Software Best Practices

Voices on Software Development Best Practices
Welcome to Software Best Practices Sign in | Join | Help
in Search

Six Sigma approach to measuring Developer productivity

Last post 07-30-2007 2:44 PM by David Harper. 17 replies.
Page 1 of 1 (18 items)
Sort Posts: Previous Next
  • 05-23-2007 4:04 PM

    Six Sigma approach to measuring Developer productivity

    I know this forum is about metrics for software projects, but I'm curious about metrics to measure the productivity of the developers themselves.

    Certainly this topic has been discussed a lot in the literature and I've seen applications of Function Points, Feature Points, Defects Per KLOC, etc., to the problem.  But nothing ever hit home with me because the metrics were neither granular nor homogeneous enough, or the quantity of data was too low to make any significant conclusions.  Several years ago, using a Six Sigma approach, I developed a metric based on code churn and I would like to get opinions from others on this metric. 

    In Six Sigma, one of our key metrics, the sigma value, is based on two measurements, 1) an Opportunity for Defect (OFD) and 2) and an OFD that is defective (Defect).  Before converting to a sigma value, which I won't do here, the first step in calculating this metric is to count all the Defects and divide that by all the OFDs.  The result is also called a defect rate.  In my case, I defined an OFD to be a line of code.  Then I defined a Defect to be any line of code that later needs to be changed, because of developer error, after checkin.  The assumption here is that if a line of code was written correctly the first time, it wouldn't need to be changed later.  Thus, any line of code that changes is considered a defect.

    For example, say that a developer writes 1000 lines of code to implement a new feature.  Later, a tester finds a bug in that feature and logs a ticket for the developer to fix the bug.  The developer goes in and changes 3 lines, adds 2 new lines, and deletes 4 lines of code to fix the bug.  That developer's productivity rating would then be (3 + 2 + 4) / 1000 = 9 / 1000 or 0.009. 

    I've been using this metric successfully for some time.  I use diff.exe to count the number of lines that change and then I parse the diff.exe output to calculate the metric.  There is one small snag, though, in the process that I've been able to work around.  Diff.exe often has trouble differentiating between a line of code that has changed as opposed to a new line of code added and an old line of code deleted.  To get around this issue, I multiply the number of lines that have changed by 2, so I get the same numeric result no matter how diff.exe interprets the changes.  Thus, my result from the above example would be (3 * 2 + 2 + 4) / 1000 = 0.012 instead of 0.009.  Of course, that reduces the accuracy of the metric, but it's still a useful metric nonetheless.

    Anyone have any thoughts on this approach?  Is it something you might consider using in your own organization?

  • 05-24-2007 2:09 AM In reply to

    Re: Six Sigma approach to measuring Developer productivity

    Your method doesn't take into consideration difference in the number of lines of code produced daily. Based on what I have understood, two developers A and B , A producing 500 lines of code with 10 defects is the same as B producing 1000 lines of code with 20 defects.

    Your measuring method also doesn't take into consideration a lot of factors, some of which are:

    1. code quality: good structured code, easy to maintain, etc.
    2. Performance of the code: developer avoids bottle necks in the programming language or the platform based on his knowledge and experience.
    3. Complexity  of the code or the algorithm used.
    4. The severity of bugs: A simple buffer overrun bug that can be only fixed in a small number of code lines can be a lot more dangerous than another one that is related to business logic and requires changing many lines.
    5. Nesting level of method calls.

    Measurement is generally one of the most difficult things in management. The way you apply your measurements and the way they affect developers can affect your overall application in a way that you cannot imagine. For example if you give a bonus to developers with high productivity based on your metrics (developers who have the lowest error rates per line of code), then developers can spend a lot more time over-testing, over-debugging and over-reviewing their code and producing a smaller number of lines of code daily.

    The only way that we found good for measuring developers productivity is comparing developers together (without telling them we are comparing them). Assign two similar tasks to two developers and see who finishes first, the difference in the quality level of the code, the number of bugs, the severity of bugs, etc. 

  • 05-24-2007 5:39 AM In reply to

    Re: Six Sigma approach to measuring Developer productivity

    Hello both.

    First let me declare that I'm a humble academic rather than a practicising software developer, as I was  along time ago. I've much to learn from you folks, and hope to build it into my teaching.

    A couple of discussion points based on the above -

    It's always been traditionally difficult to determine what's a real defect and what is an "inadequately specified requirement".

    Certainly here in the UK, if any aspects of employees performance are being monitored, we must tell them about it. Yes, I know that sometimes that defeats the object of the monitoring process, but that's how it is.

    best regards

    Geoff

  • 05-24-2007 3:35 PM In reply to

    Re: Six Sigma approach to measuring Developer productivity

    MuhammadAdel:
    1. code quality: good structured code, easy to maintain, etc.
    2. Performance of the code: developer avoids bottle necks in the programming language or the platform based on his knowledge and experience.
    3. Complexity  of the code or the algorithm used.
    4. The severity of bugs: A simple buffer overrun bug that can be only fixed in a small number of code lines can be a lot more dangerous than another one that is related to business logic and requires changing many lines.
    5. Nesting level of method calls.

    In fact this is probably more important than most people think and is probably why trying to use metrics to measure software 'productivty' hasn't ever seemed to me to be a very good idea

    If you aspire to be a good software developer then one of the things that you want to do is to write good easy to maintain code with as little complexity as required to achieve the result you want. This should lead to LESS lines of code being written by a good developer. They may have the same defect injection rate as another developer (although in practice it is likely to be lower) but by writing less code they have fewer overall defects. You also have to be careful how and what you measure - if your including comments rather than just logical lines of codes and you developers know that then your going to end up with horribly bloated files full of useless comments.
    In fact Chapter 32 of CC2 is all about writing self documenting code. Another good example is Rails. The whole point of Rails is that you don't have to end up writing lots of code!

    So just by measuring lines of code that a developer writes you can actually be influencing the wrong kind of behaviour.

  • 05-24-2007 6:15 PM In reply to

    Re: Six Sigma approach to measuring Developer productivity

    MuhammadAdel:
    Your method doesn't take into consideration difference in the number of lines of code produced daily

    You lost me; why should that make a difference?  Yes, 500 lines with 10 defective lines is the same as 1000 lines with 20 defective lines; that's what we want.  The fact that one developer produced twice as many lines as the other in the same period of time is a different metric, a metric I'll call throughput.  My metric allows us to analyze throughput to see if producing more lines of code per day is more productive than producing fewer lines of code each day.

    MuhammadAdel:

    Your measuring method also doesn't take into consideration a lot of factors, some of which are:

    1. code quality: good structured code, easy to maintain, etc.
    2. Performance of the code: developer avoids bottle necks in the programming language or the platform based on his knowledge and experience.
    3. Complexity  of the code or the algorithm used.
    4. The severity of bugs: A simple buffer overrun bug that can be only fixed in a small number of code lines can be a lot more dangerous than another one that is related to business logic and requires changing many lines.
    5. Nesting level of method calls.

    I disagree; my metric encourages high quailty code that is well structured and easy to maintain.  Code with poor performance, high complexity, serious bugs, etc., is more likely to result in lines of code having to be changed.  Thus, my metric does take all that into consideration.  It doesn't measure quality, complexity, bug severity, or nesting levels, but there are other good metrics for that.  E.g., Cyclomatic Complexity metric for measuring code complexity.  If I were to incorporate something like complexity into my metric, I would have to make an assumption as to whether complex code is better than simple code, or vice versa.  Instead, I just let the metric talk by reporting the percentage of code lines that need to be changed.

    Another output from my project was that the developers preferred being measured by my metric over other metrics such as Bugs per KLOC.  You mention counting the number of bugs as a metric, but bugs are apples and oranges.  You could have a very serious bug that involves only a minor correction to fix, or you could have a minor bug that requires thowing away months of code development.  My metric puts some uniformity around measuring coding productivity.  It doesn't really matter what the severity of a bug is; if it's bad enough to fix, lines of code will be changed.  If it's not serious enough to fix, nothing will be changed.  It's either changed or it's not change.  Everything in between doesn't matter; all that matters is how many lines of code need to be changed.

  • 05-26-2007 1:55 AM In reply to

    Re: Six Sigma approach to measuring Developer productivity

    How do you keep track of which developer is responsible for which line? Where I work we share code extensively. There are sections of code that are a mixture of work from half a dozen different developers. It's true that svn (our usual source control system) can blame based on any single line, but this often fails due to: pairing, working at a different station, or through simple formatting changes. Tracking down a defect to the programmer that created it would certainly be possible but might take considerable time given the way I usually work.

  • 05-26-2007 2:13 AM In reply to

    Re: Six Sigma approach to measuring Developer productivity

    Paul Sinnett:
    How do you keep track of which developer is responsible for which line?
     

    Check in often and before someone else touches the code.  Then you can tell which developer is responsible for each line but just looking to see who checked it in.

  • 05-28-2007 2:42 AM In reply to

    Re: Six Sigma approach to measuring Developer productivity

     It's not enough to see who last checked in the code.  If they were working together as in our buddying system, either of them could have checked it out.  It's relatively easy to allocate responsibility (=blame?) to one or other developer but measuring effective productivity is much difficult than the metrics imply.

    Improving developer productivity isn't just about changing things you can directly measure.  There are qualitative aspects too (and yes I know you can produce scales for "measuring" qualitative factors).  For example, you can certainly use cyclic complexity to measure how complex a piece of software is, but it's not so easy to estimate how much less complex it might have been, or how much more complex it needs to be.  If someone chooses a complex algorithm, it may turn out to be concise and occupy many less lines of code, and it may even have few defects.  But the very next modification needed to that code may initiate a string of defect injections caused by the complexity.  On the other hand, with something less complex, we might end up with more code, and perhaps take more time and inject more defects initially, but longer term it's maintainable and modifications do not inject further defects.  So which developer was more productive?

    It's also about how you decide to measure quality.  You can have code which has been produced to a very high quality but which won't integrate very well with other parts of the system - so the productivity needs to take account of its integration (quality of design and testing for example)?  An undetected defect in a design can wipe out any coding productivity so any measure of productivity needs to have a link with meeting use cases and requirements, passing regression testing, etc, to show that the productivity is measured against the right target (one the customer recognises), otherwise it's not really productive.  That also begs the question of how you measure the quallity of the testing without producing a circular measure.

    We always struggle with the problem of relating the metrics to what we really care about, which is the right/acceptable level of quality.  Measuring defects/KLOC, or somesuch gives us a handle on the overall defect level, but it's still very difficult to understand how that translates into quality in front of the customer.
     

     

  • 05-30-2007 7:21 PM In reply to

    Re: Six Sigma approach to measuring Developer productivity

    Sorry, but I'm not sure how your response relates to the metric I proposed. 

    bob_lloyd:
    Improving developer productivity isn't just about changing things you can directly measure.

    What do you base that on?

    bob_lloyd:
    It's relatively easy to allocate responsibility (=blame?) to one or other developer but measuring effective productivity is much difficult than the metrics imply.

    Why is that?

    My metric was not about complexity.  But in tests that we did using the the Cyclomatic Complexity Metric, we established a correlation between the number of lines of code that need to be changed and the complexity of the code.

    bob_lloyd:
    So which developer was more productive?

    The one with the lowest ratio of (number of lines that need to be changed) / (number of lines written)

    ]

    bob_lloyd:
    You can have code which has been produced to a very high quality but which won't integrate very well with other parts of the system - so the productivity needs to take account of its integration (quality of design and testing for example)?

    It does.  If the code won't integrate well, the developer will have to change more lines of code than if it did integrate well.

    bob_lloyd:
    An undetected defect in a design can wipe out any coding productivity so any measure of productivity needs to have a link with meeting use cases and requirements, passing regression testing, etc, to show that the productivity is measured against the right target (one the customer recognises), otherwise it's not really productive.

    I don't understand why you say this.  If design, use cases, requirements, etc.. are bad, it will result in more lines of code being changed.  The metric works perfectly for tracking this.

    bob_lloyd:
    That also begs the question of how you measure the quallity of the testing without producing a circular measure.

    I'm not trying to measure the quality of the testing.  I don't understand why you brought this up.

    bob_lloyd:
    Measuring defects/KLOC, or somesuch gives us a handle on the overall defect level

    Measuring defects/KLOC is a terrible metric.  Managers like using that metric, but it's unfair to developers.  I developed my metric as a solution to developer complaints about being measured by defects/KLOC at Microsoft.  Defects are neither homogeneous or granular enough to be fair.  One defect could be fixed by making a small change to a single line of code while another could require months of work to correct many modules of code.  The severity of the defect is taken into account with my metric because severity will be taken into account when deciding if the defect needs to be fixed or not.  I.e., if the problem is severe enough, the developer will have to change lines of code to resolve it.  If it's low severity defect, it might not be fixed; thus no lines of code will be changed.

  • 05-31-2007 12:57 AM In reply to

    Re: Six Sigma approach to measuring Developer productivity

    SixSigmaGuy:

    bob_lloyd:
    So which developer was more productive?

    The one with the lowest ratio of (number of lines that need to be changed) / (number of lines written)

    That doesn't take into account the exposure to the risk that a piece of code requires change. If I'm working on some rarely used module my productivity would be higher by this metric due to the module not being as thoroughly tested. At the extreme I might create a single line on a project that is never executed in practice and therefore is never changed. Should my productivity for this effort be the highest it is possible to achieve? 


  • 05-31-2007 1:33 AM In reply to

    Re: Six Sigma approach to measuring Developer productivity

     I have a lot of reservations about this measure particularly over the operationalisation of the productivity value.

     Clearly the defects/number of opportunities is a well-established quality metric and in the abstract seems to provide a reasonable measure of quality. The operationalisation of that denominator is where the judgement comes in. There are very many opportunities to inject defects and not all of them involve writing code.

    The operational definition of opportunity defined as a line of code that later needs to be changed because of developer error is a problem. A developer who correctly sees the need for refactoring as the result of a minor defect which could have been fixed with a quick patch (like a single byte change to reduce a privilege on a method call), would appear to be penalised for lower productivity because moving code is measured as changed code lines. Even with the consequent deletion of lines as a result of the move, any wrapping code (function headers, etc) may be very simple, low complexity, high speed changes, but are undifferentiated in the measure.

    SixSigmaGuy says

    “If I were to incorporate something like complexity into my metric, I would have to make an assumption as to whether complex code is better than simple code, or vice versa.  Instead I just let the metric talk by reporting the percentage of code lines that need to be changed.”

    And that's exactly one of the big problems. It reduces the measure of productivity of the programmer to production line metrics as if the product was simply a succession of lines of code. The metric produced gives you an indication of how fast the code conveyor belt is running. This isn't particularly useful information and can generally be obtained, as was mentioned above, just by looking at developers day-to-day.

    Without understanding the need for the code change and the risk associated with it, you can't provide a meaningful measure of the productivity of the fix. If all you want is a clock on the developer, there are easier ways. But developer productivity is intimately connected with quality and unfortunately you do have to take into account factors like quality, risk, complexity, and so on. That's why simple arithmetic to produce metrics gives a false sense of security. You could prove you have all the figures and reward highly productive developers but someone who is slower but refactors and tests well could be much more productive longer term and score lower in the productivity metrics.

    In fact in Kan's book "Metrics and Models in Software Quality Engineering", the measure of productivity is calculated not on a per-developer basis related to lines of code, but to project size (based on assembly-equivalent source size, to take account of different languages used) and the overall project effort. That allows for projects to be compared for productivity but not individual programmers.

    I hope that folks can see that my comments above are related very much to the problems with the operational definition of productivity proposed.
     


     

  • 06-01-2007 1:22 AM In reply to

    Re: Six Sigma approach to measuring Developer productivity

    Apologies about the length of the response and the quoting but there are lots of issues here that ought to have some discussion. 

     

    SixSigmaGuy:

    bob_lloyd:
    Improving developer productivity isn't just about changing things you can directly measure.

    What do you base that on?

    Simple aspects like the suitability and quality of algorithms, decisions about trade off between functionality and performance, appropriate compromises between usability and even things like the division of responsibility between classes, etc.  All have a bearing on productivity. 

     

    SixSigmaGuy:

    bob_lloyd:
    It's relatively easy to allocate responsibility (=blame?) to one or other developer but measuring effective productivity is much difficult than the metrics imply.

    Why is that?

    My metric was not about complexity.  But in tests that we did using the the Cyclomatic Complexity Metric, we established a correlation between the number of lines of code that need to be changed and the complexity of the code.

    I'm not yet convinced that that correlation can be generalised.  For example, with highly compact complex code, single line changes can carry a bigger functional payload so the size of the change (and hence the implied risk) may not correlate with the amount of code changed.  C++ programmers will doubtless recognise the single line function as a case in point. Despite the translation into assembly-line equivalents, the editing of complex condensed code may be rapid if the developer understands it well. This will skew the productivity measure.

     

    SixSigmaGuy:

    bob_lloyd:
    So which developer was more productive?

    The one with the lowest ratio of (number of lines that need to be changed) / (number of lines written)

     Refactoring can be highly productive and doesn't get adequately differentiated in this measure.

     

    SixSigmaGuy:

    bob_lloyd:
    You can have code which has been produced to a very high quality but which won't integrate very well with other parts of the system - so the productivity needs to take account of its integration (quality of design and testing for example)?

    It does.  If the code won't integrate well, the developer will have to change more lines of code than if it did integrate well.

    Not necessarily so.  For example, integration issues on the .NET platform can be very costly but may well be resolved by very small changes to configuration files instead of code changes. It's an artificial separation to consider only direct changes to code. The integration problem may well be fixed elsewhere as the cheaper option.

     

    SixSigmaGuy:

    bob_lloyd:
    An undetected defect in a design can wipe out any coding productivity so any measure of productivity needs to have a link with meeting use cases and requirements, passing regression testing, etc, to show that the productivity is measured against the right target (one the customer recognises), otherwise it's not really productive.

    I don't understand why you say this.  If design, use cases, requirements, etc.. are bad, it will result in more lines of code being changed.  The metric works perfectly for tracking this.

    Again this is an assumption. I've seen cases where the requirements and use cases have been inadequately specified where the changes in code were very small but the required analysis time was very high.  These things affect productivity for developers without necessarily translating into a corresponding increase in the level of code change.

     

    SixSigmaGuy:

    bob_lloyd:
    That also begs the question of how you measure the quallity of the testing without producing a circular measure.

    I'm not trying to measure the quality of the testing.  I don't understand why you brought this up.

    The reason it's pertinent is because there's no direct correlation between a code change and the impact on the size and scope of the test code. A small change in one can be the cause or the result of the other. We need to include the consideration of the changes in test code in any measure of coding productivity but the central problem remains unsolved - how does the productivity of generating code reflect the productivity of the developer on the project?

     

    SixSigmaGuy:

    bob_lloyd:
    Measuring defects/KLOC, or somesuch gives us a handle on the overall defect level

    Measuring defects/KLOC is a terrible metric.  Managers like using that metric, but it's unfair to developers.

    It sure is.  Any metric based an the estimate of lines of code suffer from the same problems, whether we are using assembly-language equivalents, or numerical measures of code lines changed. The problem is relating this measure to the quality of the change and the risks implied. At best, these metrics give us an ordering. In my experience, it is difficult to make cross-project or cross-developer comparisons without some pretty massive simplifying assumptions. 


  • 06-01-2007 9:44 AM In reply to

    Re: Six Sigma approach to measuring Developer productivity

    One thing to always come back to when you are considering productivity is "does this change add value to our customers". Measuring lines of code, defects fixed, or defects introduced per kLoC isn't a good method of measuring productivity.

    Far too often it is used because the metrics are easy to collect. But this is a classic example of collecting metrics and then making them fit your objectives and goals.

    Productivity should be about the "amount of value you add versus the time taken to add it". E.g. just adding a feature isn't productive unless it is a feature that your customers have asked for and is of value to them.

    One of the main problems with metrics based productivity measures are that they can often lead to and/or enforce bad behavior.  Defects/LoC is a good example of this because you encourage people to write huge amounts of 'unreadable code' which could often be re factored to be half the size. You might even think you can turn this around - e.g. count  extra test code as a 'positive' change - but that often just encourages people to write more bad test code. In fact you should really just be adding test data to your test harnesses not having to add hundreds of lines of code.

    A good method is for a project to measure the amount of requirements fulfilled in a given time and the amount of customer changes that have been accommodated.

     A last thought: I've often found that the most productive developers are those that take the time to think about what they are doing. But how do you measure 'thinking' time? I.e. time when you might not look like you are doing anything! (Well ok I always make notes but I expect most people know what I mean)



     


     

     

     

     

  • 06-02-2007 6:23 AM In reply to

    Re: Six Sigma approach to measuring Developer productivity

    David Harper:

    A good method is for a project to measure the amount of requirements fulfilled in a given time and the amount of customer changes that have been accommodated.

    Okay, so the next question is how do you measure the relative worth of each requirement / customer change request? Or do you consider them all equal?
     

  • 06-03-2007 4:23 AM In reply to

    Re: Six Sigma approach to measuring Developer productivity

    Some of the metrics we are looking at include the following:

    • Percentage of use cases met and developer tested
    • Percentage of requirements met and developer tested
    • Same again but integrated
    • No of times regression testing has passed
    • Amount of risk remaining and its distribution
    • Defect discovery rate and fix rate
    • Defect latency - time after discovery before fix
    • A collection of customer call metrics

    We also use, inadequately, a measure we call Avoidable Defect Cost, which looks at the earliest phase in which a defect could have been detected (a judgement call by the tech lead and developer) compared with the actual phase in which it was discovered.  This is weighted according to phase and severity and gives us a way of looking at which phases in the development process were error-prone.  Not a perfect measure, and not a comparator between projects but sufficiently focussed to pick up defects in the inspection process, or dev testing, or defining interfaces, etc.

    We also try to understand the distribution of risk across the project.  We avoid the assumption that the highest risk takes the most effort - it might just be a small amount of high quality effort, for example a couple of senior engineers scrutinising something in great detail.

    We also measure how much our change control mechanism is used and try to track the impact.  Where change could have been anticipated and designed for, we look at how cost effective that might have been - it's not used as measure of the developer but of the flexibility of the process.

    As far as developer productivity goes, we look more at the overall quality of their contribution to the project and we take into account the above measures, as well as more impressionistic (but no less valuable) indices of their contributions to design reviews and inspections, buddying, etc.

    You'll note that none of these measures are really about individual developer procductivity but all of them contribute to increasing productivity on each project.  Developers are focussed on evidence of quality meeting what the customer cares about.  We have test leads who need to be convinced of what developers claim so evidence of testing is highly visible.

    It's already highlighted training issues in certain areas, poor developer testing in others, inadequate reviews, etc.  In a proactive supportive environment, developers are keen to see these metrics, and identify where things need improvement.

    All these things are related to developer productivity, can influence it positively, and are impersonal and highly visible.  Does this ring true with others out there?

     

  • 06-08-2007 2:52 PM In reply to

    Re: Six Sigma approach to measuring Developer productivity

     In my opinion there are two distinct topics here; "the measure" and "what you do with the measure".
     

    The Measure

    This is an interesting measure.  It does indeed (from a mathematical point of view) have some desirable qualities.

    1.  It appears to be scale invariant.  That is, it would allow us to compare the defect rates for a small project to a large project without having to introduce another correction factor.
    2. It looks like it should allow comparison across families of languages (e.g. Ruby & Python, C++ & Java, Perl & PHP & other "scripty things") fairly reliably.

    However it does not appear to measure productivity.  For example, if the power were out in the office for a week (certainly having a negative effect on productivity) this measure would not detect it.

    Also, if a developer writes a lot of lines like

    foo = 0
    bar = 0
    baz = 0

    rather than the simple

    foo, bar, baz = 0

    they come out looking much better by this metric than I think you'd intend.  :-)

    Like most metrics I'm sure this one does not exist in a vacuum.  You've probably got a proverbial boatload of other metrics that help you to understand what's going on inside of your particular projects.  And with that lead in I move on to...


    What you do with the measure

    I couldn't help but notice (and agree with) the negative reaction to measuring developers based on this metric.  There is a lot more that goes into being a good developer than producing a whole lotta lines of code.  My gut tells me that you need to be really careful about using and communicating this kind of metric.  This is especially true if you tie it in any way to compensation (e.g. bonuses for low changes/LOC).  For an excellent discussion of this see Joel Spolsky's "The Econ 101 Management Method".

    Now on the upside, you may be using this to indirectly measure things like:

    1. Did changing the Technical Specification template to include a section on unit testing decrease the defect rate?
    2. Did canceling the team meeting to review the finalized specification increase the defect rate?
    3. Did ______________ increase/decrease the defect rate?

    This kind of metric to analyze process changes is at the heart of six sigma (as I understand it).  But you want to very careful about using it to measure people.

    At one of the Construx Executive Summits (two years ago I think) one of the participants said that they have a policy of never using metrics for performance evaluation.  They cited that this was specifically because the people would begin to "game" the system to skew the metrics in their favor.

    Metrics are great for measuring processes since processes don't have a mind of their own.  They're not so good for measuring people in creative rolls where you want them to have a mind of their own.

    To your original question, I would not consider using this as a metric at Team46 for our development projects.  Not because it is that bad, but because I feel I can better measure the things I'm interested in more directly.

    Bruce P. Henry

    LiquidPlanner
  • 07-24-2007 4:46 PM In reply to

    Re: Six Sigma approach to measuring Developer productivity

    At NEC Telecom Software Philippines, Inc. where I work, we allow project managers to measure their software engineers' productivity any way they want; however it's our policy not to use productivity in evaluating individual performance.  We compute productivity two ways: creation productivity and overall productivity.

    We also count LOC per file, i.e., if an engineer adds a new LOC in an existing source file, we count all the lines in the source file as modified.  This way, we have a consistent way of measuring LOC.

    Going back to how we compute productivity, we use the following formula company wide:

    Creation productivity = New + Modified (Added + Changed + Deleted)  / Total project hours

    Overall productivity = New + Modified + Reused / Total project hours

    We only have a small set of company wide metrics in addition to the above

    - Percentage of upstream defects versus percentage of downstream defects

    - Percentage of post-release bugs

    - Rework hours

    - Bug density (# of bugs found/LOC)

     We use these metrics to evaluate current process performance and to determine improvement areas.  We are investing heavily on technical and software management trainings (so that creation quality -- doing things right the first time -- improves so that very few defects are found during peer reviews) and software tools from Parasoft to speed up our code reviews and testing.

    We understand that software measurement is more complex than our company's prescribed simplification but so far it's working.

    We also have bi-annual quality and productivity reports and we usually include project profiles so that we don't mistakenly compare dissimilar projects. 

    In friendship,

    Carlo

  • 07-30-2007 2:44 PM In reply to

    Re: Six Sigma approach to measuring Developer productivity

    brucehenry:

    2. It looks like it should allow comparison across families of languages (e.g. Ruby & Python, C++ & Java, Perl & PHP & other "scripty things") fairly reliably.


    brucehenry:

    Also, if a developer writes a lot of lines like

    foo = 0
    bar = 0
    baz = 0

    rather than the simple

    foo, bar, baz = 0

    they come out looking much better by this metric than I think you'd intend.  :-)

    Actually this depends on how and what you are measuring when you say LOC. Do you mean physical or logical lines of code? Most people measure physical LOC because it is easy. But actually you are better off using Logical LOC. In this case foo,bar,baz = 0 would be treated as 3 distinct statements.

    When you start measuring LLOC you run into problems with comparing things written in different languages. If you've structured you code well you can however look at the LLOC and Defect Rates for different types of code. E.g. your product code, test code and build related/meta data code. 

     

    brucehenry:


    What you do with the measure

    I couldn't help but notice (and agree with) the negative reaction to measuring developers based on this metric.  There is a lot more that goes into being a good developer than producing a whole lotta lines of code.  My gut tells me that you need to be really careful about using and communicating this kind of metric.  This is especially true if you tie it in any way to compensation (e.g. bonuses for low changes/LOC).  For an excellent discussion of this see Joel Spolsky's "The Econ 101 Management Method".

    Now on the upside, you may be using this to indirectly measure things like:

    1. Did changing the Technical Specification template to include a section on unit testing decrease the defect rate?
    2. Did canceling the team meeting to review the finalized specification increase the defect rate?
    3. Did ______________ increase/decrease the defect rate?

    This kind of metric to analyze process changes is at the heart of six sigma (as I understand it).  But you want to very careful about using it to measure people.

    At one of the Construx Executive Summits (two years ago I think) one of the participants said that they have a policy of never using metrics for performance evaluation.  They cited that this was specifically because the people would begin to "game" the system to skew the metrics in their favor.

    Metrics are great for measuring processes since processes don't have a mind of their own.  They're not so good for measuring people in creative rolls where you want them to have a mind of their own.

    To your original question, I would not consider using this as a metric at Team46 for our development projects.  Not because it is that bad, but because I feel I can better measure the things I'm interested in more directly.

     

    Agreed. What these types of metrics can do is to tell you where you might be more productive in the future.

    • What areas of code are most bug prone
    • What is the history of that area of  code
    • What type of defects are being found at what stages (e.g. how effective is your requirements gathering and review phases?)