Software Best Practices

Voices on Software Development Best Practices
Welcome to Software Best Practices Sign in | Join | Help
in Search

Question about Defect Density Metric

Last post 07-17-2008 4:59 PM by bbrown4. 5 replies.
Page 1 of 1 (6 items)
Sort Posts: Previous Next
  • 11-21-2007 3:49 PM

    • bbrown4
    • Top 50 Contributor
    • Joined on 05-30-2007
    • Hillsobor Oregon
    • Posts 4

    Question about Defect Density Metric

    The standard defect density metric is defined as the number of defects discovered normalized by code size. Each of these elements needs a crisp definition: all defects, all severity levels, only open defects; EKLOC, logical KLOC, function points.

    In our defect density defintion, the goal of the metric is provide one measure of the amount of rework required. Therefore, we measure total defects of all severity levels. We also measure EKLOC, although we agree that function points are a better measure for this metric but more difficult to measure in an automated manner. So we compromised.

    Imagine a large program with 1 million lines of code. Of that 1 million lines, only 10% of those lines are in active development and 90% of the lines are legacy that are rarely or never touched. We felt that changed defect density would be a another useful metric since it would be measuring where people were working and would not be diluted. Our first thought was to measure defects normalized by changed lines of code.

    The tool we use to measure LOC also measures code churn (add, deleted or modified from the previous version). Knowing where code churn is the greatest is useful to the development and test team for obvious reasons. However, code churn measures the amount of change from one version to the next and not the lines of code that have changed over time.

    If we imagine a single source file that has had the same line of code changed 10 time over 10 revisions, then the number of lines of code that have changed is 1, and the number of changes to that file over all revisions is 10 (churn). It seems impossible for a tool to determine the number of lines of code that have changed over a period of time.

    In your experience, is it useful to measure defects normalized by aggregated code churn (the number of changes to all files over time)? Is this a defect density metric? Has anyone used a similar metric?

    Bob Brown, Intel Corp, Intel Software Quality

    Bob Brown
  • 01-27-2008 2:31 PM In reply to

    Re: Question about Defect Density Metric

    There are lots of different ways to measure CodeChurn and the more you can use (easily) the better. If your system is complicated and broken down into separate components or modules then you should measure code churn per component. You can also then measure the amount of code churn vs the frequency of changes to files.

     E.g. In two components A and B 1000 lines are changed, In A as a result of 10 changes and B the result of 100 changes.

    A = 10/1000 =  0.01

    B = 100/1000 = 0.1

     You could then look at B and ask yourself why is it changing so frequently? Is this due to a lot of defect fixing effort etc?

    For more information on relative metrics see

    http://research.microsoft.com/research/pubs/view.aspx?type=Publication&id=1359 

     

  • 07-11-2008 8:42 AM In reply to

    • madan
    • Top 150 Contributor
    • Joined on 07-11-2008
    • Posts 1

    Re: Question about Defect Density Metric

    hi Bob,

    My question is related to your statement "The tool we use to measure LOC also measures code churn" . I am in search of such a tool and will be glad to pilot if you can share the same.

    thanks in advance

    regards

    Madan

  • 07-17-2008 4:16 PM In reply to

    Re: Question about Defect Density Metric

    By the way... Capers Jones strongly recommends that people NOT use lines of code as a basis for metrics like this. Simply, lines of code is a measure of the size of the *solution* and these are known to vary by a factor of 10 for the same *problem*. Capers recommends looking at defect density in light of Function Points (see, for example, www.ifpug.org). The idea is that Function Points are a measure of the size of the *problem* that's invariant with solution size. So an organization that delivers n defects per function point would be seen by the customer as twice as good a different organization that delivered 2n defects per function point, regardless of how many lines of code were involved in either product.

    I'm not necesarilly convinced that function points are the best measure of problem size, but they are a much more meaningful measure than lines of code.

     

    Cheers,

     -- steve

     

  • 07-17-2008 4:45 PM In reply to

    • bbrown4
    • Top 50 Contributor
    • Joined on 05-30-2007
    • Hillsobor Oregon
    • Posts 4

    Re: Question about Defect Density Metric

    We did extensive evaluations of COTS tools that would measure both LOC and churn. Our two favorites were RSM (M Squared Technology) and Code Reports (was Smartbear and is now owned by ByIQ). RSM has the better solution but programming language support is limited. Code Reports has better language support but has a much more complex model using the SCM system and is somewhat buggy and performance constrained for large (> 1M LOC) code bases.

    Bob Brown
  • 07-17-2008 4:59 PM In reply to

    • bbrown4
    • Top 50 Contributor
    • Joined on 05-30-2007
    • Hillsobor Oregon
    • Posts 4

    Re: Question about Defect Density Metric

    Yes, I agree that FP is a better measure. However, the cost (effort and expertise) to determine FP is prohibitive. You need a trained FP counter and it is a very manual process.

    The question is: can defect density using LOC be useful even if we all agree that it is not precise in measuring code size? If we were developing metrics for a mature SW organization, I would definately want to use FP. But for immature orgs that do not use metrics today, we felt it was more important to have an easy to use method to calculate size and the defect density derived from LOC is still useful. We do not want our teams to be comparing themselves across teams. However, if an org or a team has a goal to improve DD by (say) 10%, then normalizing within a team using LOC is useful. Comparisons across application domains (apps vs. FW vs. tools) is not realistic. Comparisons among different teams that use different methodologies or SDLC are also not going to be useful. But comparison over time within a team using LOC is useful in measuring improvement.

    Using LOC does not allow us to benchmark using external data from Capers Jones. Backfiring rules are too variable to convert. Read the fine print with benchmarks that use LOC since both the numerator and denominator are often ill defined or are not consistent with our DD definitions (which defects count, which files count when measuring LOC).

    Bob Brown
Page 1 of 1 (6 items)
Seminars           www.Construx.com           Consulting