This talk discusses Coverity Scan, which does static analysis of open source projects, and compares the results it has on Jenkins (as an example of an open source project) with the results of clang and a few others.
Coverity Scan started as a project funded by US DHS in 2006, but Coverity already started scanning the Linux kernel and found >500 issues in 2000. Since then, 18K defects have been identify of which 10K have been fixed. False positive rate is currently below 10% (must be below 20% to be usable at all).
How it is used: upload the code, triage the results (false positives), include in CI. It’s all cloud based. Anybody can create an account and upload a project (you then become its admin), or you can join an existing project (you need to be approved by its admin). You have to submit a build, i.e. download something from coverity that runs together with your build. You also need to do modelling: telling the scanner about the things that the tool cannot see, e.g. what is done by shared libraries. This eliminates false negatives, which you cannot tell the tool about. False positives are tracked in the database, not in the code, to avoid cluttering the code.
Linux still has many open defects. Python, on the other hand, has been completely cleaned up with Coverity and has zero defects. CI keeps it that way. Coverity also has a Java scanner.
A defect is shown directly in the source code. It typically consists of many steps at different points in the code, so there’s an overview of the steps which links to the places in the code.
The speaker showed a lot of nice examples of the powerful analysis that the tool can do, including detecting copy/paste errors and doing interprocedural analysis (limited to +- 7 levels; commercial customers can configure it).
Comparison of static analysis tools
Does it find critical defects? What is the false positive rate? Does it give enough information to fix things? Is it accurate? Can I integrate it in the workflow (CI)? Can I persist false positive markings?
To make these comparisons, they actually still use Coverity because the web front-end allows to integrate other analysis tools as well.
Comparison for Java: Coverity vs FindBugs on Jenkins
Coverity found 196 issues, FindBugs found 627, 28 were overlapping. However, Coverity found many more critical issues; FindBugs mainly finds coding style issues.
Comparison for C: Coverity vs Clang analyzer on freeRADIUS
freeRADIUS uses Clang for a couple of years already. In 2011 both found +- 100 issues, only 3 overlapping. Again, the issues found by Coverity were more critical. Did the comparison again two years later: fixed 42 of the Coverity defects, only 10 of the clang defects – probably because the issues found by clang were not considered so important.
Quote by freeRADIUS developer Alan Dekok: “[Coverity] helped to make us better programmers”. Static analysis cannot check correctness, so it doesn’t replace testing, but it does help you to write better code.
Comparison between open source and proprietary code
Coverity scans a lot of proprietary code, so can compare with open source. Conclusions: for both, defect density is going down, the bar for quality is raised. The speaker draws some conclusions about the project size but to me it looks like there is just not enough data. It does seem to be clear however that for open source projects, the quality can only be maintained if there is budget. But on average, there really isn’t a difference between open source and proprietary projects.