Measuring Software’s Security With Contests
Wednesday, September 18, 11:55-12:45pm - Presentation

Historically, games of capture the flag (CTF) represent almost mock warfare, with teams pitting their skill and stealth against each other to capture small fabric flags concealed in their respective territories. The CTF game terminology was appropriated to refer to competitions held by computer security professionals, pitting teams or individuals against each other in contests of computer security. Current contests select for system administration and software vulnerability detection skills. However, we observe that contests can also select for individuals who excel at creating secure systems and that there is overlap between individuals with the skills to create secure software and individuals with the skills to identify vulnerabilities in software. The results of secure coding contests can be useful for informing general debate about the effectiveness (or ineffectiveness) of various coding practices with respect to security. Current contests: The contest that defined computer security's version of CTF is held at the annual DEFCON hacker conference in Las Vegas. At the DEFCON CTF competition, teams compete with the goals of attacking other teams' systems while defending their own. Each team is given an identical system, and each system is running many different custom applications, such as an e-mail client or a web server. These applications are prepared by the contest organizers to purposefully contain vulnerabilities. Each team is responsible for identifying the vulnerabilities, mitigating them in their systems while exploiting them in the systems of other teams. The “flag” takes the form of a “key” that is placed on the systems by the organizers. Compromising a system enables a team to copy the key and thus capture the flag. In the DEFCON CTF model, contestants have dual red and blue team roles, in which they must both defend their systems and penetrate the systems of the other contestants. In contrast, some competitions, such as the Collegiate Cyber Defense Challenge (CCDC), contestants play only the role of a blue team, so their responsibilities end at the identification and mitigation of vulnerabilities. The Pwn to Own contest held at the CanSec West conference focuses on attacks. In particular, it invites a broad field of hackers to demonstrate exploitable vulnerabilities in real-world software, typically web browsers. Those who are able to demonstrate vulnerabilities are rewarded with cash prizes and the gift of the hardware that was hosting the software they exploited, hence the contest name. This contest essentially provides bounties on defects in important software, encouraging security researchers to dedicate resources to finding those defects well in advance of the contest. We can do better: The field of computer security encompasses more than vulnerability analysis and system administration. Indeed, these skills are arguably an indirect response to a more central problem, which is the unfortunate prevalence of insecure software. The question is: can we use contests to accelerate progress on this more fundamental problem by identifying among the winners those tools, techniques, and developers that develop measurably more secure software? We believe the answer is 'yes.' Our idea takes inspiration from CTF contests like those just described and from programming contests such as Google Code Jam. Programming contests test contestants’ abilities to create correct and efficient software, while CTF contests focus on facility for identifying flaws in software. Combining the two, we can create a contest that charges one set of teams to build software that is fast, correct, and secure, while another set of teams is charged with finding defects and vulnerabilities in the submitted software, to test whether the software is as good as its developers hope. The outcomes of this contest can teach us what software development methodologies are effective for creating secure software, and what methodologies are effective for identifying (non-artificial) vulnerabilities. The contest would be held in two phases. In the first phase, contestants are asked to build a software system of moderate complexity -- for example, a simple web server or a file parser. Contestants can use whatever programming language or technology platform they wish, as long as it can be deployed to the testing infrastructure we will use to judge it. We judge each contestant's submission by a pair of objective tests: does it meet predefined functionality requirements, and does it run in a certain amount of time. Points are awarded for software that meets the functionality requirements and that runs more efficiently than a baseline.

In the second phase, contestants perform vulnerability analysis on the software submitted during the first phase. Contestants in the second phase may include contestants from the first phase, but also any other interested parties. In this phase, points are awarded to contestants for identifying vulnerabilities in a piece of software, as characterized by submitted testcases/exploits, and points will be deducted from the score awarded in phase one for each unique vulnerability discovered in a contestant's piece of software. At the conclusion of phase two, the organizers determine winners in individual categories of software builders, bug finders, and a best overall winner, with cash prizes awarded to each. From the performance of each team at each step we acquire interesting data about the creation of security-critical software. Which technologies result in fewer exploitable vulnerabilities? Which bug finding strategies have the most success? As we iterate over each contest with a different software implementation challenge, do we observe some commonalities with technologies or methodologies that tend to present security problems? I hope in presenting these ideas to the NICE audience we might work through the details of this idea toward a viable contest model that ultimately serves to directly identify best practices for secure code development.