Hackers’ self-selection in crowdsourced bug bounty programs∗
A bug bounty program, also called Vulnerability Research Program (VRP), is a form of crowdsourcing increasingly employed by modern companies to improve their system’s security. It consists in oﬀering monetary rewards to individuals that find new security flaws in a software or a system.
One of the key challenges in the design of such contests is to attract enough participants while limiting the low quality participations. In this paper, we study how hackers’ perception of the uncertainty to obtain a reward, determined by the level of information a contest provides about the contractual terms, aﬀects the outcome of the contest both quantitatively (the number of participations) and qualitatively (participants’ quality). Specifically, we examine how a hacker’s choice to participate to a VRP depends on this level of information.
Using an unbalanced panel data set on 156 bug bounty programs run on a well-known bug bounty platform, we find that a more detailed contest policy and in particular more information about the compensation scheme attracts a greater number of participants. On the contrary, providing less detail induces less participation but attracts more performant and more experienced hackers. Hackers self-select whether to participate in a VRP according to the level of information about the contest’s contractual terms, which leads to a trade-oﬀ between inducing more participation and attracting more valuable participants.
Keywords: bug bounty program, vulnerability research program, innovation contest, contract-related incentives, self-selection eﬀect, hackers.
∗ I thank my thesis advisor Marc Bourreau for his support. I also thank Maya Bacache, Christine Zulehner, and the seminar audiences at 3EN 2018, EARIE 2018, DIF Lyon 2018, ZEW summer Workshop 2018, and at the TSE Digital Seminar for their useful comments on a previous version of the paper.
Five years ago, a 17-year-old teenager in India discovered a serious vulnerability in several airline booking systems that allowed to get free plane tickets. Despite his eﬀort to reach the airline companies and to alert them about the flaw, only one company took him seriously and reacted.1 This is less true nowadays: an increasing number of companies seek to collaborate with benign vulnerability identifiers – the so-called white-hat hackers – to improve their systems’ security. Methodical approaches have been developed to work with independent researchers, such as oﬀering incentives in the form of a “bounty”. Organizations started running Vulnerability Research Programs (VRPs)
– also commonly called Bug Bounty Programs – which give monetary compensation to crowdsourced resources in exchange of information about vulnerabilities. Along with it, web platforms called “Bug bounty platforms” have emerged, hosting and managing these VRPs as a third party (E.g. HackerOne, BugCrowd, Yeswehack).
Launching a VRP has become a modern way to improve software and systems’ security and is becoming accepted as a normal part of the software development lifecycle. However, managing a successful VRP is not an easy task. One of the key challenges companies face is to make the right balance between attracting enough participants and setting suﬃciently high standards for participation in order to limit the proportion of low-value participations.2 Indeed, a VRP intends to benefit from the diversity of participants, thus it is important to let a large pool of individuals to participate. Yet, each participation induces a cost, as it requires dedicated resources to sort the relevant participations out of the invalid ones and to communicate with participants.
One method applied by bug bounty platforms to reduce the rate of invalid particip-ations is to allow only individuals that have been suﬃciently eﬃcient in the past to participate.3 Unfortunately, one can also lose potentially valuable participations by applying such restrictive policies (Zhao, Laszka, and Grossklags, 2017). Apart from setting such a minimum quality standard, it is the contest’s policy – i.e., the terms and rules of the contest – that may shape the outcome of the contest.
The policy of a crowdsourced contest such as VRPs is comparable to an employment contract, as it defines the contractual relationship between the contest owner and participants, especially by specifying the compensation scheme the contest oﬀers to participants and what it expects in exchange. The relationship between workers’ performance and the attributes of an employment contract have been largely studied by economists. The literature distinguishes at least two types of eﬀects. First, a compensation scheme can aﬀect a worker’s performance – for instance, through a pay-for-performance scheme – by inducing a certain level of eﬀort from the worker, which may be referred to as eﬀort eﬀects. Secondly, it can have an eﬀect on personal attributes. For example, a number of empirical works show that more productive workers systematically prefer a variable-pay to fixed-pay schemes (e.g., Dohmen and Falk, 2011). The possibility that agents with diﬀerent individual characteristics feel attracted by diﬀerent pay schemes and therefore self-select into particular forms of contracts may be referred to as self-selection eﬀects.
In the same way as an employment contract, the characteristics of a contest’s policy may aﬀect both the level of eﬀort provided by participants and the type of individuals that choose to participate in the contest. In this paper, we focus on the second aspect, that is, how the attributes of a VRP aﬀects the decision of an individual to participate. In particular, we are interested in how the completeness of the contract oﬀered by a VRP aﬀects a hacker’s choice to participate. By completeness of the contract, we mean how much information the VRP’s policy provides about the compensation scheme and about what it expects as an outcome.
Crowdsourced innovation contests like VRPs present several important specificities that make it diﬃcult to derive the answer to our question directly from the case of a standard employment contract. First, the number of participants – the number of brains that work on a problem – is an important factor that defines the eﬀectiveness of a crowdsourced innovation contest (Terwiesch and Xu, 2008; Boudreau, Lacetera, and Lakhani, 2011). A contest policy should be thus applicable and attractive enough to a large panel of individuals. At the same time, it has to be accurate enough to provide adequate incentives and make the research process eﬃcient. Secondly, in VRPs, participants are asked to find new security flaws that were unknown before. That is, they are asked to find an innovative way to penetrate a system rather than to carry out predefined tasks like it is the case in standard crowdsourcing platforms (e.g., Amazon Mechanical Turk). Certain types of individuals may be more qualified to innovate. Solvers who have deep knowledge and experience in the problem domain could be most eﬀective. Or, on the contrary, technical and social marginality could be an advantage to successfully solve the problem (Jeppesen and Lakhani, 2010). All in all, the success of an innovation contest largely depends on how to sort and to attract the right kind of solvers. Third, a VRP oﬀers only a variable-pay: participants are compensated only if their contributions are relevant enough, regardless of the eﬀort they have actually provided. Variable-pay schemes are likely to attract more productive workers than fixed-pay schemes, but they encompass also a sorting eﬀect on other attributes such as the relative self-assessment or risk preference (Dohmen and Falk, 2011) that might alter the eﬀectiveness of a contest.
The purpose of this study is to examine how the completeness of the contract proposed by a VRP aﬀects its eﬀectiveness. As the goal for a bug bounty program is to find as many relevant vulnerabilities as possible (i.e., it is interested in maximizing the sum across all outcomes opposed to a one-prize contest in which the goal is to maximize the value of the highest outcome), we examine the eﬀect both on the quantity (the number of participations) and the quality of the participants.
Our analysis is based on publicly available data from the web platform HackerOne. We use an unbalanced panel data set on 156 bug bounty programs run from January 2015 to February 2019. VRPs on the platform we analyze are free to choose the level of information they provide about the contractual terms through a written policy. They can provide more or less detail about their compensation scheme, from fully specifying the payouts for each task to having a large degree of discretion about the monetary rewards and the targeted scope. They can also modify their policy over time. We consider that the level of information provided in the written policy reflects the degree of completeness of the contract the contest oﬀers. We find that the more precise and detailed the policy is, the more participants it attracts. However, it also attracts participants with more heterogenous performance and it reduces the average quality of participants. On the contrary, leaving more uncertainty about the monetary rewards and the targeted scope generates fewer, more homogenous participants, but with higher quality in average.
This paper proceeds as follows. Section 2.2 reviews the relevant literature, Section 2.3 develops our hypothesis for analysis and Section 2.4 describes the data and estimation strategies. Section
reports the results, and Section 2.6 concludes.
Our paper is closely related to three streams of research. The first one is on the economics of information security. The literature on the economics of information security aims at studying the potential market failures causing information systems insecurity. Vulnerability discovery is one of the topics at the heart of this field. We contribute empirically to this literature, first by studying a marketplace – a market for software vulnerabilities, where hackers sell vulnerability information to software vendors and companies – that has been barely studied for now because of its novelty, and secondly by studying how the design of a bug bounty program aﬀects the contribution of the individual researchers.
Among the few papers that focus on VRPs, Finifter, Akhawe, and Wagner (2013) analyze two programs run by big pioneers in the vulnerability research community (Google and Mozilla) and examine whether running a VRP is economically profitable for a firm. Zhao et al. (2017) develop an analytical framework that compares diﬀerent policies that aims at reducing the number of invalid reports.4 To our knowledge, there are only two empirical studies that analyze data from bug bounty platforms.5 Zhao, Grossklags, and Liu (2015) compare the currently biggest bug bounty platform HackerOne (run by a US-based company) to Wooyun, a well-known Chinese bug disclosure platform. They compare the trend in the discovered vulnerabilities in the two platforms, the diﬀerent reward structures VRPs oﬀer, and how oﬀering monetary incentives attract more participants to a VRP. Maillart, Zhao, Grossklags, and Chuang (2017) also use data from HackerOne and show that the number of participations in a VRP is considerably reduced over its duration and that hackers strategically switch to new programs when new programs become available.
Along with these two papers, our paper is among the few ones that provides an empirical analysis about bug bounty platforms. Besides the principal characteristics of VRPs already examined by existing papers – such as the eﬀects related to monetary incentives or the decreasing probability to find new vulnerabilities –, we identify an important mechanism that aﬀects the eﬀectiveness of a VRP. We are the first one that focuses on how the amount of information provided by a VRP about its contractual terms aﬀects its eﬀectiveness. We assimilate a VRP’s policy to an employment contract proposed by a firm to workers and we study how the perception of the uncertainty to obtain a reward aﬀects a worker’s choice to participate in the contest. Our data set is also unique, in that it is a recent and large panel data set on VRPs run by diverse types of organizations while they are managed on a single platform. As we have a panel data set, we are able to account for the diﬀerent fixed eﬀects and identify in a robust way the eﬀect attributed to a change in a VRP’s policy. Moreover, we use both quantitative (the number of participations) and qualitative (participants’ average quality and variance) information that defines the outcome of a contest.
Our paper is also related to the literature on innovation contests and tournaments. VRPs are a type of contest in which the organizer commits to reward the participants according to the rules and terms it defined, and participants spend resources in order to win the rewards. For each new security flaw, only the first finder is rewarded. It is thus close to an innovation contest in that the goal of the contest is to find a new idea – an innovative way to penetrate the system and secure it. Economists have studied from various angles the optimal design of contests, mainly about how to allocate the prizes (Archak and Sundararajan, 2009; Liu, Yang, Adamic, and Chen, 2014) and whether free entry or restricted numbers of participants yield to better outcomes (Terwiesch and Xu, 2008; Boudreau et al., 2011).
As Boudreau et al. (2011), we are interested in how the degree of uncertainty faced by participants aﬀects the outcome of the contest. However, our scope and approach diﬀer from them in several aspects. First, we study a type of contest increasingly used with the rise of crowdsourcing but barely studied for now. As mentioned earlier, in contests like VRPs, the goal is to maximize the sum of the outcomes, while “traditional” innovation contests like those launched on the well-known web platform InnoCentive aims at selecting one best solution.6 Secondly, in Boudreau et al. (2011), the degree of uncertainty is measured by the number of problem domains on which a given solution draws. They focus on the fact that participants exert less eﬀort when they face more uncertainty to solve the problem. In our case, the uncertainty comes from the level of information provided by the contest and we are interested in how the preference for uncertainty attracts a given type of participants. Lastly, the results we obtain are diﬀerent from the findings of Boudreau et al. (2011). In our work, we find that uncertainty attracts more performant participants, which has a positive eﬀect on the overall outcome, while in their case, it is the number of participants that compensate the reduced eﬀort of each individual due to problem uncertainty.
We also build our analysis on results from the rich body of literature on employment contracts. Specifically, in our paper, we are interested in the incentive schemes used by firms to attract specific types of workers, namely the self-selection eﬀect as defined in Salop and Salop (1976) or in Chow (1983). Analytical works show that individuals with higher skills are more likely to choose a performance-based pay schemes than low-skill workers (e.g., Salop and Salop, 1976; Demski and Feltham, 1978; Lazear, 2000b; Jensen, 2003). The basic idea is that a worker evaluates the match between his self-perceived personal attributes and the perceived attributes of available employment contracts and selects the contract that maximizes his expected utility. This theory is supported by a number of empirical papers. Most of them are laboratory experiments (Chow, 1983; Waller and Chow, 1985; Cadsby, Song, and Tapon, 2007; Eriksson and Villeval, 2008; Dohmen and Falk, 2011), except from a field experiment by Fehrenbacher, Kaplan, and Pedell (2017) and studies of Lazear (2000a,b) based on a large data set on an auto glass company’s workforce.
The originality of our work also comes from the fact that we investigate the mechanism of a crowdsourcing contest by referring to researches applied to a standard employment framework. In particular, our focus is on participants’ self selection, while studies on contests are more concerned with the relationship between the design of the contest and the eﬀort exerted by participants.
To our knowledge, Eriksson and Villeval (2008) is the unique paper that studied the self-selection eﬀect in the context of tournaments. In a laboratory experiment, they show that when workers are allowed to choose between a performance-pay and a tournament, there is a considerable reduction in the variance of eﬀort among contestants in the tournament. They suggest that this is due to the fact that subjects self-select their payment scheme according to the degree of risk aversion. We also rely on their findings and see whether their arguments can be applied to a more general case where the degree of uncertainty perceived by the participants diﬀers by the level of information the contest provides about the contractual terms. Our work confirms the findings of Eriksson and Villeval (2008) – which is based on a laboratory experiments on 120 students – in a more robust way, using a “natural setting” from 156 contests involving 184 participants in average. Additionally, most researches that study the self-selection eﬀect, including Eriksson and Villeval (2008), compare variable-pay schemes to “less variable” pay schemes. Our approach is unique, as the uncertainty of being compensated is reflected through several measures that reflect the completeness of a contract: the amount of information provided in the policy, whether the VRP gives detailed information about the rewards, and about the scope. This allows us to have a greater granularity in the degree of uncertainty a contract represents.
Table of contents :
0.1 The economics of information security
0.2 Some barriers to overcome
0.3 The market for software vulnerabilities, platforms and open innovation
0.4 Presentation of the three chapters
1 The effect of competition intensity on software security
1.2 Literature review
1.3 The web browser and its revenue model
1.4 A model of competition in security quality
1.5 Empirical specification
1.6 Data and method
1.7 Estimation results
2 Hackers’ self-selection in crowdsourced bug bounty programs
2.2 Related literature
2.3 Hypothesis development
2.4 Data and empirical framework
3 Software vulnerability disclosure and third parties involvement
3.2 Related work
3.3 Data and empirical Strategy
3.5 Interpretation and conclusion