Empirical analysis of two labor market policies

Anonymous applications

While unemployment insurance and job search counseling are well established and wide spread policies (respectively 1.4% and 1% of GDP in France), anonymous applications have only recently arrived on the public policy stage. Anonymous resumes are used in Belgium in the federal administration. They have been tested locally in the Netherlands, Sweden and in Switzerland, but the tests have not led to new legislation. In Great Britain, an attempt by Liberal Democrats to impose anonymous resumes for recruitment has been opposed by the government in 2009. In France, which we study here, a law was passed in 2006 that made the use of anonymous resumes mandatory for firms with more than 50 employees; yet, the government did not take the steps to define the conditions under which the law would apply.
Anonymous applications aim at reducing pervasive differential treatment on the labor market. Differential treatment by race and gender are repeatedly documented as a prominent feature in many labor markets. Correspondence testing studies, initially primarily developed in the UK, have been instrumental in providing direct and compelling evidence of differential treatment in interview rates. A notable example for race in the US is Bertrand and Mullainathan (2004). Correspondence studies have been increasingly used in France, the country under study here: see in particular Duguet, L´ Horty, Meurs, and Petit (2010) introducing the special issue of Annals of Economics and Statistics, n˚99/100, on measuring discrimination. The exact interpretation of differential treatment in terms of discrimination is still debated. In this thesis, we abstract from this normative and important question and focus in the context of a randomized experiment on the actual effectiveness of anonymous application to reduce differential treatment in access to interviews and hires. More precisely, we consider differential treatments between women and men, and between majority workers and workers with foreign background3 or living in deprived neighborhood.
Given correspondence testing evidence, there is a consensus on the likely effec-tiveness of anonymous applications to change the pool of candidates called for an interview. However effects on later stages of the hiring process, when candidates are actually interviewed, are controversial. The theory of statistical discrimina-tion suggests that changing the information set of employers at the beginning of the hiring process may have a strong impact on final decisions, if skilled minority applicants who are the victims of the negative signal attached to their group are able to overcome that signal at the interview stage by demonstrating their creden-tials. Similarly, anonymous resumes may be effective against taste discrimination if meeting with the applicant induces the recruiter to overcome his prejudiced views against a group, or simply to give these views less weight once they know the in-dividual characteristics of the applicant better. Clearly, little is known on whether these conditions hold in practice. Two non experimental exceptions are Goldin and Rouse (2000) and Aslund and Nordstrom Skans (2007). They both find that anony-mous procedures in the first stage of hiring processes increase the relative chances of women to be interviewed and hired.
In the first chapter, we provide experimental evidence on the impact of anony-mous resumes on the first and later stages of the hiring process. The experiment was implemented in 2010-2011 by the French Public employment service (PES) to help the French government decide on the enforcement conditions of the law passed in 2006. It was felt that a randomized experiment was needed to provide simple and transparent evidence. Firms posting job offers at the PES were asked to par-ticipate to an experiment in which they would have one chance out of two to receive anonymous resumes preselected by the PES agents, rather than standard ones. The experiment involved about 1,000 firms in eight local labor markets, and lasted 10 months. Although the experiment was initially designed to mimic the situation that would prevail if the law was finally enforced, compromises had to be struck given the government’s reluctance to impose anything on firms at that stage. In particu-lar, participation to the experiment was not mandatory. This is however the first time, to our knowledge, that experimental evidence is brought on the effectiveness  of anonymous referral procedures in a large labor market4, rather than on the mere existence of differential treatment.
The main findings are the following. First, and as expected, women do bene-fit from higher callback rates under the anonymous resumes procedures; however, the effect is somewhat limited by the extent of labor market segmentation. In-deed, half of the job offers in the experimental sample have only male applicants, or only female applicants. Second, and in a much less expected way, applicants from foreign background or residents in deprived neighborhoods witness a decrease in their relative chances to be interviewed, as compared to the reference group. Third, we find evidence that anonymous resumes counter homophily in the hiring process: they undo the tendency of female recruiters to select female applicants, and of male recruiters to select male applicants. Interestingly, this effect persists at later stages of the hiring process, so that anonymous resumes in effect equalize the chances of applicants of both genders to be interviewed and finally hired, irre-spective of the gender of the recruiter. Last, we do not find any evidence that the anonymous procedures increases the firms’ direct hiring costs nor the opportunity cost of vacancies.
We also document the representativeness of the sample of firms entering the ex-periment. Although differences in terms of observable characteristics seem minor, there are indications that firms who accepted the experiment were initially rather favorable to applicants from foreign background or residents in deprived neighbor-hood. This limits the external validity of the experiment. This also provides a plausible interpretation to the counter-intuitive impact of anonymous resumes on that group: these self-selected firms may practice “reverse discrimination” (possibly motivated by many reasons, including the possibility to pay minority workers lower wages), and anonymous resumes may prevent that practice. Interestingly, there is no evidence of selection at entry along lines of gender differential treatment. Less visible in the French debate, gender differential treatment may well be a relevant target for anonymous resumes.

Unemployment insurance

Contrary to anonymous resumes, the economic literature presents a large body of empirical evidence on the impact of unemployment insurance (UI) generosity. Putting aside insurance provision, this literature mostly focuses on impacts on la-bor market transitions from unemployment to employment. When unemployment benefits are more generous, reservation wages may increase and/or search effort may be lower. This leads to a decrease in unemployment exit rate to jobs. In his seminal work, Meyer (1990) identifies the effect of UI generosity in the US through variations across states. Since the adoption of more generous UI is poten-tially endogenous at the state level, Card and Levine (2000) propose to focus on exogenous variations in UI generosity due to targeted unanticipated policy change.
Using the same identifying method, positive effects of potential benefit duration (PBD) on unemployment duration5 are found in European countries, such as Ger-many (Hunt (1995)), Austria (Winter-Ebmer (1998), Lalive and Zweimuller (2004), ˜ 1 Lalive, Ours, and ZweimA 4 ller (2006) ), Poland (Puhani (2000b)), Slovenia (van Ours and Vodopivec (2006)), Finland (Kyyra¨ and Ollikainen (2008)) and Portugal (Addison and Portugal (2008)). Other authors rely on discontinuities in the UI system to identify the effects. Those discontinuities are usually age thresholds, as in Lalive (2008) and Caliendo, Tatsiramos, and Uhlendorff (2009). One exception is Card, Chetty, and Weber (2007) who use discontinuities based on past employment thresholds.
At the same time, unemployment benefits may affect the match quality, as it encourages unemployed to wait for higher productivity jobs (see Marimon and Zili-botti (1997) and Acemoglu and Shimer (2000)). Effects on match quality are far less documented (see the review in Addison and Blackburn (2000)). Using a structural model, Belzil (2001) finds that increasing the PBD by one week leads to an increase in subsequent employment duration by 0.5 to 0.8 days. Jurajda (2002) and Tatsir-amos (2009) compare benefit recipients to ineligible unemployed and find large pos-itive effects of eligibility on employment duration. Centeno (2004) estimates that a 10% increase of unemployment insurance generosity translates into a 3% increase in subsequent job tenure. In more recent studies, authors focused on identifying causal effects through difference in difference method (van Ours and Vodopivec (2008)) or through regression discontinuities method (Card, Chetty, and Weber (2007), Lalive (2007), Centeno and Novo (2009), Caliendo, Tatsiramos, and Uhlendorff (2009)). They do not find any average effects of PBD on subsequent wage, nor on employ-ment duration. However, Centeno and Novo (2009) and Caliendo, Tatsiramos, and Uhlendorff (2009) document heterogeneity in the effect. Centeno and Novo (2009) find that more constrained unemployed experience an increase by 3 to 8% on their earnings when PDB increases by 6 months. Caliendo, Tatsiramos, and Uhlendorff (2009) find that unemployed persons who find jobs just before their unemployment benefits run out accept less stable jobs than comparable unemployed persons who benefit from longer entitlement.
The second chapter of this thesis provides evidence that effects on match quality are indeed very limited in the French case. Compared to previous studies, this evidence is all the stronger that it concerns workers who are marginally attached to the labor market. Those workers are likely to benefit the most from extended UB. Their marginal attachment shows that they typically lack productive or job search skills that they could acquire with extended UB. They are also likely to be financially constraint such that extended UB would greatly change the value they attach to unemployment.
Our evidence is also all the stronger that we estimate the effect of a large in-crease in UB generosity. In a regression discontinuity design (RDD) similar to Card Chetty and Weber (see the second section of this introduction for an explanation of the method), we estimate the impact of an increase from 7 to 15 months in poten-tial benefit duration (PBD). In the French unemployment insurance system, when workers work more than 8 months over the year before their job separation, they are entitled to 8 more months of UB: their PBD is more than doubled.
Absence of match quality effect is all the more compelling that extension of PBD actually slows down unemployment exits to work. Unemployed with extended PBD actually wait longer before taking a job (roughly 2.5 months). Yet they do not find better jobs.
Our result is robust to different measures of match quality: employment duration and hourly wage of the first job after unemployment exit. We complement those two standard indicators by the wage two years after unemployment entry. This enables us to compare short and long PBD recipients at the same horizon, whatever the effect of PBD on unemployment duration.
The effect on unemployment exit to employment starts early in the unemploy-ment spell, even when both short and extended PBD unemployed receive benefits. This points to forward looking behaviors. However the effect is somehow stronger between 7 and 15 months after unemployment entry, when short potential benefits are expired but extended benefits are still paid.

Methodological issues for labor market policy eval-uation

Our evaluation exercises adopt the conceptual framework of the treatment effect literature (Rubin (1978)). Treatment evaluation consists in comparing the situa-tion of treated individuals to their situation if they have not been treated. The fundamental problem of evaluation is a missing data problem: this counterfactual situation is not observed at the individual level. Moreover, for most real treat-ments, eligibility conditions and take up behaviors make the treated and untreated groups fundamentally different, such that untreated individuals are not a suitable control group. LaLonde (1986) documents the extent of this bias in the context of training evaluation. To interpret differences between treated and untreated groups as causal effect of the treatment, randomness must somehow select groups in and out of the treatment. This condition guarantees ex ante statistical comparability between groups.
Randomness can be intentionally generated by researchers to conduct the eval-uation exercise (see Duflo, Glennerster, and Kremer (2008)). Intentional random-ness, or randomization, is usually thought as the ideal policy evaluation experiment. Indeed we rely on randomized experiments to evaluate the impact of anonymous applications and job search assistance.
However, randomization is not feasible in every context. For example, French law and culture makes it difficult to randomize the generosity of unemployment ben-efits. To evaluate unemployment benefits extension, we thus rely on local randomness generated by discontinuous eligibility rules. Depending on past employment duration, unemployed are entitled to different potential benefit duration. If they work between 6 and 8 months over the preceding year, they may receive benefits during 7 months; if they work more than 8 months, they are entitled to 15 months of benefits. Thus crossing the 8 month threshold in past employment duration leads to an increase in UB generosity. Potential benefit duration, the treatment, is a discon-tinuous function of past employment duration, which is called the forcing variable in the Regression Discontinuity Design (RDD) terminology. The key assumption of the RDD evaluation exercise is the absence of manipulation of the forcing variable. In other words, workers are assumed not to sort around the 8 month threshold in past employment duration. This can be a strong assumption. Because they may gain 8 month of UB, workers have strong incentives to work a few days more. Work-ers reacting to this incentive should accumulate just above the threshold and have thus a higher propensity to receive generous UB. This phenomenon pollutes the comparison between treated and untreated individuals as manipulating workers are also likely to react strongly to UB generosity. Fortunately, some statistical tests can detect manipulation and gauge the plausibility of the randomness assumption, also called exogeneity assumption. In chapter 2, we apply a RDD and do not find evidence of manipulation.
Even when feasible, randomized experiment may suffer from several caveats. In this thesis, we devote specific attention to three of them:
• John Henry effects (or placebo effects)
• differential sample attrition
• equilibrium effects
Placebo effects are discussed while evaluating anonymous applications in the first chapter. The two other caveats are the main problematic of the third and fourth chapter.
Out of those caveats, placebo effects are typical to randomized experiment while differential sample attrition and equilibrium effects are usual criticism of microe-conometric evaluation.

Placebo effects

When individuals are aware that they are part of an experiment, whether treated or not, they may have specific experimental behavior that they would not have outside of an experiment. Motivation effect can arise when individuals know that they are observed. For example, because they are in an experiment, individuals exert more effort than usual. Another experimental effect could be that individuals alter their behavior to confirm with their preconceived views of the treatment effect. Suppose that some individuals think that the treatment is ineffective, they may provide less effort when treated than when untreated, thus attenuated treatment effect estimates.
More fundamentally, being in a treatment group is usually perceived positively. This is conceived as an opportunity, whereas being in the control group is a loss of opportunity. Being in the control group is negatively associated. Such assortative association may bias upwards treatment effect.
Experimental effects are usually decomposed in two categories ( see Duflo, Glen-nerster, and Kremer (2008)). When the behavior of the treatment group is altered by the experiment, they are called Hawthrone effects. When it is the behavior of the control group which is altered, they are called John Henry effects. In the anonymous application experiment, John Henry effects are an issue. They are all the more likely that they could explain the counterintuitive effect on foreign vs native interview gap. The control group is composed of employers who have ac-cepted to receive anonymous resumes but receive nominative application. We could suspect that they exert special effort not to discriminate, because they know that their practices are observed and that discrimination is prohibited by the French Law. To test for this placebo effect, the control group has been observed after the end of the experiment. Applicants to post experiment job offers posted by control employers have been surveyed and no difference in interview or hiring gaps have been detected before and after the experiment. To control for confounding factors, employers outside the experiment have also been observed, during the experiment and afterwards. Difference in difference estimates also reject placebo effects.

Differential sample attrition

Sample attrition is a pervasive issue for surveys in social sciences. The damage appears particularly clearly in randomized experiments or quasi-experiments: while randomness in assignment to treatment creates a treatment group and a control group that are comparable, in the presence of sample attrition, however, the ob-served treatment and control groups may not be comparable anymore, threaten-ing the validity of the evaluation. A variety of tools has been developed to cor-rect for sample selection over the past decades, starting with seminal papers by Heckman (1976 and 1979) and turning less and less parametric up to the “worst-case”, assumption-free approach developed by Horowitz and Manski (1995, 1998 and 2000).
The main purpose of the third chapter is to propose another approach to correct sample selection, at the crossroads of semi-parametric forms of the “Heckit” and of the bounding approach of Lee (2009b). The main advantage of our approach is to yield informative set identification without requiring an instrument, but making the most of quite basic information on the number of attempts that were made to obtain response to the survey from each individual that responded. When the number of attempts can be assumed continuous, we show that our approach even yields point identification. Our approach assumes that the distribution of the number of attempts gives information about the ranking of individuals in terms of reluctance to respond and that this ranking is the same whether the individual is treated or not. This assumption boils down to write the response behavior as a latent threshold-crossing model with latent variable independent to the treatment. It can also been restated in terms of monotonicity and compared to Lee’s assumption. Our assumption being stronger and our information set been larger, our bounds are tighter than Lee’s bound.
We then apply our sample selection correction in the context of the Behaghel, Crepon, and Gurgand (2012) job search experiment, which can be viewed as a within-study comparison proving that sample selection can matter in practice. In the context of that job search experiment, exhaustive administrative is available, but a phone survey yields richer information – with low and unbalanced response rates. Using the administrative information, we show that selection into the phone survey is not as good as random, as it is correlated to potential outcomes. Moreover, point estimates suggest that the phone survey over-estimates the program’s impact by about 50%. Applying the sample selection correction procedure closes most of the gap between the estimates in the full and in the selected samples. Bounds `a la Horowitz and Manski (2000) or Lee (2009) are, in this application, too wide to be very conclusive.

Equilibrium effects

External validity of empirical results may be compromised in some randomized ex-periment or any microeconometric evaluation because of equilibrium effects. Even if treated and untreated individuals are ex ante identical, the presence of treated individuals may affect the untreated individuals, contradicting the so-called ‘noin-terference’ (Rubin (1978)) or ‘stable unit treatment value’ (Angrist, Imbens, and Rubin (1996)) assumption. Then difference in outcomes between treated and un-treated measures the effect on the treated net of that on the untreated. When the proportion of treated individuals changes, the difference in outcomes is also likely to change. As a consequence, tone particular microeconometric estimate is unlikely to be valid when the treatment is implemented in another environment at a dif-ferent scale. To obtain valid estimate, one solution is then to change the scale of randomization such that randomized units do not interfere with one another. This is usually a hard task when evaluating labor market policies, as this solution means finding numerous segmented labor markets. Another approach is to try to estimate the extent of the equilibrium effects as in Crepon, Duflo, Gurgand, Rathelot, and Zamora (2012) by varying the proportion of treated individuals in different labor markets.
The issue of equilibrium effects, which is discussed in a broader perspective in the survey of Meghir (2006), is particularly relevant to the evaluation of labor supply based policies (such as increasing incentives or monitoring the unemployed). First, they generally aim at increasing the overall number of filled jobs, which depends on the interactions between aggregate labor supply and labor demand. Second, these policies may induce displacement effects: treated persons may crowd out the untreated because they compete for the same jobs.
Although they have long been recognized, these questions have received limited attention to date. Davidson and Woodbury (1993) and Calmfors (1994) are early contributions. More recently, Lise, Seitz, and Smith (2005) study the equilibrium effects of the Self-Sufficient Project incentive program in Canada. They calibrate an equilibrium model of the labor market so that, when used in partial equilibrium, the model matches the effect of the program estimated by direct comparison of treated and untreated. When equilibrium effects are simulated, the impact of the Self-Sufficient Project is far lower. In contrast, Albrecht, van den Berg, and Vroman (2009) find, using a calibrated model, equilibrium effects of a Swedish training program to be stronger than implied by direct comparison. Using a job search and matching model with skilled and unskilled workers, Van der Linden (2005) shows that micro and equilibrium evaluations are likely to differ widely when job search effort and wages are endogenous. When wages are bargained over, raising the effectiveness of or the access to counseling programs pushes wages upwards and leads to lower search effort among nonparticipants. Induced effects can outweigh positive micro effects on low-skilled employment when the response of wages is taken into account.
The equilibrium effects have also been analyzed in empirical evaluations that do not rely on structural models. For instance, the contribution of Blundell, Costa Dias, Meghir, and J. (2004) evaluates the New Deal for Young People in the U.K. This program was piloted in certain areas before it was rolled out nation wide. Moreover, the program has age specific eligibility rules. Blundell, Costa Dias, Meghir and Van Reenen use these area and age based eligibility criteria that vary across individuals of identical unemployment durations to identify the program effects. They find that either equilibrium wage and displacement effects are not very strong or they broadly cancel each other out.
The aim of the fourth chapter is to analyze the impact of counseling in the standard matching model of the labor market (Pissarides, 2000). In our specifi-cation, counseled unemployed have a constant comparative advantage in the job search. This is consistent with most of empirical results on Job Search Assistance – JSA – evaluation (see meta analysis in Kluve (2010) and Card, Kluve, and Weber (2010)). According to Kluve (2010) , out of 21 microeconometric evaluations of JSA in European countries, 15 conclude on short term positive significant effects on transitions out of unemployment and 6 on insignificant effects.6 Using this simple matching model allows us to analyze the consequences of counseling in a dynamic set-up. More precisely, we shed some light on three important issues:
(i) What is the true impact of the policy when equilibrium effects are taken into account? The model shows that the true impact of counseling can be very different from what can be concluded when equilibrium effects are neglected even when the treatment group is small. For instance, we find that counseling can increase unem-ployment when a small proportion of job seekers benefit from counseling, although counseling improves the efficiency of job search. Equilibrium effects rely on the adjustment of wages. The impact of policies on wages has been analyzed in some papers devoted to equilibrium effects of several labor market policies and educa-tion policies, in particular since the seminal contribution of Heckman, Lochner, and Taber (1998). Our model allows us to analyze precisely the reaction of wages to counseling, as in the paper of Van der Linden (2005).
(ii) What is the impact of the generalization of the policy to a large treatment group? The model shows that there is no simple answer. In particular, the relation between the impact of the policy on unemployment and the size of the treatment group is not necessarily monotonic. Strikingly, in our framework, unemployment increases with the size of the treatment group when a small share of job seekers are treated but diminishes with the size of the treatment group when a sufficiently large share of job seekers are counseled.
(iii) What is the dynamic impact of counseling? Many experiments made to evaluate labor market policies are transitory. Typically, a group of job seekers is selected to benefit from counseling (the treatment group) and the control group will never benefit from counseling. The comparison between the outcomes yields the evaluation of the impact of counseling. Our model allows us to stress that the consequences of permanent and transitory policies can be very different. The dif-ference comes from the reaction of non-counseled job seekers. When the policy is transitory, non-counseled workers do not expect to benefit from counseling in the future. However, when the policy is permanent, the expectation to benefit from counseling in the future induces the non-counseled workers to raise their reserva-tion wage. In our framework, this phenomenon implies that permanent counseling increases unemployment when a small share of job seekers are counseled whereas counseling always decreases unemployment when it is transitory. Accordingly, it can be misleading to conclude that a truly successful transitory policy will remain successful when it becomes permanent.

