The DADS-EDP panel
The data set used in this paper is called the ”DADS-EDP” which is a panel provided by the INSEE. This ”DADS-EDP” data set is actually composed of two other sources: the DADS panel merged to the EDP. 5 The DADS is an administrative data set with information on wages, the type of employment contract, employment sector, the size of the firm the person is working in, the starting and closing dates of the period of paid work, the number of paid hours, etc… Each year, firms have to make a declaration for each of their employees. Every working person in France is covered by these declarations, except for employees of government bodies, self-employed people and employees of French firms established abroad.
Civil servants working in public institutions of an industrial and commercial nature are included in the DADS (since 1991 and 1992) as well as publicly-employed hospital staﬀ (since 1984), civil servants of territorial communities (since 1988), unemployment bene-fits recipients (since 2002) and agricultural workers (since 2003). There exists diﬀerent EDP stands for ´ , or Permanant Demographic sample. Echantillon D´emographique Permanent versions of the DADS, for instance a version that includes every civil servant. We have access to the panel version of the DADS (from 1976 to 2010), which is a 1/25th sample of the DADS until 2001 (we have employees born in October in an even year). Since 2002, the sample size was doubled.
The EDP is a panel with information on marital status, fertility, degrees obtained and the place of residence.6 From 1967, to 2004, people born from the 1st to the 4th of October are in the EDP. Since 2004, the data set was enriched with individuals born from the 2th to the 5th of January as well as those born between the 1st and 4th of April and July.7 The data are gathered from civil registries each year, and also from the census, whenever needed to complete the information from the civil registries. For example, for people born between 1989 to 1997, the information on their children comes exclusively from the census. For most people born between 1982 and 1989, this is also mostly the case. Before 2004, only people living in continental France were in the EDP. Since 2004, people living in the French overseas territories are also included.
The merged DADS-EDP panel to which we have access through the CASD8 is composed of individuals born on EDP days that are also in the DADS panel. Therefore, the data set does not include civil servants of national public services, men or women who have never worked and self-employed people. French nationals born abroad are not included in the data.
Before using the data for analysis, a lot of data preparation and cleaning was done. The raw data is in spell format, not very well suited for analysis. We describe briefly how we transformed the data into a panel. The same operations were applied for both women and men, so this subsection only describtes the steps for women.
First of all, we had to order the births of the children. In the description of the data it was written that for each child the variable aeni gave the birth year of the i the year of birth of the oldest child was aen2 for a woman who had two children, but for a woman who had 3 children it was then aen3. Something as simple as computing the mean age of childbearing by cohort for the first child was thus impossible to do with a single line of code. The first step was thus to order the births so that the first child, and the oldest, was in aen1, the second child to be born to a woman in variable aen2 and so on. The second step was to remove obvious errors; for example, we had some women that were born after their children. We completely removed such lines. We filled up incomplete data if possible: for example if starting date of contract and the duration of payment were both available, but not the ending date of contact, it was easy to deduct the ending date of contract. We then created two variables that gave the age in years of the individuals (current year minus year of birth) and the number of children someone had at a given year. Until then we only had the total number of children someone ever had.
One of the first important steps was to create a variable that counted the number of days a person stayed out of the labour force due to having a child. The starting and ending days of work contracts are reported, so it was possible for us to compute this variable. For women for whom no date of exit or entry into a firm is indicated around birth, we have deducted the length of maternity leave by the reduction in hours worked or wages in comparison to hours worked and hourly wage of the year preceding the birth. 9 The variable ”duration of maternity leave” is useful to compare the impact of the birth of a child for women that had short maternity leaves (less than 6 months after the birth) to mothers who took longer maternity leaves. We refer here to maternity leave as the period of exit from the labour market following the birth of a child. This maternity leave therefore includes, within the framework of this article, statutory maternity and parental leave.
We removed the parallel spells by defining the main employment for each woman and for each year. If a woman has had more than one job in a given year, we only kept the one where she worked the most days. If there were two jobs (or more) where she had worked an equal number of days in, we kept the job with the highest wage. For each job, we also had the worked hours, the number of days worked as well as wages. We summed up all these variables and finally removed every other spell. After these operations, a line in our data set gives, for a given year, a woman’s total number of days worked, the total of the earned wages and only her main job.
Chapter 3. The birth of a child The birth of a child and its impact on wages and worked hours by education level of the mothers and fathers: evidence from France 54 We also created a variable giving the attained education level in five categories: no information on education, less than high school education, completed high school (or similar degree), 2-3 years of higher education (university or similar institution) and 4-8 years of higher education.
Finally we created hourly wages by dividing the annual wage by the number of hours worked in the year.
In order to identify the impact of the birth of a child on hourly wages and the number of hours worked, we use a standard DiD approach, as in Duguet et al. (2015). The following description is fairly standard and can be skipped for readers familiar with DiD.
Card and Krueger (1994) use the DiD method to study the impact of a raise of the minimum wage in New Jersey that occurred on the 1st of April 1992. DiD works by considering two groups, a control group and a treatement group. In experiments, creating such groups is relatively easy: it suﬃces to assign the treatment (for example, taking a certain medication) randomly. The people that were randomly selected to get the treatment thus become the treated group.
One famous example of randomly assigned treatments in economics are the Vietnam lotteries. Angrist (1990) studies the impact of these lotteries on the wages of men and shows that “[…] as much as ten years after their discharge from service, white veterans who served at the close of the Vietnam era earned substantially less than nonveterans” (p. 330). But in most cases, however, treatment is rarely assigned randomly in economics.
In cases were random treatment is not possible, quasi-experimental methods such as DiD have proved to provide consistent estimates of the parameters of interest. The problem econometricians face is that they do not observe what would have been the outcome variable (for example, hourly wages) for the control group if it were treated, because it might be that people self-selected into the treatment group. Let us illustrate how DiD works. Let y wrong or unreported. As our study focuses on births of children in 2002-2003, we will present descriptive statistics only for the most recent cohorts.
Control and treatment groups
To analyse the impact of the birth of a child on hours worked and on the hourly wage of a woman or a man, we run a diﬀerence-in-diﬀerences estimation. This allows us not to have to specify a functional form for the wage equation and also deals with unobserved heterogeneity. It is thus important to discuss how we constructed the control and treatment groups. We only kept individuals that had a birth in either 2002 or 2003. We assume that having a child in 2002 and 2003 is very similar, and thus pooling the individuals is not a problem. We selected both these years in order to increase the number of observations for our analysis. Then, we observed what the hourly wages and worked hours were for these individuals two years before treatment. This counts as our first observation. Then we also observed them again two, four and six years after treatment. The reader might wonder why we chose the years 2002 and 2003 as the treatment years. The first reason is that we wanted to use the most recent data at our disposal. The second reason is the requirement that nothing but the treatment could influence the variables of interest. So if we focus on the 2000’s, our options are actually very limited. In the year 2000, the socialist government at the time in France introduced the 35-hour workweek. Before this reform, the normal workweek in France was 39 hours long. We had to be careful not to select individuals before and after this reform, or else we would have attributed a decrease in hours due to this reform to the birth of a child. Another reform was implemented in 2004, called the compl´ement de libre choix d’activit´e (CLCA). The CLCA is a lump sum, which depends on family resources, given to parents that completely stop, or reduce their labour supply to take care of a child, that must be younger than 3 years old. For their first child, parents received the CLCA during 6 months, and from their second child, they got the CLCA until the third birthday of the last child. Before this reform, only parents of two children or more could receive such an allowance. Here again, we had to be careful, as this reform could have changed the behaviour of people. Since we also wanted to see the labour supply and hourly wages of treated individuals six years after treatment, we could not have chosen the year 2005 (or later) as a treatment year, as our data is available until 2010.
To create our control and treatment groups we separated the individuals by education levels. However, having five categories for the education variable decreased the number of individuals inside the groups too much. We decided to pool individuals together into two categories: having at most a high school degree (or similar) or having more than a high school degree (such as any university degree). We also made sure that people in groups would not have another child before 2011. We separated the treatment and control groups further into women (men) that stayed out of the labour force for less than 6 months, for more than 6 months but less than than 2 and a half years and women (men) that stayed out of the labour force for more than 6 months but less than 6 years. This allows us to distinguish the impact for people that stay out of the labour force for diﬀerent lengths of time.
Then, to analyse the impact of having a first child, we proceeded in a similar manner to Duguet et al. (2015): the control group is composed of women without children. For the impact of a second child, the control group is made up of women with one child and for the impact of a third birth, the control group is composed of women with two children.
Table of contents :
1.1 Short description of the different chapters
1.2 Access to data for empirical studies in economics
1.2.1 The LIS data
1.2.2 The DADS-EDP data
1.2.3 Version control to facilitate research
1.2.4 Questions that this thesis aims to answer
2 Childbearing differences between French and German women
2.1 Introduction and literature review
2.2 The LIS data: descriptive statistics and visualizations
2.3 Results from the hierarchical model
2.3.1 The effect of household income on the number of children
2.3.2 The effect of unemployment growth
2.3.3 Other results
2.4 Discussion and conclusion
3 The birth of a child and its impact on wages and worked hours by education level of the mothers and fathers: evidence from France
3.2 Data and econometric methodology
3.2.1 The DADS-EDP panel
3.2.2 Data preparation
3.2.3 Econometric methodology
3.3 Summary statistics and graphs
3.4 Control and treatment groups
3.5 Results and discussion
3.7 Appendix 1: data cleaning
3.7.1 Appendix 2: Common trend graphs
126.96.36.199 Impact of the first child
188.8.131.52 Impact of the second child
184.108.40.206 Impact of the third child
4 The time and the transitions back to work in France after maternity
4.2 Institutional setting
4.3 Data and summary statistics
4.3.1 The DADS-EDP panel
4.3.2 Control variables
4.3.3 Descriptive statistics
4.4 Econometric analysis
4.5 Results and discussion
5 Version control systems to facilitate research collaboration in economics
5.2 Scope of Git
5.3 Basic usage of Git