Presentation of the region
The Thesis work concerned only Northeast France because a very important data set was available for this area and forests are present in a great part of the region (47.8 % forested areas in the Vosges administrative department). Indeed, the region of the Vosges mountains has been investigated many times by phytoecologists and integration of all sources of information into the database led to the preparation of the best data set ever manipulated in Northeast France (more than 1,000 sites with complete floristic and ecological descriptions).
The boundaries of the region are the national French borders in the East and North directions. After having confronted the map of available data with a geological map of France at the same scale (1/1,000,000th ), it was decided to set the western and southern boundaries to fixed co-ordinates: respectively 779,000 meters and 2,281,000 meters for longitude and latitude in the second Lambert scale (easy-to-use and homogeneous scale over the whole metropolitan French territory). Two maps are included in the beginning of the report to show the geographic extent of the considered forest sites on an administrative background (with the main cities), as well as the topographic features of the studied region.
The distribution of the forest sites in the different Northeast France administrative regions and departments is shown on Table 1.
Presentation of the database
The concept of forest sites
The general principle behind the expression of “ forest sites” is to structure the existing forested surfaces or areas that are likely to be forested, into homogeneous units called sites.
According to the definition unanimously accepted today, “ a site is a piece of land of variable surface, homogeneous in its physical and biological conditions (mesoclimate, topography, soil, floristic composition and vegetation structure). A forest site justifies that, for a given species, a specific sylvicultural method may be applied, which can be expected to result in a productivity bound within known limits” (Delpech et al., 1985).
Sources of data
The catalogue of forest sites
The results of a typology study are compiled in a document commonly referred to as “ catalogue of forest sites” . This document is made available mainly to forest managers concerned with the area, but all persons somehow involved in forests, such as researchers, teachers, etc. can find useful information in this synthesis, which presents detailed information about all identified sites : physical character, characteristic vegetation, spatial distribution, forest potential, sensitivity to various stresses… (Becker, 1996). It is estimated today that almost two thirds of the French forest territory are covered by a site catalogue and work is going on. The concept of forest sites applies to regional areas of moderate size (50 to 2,000 km2).
As far as Northeast France is concerned, 16 catalogues were used, representing 663 sites out of the total 1,239 (53.5 %). Table 2 summarises the different sources of information used in the study.
Other sources of data
Besides the catalogues of forest sites, other sources of data that presented complete description of flora and analysis of the first soil horizon, were used: M.Sc. and Ph.D. dissertations represented 415 forest sites (33.5 %), phytoecological surveys 82 sites (6.6 %) and data issued from forest sites networks 79 sites (6.4 %).
The data set prepared during the Thesis work is unique because for 1,033 forest sites, a complete floristic relevé is available as well as its corresponding description and chemical analysis of soil. It is the first time in France that this kind of data is presented in a digital database.
Aims and extent of the database (adapted from Gé gout et al., 2000)
The phytoecological database was created by the French Institute of Forestry, Agricultural and Environmental Engineering (ENGREF) in order to classify, structure and enable the use of data from various sources (forest sites catalogues, M.Sc. or Ph.D. dissertations, forest sites networks like the European or RENECOFOR networks). Information about ecology, pedology, climate and flora are contained in the database that summarises lots of documents that would be difficult to exploit on their own and its construction is justified by the fact that no such database does exist as far as France is concerned : only purely floristic (SOPHY) or pedologic (DONESOL) databases had been investigated by now. The phytoecological database is being constructed in a French version. The database is composed of complete floristic and ecological relevé s that have been carried out on many forest sites. Field ecological variables, which are not sufficient to completely explain the distribution of vegetal species, are completed when available by laboratory soil analyses (edaphic data) and by meteorological or mesoclimatic data derived with a geographic information system (GIS). Climatic variables that could not be obtained on field by the authors, but determined through the use of GIS, represent an original and new type of information.
3,500 to 4,000 relevé s with soil analyses will be integrated to the database, among which 1,239 concern Northeast France and were selected for investigation in the Thesis work. The constitution of the database is financed by ENGREF, the French Ministry of Agriculture and Fisheries (DERF) and the French Agency for Environment and Energy Management (ADEME) and the realisation is carried out in ENGREF, Nancy, France (Gé gout et al., 2000).
Structure of the database (adapted from Gé gout et al., 2000)
The structure of the database is presented in the following page.
The numerous fields to be filled in are dispatched into 19 tables in order to describe the phytoecological relevé s, as well as the studies or projects they are derived from. The table « Observations » is crucial because an observation corresponds to a relevé carried out by one or more authors on a specific site and at a given time. The author(s) is (are) listed in the table « Observation’s Authors ».
Several observations are made for a specific study and the fields that characterise it are present in the table « Studies ». The table « Studies’ Authors » gives information on the different authors of studies, projects, reports, dissertations… and the table « Authors » summarises all the authors from all the studies and observations. The table « Study Financing » indicates the organism(s) that takes (take) part to the financing of the survey and the table « Organisations » presents all the organisations that have financed one or more studies from the database.
The table « Forest Sites » characterises the location and the permanent ecological conditions, as well as the mean climatic conditions of the sites.
The table « Profiles » indicates the characteristic variables of the soil. It is separated from the table « Forest Sites » because some relevé s do not present any soil description. The table « Mineral Horizons » describes each mineral
Vegetal species sensitive to Global Change in Northeast France MScEE Thesis Work or organo-mineral horizon and the table « Organic Horizons » each holorganic one. The table « Profiles Analyses » indicates methods and analytical characteristics common to all horizons from a given profile. The table « Horizons Analyses » present the results of laboratory analyses of the different horizons.
The table « Vegetation » describes the vegetal community noticed on a forest site. The table « List of Taxa » indicates the abundance/dominance of each species present on this location and is linked to the table « Usual Taxa » where all the species encountered in the different relevé s from the database are listed. This table presents plant names in French or in Latin and is linked to the table « CIFF Codes » that contains all the current names and synonyms of all species from the French flora. It is finally linked to the table « French Names » that will contain all current French names and synonyms of forest French species (Gé gout et al., 2000).
Quality of the data entered in the database (adapted from Gé gout et al., 2001)
A typical feature from the database is that it has been investigated for a long time and many different people have taken part to its construction. However, rules and procedures of manipulation, verification and harmonisation of the data have been summarised in an internal document (Gé gout and Dossier, 2001). Verification of the data had to be carried out by another person (the author) than the one that entered them into the database and concerns every fifth relevé for floristic description and soil analyses. In case of a too important proportion of fake data, the totality of the data from the considered study is checked. Harmonisation of the data concerns translations of French names to Latin names for floristic description, with reference to CIFF, the computer-based code of French flora (Brisse, 1994) and it concerns the expression of data in the same units for a specific variable, as far as soil analyses are concerned (Gé gout et al., 2000).
A term of error for the whole database could be computed, based on the 1,239 relevé s from Northeast France and a systematic check of every fifteenth relevé (concerning 86 complete relevé s, see Table 3). An error term, specific to each relevant component of the ecosystem, was also computed and found to range between 0.2 and 1.7 %. As far as flora is concerned, the presence or absence of 1 species out of 500 is fake (major mistakes) and the rate is higher if coefficients of abundance/dominance are checked (minor mistakes). Moreover, concerning edaphic or site-specific data used in the computations, 1 data out of 200 is fake. Finally, greater rates of error (1 to 2 %) may be observed on the description of the soil horizons (like the sampling depth… ) but those data are not used to characterise plant ecology.
When focus is on the proportion of relevé s that contain at least one mistake, values are of course much greater (53 down to 45 %, see Table 3) because the database contains numerous fields. Besides, every fifth down to every twentieth relevé contains at least one mistake for important fields (most mistakes are however minor mistakes). As long as checked data have been modified, the different rates could be computed again to find a value of 0.2 to 0.5 % for data further used in the computations of plant-environment relationships (Gé gout et al., 2001).
Presentation of the variables
Plant ecology presents mainly two faces : a descriptive one and an explicative one. The descriptive face concerns the listing of plant species in their environment (identification, quantification and description of all necessary information) and the explicative face has to cope with the measurement of the relation between vegetation and environment. Priority has to be given to plants because they know very well which kind of environment suits them best and thus, they can be used as ecological means of measurement (ecological calibration with plants). It is of crucial importance to determine the ecological behaviour of each species but only a few research teams have initiated the elaboration of catalogues in France (De Ruffray et al., 2001).
The establishment of the phytoecological database by ENGREF is thus relevant in the context of understanding plant species distribution and sensitivity to ecological factors. Dependent variables are floristic data (presence/absence or coefficients of abundance/dominance of plants species) and independent variables are ecological data that describe and characterise the local environment (mesoclimate, topography, geology, mineral characteristics, pedology… ).
In the following, the derivation of the main working tables is presented. Only 1,033 relevé s out of the 1,239 were selected for further computation because they presented both floristic description and soil analyses of the first horizon.
Dependent variables concern all the species encountered in any observation out of the 1,239 relevé s. It has been decided not to take into account different vegetation layers for woody species in the different manipulations : it would have been however possible to distinguish between a juvenile and an adult behaviour but the interest was on particular species’ behaviour and determination of plant-specific indicator values. In a same way, woody plants that were present only in the herb layer were rejected because they did not show signs of growth or adaptation to the environment.
Two main tables could thus be constructed concerning flora, with the forest sites in rows and plant species in columns : a table of presence/absence and a table of abundance/dominance. In both cases, the value 0 was set at the intersection (i,j) when the species j was absent in the forest site i. In the case of presence/absence, the value 1 was set at the intersection of a present species j in a forest site i and in the case of abundance/dominance, coefficients ranging from + to 5 were attributed according to the Braun-Blanquet (1932) scale:
5 : cover greater than 75 %, any abundance;
4 : cover between 50 and 75 %, any abundance;
3 : cover between 25 and 50 %, any abundance;
2 : important abundance or cover between 5 and 25 %;
1 : relatively important abundance but poor cover;
+ : poor to very poor abundance (very poor cover).
A total number of 567 vegetal species are found in Northeast France, but only 234 were selected because they were present in at least 10 relevé s (or approximately 1 % of the relevé s). Thus, statistics can be derived for each vegetal species. The final floristic table was thus composed of 1,033 rows (forest sites) and 234 columns (plant species).
The clear-cut of species seems to be important if the number of vegetal species is considered because 234 species are remaining out of 567 initial ones (41.3 %). However, interest is on the total number of presence, whatever the species may be and the clear-cut leads to a reduction of 19,525 observed present plants or trees down to 18,625 and it means that 95.4 % of the floristic information is kept when species present in less than 10 relevé s are disregarded. The final list of species is presented in Appendix 1, with ecological indicator values found by different phytoecologists.
Fagus sylvatica is dominating in Northeast France (present in 656 relevé s), followed by Abies alba (440 relevé s) and Quercus petraea (373 relevé s). The shape of the curve (exponential) presented in Figure 2 is characteristic of vegetal species distribution. It was built on the basis of Appendix 1 and decreasing numbers of occurrences or frequency, the species being sorted by their rank.
Vegetal species sensitive to Global Change in Northeast France MScEE Thesis Work
Most of the vegetal species presented in this report were described by Rameau et al. (1989 and 1991) and Ferry and Rameau (2000).
Independent variables are investigated with regard to their relation with specific-plant presence or absence and should include variables linked to physiological plant growth and variables linked to Global Change. Three main ecological components were selected in order to classify independent variables : climatic, pedologic and light-related. Poor information was however available on the light conditions and computations had to be carried out without this component.
However, the relevé s were performed in homogeneous light conditions (located in closed forests for most of them), meaning that the light factor may be considered as a constant factor.
The list of independent variables is presented in Appendix 2a.
As already stated, only 1,033 relevé s out of the initial 1,239 were selected because they presented soil analyses of the first horizon.
Two types of variables were used as far as climate is concerned : field-based and GIS-derived variables. Data measured on field were altitude, topography and exposition and data concerning location of the forest sites were derived from information given by the authors of the studies and derived from topographic maps from the French National Geographic Institute (IGN).
Geographic information systems were used to convert the location of the forest sites into different systems of co-ordinates and derive some climatic variables. Thus, 12 mean monthly temperature values and a mean annual one (computed over a period of 30 years between 1960 and 1990), as well as 12 mean monthly precipitation values and a mean annual one (computed over the same 30-year- period), were available. Those data are coming from the 1-km-precision grid AURELHY meteorological model of France (Benichou and Le Breton, 1987).
A third category of climatic variables comes from the computation of different indexes (Lebourgeois, 1999) based on the temperature and precipitation data:
Numerous variables linked to pedology were derived from the description of the soil horizons (see Appendix 2a). They concern data about the type of soil, the apparition of hydromorphy (reduced conditions), the different obstacles encountered in the soil (floors), the type of humus, the presence of coarse fragments and the effervescence of the material, the nature of the bedrock, compacity and moisture of the different horizons, as well as structure and texture.
Edaphic variables also concern the results of soil analyses of the first soil horizon (horizon A). They mainly characterise mineral, nitrogen and phosphorous nutrition and are quantitative measured data. As long as different methods of measurement may have been used to derive them, it was decided to separate the data with regard to those different protocols. The main principles of pedology are described in Duchaufour, 1991 and Delecour, 1978 and the different protocols of soil sample analysis are described in Baize (2000 and 1988) and Baize and Girard (1995).
Summary statistics were derived for each independent variable and are presented in Appendixes 2b and 2c. Qualitative variables (Appendix 2c) were described in terms of number of occurrence in each class and quantitative variables (Appendix 2b) in terms of their distribution with the minimum, 1st quartile, mean, median, 3rd quartile, maximum, standard deviation and standard error of mean values.
For all variables, the number of relevé s for which information is missing, is noticed.
Presentation of the computation methods
Basics statistics, with regard to this work, were reviewed in Legendre and Legendre (1999) and StatSoft, Inc. (2001). Four computation methods are presented below, with their respective interest and relevance.
The first analyses performed were principal component analyses, in order to detect the relations between all soil variables and between all climatic variables or indexes. The goal was also to detect redundancy in the predictor variables and select ªmaster variablesº. Correspondence analyses were then computed in order to identify the main ecological factors explaining vegetation distribution, considering all 234 species.
The last two methods concerned the assessment of the effect of the predictor variables on the probability of presence of each species, with a non-parametric case (Kernel estimation regression) and another parametric one (logistic regression), that allowed to take simultaneously more than one predictor variables into consideration (multiple regression).
Principal component analyses
Given a wide range of parameters and variables, it was important to characterise their correlation, through classical linear regressions and principal component analyses. Thus, some variables may be integrated or contained in more synthetic ones. Correlation matrices were derived for two groups, one concerning climate and one concerning soil.
The performance of principal component analyses may avoid biased results due to correlated variables, but may also enable the elimination of redundant variables as well as the definition of synthetic variables or parameters (Bouroche and Saporta, 1998).
The principle of correspondence analysis relies on the construction of synthetic variables that summarise the variance (ªinertiaº) of a data set (Bouroche and Saporta, 1998; Dervin, 1990; Gé gout and Houllier, 1993).
A first correspondence analysis was carried out based on the floristic table composed of 1,033 relevé s and the 234 species present in more than 10 relevé s. The presence/absence floristic table was used rather than the abundance/dominance one in order to give a similar weight to each species.
Ecological signification of the main factorial axes was graphically or analytically derived from linear regressions (either simple or stepwise or multiple) or analyses of variance between ecological variables and forest sites factorial co-ordinates.
Interpretation of the results of the first correspondence analysis led to a second analysis to definitely characterise the main ecological gradients having an influence on the distribution of plant species in Northeast France.
The results from the second correspondence analysis are expected to be highly correlated with the results of the first one because the principle of computation of the factorial co-ordinates is based on the separation of forests sites that present very different characteristics. Indeed, two forest sites that are located close from each other on a factorial plane are expected to present similar characteristics for the considered factors. In a same way, if focus is on the species factorial co-ordinates and if two vegetal species are close from each other, they are likely to be present in the same kind of environments illustrated by the factors.
Table of contents :
1. INTRODUCTION TO GLOBAL CHANGE AND VEGETATION DISTRIBUTION
1.1. Main issues of Global Change
1.2. Facts and figures on the French forest
2. MATERIALS AND METHODS
2.1. Presentation of the region
2.2. Presentation of the database
2.2.1. The concept of forest sites
2.2.2. Sources of data
220.127.116.11. The catalogue of forest sites
18.104.22.168. Other sources of data
2.2.3. Aims and extent of the database
2.2.4. Structure of the database
2.2.5. Quality of the data entered in the database
2.3. Presentation of the variables
2.3.1. Dependent variables
2.3.2. Independent variables
22.214.171.124. Climatic variables
126.96.36.199. Pedologic variables
188.8.131.52. Summary statistics
2.4. Presentation of the computation methods
2.4.1. Principal component analyses
2.4.2. Correspondence analyses
2.4.3. Non-parametric Kernel estimation regression
2.4.4. Logistic regression
184.108.40.206. Logistic regression analysis
220.127.116.11. Fitting a logistic regression model to data
18.104.22.168. Test statistics
22.214.171.124. Comparison of different logistic regression models
126.96.36.199. Interpretation of the fitted logistic regression model
3.1. Results from the principal components analyses
3.1.1. Principal components analyses on soil variables
188.8.131.52. Characterisation of correlation between soil variables
184.108.40.206. Characterisation of the links between soil variables
3.1.2. Principal component analyses on climate variables
220.127.116.11. Characterisation of correlation between climatic variables
18.104.22.168. Characterisation of the links between climatic variables
3.2. Results from the correspondence analyses
3.2.1. First correspondence analysis
22.214.171.124. Representation of the forest species factorial planes
126.96.36.199. Ecological meaning of the first axis
188.8.131.52. Ecological meaning of the second axis
184.108.40.206. Ecological meaning of the third axis
3.2.2. Second correspondence analysis
220.127.116.11. Preparation of the second correspondence analysis
18.104.22.168. Correlation of forest sites factorial co-ordinates from the two analyses
22.214.171.124. Ecological meaning of the second axis
126.96.36.199. Selection of soil variables for further computations
188.8.131.52. Selection of climatic variables for further computations
184.108.40.206. Summary of the main explicative variables
3.3. Results from the Kernel estimation regression
3.4. Results from logistic regression
3.4.1. Choosing a “best” model for each species
3.4.2. Presentation of logistic regression results
3.4.3. Interpretation of logistic regression results
220.127.116.11. Five types of sub-models
18.104.22.168. Sensitivity of species towards variables
22.214.171.124. Significance of the regression coefficients
3.4.4. Discussion of logistic regression results
126.96.36.199. Goodness-of fit (pseudo-R2)
188.8.131.52. Presence/absence data versus probability of presence
184.108.40.206. Other link functions
3.5. Comparison of ecological indicator values
3.5.1. Interests of ecological indicator values
3.5.2. Summary of the different methods
3.5.3. Correlation between the different methods
4. DISCUSSION ON THE SENSITIVITY OF VEGETAL SPECIES TO GLOBAL CHANGE
4.1. Scenarios of Global Change
4.1.1. Change in temperature: global warming
4.1.2. Change in soil pH: acidification
4.1.3. Change in the C/N ratio: eutrophication
4.1.4. Combination of the changes
4.2. Choice of species
4.2.1. Choice of non-tree plants
4.2.2. Sensitive species
4.3. Results from the different scenarios
4.3.1. Spatial representation of probabilities of presence
220.127.116.11. Lamium maculatum
18.104.22.168. Oxalis acetosella
4.3.2. Effect of Global Change scenarios on the numbers of occurrences
22.214.171.124. Disappearance of species
126.96.36.199. Decrease/increase in the probability of presence
4.3.3. Discussion on the methodology
188.8.131.52. Long-term predictions
184.108.40.206. Factors of uncertainty
5.1. Difficulty of ecological modelling
5.1.1. Heterogeneity of the data
5.1.2. Simple models
5.2. Autecology vs. synecology
5.3. Further research