Woodland Survey of Great Britain 1971–2001

Abstract. The Woodland Survey of Great Britain is a unique data set, consisting of a detailed range of ecological measurements at a national scale, covering a time span of 30 years. A set of 103 woods spread across Britain were first surveyed in 1971, which were again surveyed in 2000–2003 (for convenience referred to subsequently as the "2001 survey"). Standardised methods of describing the trees, shrubs, ground flora, soils and general habitats present were used for both sets of surveys. The sample of 1648 plots spread through 103 woodland sites located across Britain makes it probably the most extensive quantitative ecological woodland survey undertaken in Britain; it is also notable for the range of sites that have been revisited after such a long interval. The data set provides a unique opportunity to explore the effects of a range of potential drivers of woodland change that operated between 1971 and 2001. The data set is available in four discrete parts, which have been assigned the following DOIs: doi:10.5285/4d93f9ac-68e3-49cf-8a41-4d02a7ead81a (Kirby et al., 2013b), doi:10.5285/d6409d40-58fe-4fa7-b7c8-71a105b965b4 (Kirby et al., 2013d), doi:10.5285/fb1e474d-456b-42a9-9a10-a02c35af10d2 (Kirby et al., 2013c), doi:10.5285/2d023ce9-6dbe-4b4f-a0cd-34768e1455ae (Kirby et al., 2013a).


Introduction
In 1971, a national survey of semi-natural woodlands in Great Britain was undertaken at the Nature Conservancy's research station at Merlewood, Grange over Sands, Cumbria (a predecessor of the Centre for Ecology and Hydrology). The survey of 103 sites was planned by R. G. H. Bunce and M. W. Shaw (Bunce and Shaw, 1972;Hill et al., 1975;Bunce, 1981). The project at this time had the following objectives: 1. To develop an efficient user-orientated method of classifying semi-natural woodland ecosystems in Britain.
2. To develop a complementary method of phytosociological classification for semi-natural woodlands.
3. To use or assist in the use of the classification in the fulfilment of the Nature Conservancy's aims and polices for wildlife conservation (Bunce and Shaw, 1973a).
Within the 103 woodland sites chosen, ecological information was recorded at the site level and in more detail from 16 200 m 2 sample plots located at random within each site. From each of these plots the following data were collected: presence of vascular plants and bryophytes from five nested quadrat sizes, measurement of diameters at 1.3 m (DBH -diameter at breast height) of all trees over 5cm in diameter in the plot and of saplings and shrubs in specified quarters of the plot, site descriptions and soil samples. These data were collected from the 103 sites (1648 plots) by eight survey teams between July and September 1971. In 2000, it was thought timely to revisit the 1971 survey. This time, the survey was focused on assessing the changes that had occurred within the woodland sites in the intervening 30 years, moving away from the original goals of the 1971 survey as outlined above. Fourteen sites were visited in 2000 as part of a pilot survey to assess the logistical and analytical implications of trying to carry out a re-survey (Smart et al., 2001). No surveys were carried out in 2001 because of a serious foot-and-mouth disease outbreak in livestock (during which access to the British countryside was severely restricted in order to constrain the contagious disease) but 56 sites were surveyed in summer 2002 and the remainder in 2003 by teams of consultant ecologists using exactly the same field methods as in 1971, as described below. Prior to each survey, a two-day training course was held at the Centre for Ecology and Hydrology to thoroughly prepare the surveyors with the detailed field protocols. Additionally, in 1971, all survey teams were initially accompanied by a supervisor and regular visits to the field were made by the project leader to ensure consistency and quality in data recording according to criteria laid out in the field handbook (Shaw and Bunce, 1971). In the 2001 survey, experienced survey staff were available in the office to answer post-training queries from the field throughout the survey via telephone and a full quality assurance exercise was carried out as described below, and more fully in Kirby et al. (2005).

Survey sites
The 103 surveyed woodlands were chosen from a set of 2453 woodlands that had been part of a preliminary survey known as the "Steele" survey (Steele, 1968). This had begun in the late 1960s and was led by R. C. Steele, the head of the Nature Conservancy's Woodland Management section. Standard recording cards were used, and the data provided background information for the Nature Conservation Review (Ratcliffe, 1977).
The subset of 103 was derived from the 2453 by association analysis (Williams and Lambert, 1959) and other numerical techniques that, at the time, were still novel and undergoing rapid development (Hill et al., 1975;Bunce, 1981;Bunce and Shaw, 1973b). These analyses put the woods into 103 groups according to the similarity of their plant species composition. The wood that was most typical of that group was then selected for detailed survey. Site names and grid references are given in Table 1 (it should be noted that the majority of the sites are in private ownership and therefore permission from the landowner must be sought before any potential visit).

Site descriptions
The sites provide a representative sample of the geographic spread of woodland cover (see Fig. 1) and the range of broadleaved/semi-natural woodland types. The sites also show a considerable physiographic variability in terms of rainfall, slope and aspect (Corney et al., 2004). The number of sites recorded in the 1971 survey from each of the 32 original ITE land classes in Britain (Bunce et al., 1990) was compared with the mean percentage area of broadleaved woodland, estimated from Countryside Survey 2000 data (Haines-Young et al., 2000), for each land class (Bunce et al., 1996). The comparison shows a good correspondence between national woodland area and the number of woodland survey sites recorded, with proportionally more surveyed woods  from land classes with a high broadleaved woodland cover (Kirby et al., 2005) (see Table 2).
Additionally, we can compare the number of plots allocated to each National Vegetation Classification (NVC) group (Rodwell, 1991) with the estimated total area of NVC types in ancient semi-natural woodland across Britain (Cooke and Kirby, 1994) (see Table 3). The 1971 survey data span the broad range of types in roughly the proportions that might be expected from the Cooke and Kirby data. Secondly, a comparison was made with the sample of woody vegetation from the GB Countryside Survey from 2000 (Haines-Young et al., 2000). The 1971 plots were grouped by Countryside Vegetation System classes (Bunce et al., 1999) and their frequency was compared to the estimated national area of each class. The two data sets are generally well correlated (Kirby et al., 2005).

Plot layout and descriptions
Sixteen plots were randomly positioned within each site in 1971 and the location of each was marked on a 1 : 25 000 map. Each plot was 14.1 × 14.1 m (200 m 2 ) ( Fig. 2) and constructed as shown in Fig. 3, with one centre post and four corner posts, with a set of four strings tagged with markers at specified distances. The centre post had a right-angled gauge affixed to the top in order to orientate the plot at random. In the field, plots were located by pacing from the nearest relocatable feature. Data were then collected on ground flora, tree and shrub layers, soils and habitat characteristics for the plot as described below. A habitat sheet for the whole wood was also compiled.
In the 2001 survey, the original maps were used to find the same plot position from 1971 as accurately as possible. Analysis of the 1648 plot records taken in 1971 and 2001 described in Kirby et al. (2005) demonstrates that the records may be treated as paired data (i.e. relocation error was not significant, as described in the "Data quality" section below). The advantage of paired data is that derived variables, such as species richness, can be reduced to differences for pur-poses of statistical testing. The total variation across time and sites will be less than if two completely random samples were collected in each year and the power of tests is thereby increased. Some relocation error was, however, inevitable given the limited information available.

Methodology in context
It is often an insoluble problem that, in order to extend an older time series without breaking consistency with its established methods, methods have to be repeated despite a more modern design perhaps being preferable if we were to start again. However, although the protocols in question are old, that does not necessarily mean they are outdated.
This survey was the first time at a national level that samples were being used to obtain an integrated assessment of the response of vegetation to the environment across a defined population. The structure of the project provided the basis for the further development of strategic survey methods. A subsequent survey based on these methods (Bunce and Shaw, 1973b), the Classification of the Native Pinewoods of Scotland, set the conservation agenda for that scarce resource (Bunce and Jeffers, 1977). In later work, the concept of a woodland site, and subsequently a 1 km square sampled at random, with random plots sampled within, be-  came a standard sampling strategy used as the basis of surveys such as the Cumbria Survey (Bunce and Smith, 1978) and the Terrestrial Survey of Shetland (Milner, 1975). Variations of this method are currently used very successfully in several other large ecological surveys in Britain, such as the Countryside Survey (Carey et al., 2008) and the Glastir Monitoring and Evaluation Programme (Emmett and GMEP team, 2014). Within the European Biodiversity Observation Network (EBONE), methods adapted from the basic principles in this woodland survey have been developed to roll out across the whole of Europe (Bunce et al., 2008(Bunce et al., , 2011. During the EBONE project, the methods were widely tested across 12 European countries, and also Israel, Australia and South Africa. The methods were proven to be robust, reliable and repeatable at a continental, landscape scale (Roche and Geijzendorffer, 2013).
A key aim of the sampling design was that the methods chosen should be standardised, and therefore highly repeatable. The size of the plot was chosen with reference to continental phytosociologists who at the time most widely used plots of between 100 and 200 m 2 (Bunce and Shaw, 1973b). After preliminary field tests, it was found that the number of species recorded usually stabilised at this size. The area of 200 m 2 was thus adopted for this survey, with five nests within. As the focus of the survey is on ground flora as well as tree and shrub information, the square plot with inner nests aids a systematic search of the vegetation within the plot. It is also straightforward to lay out in the field, and ensures a standard-sized plot is laid out every time. For these reasons, we consider the square plot as more advantageous than a circular plot. Plotless sampling was also dismissed, as it is not a suitable method for recording ground vegetation, only tree density. Random sampling was preferred to systematic sampling in this case to avoid the possibility of resonance with environmental features, for example a map grid line following the course of a stream. Random sampling also has practical advantages over systematic sampling, which requires continuous scale adjustment in order to obtain a constant sample from variably sized areas (Bunce and Shaw, 1973b).

Site information, plot locations and information, slope and aspect
For both the whole woodland site, and for each of the 16 200 m 2 plots within, the presence and absence of a series of attributes were recorded. Attributes included management factors such as the presence of coppice or stumps, physiological factors such as the presence of rock or cliffs, habitatrelated factors such as the presence of rotting stumps or hollow trunks, aquatic habitats such as ponds, presence of buildings or open habitats such as glades and rides, presence of epiphytes on trees, presence of animals and birds, and also boundary types and nearby land use. A full list of habitats may be found in the 1971 field handbook (Shaw and Bunce, 1971) (supplied as supporting documentation with the data sets). The slope of each plot was measured in degrees using a hypsometer and the aspect of each plot was measured using a magnetic compass.

Vegetation data
Within the plot described in Fig. 2, the area within the first nest of the plot (2 × 2 m) was searched for the presence of all vascular plants (monocots, dicots and ferns), including tree species. This procedure was repeated for each nest of the quadrat, increasing the size each time as shown in Fig. 2.
In the final nest (the whole 200 m 2 plot), the percentage cover (to the nearest 5 %) of each species was estimated. In addition, the total cover of bryophytes was estimated from the entire plot, as was an overall estimate for litter, wood, rock, bare ground and standing water. Bryophytes and lichens were collected separately and specimens identified later in 1971; in the 2001 survey only a limited list of common bryophytes was recorded. Some species were recorded in 1971 as amalgamated taxa reflecting difficulties in their consistent separation, for example Quercus robur and Q. petraea. In the data set, amalgamated taxon codes have been applied in order to remove the effect of recorders separating out such species to differing degrees.

Soil data
In both 1971 and 2001, soil samples were taken from every accessible plot in every woodland. A single composite soil sample was taken from each plot, at the centre of the vegetation quadrat, using a trowel. Samples (weighing approximately 1 kg) were taken to a depth of 15 cm and placed in a labelled plastic bag. On return to the laboratory, all soil samples were stored at 4 • C prior to processing and analyses. Soil samples from the 2001 survey were sieved using a 2 mm automatic sieving machine. A pH reading was taken on a representative fresh subsample from each soil sample before air-drying at 20 • C. Another subsample was then taken to determine loss on ignition (LOI), as a measure of soil organic matter content. Unless otherwise stated, soil pH values in the data set are from the soil samples prior to air-drying ("fresh"). All analyses were carried out under the supervision of the Environmental Chemistry Section at the Centre for Ecology and Hydrology (CEH), Merlewood, following standard methodologies and quality control procedures (Allen, 1989), including the analyses of certified standard reference samples within batches.
During the 2001 survey, the same soil analysis protocols were used as in the 1971 survey but the equipment was different. Changes in analytical precision since 1971, due to modifications in technical equipment, could have influenced the significance of the results obtained from both pH and LOI. Therefore repeat analyses of LOI on the 1971 samples and comparisons between fresh and air-dried soil samples from 1971 and 2001 were done to check the comparability of analytical methods between the two surveys. A representative number (ca. 20 %) of soil samples from 1971 were analysed for pH and LOI using the same procedures and equipment as for the 2001 survey. These results are included in the published data set.
Soil group information is derived from data recorded in 1971. Information on soil moisture, texture, structure and colour for different horizons was recorded in the field. This information was translated into comparable Avery (1980) soil codes in 2001.

Tree diameter
Trees, saplings and shrubs were recorded in the 200 m 2 plot, as described above. Decisions as to whether individuals are in the plot or not were based on the rooted base being 50 % or more within the plot.
For trees (stems of more than 5 cm diameter at breast height (DBH) of any species normally capable of attaining a treelike habit in Britain), the species and DBH of all stems in the whole plot greater than 5 cm were measured. Trees with multiple stems had each stem recorded separately. Standing dead trees were also measured and identified as such.
Saplings (definition as for trees, but with a height of less than 130 cm and with a DBH less than 5 cm) were recorded only in quarters 1 and 3 of the plot (see Fig. 2). The same measurements as for trees were made. Shrubs, like saplings, were also only recorded in quarters 1 and 3, and again the same measurements were taken. Shrubs were defined as species including hazel, blackthorn, Viburnum spp. and juniper. See Table 4 for a summary of data collected.

Data quality
The 1971 data sets were transferred from the original field sheets to spreadsheets prior to the 2001 surveys. The 1971 data were double-punched and then checked and corrected to produce a final validated copy. In the 2001 surveys, the consultant surveyors were asked to ensure that all data were corrected and validated prior to transfer in electronic form to CEH. Initial standard validation checks included plot and site counts to ensure no duplicate numbering and hence double counting of plots.
As part of the quality assurance process for the ground flora data, six sites were visited by a different set of surveyors and eight plots at each site recorded within 2 weeks of the main survey. A mixed model ANOVA showed no overall difference in species richness between the different surveyors (Kirby et al., 2005). Some plot relocation error was inevitable given the limited information available and the nature of the original maps. In the repeat survey, the field botanist relied only on a marked point on a map as the sole aid to relocating the 1971 plot location. As statistical analyses of temporal vegetation change are more powerful when based on records from plots located in the same place rather than randomised to new locations at each survey, a method was developed to measure whether the 2001 record for a plot was more similar to the record for that plot in 1971 than another (randomly chosen) position from 1971. This follows from the general principle that locations near to each other tend to be more similar. Therefore, the principle of autocorrelation between near points was used to address the problem of quantifying the error involved in attempting to relocate the same vegetation monitoring plots. In attempting to measure the amount of relocation error, one cannot of course exploit a "true" set of temporal pairs known to have been recorded in exactly the same position. What can be done is to compare the average species compositional similarity between the ostensibly true temporal pairs with the average similarity for a random pairing of the 1971 data with the 2001 data. If, on average, attempts to relocate the true 1971 position had been successful then the similarity between the true pairs should be greater than the random pairs. This approach was tested on the 14 pilot resurvey sites (Smart et al., 2001). All the sites showed higher similarity between plots as a result of the search for the 1971 plot location, and for nine sites there was significantly higher similarity. The same analysis was carried out for all the remaining sites. Overall at 97 sites (out of 103) mean similarity was greater between "relocated" plot pairs compared to randompair comparison; for 59 sites the difference was significantly greater. The data have therefore been improved through the identification of the original plot locations. There is still a need for caution in interpreting the explanatory power of plot-level variables because of the possible confounding of plot relocation error and change over time. Small differences between years in plot location, for example, from an open patch to a more shaded patch could result in lower species richness and higher woody basal area being recorded for that plot. However, given the size of the data set, individual plot errors due to this factor are likely to be balanced out over the whole sample. A full account of this is given in Appendix 3 of Kirby et al. (2005).
It is important to note that there were some marked differences in the date of surveys between 1971 and the 2001 surveys, with most sites being recorded earlier in the year in 2001. This is likely to influence the recorded presence or abundance of vernal species in particular, with more species generally detectable in the late April-July period (Kirby et al., 1986;Sykes et al., 1983;Sykes and Horrill, 1979) than much later surveys. More species records would therefore be expected from the 2001 surveys.
In terms of the analytical soil data, quality control measures were followed as outlined in Allen (1989). These included the analyses of certified standard reference samples within batches. The descriptive profile data collected in 1971 210 C. M. Wood et al.: Woodland Survey of Great Britain 1971 were collected following the standards set out in the training and field handbook but were not formally checked for quality aside from checks from supervisors during the survey.

The Woodland Survey in context
Although there are many schemes across the world that monitor trees and forestry, there are few long-term programmes that take an integrated approach such as the survey in question, including trees, but also vegetation and soil information. Many national forest and woodland monitoring schemes were initially set up with an emphasis on monitoring timber production, commonly in the 1920s, when timber supplies were low following the First World War. For example, in Britain, the Forestry Commission was set up in 1919; since then it has undertaken national forestry inventory surveys which concentrate on the size, distribution, composition and condition of all forests in Britain but does not focus on sampling ground flora or soils (Forestry Commission, 1952, 1970, 1984, 2013. The situation is similar in the heavily forested countries of northern Europe such as Sweden, Denmark and Finland, where national forest inventories are also carried out (Groom and Reed, 2001), and also in the United States of America, where the US Forest Service has had a monitoring programme in place since the 1920s (Smith, 2002;United States Forest Service, 2015a).
An additional driver for the initiation of forest surveys across central Europe was the mystery of Waldsterben (forest decline). This became a contentious issue in the early 1980s, when it was suggested that air pollution was causing a progressive death of forests (Hinrichsen, 1987). In Germany, the forest authorities initiated surveys of the national forests, starting in 1987 and repeated at approximately decadal intervals, and currently carried out by the Thünen Institute of Forest Ecosystems (Kändler, 2009;Kandler and Innes, 1995;Thünen Institute, 2015). In Switzerland also, a thorough national forest inventory was first carried out in the early 1980s, repeated in the mid-1990s and again in the mid-2000s. Since 2009, the inventory has become a continuous monitoring programme. The inventory records the current state and changes of the Swiss forest (Mandallaz, 2007;National Forest Inventory, 2015;Böhl and Brändli, 2007). In both of these countries, the inventories are, again, largely focused on monitoring timber production, although both have been concerned with forest condition from the start, and the Swiss inventory in particular has come to include greater detail regarding a range of habitat measures (as described in the field manuals, e.g. Keller, 2011).
In tropical regions there is a general shortage of biodiversity data (Balmford et al., 2005), which is largely due to the geographical inaccessibility of many of the areas, and lack of local resource. Many studies regarding forestry and woodland in these regions rely heavily on remotely sensed information and concentrate on extent, biomass and carbon stocks (Asner, 2015;DeVries et al., 2015;Sousa et al., 2015;Wani et al., 2015), rather than ground-level biodiversity at a national level. Efforts are being made in many countries to intensify soil and ground vegetation sampling, as in the USA (United States Forest Service, 2015b;Smith, 2002); however, it is important to remember that the focus of this British woodland survey is on the semi-natural woodland ecosystem (not only trees and shrubs but also soils and ground flora). Taking this into account, there is relatively little literature regarding comparable national long-term monitoring schemes across the world, particularly those dating back as far as 1971.

Examples of data usage
Since the first survey in 1971, the data have been analysed and used in a range of ways to answer a variety of questions. After the first survey in 1971, publications arising from the data included the production of "A Field Key for Classifying British Woodland Vegetation" (Bunce, 1982(Bunce, , 1989. The survey was also described in Bunce and Shaw (1972) and used to put British woodlands into a European context in Bunce (1981). The standardised methods, as described in Bunce and Shaw (1973b), became the basis for a range of subsequent large surveys, as described in Sect. 2.3.
Following the second survey in 2001, a range of analyses were undertaken, as described in Kirby et al. (2005), focusing on changes that had taken place between the two surveys. Some of the conclusions from the main findings were that there had been an overall increase in soil pH, particularly in organic soils, but there was no increase in the mean level of soil organic matter. Most tree and shrub species remained stable in terms of their frequency of occurrence at plot and site levels, although 15 species (9 of these shrubs) declined, whilst 5 other species (4 conifers) increased. There was a net loss of stems from the smallest size classes (particularly less than 10 cm DBH) with some smaller gains in the 30-60 cm classes. Stems greater than 60 cm remained scarce, although different species revealed distinct patterns of variation. Overall ground flora species richness declined by up to 32 % at a plot level (Kirby et al., 2005).
More recently, further studies have included an analysis of the impact of an extreme weather event -a storm in 1987 during which wind speeds locally gusted up to 160 kph and an estimated 15 million trees were blown down across the south of England. Using Bayesian methods, Smart et al. (2014) demonstrated that woodland plots inside the storm track had a lower loss of understorey species richness, or an increase in richness between 1971 and 2001. Marrs et al. (2013) analysed the data in order to investigate the impact of aggressive dominant native species on the species richness of native woodlands. Findings suggested that several species do have the potential to become "overdominant" and perhaps may impinge on other field-layer species.  Corney et al. (2006) undertook a multivariate analysis to assess the effects of landscape-scale environmental drivers on the vegetation composition of British woodlands. The analysis investigated the degree to which field-layer vegetation composition in forests is determined by variables operating at different scales, from regional (such as climate, location) to local factors (such as the basal area of canopy trees and management).
Additionally, the plot species data have contributed to Great Britain niche models such as MutiMOVE (Henrys et al., 2015). MultiMOVE is a statistical package that contains fitted niche models for almost 1500 plant species in Great Britain. The models have been fitted using multiple statistical techniques in order to make predictions of species occurrence from specified environmental data, including this woodland data. It also allows plotting of relationships between species' occurrence and individual covariates so that the user can see what effect each environmental variable has on the specific species in question.
The metadata are stored in the ISO 19115 (2003) schema (International Organization for Standardization, 2015) in the UK Gemini 2.1 profile (UK GEMINI, 2015). Users of the data sets will find the following documents useful: "Longterm ecological change in British woodland " (Kirby et al., 2005), "Woodlands Survey of Great Britain 1971-2001: dataset documentation"  (both supplied as supporting information with the data sets), "The effect of landscape-scale environmental drivers on the vegetation composition of British woodlands" (Corney et al., 2004) and the site reports, written by the Site Surveyors (2003).

Conclusions
The countryside of Great Britain and its woods have changed considerably over the last 50 years, for a variety of reasons. Some change has been gradual and can be attributed to factors such as evolving farming and forestry practices, climate change and atmospheric pollution. These have driven gradual responses in the composition and structure of woods. Other woods have undergone sudden change, in response to drivers such as the Dutch elm disease outbreak of the late 1960s and 1970s or the 1987 storm in south-east England.
The Woodland Survey of Great Britain thus provides a rare opportunity to explore the effects of a range of potential drivers of woodland change that operated between 1971 and 2001. It is a unique data set, consisting of a detailed range of ecological measurements at a national scale, covering a time span of over 30 years. It is also notable for the range of sites that have been revisited after such a long interval.
Author contributions. C. M. Wood prepared the manuscript with significant contributions from both co-authors, and is the current database manager for the Land Use Research Group at CEH Lancaster. S. M. Smart managed the survey in 2001 and has since carried out a wide range of analyses using the data sets described. R. G. H. Bunce designed the experiment (along with M. W. Shaw), ran the project in 1971 and made substantial contributions to the 2001 survey.