Standardised soil profile data to support global mapping and modelling (WoSIS snapshot 2019)

The World Soil Information Service (WoSIS) provides quality-assessed and standardised soil profile data to support digital soil mapping and environmental applications at broad scale levels. Since the release of the first ‘WoSIS snapshot’, in July 2016, many new soil data were shared with us, registered in the ISRIC data repository, and subsequently standardised in accordance with the licences specified by the data providers. Soil profile data managed in WoSIS were contributed by a wide range of data providers, therefore special attention was paid to measures for soil data quality and the standardisation of soil 10 property definitions, soil property values (and units of measurement), and soil analytical method descriptions. We presently consider the following soil chemical properties (organic carbon, total carbon, total carbonate equivalent, total nitrogen, phosphorus (extractable-P, total-P, and P-retention), soil pH, cation exchange capacity, and electrical conductivity) and physical properties (soil texture: sand, silt, and clay), bulk density, coarse fragments, and water retention), grouped according to analytical procedures that are operationally comparable. Further, for each profile, we provide the original soil classification (FAO, WRB, 15 USDA, and version) and horizon designations insofar as these have been specified in the source databases. Measures for geographical accuracy (i.e. location) of the point data as well as a first approximation for the uncertainty associated with the operationally defined analytical methods are presented, for possible consideration in digital soil mapping and subsequent earth system modelling. The latest (dynamic) set of quality-assessed and standardised data, called ‘wosis_latest’, is freely accessible via an OGC-compliant WFS (web feature service). For consistent referencing, we also provide time-specific static ‘snapshots’. 20 The present snapshot (September 2019) comprises 196,498 geo-referenced profiles originating from 173 countries. They


1
Introduction 10 According to a recent review, so far over 800,000 soil profiles have been rescued and compiled into databases during the past decades (Arrouays et al., 2017). However, only a fraction thereof is readily accessible (i.e. shared) in a consistent format for the greater benefit of the international community. This paper describes procedures for preserving, quality-assessing, standardising, and subsequently providing consistent world soil data to the international community as developed in the framework of the Data\WoSIS (World Soil Information Service) project since the release of the first snapshot in 2016 (Batjes et al., 2017); this 15 collaborative project draws on an increasingly large complement of shared soil profile data. Ultimately, WoSIS aims to provide consistent harmonised soil data, derived from a wide range of legacy holdings as well as from more recently developed soil datasets derived from proximal-sensing (e.g. soil spectral libraries, see Terhoeven-Urselmans et al., 2010;Viscarra Rossel et al., 2016), in an interoperable mode and this preferably within the setting of a federated, global soil information system (GLOSIS, see GSP-SDF, 2018). 20 We follow the definition of harmonisation as defined by the Global Soil Partnership (GSP, Baritz et al., 2014). It encompasses "providing mechanisms for the collation, analysis and exchange of consistent and comparable global soil data and information".
The following domains need to be considered according to GSP's definition: a) soil description, classification and mapping, b) soil analyses, c) exchange of digital soil data, and d) interpretations. In view of the breadth and magnitude of the task, as indicated soil gaseous emissions (Lutz et al., 2019). In turn, this type of information can help to inform the global conventions such as the

UNCCD (United Nations Convention to Combat Desertification) and UNFCCC (United Nations Framework Convention on
Climate Change), so that policymakers and business leaders can make informed decisions about the environment and human well-being. 5

WoSIS workflow
The overall workflow for acquiring, ingesting, and processing data in WoSIS has been described in an earlier paper (Batjes et al., 2017). To avoid repetition, we will only name the main steps here (Fig. 1). These successively are: a) store submitted data sets 10 with their metadata (including the licence defining access rights) in the ISRIC Data Repository; b) import all datasets 'as is' into PostgreSQL; c) ingest the data into the WoSIS data model, including basic data quality assessment and control; d) standardise the descriptions for the soil analytical methods and the units of measurement, and e) ultimately, upon final consistency checks, distribution of the quality-assessed and standardised data via WFS (web feature service) and other formats (e.g. TSV for snapshots). 15 Figure 1. Schematic representation of WoSIS workflow for safeguarding and processing disparate soils data sets.
As indicated, data sets shared with our centre are first stored in the ISRIC Data Repository together with their metadata (currently representing some 452,000 profiles), in particular the licence and data sharing agreement, this in line with the ISRIC Data Policy (ISRIC, 2016). For the WoSIS standardisation workflow proper, we only consider those data sets (or profiles) that have a 'nonrestrictive' Creative Commons (CC) licence, as well as the defined complement of attributes (see Appendix. A). 'Non-restrictive' has been defined here as at least a CC-BY (Attribution) or CC-BY-NC (Attribution Non-Commercial) licence. Presently, this 5 corresponds with data for some 196,498 profiles (i.e. profiles that have the right licence and data for at least one of the standard soil properties). Alternatively, some data sets may only be used for digital soil mapping sensu SoilGrids™, corresponding with an additional 42,000 profiles, corresponding to some 18% of the total amount of standardised profiles (~ 238,000). Although the latter profiles are quality-assessed and standardised following the regular WoSIS workflow, they are not distributed to the international community in accordance with the underpinning licence agreements; as such, their description is beyond the scope 10 of the present paper. Finally, several data sets have licences indicating that they should only be safeguarded in the repository; inherently, these are not being used for any data processing.

3
Data screening, quality control and standardisation

Consistency checks 15
Soil profile data submitted for consideration in WoSIS were collated according to various national or international standards and presented in various formats (from paper to digital). Further, they are of varying degree of completeness as discussed below.
Proper documentation of the provenance and identification of each dataset, and ideally each observation or measurement, is necessary to allow for efficient processing of the source data. The following need to be specified: feature (x-y-z) and time (t) referenced profiles and layers, attribute (class, site, layer-field, and layer-lab), method and value, including units of expression. 20 To be considered in the actual WoSIS standardisation workflow, each profile must meet several criteria (Table 1). First, we assess if each profile is geo-referenced, has (consistently) defined upper and lower depths for each layer (or horizon), and data for at least some soil properties (e.g. sand, silt, clay and pH). Having a soil (taxonomic) classification is considered desirable (case 1), though not mandatory (case 2). Georeferenced profiles for which only the classification is specified can still be useful for mapping of soil taxonomic classes (case 3). Alternatively, profiles without any geo-reference may still prove useful to develop pedotransfer functions (case 4 and 5); however, they cannot be served through WFS (because there is no geometry (x,y)). The remaining cases (6 and 7) are automatically excluded from the WoSIS workflow. This first, broad consistency check led to the exclusion of over 50,000 profiles from the initial complement of soil profiles. 5 Such profiles may be used to generate maps of soil taxonomic classes using SoilGrids TM (Hengl et al., 2017b). b Such profiles (geo-referenced solely according to their country of origin) may be useful for developing pedotransfer functions. Hence, they are standardised though not distributed with the snapshot as they lack (X,Y) coordinates. 10 c Lacking information on the depth of sampling (i.e. layer), the different soil properties cannot be meaningfully grouped to develop pedotransfer functions.
Consistency in layer depth (i.e. sequential increase of the upper and lower depth reported for each layer down the profile) is checked using automated procedures (see Section 3.2). In accord with current internationally-accepted conventions, such depth 15 increments are given as 'measured from the surface, including organic layers and mineral covers' (FAO, 2006;Schoeneberger et al., 2012). Prior to 1993, however, the begin (zero datum) of the profile was set at the top of the mineral surface (the solum proper), except for 'thick' organic layers as defined for peat soils (FAO-ISRIC, 1986;FAO, 1977). Organic horizons were recorded as above and mineral horizons recorded as below, relative to the mineral surface (Schoeneberger et al., 2012, p. 2-6).
Insofar as possible, such 'surficial litter' layers are flagged in WoSIS as an auxiliary variable (see Appendix B) so that they may be filtered-out during auxiliary computations of soil organic carbon stocks, for example.

3.2
Flagging duplicate profiles 5 Several source materials, such as the harmonised WISE soil profile database (Batjes, 2009), the Africa Soil Profile Database (AfSP, Leenaars et al., 2014), and the dataset collated by the International Soil Carbon Network (ISCN, Nave et al., 2017) are compilations of shared, soil profile data. These three datasets, for example, contain varying amounts of profiles derived from the National Cooperative Soil Survey database (USDA-NCSS, 2018), an important source of freely shared, primary soil data. The original NCSS profile identifiers, however, may not always have been preserved 'as is' in the various data compilations. 10 To avoid duplication in the WoSIS database, soil profiles located within 100 m of each other are flagged as possible duplicates. Upon additional, semi-automated checks concerning the first three layers (upper and lower depth), sand, silt and clay content, the duplicates with the least-comprehensive component of attribute data are flagged and excluded from further processing. When still in doubt at this stage, additional visual checks are made with respect to other commonly reported soil properties such as pHwater and organic carbon content. This laborious, yet critical, screening process (see Ribeiro et al., 2018) 15 led to the exclusion of some 50,000 additional profiles from the initial complement of soil profile data.

Ensuring naming consistency
A next, key stage has been the standardisation of soil property names to the WoSIS conventions, as well as the standardisation of the soil analytical methods descriptions themselves (see Appendix A). Quality checks consider the units of measurement, 20 plausible ranges for defined soil properties (e.g. soil pH cannot exceed 14) using checks on minimum, average and maximum values for each source data set. Data that do not fulfil the requirements are flagged and not considered further in the workflow, unless the observed 'inconsistencies' can easily be fixed (e.g. blatant typos in pH values). The whole procedure, with flowcharts and option tables, is documented in the WoSIS Procedures Manual (see App. D, E and F in Ribeiro et al., 2018).
It should be noted that all measurement values are reported as recorded in the source data, subsequent to the above consistency checks (and standardisation of the units of measurement to the target units, see Appendix A). As such, we do not apply any 'gap 10 filling' procedures in WoSIS, for example, when only the sand and silt fractions are reported, nor do we apply pedotransfer functions to derive soil hydrological properties. This next stage of data processing is seen as the responsibility of the data users (modellers) themselves, as the required functions or ways of depth-aggregating the layer data will vary with the projected use(s) of the standardised data (see Finke, 2006;Hendriks et al., 2016;Van Looy et al., 2017).

Providing measures for geographic and attribute accuracy
It is well known that 'soil observations used for calibration and interpolation are themselves not error-free' (Baroni et al., 2017;Cressie and Kornak, 2003;Folberth et al., 2016;Grimm and Behrens, 2010;Guevara et al., 2018;Hengl et al., 2017b;Heuvelink, 2014;Heuvelink and Brown, 2006). Hence, we provide measures for the geographic accuracy of the point locations as well as the accuracy of the laboratory measurements for possible consideration in digital soil mapping and subsequent earth system 20 modelling (Dai et al., 2019).
All profile coordinates in WoSIS are presented according to the World Geodetic System (i.e. WGS84, EPSG code 4326).
These coordinates were converted from a diverse range of national projections. Further, the source referencing may have been in decimal degrees (DD) or expressed in degrees, minutes, seconds (DMS) for both latitude and longitude. The (approximate) accuracy of georeferencing in WoSIS is given in decimal degrees. If the source only provided degree, minutes and seconds (DMS) then the geographic accuracy is set at 0.01, if seconds (DM) are missing at 0.1, and if seconds and minutes (D) are missing at 1.
For most profiles (86 %, see Table 2), the approximate accuracy of the point locations, as inferred from the original coordinates given in the source datasets, is less than 10 m (total= 196,498 profiles, see Section 4). Typically, the geo-referencing of soil profiles described/sampled before the advent of GPS (Global Positioning Systems) in the 1970s is less accurate; sometimes we 5 just do not know the 'true' accuracy. Digital soil mappers should duly consider the inferred geometric accuracy of the profile locations in their applications (Grimm and Behrens, 2010), since the soil observations and covariates may not actually correspond (Cressie and Kornak, 2003), this both in space and time (see section 4, second paragraph). As indicated, soil data considered in WoSIS have been analysed according to a wide range of analytical procedures, and in different laboratories. An indication of the measurement uncertainty is thus desired; soil laboratory-specific Quality Management Systems (van Reeuwijk, 1998) as well as laboratory proficiency-testing (PT, Magnusson and Örnemark, 2014;Munzert et al., 2007;WEPAL, 2019) can provide this type of information. Yet, calculation of laboratory-specific measurement uncertainty for a single method, respectively multiple analytical methods, will require several measurement rounds (years of observation) and solid statistical analyses. Overall, such detailed information is not available for the data sets submitted to the ISRIC data repository. Therefore, out of necessity, we have distilled the desired information from the PT-literature (Kalra and Maynard, 5 1991;Rayment and Lyons, 2011;Rossel and McBratney, 1998;van Reeuwijk, 1983;WEPAL, 2019), in so far as technically feasible. For example, accuracy for bulk density measurements, both for the direct core and the clod method, has been termed 'low' (though not quantified) in a recent review (Al-Shammary et al., 2018); using expert-knowledge, we have assumed this corresponds with an uncertainty (or variability, expressed as coefficient of variation) of 35 %. Alternatively, for organic carbon content the mean variability was 17 % (with a range of 12 to 42 %) and for 'CEC buffered at pH 7' of 18 % (range 13 to 25%) 10 when multiple laboratories analyse a standard set of reference materials using similar operational methods (WEPAL, 2019). For soil pH measurements (log scale), we have expressed the uncertainty in terms of '± pH units'.
Importantly, the figures for measurement accuracy presented in Appendix A represent first approximations. They are based on the inter-laboratory comparison of well-homogenised, reference samples for a still relatively small range of soil types. These indicative figures should be refined once specific, laboratory and method-related accuracy (i.e. systematic and random error) 15 information is provided with/for the shared soil data, for example using the procedures described by Eurachem (Magnusson and Örnemark, 2014). Alternatively, this type of information may be refined in the context of international laboratory PT-networks such as GLOSOLAN and WEPAL. Meanwhile, the present 'first' estimates may already be considered to calculate the accuracy of digital soil maps and of any interpretations derived from them (e.g. maps of soil organic carbon stocks in support of the UNCCD LDN (Land Degradation Neutrality) effort). 20

Spatial distribution of soil profiles and number of observations
The present snapshot includes standardised data for 196,498 profiles ( Fig. 2), about twice the amount represented in the 'July 2016' snapshot. These are represented by some 832,000 soil layers (or horizons). In total, this corresponds with over 5.8 million records that include both numeric (e.g. sand content, soil pH, and cation exchange capacity) as well as class (e.g. WRB soil classification and horizon designation) properties. The naming conventions and standard units of measurement are provided in Appendix A, and the file structure in Appendix B.
Being a compilation of national soil data, the profiles were sampled over a long period of time. The dates reported in the snapshot will reflect the year the respective data were sampled/analysed: 1397 (0.7%) profiles were sampled before 1920, 218 (0.1%) between 1921 and 1940, 7,657 (3.9%) between 1941 and 1960, 26,614 (13.5%) between 1961 and 1980, 62,691 (31.9%) 5 between 1981 and 2000, and 31,084 (15.8%) between 2001 and 2020, while the date of sampling is unknown for 66,837 profiles (34.0%). This information should be taken into consideration when linking the point data with environmental covariates, such as land use, in digital soil mapping. observations is 1.35 profiles per 1000 km 2 . The actual density of observations varies greatly, both between countries (Appendix C) and within each country, with the largest densities of 'shared' profiles reported for Belgium (228 profiles per 1000 km 2 ) and Switzerland (265 profiles per 1000 km 2 ). There are still relatively few profiles for Central Asia, South East Asia, Central and Eastern Europe, Russia, and the northern circumpolar region. The number of profiles by biome (Olson et al., 2001b) respectively broad climatic region (Sayre et al., 2014), as derived from GIS overlays, is provided in Appendix D for additional information. 5 There are more observations for the chemical data than the physical data (see Appendix A) and the number of observations generally decreases with depth, this largely depending on the objectives of the original soil surveys. The interquartile range for maximum depth of soil sampled in the field is 56-152 cm, with a median of 110 cm (mean = 117 cm). In this respect, it should be noted that some specific purpose surveys only considered the topsoil (e.g. soil fertility surveys), while others systematically sampled soil layers up to depths exceeding 20 m. 10 Present gaps in the geographic distribution (Appendix C and D) and range of soil attribute data (Appendix A) will gradually be filled in the coming years, this largely depending though on the willingness or ability of data providers to share (some of) their data for consideration in WoSIS. For the northern Boreal and Arctic region, for example, ISRIC will regularly ingest new profile data collated by the International Soil Carbon Network (ISCN, Malhotra et al., 2019). Alternatively, it should be reiterated that for some regions, such as Europe (e.g. EU LUCAS topsoil database, see Tóth et al., 2013) and the state of Victoria (Australia) , 15 there are holdings in the ISRIC repository that may only be used/standardised for SoilGrids TM applications due to licence restrictions. Consequently, the corresponding profiles (~42,000) are not shown in Figure 2 nor are they considered in the descriptive statistics in Appendix C.

5
Distributing the standardised data 20 Upon their standardisation, the data are distributed through ISRIC's SDI (Spatial Data Infrastructure). This web platform is based on open source technologies and open web-services (WFS, WMS, WCS, CSW) following Open Geospatial Consortium (OGC) standards, and aimed specifically at handling soil data; our metadata are organised following standards of the International Organization for Standardization (ISO-28258, 2013) and INSPIRE (2015) compliant. The three main components of the SDI are: PostgreSQL + PostGIS, GeoServer and GeoNetwork. Visualisation and data download are done in GeoNetwork with resources from GeoServer (https://data.isric.org). The third component is the PostgreSQL database, with the spatial extension PostGIS, in which WoSIS resides; the database is connected to GeoServer to permit data download from GeoNetwork. These processes are aimed at facilitating global data interoperability and citeability in compliance with FAIR principles: the data should be 'findable, accessible, interoperable, and reusable' (Wilkinson et al., 2016). With partners, steps 5 are being undertaken towards the development of a federated, and ultimately interoperable, spatial soil data infrastructure (GLOSIS) through which source data are served and updated by the respective data providers, and made queryable according to a common SoilML standard (OGC, 2019).
The procedure for accessing the most current set of standardised soil profile data ('wosis_latest'), either from R or QGIS using WFS, is explained in a detailed tutorial (Rossiter, 2019). This data set is dynamic; hence it will grow when new 10 point data are shared and processed, additional soil attributes are considered in the WoSIS workflow, and/or when possible corrections are required. Potential errors may be reported on-line via a 'google group' so that they may be addressed in the dynamic version (register via: https://groups.google.com/forum/#!forum/isric-world-soil-information.) For consistent citation purposes, we provide static snapshots of the standardised data, in tab-separated values format, with unique DOI's (digital object identifier); as indicated, this paper describes the second WoSIS snapshot. 15

Discussion
The above procedures describe standardisation according to operational definitions for soil properties. Importantly, it should be stressed here that the ultimate, desired full harmonisation to an agreed reference method Y, for example 'pH H2O, 1:2.5 20 soil/water solution' for say all 'pH 1:x H2O' measurements, will first become feasible once the target method (Y) for each property has been defined, and subsequently accepted by the international soil community. A next step would be to collate/develop 'comparative' data sets for each soil property (i.e., sets with samples analysed according to a given reference method (Yi) and the corresponding national methods (Xj) for pedotransfer function development. In practice, however, such relationships will often be soil type and region specific (see Appendix C in GlobalSoilMap, 2015). Alternatively, according to GLOSOLAN (Suvannang et al., 2018, p. 10) "comparable and useful soil information (at the global level) will only be attainable once laboratories agree to follow common standards and norms". In such a collaborative process, it will be essential to consider the end user's requirements in terms of quality and applicability of the data for their specific purposes (i.e. fitness for intended use). Over the years, many organisations have developed respectively implemented analytical methods, and quality assurance systems, that are well suited 5 for their countries (e.g., Soil Survey Staff, 2014a) or regions (Orgiazzi et al., 2018) and thus, pragmatically, may not be inclined to implement the anticipated GLOSOLAN standard analytical methods.

7
Data availability 10 Snapshot 'WoSIS_2019_September' is archived for long-term storage at ISRIC -World Soil Information, the World Data Centre for Soils (WDC-Soils) of the ISC (International Council for Science, formerly ICSU) World Data System (WDS). It is freely accessible at https://dx.doi.org/10.17027/isric-wdcsoils.20190901 . The zip file (154 Mb) includes a 'readme first' file that describes key aspects of the data set (see also Appendix B) with reference to the WoSIS Procedures Manual (Ribeiro et al., 2018), and the data itself in TSV format (1.8 Gb, decompressed) resp. GeoPackage format (2.2 Gb decompressed). 15

Conclusions
• The second WoSIS snapshot provides consistent, standardised data for some 196,000 profiles worldwide. However, as described, there are still important gaps in terms of geographic distribution as well as range of soil taxonomic units and/or properties represented. These issues will be addressed in future releases, depending largely on the success of our targeted requests and searches for new data providers and/or partners worldwide. 20 • We will increasingly consider data derived by soil spectroscopy and emerging innovative methods. Further, long-term time series at defined locations will be sought to support space-time modelling of soil properties, such as changes in soil carbon stocks or soil salinity.
• We provide measures for geographic accuracy of the point data as well as a first approximation for the uncertainty associated with the operationally-defined analytical methods. This information may be used to assess uncertainty in digital soil mapping 5 and earth system modelling efforts that draw on the present set of point data.
• Capacity building and cooperation among (inter)national soil institutes will be necessary to create and share ownership of the soil information newly derived from the shared data, and to strengthen the necessary expertise and capacity to further develop and test the world soil information service worldwide. Such activities may be envisaged within the broader framework of the Global Soil Partnership, and emerging GLOSIS system.   Kalra and Maynard, 1991;Rayment and Lyons, 2011;Rossel and McBratney, 1998;van Reeuwijk, 1983;WEPAL, 2019). These figures are first approximations that will be fine-tuned once more specific results of laboratory proficiency tests, resp. national Soil Quality Management systems, become available. 5 b Generally, the fine earth fraction is defined as being < 2 mm. Alternatively, an upper limit of 1 mm was used in the former Soviet Union and its satellite states (Katchynsky scheme). This has been indicated in file 'wosis_201907_layers_chemical.tsv' and 'wosis_201907_layer_physicals.tsv' for those soil properties where this differentiation is important (see 'sample pretreatment' in string 'xxxx_method', Appendix B).
c Provided only when the sum of clay, silt and sand fraction is ≥ 90 and ≤ 100 percent. d Where available, the 'cleaned' (original) layer/horizon designation is provided for general information; these codes have not been standardised as they vary 10 widely between different classification systems (Bridges, 1993;Gerasimova et al., 2013). When horizon designations are not provided in the source data bases, we have flagged all layers with an upper depth given as being negative (e.g. -10 to 0 cm that is using pre-1993 conventions; see text and WoSIS Procedures Manual 2018, p. 24, footnote 9) in the source databases as likely being 'litter' layers.
wosis_201909_attributes.tsv: This file lists the four to six letter codes for each attribute, whether the attribute is a site or 10 horizon property, the unit of measurement, the number of profiles respectively layers represented in the snapshot, and a brief description of each attribute, as well as the inferred uncertainty for each property (Appendix A).
wosis_201909_profiles.tsv: This file contains the unique profile ID (i.e. primary key), the source of the data, country ISO code and name, accuracy of geographical coordinates, latitude and longitude (WGS 1984), point geometry of the location of the profile, maximum depth of soil described and sampled, as well as information on the soil classification system and 15 edition. Depending on the soil classification system used, the number of fields will vary. For example, for the World Soil Reference Base (WRB) system these are: publication_year (i.e. version), reference_soil_group_code, reference_soil_group_name, and the name(s) of the prefix (primary) qualifier(s) respectively suffix (supplementary) qualifier(s). The terms principal qualifier and supplementary qualifier are currently used (IUSS Working Group WRB, 2015); earlier WRB versions used prefix and suffix for this (e.g. IUSS Working Group WRB, 2006). Alternatively, for 20 USDA Soil Taxonomy, the version (year), order, suborder, great group, and subgroup can be accommodated (Soil Survey Staff, 2014b). Inherently, the number of records filled will vary between (and within) the various source databases.
The corresponding field names are listed below: For example, in the case of electrical conductivity (ELCO), the method is described using: sample pretreatment (e.g. sieved over 2 mm size, solution (e.g. water), ratio (e.g., 1:5), and ratio base (e.g. weight /volume). Details for each method are provided in the WoSIS Procedures Manual (Appendix D, E and F in Ribeiro et al., 2018). xxxx _date array listing the date of observation for each value 15 xxxx _dataset_id abbreviation for source data set (e.g. WD-ISCN) xxxx _profile_code code for given profile (provides the link to profile_id in wosis_201909_profiles.tsv) xxxx _license licence for given data, as indicated by the data provider (e.g. CC-BY) (... ) as above, but for the next attribute (for full list see Appendix A) Format: All fields in the above files are tab-delimited, with double quotation marks as text delimiters. File coding is according to the UTF-8 unicode transformation format.
Using the data: The above TSV files can easily be imported into an SQL database or statistical software such as R, after 5 which they may be joined using the unique profile_id. Guidelines for handling and querying the data are provided in the WoSIS Procedures Manual (Ribeiro et al. 2018, p. 45-48); see also the detailed tutorial by Rossiter (2019).  (Olson et al., 2001a).

Continent
9 Competing interests. The authors declare that they have no conflict of interest.
The development of WoSIS has been made possible thanks to the contributions and shared knowledge of a steadily growing number of data providers, including soil survey organisations, research institutes and individual experts, for which we are grateful; for an overview please see https://www.isric.org/explore/wosis/wosis-contributing-institutions-and-experts. We thank our colleagues Laura Poggio, Luis de Sousa and Bas Kempen for their constructive comments on a 'pre-release' of the snapshot data. 5 Further, the manuscript benefitted from the constructive comments provided by the two reviewers.
ISRIC − World Soil Information, legally registered as International Soil Reference and Information Centre, receives core funding from the Dutch Government.