Virtual water trade and water footprint of agricultural goods: the 1961-2016 CWASI database

. To support national and global assessments of water use in agriculture, we build a comprehensive database of country-specific water footprint and virtual water trade (VWT) data for hundreds of agricultural goods. The water footprint, indicating the water needed for the production of a good including rainwater and water withdrawals, is expressed as a volume per unit weight of the good (or unit water footprint, uWF) and is here estimated at the country scale for every year in the period 1961-2016. The uWF is also differentiated, where possible, between production and 10 supply, referring to local production and to a weighted mean of local production and import, respectively. The VWT data, representing the amount of water needed for the production of a good and virtually exchanged with the international trade, are provided for each commodity as bilateral trade matrices, between origin and destination countries, for every year in the period 1986-2016. The database, developed within the CWASI project, improves upon earlier datasets because it takes into account the annual variability of the uWF of crops, it accounts for both produced 15 and imported goods in the definition of the uWF and it traces goods across the international trade up to the origin of goods’ production. The CWASI database is available on the Zenodo repository at doi.org/10.5281/zenodo.3987468 (Tamea et al., 2020) and welcomes contributions and improvements from the research community to enable analyses specifically accounting for the temporal evolution of the uWF.


Introduction
There has been a booming interest in the concept of Water Footprint (WF) since its introduction about 10 15 years ago (Hoekstra & Chapagain, 2007, 2008).The water footprint offers a common approach, language and method to a wide range of analyses and multidisciplinary studies, as well as and it is appreciated for its capability to convey environmental messages to the public.Similarly, theThe WF identifies the freshwater needed for the production of goods along the full supply chain, eventually separating between water from precipitationseparating rainfall (green) or and water from surface/ground-water bodies (blue).The WF assessment provides a quantitative framework to analyse the volume of water embedded in agricultural goods and the efficiency of water use, when the metric is computed per unit weight of each the good (hereafter referred to as the unit water footprint, or uWF).The term unit water footprint is here introduced to unify the current terminology which includes "water footprint", used indifferently for volumes and for volumes per unit weight, "crop water footprint" which excludes livestock products, or "virtual water content" mainly used within the context of trade (see for e.g., Hoekstra et al., 2011;Konar et al., 2011;Dalin et al., 2012;Tuninetti et al., 2015).Also the concept of virtual water, originally proposed by J. A. Allan (1998) and from which the WF originated, is has been growing in popularity among both the scientific community and the general public.Virtual water is the volume of water needed to produce a certain good that is virtually traded as a factor of production when the good is exchanged among countries.Such virtual flow defines the international virtual water trade (VWT) and represents a metric that is suitable to analyse environmental and multidisciplinary aspects related to the global trade of agricultural goods, to the water management and to the agricultural policy.Similarly, the WF identifies the freshwater needed for the production of goods along the full supply chain, eventually separating between water from precipitation (green) or from surface/ground-water bodies (blue).The WF provides a quantitative framework to analyse the volume of water embedded in agricultural goods and the efficiency of water use, when the metric is computed per unit weight of each good (hereafter referred to as the unit water footprint, or uWF).The term unit water footprint is here introduced to unify the current terminology which includes "water footprint", used indifferently for volumes and for volumes per unit weight, "crop water footprint" which excludes livestock products, or "virtual water content" mainly used within the context of trade (see for e.g., Hoekstra et al., 2011;Konar et al., 2011;Dalin et al., 2012;Tuninetti et al., 2015).
Assessment of WF and VWT requires a relatively large amount of data, combining including production and trade data (in tonnes) and unit water footprint data (in cubic meters per tonne).As for uWF data, theThe first remarkable effort about data sharing has been madedatabase of uWF data has been prepared and shared by the Water Footprint Network, which published a large open-access dataset of uWF for several primary and processed agricultural goods, having crop and animal origin (Mekonnen & Hoekstra, 2010a, 2010b).This database, named WaterStat, includes average values over the period 1996-2005 and has been the basis of the water footprint assessment as proposedpresented, e.g., in Hoekstra et al. (2011).Other uWF datasets exist, which are based on spatially distributed models coupling the soil water balance with vegetation growth (see, e.g., Tuninetti et al., 2015, and references therein); such databases mostly refer to a single year or a period or to long-term simulationsaverages.Other datasets, referring to blue water or to scarcityweighted indicators, are also available within from the literature related to the Life Cycle Assessment (e.g., Pfister et al., 2011;Pfister et al., 2016).The temporal variability of uWF has been seldom considered.Few examples include water scarcity indexes (e.g., Pfister & Bayer, 2014), or annual time series of uWF in the EORA database, based on assumptions about the economic growth of different production sectors (Lenzen et al., 2013).Recently, Tuninetti et al.Focusing on iInternational trade statistics of agricultural goods are organized and shared by, e.g., , physical trade flow databases are maintained for example by the Food and Agriculture Organization (FAOSTAT) or directly byand the United Nations (UN-COMTRADE).Early publications by the Water Footprint Network (e.g., Hoekstra & Chapagain, 2011) are based on the combination of such trade databases and WaterStat to produce WF assessments.Another relevant framework for tTrade data are also organized and shared asare Input-Output tables, tracing supply chains across sectors and countries, whose worldwide dimension is captured by global multi-regional input-output tables (MRIO) (see Tukker & Dietzenbacher, 2013, for a review).In such a framework, some MRIO databases offer specific water-related extensions, quantifying water volumes associated to international trade (e.g., Geschke & Hadjikakou, 2017).Two relevant examples are the EORA database (Lenzen et al., 2013) and EXIOBASE (Stadler et al, 2108), both including a water assessment distinguishing between green and blue water and including the temporal variability, although product categories and geographical regions are more aggregated than in the present study.Supply chains and trade of specific products, with their impact on the local environment and the water resources are also the objectives of the TRASE project developed by the Stockholm Environment Institute and the Global Canopy Programme (SEI, 2019).
Such project focuses on a limited set of products, although accurately investigating their supply chain and environmental effects.
In an attempt to classify different methods Methodologies for VWT and WF assessment, can be classified in two approaches can be identified: the bottom-up and the top-down approach.The bottom-up approach refers to a processbased analysis, with a detailed description of production processes and associated water volumes.Within such approach, the uWF of each good is multiplied by the (produced or traded) quantity of such good and resulting water volumes are then summed across goods.WaterStat is the main example of a bottom-up approach The top-down approach aims at tracing full supply chains throughout economic sectors and different countries.Input-output analyses, frequently used in Economics for environmental assessments, belong to this approach (Duarte & Yang, 2011).Bottomup approaches do not consider the entire supply chain of goods and can be affected by truncation errors when used to assess the water footprint of final consumption (Feng et al., 2011).At the same time, bottom-up techniques can offer high commodity resolution enabling, for example, to differentiateconsidering the water associated to the production of a large variety of single (agricultural) products.A major problem affecting bottom-up approaches is the identification of the geographic origin of produced goods (Hubacek & Feng, 2016).In many cases, product re-export disconnects producing and consuming countries, now allowing a correct identification of dependencies and externalities.In the present work, we improve the traditional bottom-up approach by identifying the origin of produced goods and reconstructing the supply chain of agricultural goods, implementing the method proposed in Kastner et al (2011).With such improvement, the VWT quantified in this study the present database aims both at best estimating the water embodied in bilateral trade and at providing accurate estimates of the total virtual water embedded in final consumption (Feng et al., 2011;Lenzen et al., 2013).
In this publication, we present an open-access database of virtual water trade, including the annual trade matrices (years 1986-2016) and the annual virtual water export (years 1961-2016) associated to a large number of agricultural products, as well as their unit water footprint in all countries (years 1961-2016), referring to the sum of green water (originated from rainfall) and blue water (originated from surface-and ground-water bodies).Starting from the uWF dataset in Mekonnen & Hoekstra (2010a, 2010b), we extend it to consider provide annual statisticsthe temporal variability of uWF.Improvements also include the differentiation between the production-side and supply-side of uWF.The new time-varying uWF are applied to the FAO datasets of agricultural production and trade., of country export, and to the reconstructed detailed trade matrices, generatingThe results of this analysis constitute the CWASI database.
The database addresses several needs: (i) the need for a comprehensive and harmonized database of uWF, WF and VWT, (ii) the need to adopt unit water footprints that vary in time, as recently pointed out by D'Odorico et al. ( 2019), (iii) the need to disentangle the production-side and the supply-side uWF to coherently assess the WF of production and consumption, (iv) the need for ready-to-use detailed trade matrices, accurately tracing goods' trade and origin, suitable for network analyses.The uWF dataset may also be useful for other methodologies of WF and VWT assessments, such as those based on input-output matrices or the one proposed in the ISO standardization (ISO, 2014).
The present database has been developed within the EU-funded CWASI project "Coping with WAter Scarcity In a globalized world" and it is shared through an online open-access repository (Tamea et al., 2020).In a relatively recent overview of the field, the research lines that originated from the concept of WF were identified (Hoekstra, 2017).These are (i) the role of trade and globalization in goods production and consumption and how they affect local water issues, (ii) the comparison of water requirements with water availability and renewability, (iii) and the supply-chain approach applied to water management, and (iv) the combination of green, blue and grey indicators in a single framework.With the CWASI database we aim at contributing to these research lines and provide all researchers with an up-to-date and ready-to-use starting point for their research.The database will welcome additions and external contributions that may possibly become available in the future and will represent an open and shared source of data on water footprint and virtual water trade.

Data and preliminary arrangements
From the database of the Food and Agricultural Organization of the United Nations (FAOSTAT, 2018), we collected 31 years  of trade data of agricultural goods, including crop-based and animal-based commodities.Data originate from national accountings and are available as records containingwith the following information: reporting country (with FAO code), type of trade (import or export), partner country reported within the trade record (with FAO code), year, commodity (with FAO code), unit of measure, quantity.
From FAOSTAT (2018), we also collected 56 years of data on country export , regardless of the destination, and of agricultural production data, having this information: producing/exporting country (with FAO code), year, commodity (with FAO code), unit of measure, quantity.
From FAOSTAT (2018), data of agricultural yield and harvested area have also been collected per each considered crop, country and year in the period 1961-2016.The rReference unit water footprint values for every commodity and country, averaged over the period 1996-2005, are taken from WaterStat (Mekonnen & Hoekstra, 2010a, 2010b), as well as the product fraction and the value fraction needed for the computation of the uWF of processed crops.

Commodities
Production and trade data collected from FAOSTAT (2018) include crops, processed crops, livestock primary, livestock processed and live animals.The commodities currently included in the CWASI database are 357 and are have been identified as those whose FAO code or item name or description could be associated to a WaterStat database entry (commodities are listed in Table SI.1).The number of cCommodities includes all products in the "Crop" production statistics of FAO, many processed crops with the exception of feed products (such as bran and cake); ), animals and animal-based products are included for most relevant species although not for others (such as camels, rodents, ecc.).Among all commodities, some are consideredappear in both trade and production data, some appear only in trade and some other appear only in production.The primary cause in the difference of total commodities considered in production and trade is that pProduction data are only available for primary goods and for few processed goods, while trade includes primary and a larger set of processed goods.For example, the flour of wheat or the bread are only available as trade data because production data only show include the primary commodity (wheat).Conversely, yams or sugar cane are only available as production data because their trade is not recorded in the FAO statistics, possibly because they are not internationally exchanged as raw product.The number of commodities, differentiated between production and trade Commodities have been subdivided into 9 categories whose numbers of produced and traded commodities, and it is are specified in Figure 1.The FAOSTAT database provides for each commodity and year the amounts of goods produced (or traded) in any given country (or pair of countries) expressed in tons or heads, depending on the type of product.

Countries
The analysis includesdatabase considers all geographical/political/economical entities reporting (or reported for) at least one product and one year, either in the trade or the production data.From 1961 to 2016, agricultural goods were produced and traded among 255 entities having a temporary or permanent activity (the full list is reported in Table SI.2).Not all the 255 countries were active along the whole considered period, as they underwent politicaladministrative changes.Examples include is the collapse of the USSR, the separation of Eritrea from Ethiopia, or the splitting of Belgium and Luxembourg, which were considered a single entity until year 2000.Despite being inactive, a country may be reported by partners as importing or exporting goods.Values reported for a country outside its range of active years are associated to the corresponding larger active country (e.g., a trade reported towards USSR in 1992 is associated to the Russian Federation).China has been considered as the aggregation of the following non-overlapping FAO entries: "China, Mainland", "China, Hong Kong SAR", "China, Macao SAR", "China, Taiwan Province of".
There are also few entries of unclear allocation (Neutral Zone, Unspecified): they are listed but values are not considered.The set of associations is detailed in Table SI.3The period of activity of each country is listed in the Supplementary Information, together with the associations adopted during country inactivity.

Trade matrices
The detailed trade data provided by FAOSTAT include the international trade records reported by each country.
Reporting countries across the years are 184, whereas the remaining ones (up to 255) are only reported by others.There is a total of 9 million records (i.e., trade flows per country pairs, per item commodity and per year, for the commodities included in the CWASI dataset) and the number of records reported by each country is detailed in Figure 2.These records are used to reconstruct the trade matrix  for each commodityitem and year, having dimensions 255 x 255 and carrying the exporting countries on the rows and the importing countries on the columns.The matrix element (, ) thus identifies the trade flow from country  to country , which is clearly different than the flow from country  to country , i.e. (, ) ≠ (, ).Sub-national trade is not considered in these matrices and the terms on the diagonals are zeros.
A major problem arising in the construction of the trade matrices is that the same trade flow can be reported twice in the FAOSTAT database, once by the exporting country and once by the importing country.When a trade flow is reported by only one of the two countries, the reported flow is considered inused to construct the matrix (single record); this is the case for 40% records in the database.All other records are "double" (reported twice) and require a comparison between the declarations of the exporting and the importing countries, which are generally quiteusually different, with and show a mean (absolute) relative difference, across all items, countries and years, of 61%.
The choice of a value from two double records is called "reconciliation" and it is required for 42% of the non-null flows in the final trade matrices.The the method here adopted method is based on the identification of the most reliable reporting country, among the two involved in each flow, and the use of the flow being reported by it.itreports.The reliability of countries is measured per commodityitem and per year with a data-based approach detailed in the followingbelow and adapted from Gehlhar (1996).

Country reliability
For each product, , and year, , two trade matrices are built, one matrix collecting all "Importer-Reported" flows and the other matrix collecting the "Exporter-Reported" flows.The matrices have the same structure and dimensions, with the exporter countries on the rows and the importing countries on the columns.Then a reliability index is calculated for each country, , differentiating between import and export.
First, an accuracy measure () is defined for every flux, from country  to country , as with (, ) being the importer-reported trade flux and (, ) being the exporter-reported flux.The measure is modified from Gehlhar (1996) to maintain the conceptual symmetry between import and export.The smaller is the measure and the more similar is the information reported by the importing and exporting country.
Then, the reliability of each country is measured, separately for import and export, based on the comparison between the flows reported by the country and by its trade partners.For every country, , the reliability index for imports,   (), and for exports,   (), are defined as follows: where (, ) is the flux from country  to , as reported by  (importer-reported), and (, ) is the flux from  to any country , as reported by  (exporter-reported) respectively.Σ all is the sum of all import or export fluxes reported by  and Σ acc is the sum of acceptable fluxes only, defined as the fluxes whose accuracy  (Eq.( 1)) is smaller than an acceptance threshold, set to 20% as in Gehlhar (1996).(, ) and (, ) in Equation ( 2) are, respectively, the import from, and the export to, the worse partner , defined as the ones having the maximum (worse) flow-weighted accuracy measure () defined, for import and export fluxes, as .
(3) Having computed all reliability indexes, the "reconciled" trade matrix for each item and year is built, combining importer-reported and exporter-reported data.In fact, eEach matrix element (, ) is taken from the  or  matrix if the importing country  or the exporting  has a greater larger reliability index respectively.Where the reliability indexes are equal, the country having larger acceptable fluxes is chosen.

Unit water footprint
The unit water footprint measures the amount of water required to produce a unit amount of product and it can be expressed as m 3 /ton or, equivalently, as l/kg.The present work considers the sum of green water (originated from rainfall) and blue water (originated from surface-and ground-water bodies).Depending on the type of commodity, different approaches are applied for the computation of the unit water footprint.In the present work we propose a differentiation between the uWF of production (uWFp) and the uWF of supply (uWFs) of primary crops.The uWFp refers to locally-produced crops whose water footprint depends on the crop actual evapotranspiration and crop yield, with values which are here estimated annuallyannual estimates starting from 1961.The uWFp is a suitable indicator to assess the WF of agricultural production.The uWFs, instead, refers to the domestic supply of primary crops, which relies both on local production and on international trade.Country-scale Ddomestic supply is available forused as human consumption, food manufacturing, feed for livestock and as export towards other countries.The impossibility to track local production and imports into consumption and exports, within each country, makes the uWFs the best indicator to be used in conjunction with consumption and export data.The uWFs of primary crops is computed averaging local production and imports, after having identifiedying the countries of origin of the goods with thanks to an appropriate procedure applicable from from year 1986.
For primary crops, it has been possible to estimate both the uWFp and the uWFs.Processed crops are produced from a root product which may or may not originate from local production.The absence of systematic FAO data about the production of processed crops prevents the differentiation between the unit water footprint of production and of supply.
Therefore, processed crops considered in this study will have a single unit water footprint, depending on country and year, computed applying some conversion factors tofrom the uWFs of the root product.Finally, animal-based products are here considered only with the WaterStat values, without temporal variability.

Unit water footprint of locally-produced primary crops in time
When considering the production of primary crops, the unit water footprint of production, uWFp, is a function of the actual evapotranspiration along the growing period of the crop and the crop actual yield.The present work considers the sum of green water (originated from rainfall) and blue water (originated from surface-and ground-water bodies).Due to precipitation, evapotranspiration, and yield fluctuations, the uWFp exhibits significant spatio-temporal variability.
We computed the uWFp in a given year by means of the Fast-Track (FT) method, introduced and substantiated inby Tuninetti et al. (2017).This method is based on the use of the WaterStat databaseset (Mekonnen & Hoekstra, 2010a, 2010b) for expressing the spatial variations of evapotranspiration and on a ratio of agricultural yields for expressing assumes that the temporal variability of the unit water footprint, not detailed in WaterStat., is mainly expressed by a ratio of agricultural yields.Yield implicitly expresses many factors, including environmental and climatic conditions, harvested areas and agricultural practices and its temporal variations dominate over the variability of the water volumes used (evapotranspired) by crops.
According to the Fast-Track method, the unit water footprint of an agricultural product  produced in country  in year , i.e.  ,, , reads: where  ,, ������������� is the reference unit water footprint provided by WaterStat (Mekonnen & Hoekstra, 2010a, 2010b) corresponding to an average in the period =1996-2005,  ,, ������ is the average crop yield over the same period , and  ,, is the annual crop yield in a generic year  in the range 1961-2016.The average crop yield is obtained as an average of the annual yields in the years 1996-2005, weighted by the harvested areas across the years in country .
Equation ( 4) is best referred to the total (green plus blue) unit water footprint, but it may also be applied separately to the two components of green and blue water.
The Fast-Track method keeps implicitly constant the actual evapotranspiration of crops, equal to the average in the period .equal to the long-term average used in the WaterStat statistics, but this hypothesis should come at no surprise.
On the one hand, yield implicitly expresses many factors, including climatic conditions, water availability, soil fertility and agricultural practices among others, and yield temporal variations dominate over the variability of the water volumes used (evapotranspired) by crops.On the other hand, the uWF is less sensitive to hydro-climatic conditions than actual evapotranspiration, because it is defined as the ratio between evapotranspiration and yield, both reacting with equal signs to hydro-climatic fluctuations (see, e.g., Doorenbos et al, 1979).
, equal to the average in the period .In fact, when the Fast-Track estimates of unit water footprint were compared to the results of a complete model based on a daily soil water balance fed by year-specific hydro-climatic variables, the errors were within a 10% range (see Tuninetti et al., 2017).The uncertainty introduced in the unit water footprint estimates with the Fast-Track method is This has been shown to introduce an uncertainty around ±10% of the uWFp estimates for wheat, maize, rice, soybeans, which is lower than the model uncertainty in the water footprint assessment (see Tuninetti et al., 2017).The Fast-Track method, initially applied to 4 crops (wheat, maize, rice and soybeans), has been here extended to a large set of primary products, including cereals, fruits, vegetables, seeds, luxury food and nonedibles.The extension is justified by the fact that similar error ranges are expected in all crops, because water stress affects the evapotranspiration of different crops in a similar way, the only difference being the phases of the growing periods affected by water stress and the crop coefficients describing the water requirements along the growing periods.
Water stress is assumed not to affect irrigated crops, implying that actual evapotranspiration matches maximum potentialthe crop maximum evapotranspiration in irrigated conditions.Uncertainty associated to the Fast-Track method has been sparsely checked on some other crops and the range of errors found in Tuninetti et al. ( 2017) has been confirmed.

Primary-equivalent trade matrix
For the correct identification of countries of origin of the crops traded internationally, the reconstruction of a primaryequivalent trade matrix,   , is necessary (Kastner et al., 2011).This is defined as where   is the trade matrix of any root-product,   is the trade matrix of the derived products (dp) and   ,   are the product fraction and value fraction which convert the derived products into a root-product equivalent quantity.The summation is extended to all derived products which originate from the same root product and, in the case of a multi-step supply chain, Eq. ( 5) is applied iteratively until reaching a root product that is also a primary crop.The product fraction,   , is defined as the weight of a derived product obtained from a ton of input product.For example, a ton of nuts with shells leads to   (<1) tons of shelled nuts.The value fraction,   , is the market value of the derived product divided by the aggregated market value of all derived products resulting from a ton of input product.For example, in a production process of wheat flour there are other economically valuable by-products (e.g., wheat germs to feed animals); hence, the value of wheat flour constitutes only a portion (i.e., the value fraction) of the total value generated by the process.Product fractions and value fractions used in the CWASI database are time-and space-invariant and are taken from Mekonnen & Hoekstra (2010a, 2010b), as well as the root products and the full supply chains of the considered commodities.

Supply-side unit water footprint of primary crops
The country supply of a primary crop results from the sum of local production and imports, where imports may occur from producing or non-producing countries, the latter case testifying a re-export of goods produced elsewhere.
Therefore, the unit water footprint of supply, uWFs, is proportionally contributed by local production and by trade, specifying the relative contribution of every country from which the goods originated from, considering re-exports and the goods processing of goods, if necessary.For each primary-equivalent crop and each year, we can define a column vector, S, containing the supply of all countries as rows.This vector is calculated as the sum of the production vector, P, and of the imports obtained from the bilateral trade matrix   , where   (, ) identifies the trade flow from i to j as where I is a column vector of ones (i.e., a summation vector) and   ′ is the trade matrix transposed.Hence, the uWFs of a country depends both on the domestic uWFp (through P) and on the uWFp of the origin countries, where the product is produced.
In order to trace the actual origin of the country's supply, namely tracing its origin back to the country where it was produced, we adopt the approach proposed by Kastner et al. (2011).First, we define a matrix R, where each element (, ) is the quantity of supply in country i that is produced in country j.A first approximation of R can be based on reported flows only, and being equal to the sum of a diagonal matrix with elements of the  vector on the diagonal, i.e.
However, this approximation misses the fact that exporting countries may obtain the exported products not only from local production, but also from import.To account for this fact, a matrix of export shares, AX, can be defined as where (, ) is the share of country j's supply that is exported to country i.The term ( −1 ) denotes a diagonal matrix made up by the reciprocal elements of S. In turn, the imported and re-exported products may partly originate from local production and import, and so on, recursively.It has been shown by Miller and Blair (2009), that such procedure converges to where I is the identity matrix.For further details and exemplification, see Kastner et al. (2011).
The R matrix identifies where the supply of each country originates from.Therefore, knowing the uWFp of the primary crop in such countries, we can now define the unit water footprint of supply in country c and year t of the primary product p, i.e. . (9) The evaluation of uWFs corresponds to a weighted average of the uWFp values, where the weights are the actual fractions of supply, S, traced back to their origins.Eq. ( 9) is valid for every primary crop p and year t, considering that trade matrices, production vectors and uWFp values change from year to year.It is worth noticing that because the trade matrices are available form 1986 only, the uWFs can be built from that year only.

Unit water footprint of processed crops in time
Processed crops are based on the processing of root products, which are available as country's supply..The timevarying unit water footprint thus depends on that of the root product and on the conversion factors, i.e. (Hoekstra, et al., 2011), where  ,, is the unit water footprint of the processed crop (or derived product, ),  ,, is the unit water footprint of supply of the root product, , from which it derives (), c and t are the country and year considered, respectively,   is the product fraction and   the value fraction of the processed crop.The method takes into account the temporal variability associated to both the crop production, through the Fast-Track method applied to the primary crop, and the evolution of trade, through the Kastner's method applied to the crop supply; the method does not include water inputs for processing of goods.When supply chains are formed by multiple steps, for example in the case of bread, made with flour, made in turn with wheat, Eq. ( 10) is applied routinely at each step.Within the CWASI dataset, the longest supply chain is made of 4 steps, leading to final products such as refined sugar or chocolate.
Equation ( 10) describes the unit water footprint without differentiating between production and supply.This is because the absence of FAO data of production of most processed crops hinders the application of the Kastner's method (Sect. 3.3), thus an explicit accounting of countries of origin of trade, as in the case of primary crops.However, the trade of processed crops is implicitly taken into account in the procedure, thanks to the use of the primary-equivalent trade matrix (Eq.( 5)) which serves to compute the uWFs of the primary crop.
For the very few derived products without indication of the root product (Mekonnen & Hoekstra, 2010a), an association is made which is based on logical considerations (such as "Figs dried" deriving from "Figs") or on similitudes of products.The "Sugar" products (Raw Sugar, Refined Sugar,…) were also missing the root-product, likely due to a lack of information.inthat it can be either "Sugar Beet" or "Sugar Cane".For these products, we have traced back the root product to the main product (beet or cane)most largely available as country supply (either "Sugar Beet" or "Sugar Cane").

Unit water footprint of animal-based commodities
Animal-based commodities are available from FAOSTAT as grouped in three categories: "Live animals", "Livestock primary", and "Livestock processed".Products of the first category are given in heads unit, which have been converted in tons according to FAO conversion factors (FAO, 2013).The missing conversion values for some countries (or animal products) have been assigned with an average value by category or considering similar animals.Due to the lack of reliable data about country-specific animal diets and its temporal variability as well as the lack of detailed trade matrices of feed crops, we do not currently provide a time-dependent unit water footprint for the animal-based commodities.
Nevertheless, we include these products in the present database adopting the country-specific values provided by WaterStat (Mekonnen & Hoekstra, 2010b) and without differentiating between production and supply (i.e.,  = ).These values take into account the feed-animal-commodity global supply chain, considering locally-produced and imported feed (Mekonnen & Hoekstra, 2012)  The water footprint of agricultural production in a country and year is obtained by multiplying the production data, expressed in metric tons, by the corresponding (commodity, country, year) unit water footprint, considering the unit water footprint of production,  ,, , in the case of primary crops.A problem arises when a country was not a producer in the 1996-2005 decade, thus it does not have an associated value in the WaterStat database.In such case, the uWF in the closest producing country within a certain distance (10°) is taken; if no producing countries are found (e.g., in the case of remote islands, adverse climatic conditions, or small producing areas), then the global average weighted by production, is used for that country.In the case of countries having experienced political discontinuity, for example belonging to a larger country before years 1996-2005 considered in the WaterStat database (e.g., USSR), the reference value of uWF required in Eq. ( 4) is computed as a production-weighted average of the values of countries belonging to the union and available in WaterStat.This average value is then used to reconstruct the annual uWF from 1961 up to the year of the disaggregation.After converting the agricultural production into water volumes, the overall water footprint of production is obtained by summing across all commodities.Care must be used to avoid double-accounting of water footprints of primary and derived goods.For this reason, only primary products must be considered in aggregated production data.In particular, when dealing with animal-based commodities, one should avoid the inclusion of livestock and their products as well as the inclusion of crops used to feed the livestock.Primary, or single-accounting, products to be included in the sum are indicated in the Supplementary Material, Table SI.1.
Computation of the supply-side unit water footprint of goods enables the fast computation of the water footprint associated to the consumption of commodities, under the hypothesis that consumption (and export) shares with the country's supply the same mix of local and imported goods.The water footprint of consumption in a country and year can then be obtained by multiplying the consumed quantity of each good by the unit water footprint of supply,  ,, (per commodity, country and year)., then summing across all commodities.In this case, there is no doubleaccounting issues.
Finally, theThe virtual water trade is obtained by multiplying trade data, expressed in metric tons, by the unit water footprint of supply,  ,, of the exporting country.Thanks to the new definition of supply-side unit water footprint, this computation allows one to take into account the origin of goods, which are traced back to their origin places across the supply chain, without being affected by re-export.In the few cases of goods exported from countries not having an associated uWF (less than 1% of trade links over the whole period, mainly from minor countries or remote islands), the global average uWF of supply is used, weighted by all countries' exports.

The uWF index
In the presentation of the results (Section 5),Results section, the (volumetric) water footprint and the virtual water trade are summed across different commodities and the overall trends are assessed in time.However, the unit water footprint of different commodities cannot be assessed as a whole, but only for one commodity at a time.To overcome such problem, an appropriate index is constructed in analogy to some economic indices aggregating prices of different commodities, calculated with the Laspeyres approach, such as the Agriculture Producer-Price Index (in FAOSTAT) calculated with the Laspeyres approach.The index is built as the inverse ratio between the water footprintWF of production (in m 3 ) of all commoditiesitems () in all countries () in year 2000 and the WF obtained with the same quantities (year 2000) but with uWF in year , i.e. • 100.

𝐼𝐼(𝑡𝑡) =
In such way, () expresses the variation of uWF across all agricultural commodities, weighted by the productions in year 2000,  (𝑖𝑖, 𝑐𝑐, 2000).A similar index as in Eq. ( 11) can also be built for trade using, e.g., the exports of each country in year 2000 as weights, thus leading to a uWF index for trade.In addition, indexes (for production or trade) referring to single categories of goods can be built by aggregating only the goods belonging to a given category.

Results
The importance of considering a time-dependent unit water footprint is highlighted in Figure 3 The time-varying uWF in the CWASI database is used to assess the temporal evolution of virtual water trade across the years, considering the contribution of different categories of goods.Figure 7 updates previous versions published in the literature (e.g., Konar el at., 2011;Carr el at., 2013;Tuninetti et al., 2017) by either introducing the temporal variability of the uWF of crop-based goods, expanding the number of considered crops and/or extending the temporal range considered.Total VWT has increased from 750 to 2400 km 3 /y in the considered period (about 1000 km 3 /y in 1986).
Major categories are cereals, luxury food, seeds/oils and vegetables, with the relative contribution of cereals, which was very large in the 60ies, being outperformed by the other categories in the most recent years.VW volumes associated to cereals has doubled in the considered period, while volumes associated to vegetables has grown 9-fold.The growth of animal-based products is remarkable, but it should be specified that it only reflects the increased trade quantity without considering the temporal variability of uWF.

Conclusions
The globalization of water resources through the international trade of food and agricultural goods is a remarkable global environmental change of our times, and the scientific community is devoting great effort to study it.The quantification of the volumes of water involved in the production and trade of agricultural goods is a key tool to investigate the water-food-trade nexus issues.This study presents an open-source database specifically developed for this purpose.The main outcome of this study is the time-varying unit water footprint in the years 1961-2016 and the virtual water trade matrices for the years 1986-2016 of hundreds of commodities form the food and agricultural sector.
The water footprint of production per commodity are also available annually in the period 1961-2016.The current database includes a total of 26.8 million data, half of them being elements of the trade matrices.Figure 8 shows the number of shared data per variable and per year.
The introduction of a supply-side estimate of the unit water footprint brings much more detail in the water footprint accounting.This is a new concept and it is a key tool in the expedite accounting of the virtual water trade and also of the water footprint of consumption.It also enables the overcome of previous problems related to the non-consideration of re-export and it also enables a more accurate assessment of virtual water trade, with the correct identification of countries of origin of traded goods.
The time-varying unit water footprints of crops and crop-based commodities are estimated with a simplified but robust method, affected by a known range of uncertainty.Single-year data should be used with care and be put in a temporal perspective or in a multi-year average, in order to minimize the error and to avoid misinterpretations of year-specific results. is still characterized by a shortcoming that can be improved in future assessments.This concerns with At present, the uWF of animal-based commodities, which is kept constant in time.This limitation can be overcome when reliable data on the country-specific feed composition and diet of each animal type will be available along the considered time period.Despite this shortcoming, the overall open-source database presented in this work aims to help the scientific community and policy makers to quantify and investigate the complex linkages between the global food system and water resource issues.Potential applications of the CWASI dataset range from supporting national-scale policies of water management as well as agricultural policies oriented to the optimization of water use or, ultimately, to provide indications for price formation or for trade agreements based on the efficient and sustainable use of water resources worldwide.The CWASI database is shared through the Zenodo online open-access repository (Tamea et al., 2020) and it is planned to be improved and updated in the future, capitalizing contributions from the overall scientific community.

(
2017) proposed a Fast-Track methodology to estimate annual adapt crop-based uWF valuesdata from WaterStat usingthrough agricultural yield data.

Formattato:
Figure 2 with the darker (lighter) line corresponding to the newest (oldest) values.Countries more involved in trade and reporting more information (to the left) are characterized, on average, by a larger reliability, while countries less involved in trade have lower average reliability, which used to be very low in the past.Current  values, instead, are more uniform across countries.

4
but are only available as time-averaged values over the period 1996-2005.Here data are generically referred to year 2000 and are arranged consistently with the rest of the CWASI database.Virtual water trade and water footprint indicators 4.1 Water footprint and VWT data

∑
, (,,)•(,,2000) ∑ , (,,2000)•(,,2000) , which shows the temporal trends of the global average uWF of production of some commodities.The global average is computed by weighting the uWFp of each country by the country production of such crop.The relevance of the temporal change is evident, ranging from 4000 to 1500 m 3 /ton over the whole considered period.It is worth noticing that the uWF od production of other major crops are shown in Tuninetti et al. (2017).The values considered in WaterStat refer to the period =1996-2005, highlighted by a grey shade in Figure 3.It is clear that the average value in such reference period is scarcely representative of the whole period considered in the present dataset.It is thus very important to consider the temporal variability of unit water footprint, especially in analyses spanning long periods or periods different than years 1996-2005.The temporal variation of the uWF of production of crops is marked all over the world.If compared to the values averaged over the period 1996-2005 (as in the WaterStat database), the uWFp computed with the Fast-Track method at the beginning and at the end of the considered period are very different.Figure4shows the relative change of the uWFp of wheat in 1961 and 2016 with respect to the 1996-2005 average.The variation is quite uniform worldwide with countries producing cereals with lower efficiency than major exporters (eg., the USA).Seeds/oils and luxury food had intermediate periods where trade was on average more efficient than global production, while non-edible goods are currently traded with low efficiency, i.e. the trade-weighted uWF index is larger than the production-weighted one.

Figure 1 : 610 Figure 2 :
Figure 1: Commodities considered in the analysis, split into 9 categories: number of commoditiesitems in the trade and production dataset.Icons from Flaticon.com.

Figure 7 :
Figure 7: Global virtual water trade (as derived from export data) from 1961 to 2016 considering the 9 categories of goods 630

Figure 8 :
Figure 8: Number of data in the CWASI database, per year and per variable (uWF: unit water footprint, WFP: water footprint of production, VWE: virtual water export, VWT: virtual water trade, n: number of commodities).