the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
The first rainfall erosivity database in Mexico: facing challenges of leveraging legacy climate data
Abstract. Soil water erosion (SWE) is the dominant soil degradation driver on a global scale. For quantifying SWE, erosivity is an index that reflects the potential (i.e., the energy) of rainfall to cause SWE. To support large-scale SWE studies and the assessment of the SWE process at the national scale in Mexico, the objectives of this research are a) to develop the first Mexican rainfall time series database for three climate normals CNs (1968–1997, 1978–2007, and 1988–2017) leveraging legacy climate data, and b) to estimate rainfall erosivity across continental Mexico by using daily rainfall time series. The workflow has three methodological moments: 1) development of the daily rainfall time series database, 2) identification of the best empirical relationship to estimate daily rainfall erosivity, and 3) estimation of the rainfall erosivity across Mexican territory. We compiled and harmonized 5410 rainfall time series (RTS) well distributed across the Mexican territory. We perform quality control and assurance, homogeneity analysis (using the normal homogeneity test), and the data gap-filling process (using the proportion method). Then, we tested three combinations of the α and β coefficients, proposed by three authors, in a power model to estimate rainfall erosivity; in this step, we used three validation databases (global, national, and local scales). Finally, we estimated the annual rainfall erosivity for all three CNs with multiple combinations of α and β coefficients. As principal results, the new database includes 1370, 1678, and 1676 RTS for each CN and its corresponding rainfall erosivity. The best parameter combination is the one proposed by Richardson et al. (1983) for all three validation databases. For the global and national databases, we observe a positive bias (Mean error of 956 and 324 MJ mm ha-1 h-1 yr-1, respectively); in contrast, for the local database, results show a negative and higher bias (Mean error of -3699 MJ mm ha-1 h-1 yr-1). About the erosivity estimation across the Mexican territory, the median values for rainfall erosivity for the three CNs were 3245, 3070, and 3327 MJ mm ha-1 h-1 yr-1, respectively. The statistical distribution of the erosivity values was right-skewed for the three CNs, with high erosivity values reaching >12000 MJ mm ha-1 h-1 yr-1 in all three CNs. The behavior throughout the year of the rainfall erosivity was similar for the three CNs. However, September had the highest contribution to the rainfall erosivity. The new database provides daily climatological data and analysis across Mexican territory through a multi-year period (1968 to 2017). Rainfall erosivity results support the study of SWE at the national scale by identifying areas with higher susceptibility to soil loss due to rainfall action and providing a more spatially dense and well-documented rainfall erosivity database. Following the FAIR principles (Findability, Availability, Interoperability, and Reproducibility) for scientific data, this database is available from a scholarly accepted repository https://doi.org/10.6073/pasta/e0dc8bd3501f8c19bb750e853c3289cb (Varón-Ramírez et al., 2025) for public consultation.
- Preprint
(6573 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on essd-2025-306', Anonymous Referee #1, 07 Jan 2026
-
RC2: 'Comment on essd-2025-306', Anonymous Referee #2, 06 Apr 2026
1. Significance
This manuscript presents the first national-scale daily rainfall time series and rainfall erosivity database for Mexico, covering three climate normals (1968-1997, 1978-2007, and 1988-2017). Mexico lacks publicly available sub-hourly rainfall data at the national level, which makes this contribution genuinely unique: no comparable effort exists in terms of spatial coverage, daily temporal resolution, rigorous quality control, and length of the study period for this country.The database has broad potential uses: as input to the USLE/RUSLE model for national-scale soil loss estimates, as a reference for validating global products (GloREDa, GloRESatE), for climate trend studies, hydrological modelling, territorial planning, and soil conservation. The authors explicitly identify these applications, which is appropriate.
Two aspects deserve attention regarding significance:
The coverage of the Mediterranean California ecoregion is very limited (absent in CN3, very sparse in CN1 and CN2). The authors acknowledge this, but it is not clear to what extent this limits the database's utility for that region.
The most recent period available ends in 2017. Given that the SMN source data were available up to 2022, the authors should explain in more detail why the climate normals are not extended or why a more recent supplementary period is not included.2. Data Quality
The methodological workflow is robust: quality control, homogeneity analysis (Alexandersson standard normal homogeneity test), data gap-filling (proportions method using the climatol package), and subsequent validation (McCuen test, 10% threshold). The use of a WMO-recommended framework (WMO 2020, 2023) is a significant strength. RMSE estimates for the gap-filling process are provided by ecoregion and month (Table A4).However, the following issues must be addressed:
2.1 Uncertainty propagation
The highest RMSE values correspond to the Tropical Rain Forest ecoregion (up to 12.11 mm in October for CN3) and the Great Plains in CN3 (13.17 mm in July). These values are considerable. The authors should discuss their implications for the quality of erosivity estimates in those ecoregions, given that erosivity depends non-linearly on precipitation (power model with β ≈ 1.81). No uncertainty estimates propagated from the gap-filling error to the final R factor are provided. This would be an important addition for end users.2.2 Numerical inconsistencies
Several numerical inconsistencies in the RTS counts must be resolved:
The abstract reports 1,678 RTS for CN2, but Table 1 shows a total of 1,679, and Section 3.1 states 1,676. This inconsistency appears in multiple locations and must be corrected and unified throughout the manuscript.
For CN3, the abstract cites 1,676 RTS, while Table 1 indicates 1,683, and Section 3.1 also states 1,683. The authors must review and reconcile these figures.
Table A3 shows "RS for data gap-filling process" as 1,479 / 1,776 / 1,723, whereas Table 1 shows 1,479 / 1,774 / 1,721 as "RTS before data gap-filling". This discrepancy of 2 units for CN2 and CN3 is unexplained.2.3 Data accessibility
The data are deposited in the Environmental Data Initiative (EDI) repository with a permanent DOI (https://doi.org/10.6073/pasta/e0dc8bd3501f8c19bb750e853c3289cb), and the R code is available via Zenodo. This complies with the FAIR principles. It is strongly recommended that the authors:
Include a detailed README file in the EDI repository describing variables, units, missing value conventions, and known limitations.
Clarify whether gap-filled values are distinguished from original observations in the released files (e.g., via a flag column indicating imputed vs. observed values). This is essential for users conducting trend analyses.3. Presentation Quality
The manuscript is well organised and follows a logical structure. The workflow diagram (Figure 2) is helpful for understanding the methodological process. The figures are generally of good quality. Figure 8 (R factor map for CN3) and Figure 9 (comparison with Panagos et al., 2017) are informative. However, the following aspects need attention:Figures:
Figure 6 (verification of the three models) has four panels. The caption states that panel (d) corresponds to the Michoacán database. However, the x-axis is labelled "EI30 Factor" when the values shown are actually EI30 estimated from daily-resolution data — not from high-resolution sub-hourly data. This may cause confusion and the legend/caption should be clarified.
Figures A3a, A3b, and A3c show the number of locations with erosive rainfall for each day of the year across the three climate normals, but the text only discusses CN3 in detail. A more explicit comparative discussion among the three normals would be beneficial.Language and abbreviations:
Lines 99-100: "The country is located between latitudes 14°W and 32°N and longitudes 86°W and 118°W" -latitude values cannot be expressed in °W. This appears to be a typographical error (should read °N).
The abbreviations "RTS" and "RS" are used interchangeably throughout the manuscript to refer to the same concept (rainfall time series). These must be unified.References:
The reference list is broad and appropriate. However, Cortés (1991) -a master's thesis- serves as the national-scale validation database and plays a central role in the study. Given its importance, the authors should ensure that this source is either fully accessible or that the methods used to obtain the data from it are described in greater detail.4. Re-usability of the Dataset
Based on the information provided in the manuscript and the data deposited in EDI, a user with basic knowledge of R and climatology would be able to re-use the database. The code available in Zenodo facilitates reproducibility. The database columns are described in Section 7. Nonetheless, the following improvements are recommended to facilitate re-use:
Include a detailed README file in the EDI repository.
Clearly flag imputed versus observed values in the gap-filled daily time series files.Summary of Recommendations
Major (must be addressed before acceptance):
Resolve all numerical inconsistencies in RTS counts across the abstract, main text, and tables.
Discuss uncertainty propagation from gap-filling errors to the R factor, especially for ecoregions with high RMSE.
Correct the typographical error in latitude/longitude notation at lines 99–100.Minor (recommended):
Include a detailed README file in the EDI repository.
Clarify the availability and accessibility of the Cortés (1991) dataset.
Unify the use of abbreviations (RTS vs. RS) throughout the manuscript.
Expand the comparative discussion among climate normals in Appendix Figures A3a–c.
Improve the legend of Figure 6d to avoid ambiguity regarding temporal resolution.Citation: https://doi.org/10.5194/essd-2025-306-RC2
Data sets
Daily rainfall series and rainfall erosivity in Mexico for three climatic normals (1968-1997, 1978-2007, and 1988-2017) V. M. Varón-Ramírez et al. https://doi.org/10.6073/pasta/e0dc8bd3501f8c19bb750e853c3289cb
Model code and software
Rainfall-Erosivity-Mexico: Rainfall- Erosivity-Mexico V. M. Varón-Ramírez https://doi.org/10.5281/zenodo.15468097
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 1,374 | 235 | 55 | 1,664 | 60 | 93 |
- HTML: 1,374
- PDF: 235
- XML: 55
- Total: 1,664
- BibTeX: 60
- EndNote: 93
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
I would first like to thank the authors for the effort invested in compiling and ‘harmonizing’ multiple historical data sources, which are not always easily accessible, particularly for researchers who do not work in Mexico. I also appreciate that the manuscript focuses not so much on a purely historical climatic analysis, but rather on how the R factor (rainfall erosivity) is estimated, whose utility—and potential future users of this dataset—is a key aspect of the work.
The manuscript “The first rainfall erosivity database in Mexico: facing challenges of leveraging legacy climate data” by Viviana Marcela Varón-Ramírez and colleagues provides a detailed dataset of historical precipitation time series for Mexico, applied to the estimation of rainfall erosivity. The dataset itself, as well as the calibration using several available empirical models, is interesting and represents a solid starting point. However, the applicability of the dataset would benefit from being presented more clearly, particularly in terms of its potential users and intended applications.
I do not have major concerns regarding the core content of the manuscript, and I have provided specific comments throughout the text that I hope the authors will find helpful and intuitive to address. My main concern—which may require more substantial work—does not relate to the dataset itself or its calibration, but rather to the discussion section. In its current form, the discussion is unsatisfactory and does not allow the reader to properly assess the potential usefulness or relevance of the dataset.
The discussion needs to be completely restructured in a more organized and focused manner, selecting and developing the strongest points of the article (some suggestions are provided in the annotated manuscript). In this sense, I consider and expect that this manuscript will be accepted subject to the revisions (they fall between minor and majors) so if authors handle the chaotic way in which they currently present their results I believe this paper can make in through and be a valuable asset for people in need of R-factor data/maps, etc.
If possible, it would be highly valuable for the authors to incorporate the suggestions provided in the manuscript so that the significant effort invested in compiling historical data for Mexico can be communicated more clearly and effectively. Revising the discussion may also require supporting it with a broader range of references than are currently included, depending on the final focus the authors choose to adopt.
Specific comments:
Authors can follow up -in my opinion- my comments and suggestions in a better way when they check their original manuscrit with anotated comments. I hope the editor finds this suitable given this manuscript structure of a ‘dataset-paper-like’.
typing errors are shortlisted within the original manuscript