the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
CAMELS-NZ: Hydrometeorological time series and landscape attributes for Aotearoa New Zealand
Abstract. We present the first large-sample catchment hydrology dataset for Aotearoa New Zealand with hourly time series: the Catchment Attributes and Meteorology for Large-Sample Studies – Aotearoa New Zealand (CAMELS-NZ). This dataset provides hourly hydrometeorological time series and comprehensive landscape attributes for 369 catchments across Aotearoa New Zealand, ranging from 1972 to 2024. Hourly records include streamflow, precipitation, temperature, relative humidity and potential evapotranspiration, with more than 65 % of streamflow records exceeding 40 years in length. CAMELS-NZ offers a rich set of static catchment attributes that quantify physical characteristics such as land cover, soil properties, geology, topography, and human impacts, including information on abstractions, dams, groundwater or snowmelt influences, as well as on ephemeral rivers. Aotearoa New Zealand's remarkable gradients in climate, topography, and geology give rise to diverse hydroclimatic landscapes and hydrological behaviours, making CAMELS-NZ a unique contribution to large-scale hydrological studies. Furthermore, Aotearoa New Zealand’s hydrology is defined by highly permeable volcanic catchments, sediment-rich alpine rivers with glacial contributions, and steep, rainfall-driven fast-rising rivers, providing opportunities to study diverse hydrological processes and rapid hydrological responses. CAMELS-NZ adheres to the standards established by most previously published CAMELS datasets, enabling international comparison studies. The dataset fills a critical gap in global hydrology by representing a Pacific Island environment with complex hydrological processes. This dataset supports a wide range of hydrological research applications, including model development and climate impact assessments, predictions in ungauged basins and large-sample comparative studies. The open-access nature of CAMELS-NZ ensures broad usability across multiple research domains, providing a foundation for national water resource and flood management, as well as international hydrological research. By integrating long-term high-resolution data with diverse catchment attributes, we hope that CAMELS-NZ will enable innovative research into Aotearoa New Zealand's hydrological systems while contributing to the global initiative to create freely available large-sample datasets for the hydrological community. The CAMELS-NZ dataset can be accessed at https://doi.org/10.26021/canterburynz.28827644 (Bushra et al., 2025).
- Preprint
(7783 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
-
CC1: 'Comment on essd-2025-244', Sacha Ruzzante, 02 Jun 2025
Overall this seems like a valuable contribution to the growing number of CAMELS datasets. However, I have some suggestions to improve the usefulness of the data and to improve consistency with other CAMELS datasets.
- Can you include time series of glacier evolution, as was done for Camels-CH (Höge et al., 2023)? Or at minimum, have a static attribute that describes glacier cover for each catchment.
- Some of the static attributes are provided as categorical variables that indicate the dominant category (eg. land cover, geology). For many applications it is more useful to know the percentage of the catchment that falls into each category, rather than just the most common category.
- There are many useful static attributes that can be calculated but are not currently included. For example, soil characteristics from SoilGrids (Poggio et al., 2021), catchment average elevation, mean annual temperature, precipitation seasonality, etc. See other camels datasets or the Caravan project (Kratzert et al., 2023) for examples.
- Are there other climate datasets that could be included as well? For machine learning models previous work has shown that including several climate datasets in training usually improves overall model skill. For example, ERA5-Land (Muñoz Sabater, 2019), the New Zealand Reanalysis Dataset (Pirooz et al., 2023) CHIRPS (Funk et al., 2015), CPC (Chen & Xie, 2008), etc. You may want to look at how these were included in other camels datasets such as Camels-BR (Chagas et al., 2020). Some of these provide daily data only, but that is what many users will want anyway. For snow-affected catchments it would be useful to have a SWE product (eg. ERA5-Land).
- It would be useful to also provide daily aggregated streamflow and meteorology data. Most hydrologic models are built on daily data, and for benchmarking models across different research groups it is useful to know that everyone is using exactly the same data. Providing the daily aggregated data helps ensure this.
- The paper states “Information on how to obtain permission [for the 13 gauges that require it] is provided in the readme file”, but this is missing from the readme file.
- I’m not sure what the authors mean by the “original temporal structure” in the following:“All time series data are reported in the local time zone, and include the effects of daylight saving time (DST) where applicable. No corrections or transformations were applied to standardise timestamps across the dataset. This decision was made to preserve the original temporal structure of the observations.” It would be more useful if all timestamps were provided in standard time, and it is quite possible to do this while preserving the temporal structure of the data.
- There are some negative streamflow values. For example, station 29231, which has a number of timestamps with flow of -0.003 cms. What does this mean?
Chagas, V. B. P., Chaffe, P. L. B., Addor, N., Fan, F. M., Fleischmann, A. S., Paiva, R. C. D., & Siqueira, V. A. (2020). CAMELS-BR: Hydrometeorological time series and landscape attributes for 897 catchments in Brazil. Earth System Science Data 12(3), 2075–2096. https://doi.org/10.5194/essd-12-2075-2020
Chen, M., & Xie, P. (2008, January 8). CPC unified gauge-based analysis of global daily precipitation. Western Pacific Geophysics Meeting, Cairns, Australia.
Funk, C., Peterson, P., Landsfeld, M., Pedreros, D., Verdin, J., Shukla, S., Husak, G., Rowland, J., Harrison, L., Hoell, A., & Michaelsen, J. (2015). The climate hazards infrared precipitation with stations—A new environmental record for monitoring extremes. Scientific Data, 2(1), 150066. https://doi.org/10.1038/sdata.2015.66
Höge, M., Kauzlaric, M., Siber, R., Schönenberger, U., Horton, P., Schwanbeck, J., Floriancic, M. G., Viviroli, D., Wilhelm, S., Sikorska-Senoner, A. E., Addor, N., Brunner, M., Pool, S., Zappa, M., & Fenicia, F. (2023). CAMELS-CH: Hydro-meteorological time series and landscape attributes for 331 catchments in hydrologic Switzerland. Earth System Science Data, 15(12), 5755–5784. https://doi.org/10.5194/essd-15-5755-2023
Kratzert, F., Nearing, G., Addor, N., Erickson, T., Gauch, M., Gilon, O., Gudmundsson, L., Hassidim, A., Klotz, D., Nevo, S., Shalev, G., & Matias, Y. (2023). Caravan—A global community dataset for large-sample hydrology. Scientific Data, 10(1), 61. https://doi.org/10.1038/s41597-023-01975-w
Muñoz Sabater, J. (2019). ERA5-Land monthly averaged data from 1950 to present [Dataset]. Copernicus Climate Change Service (C3S) Climate Data Store (CDS). https://doi.org/10.24381/cds.68d2bb30
Pirooz, A., Moore, S., Carey-Smith, T., Turner, R., & Su, C.-H. (2023). The New Zealand Reanalysis (NZRA): Development and preliminary evaluation. Weather and Climate, 42(1), 58–74. https://doi.org/10.2307/27226715
Poggio, L., de Sousa, L. M., Batjes, N. H., Heuvelink, G. B. M., Kempen, B., Ribeiro, E., & Rossiter, D. (2021). SoilGrids 2.0: Producing soil information for the globe with quantified spatial uncertainty. SOIL, 7(1), 217–240. https://doi.org/10.5194/soil-7-217-2021Citation: https://doi.org/10.5194/essd-2025-244-CC1 -
RC1: 'Comment on essd-2025-244', Anonymous Referee #1, 17 Jun 2025
This paper describes a new open access dataset that will be of great value to hydrologists and others in the earth sciences. It is clear and well written and includes excellent background on NZ's climate, landscape and geology. I recommend that the paper is published if the following minor comments are addressed satisfactorily.
Lines 140-145: I would recommend using a more recent source for future estimates of temperature and rainfall changes instead of King 2010 as this is based on NIWA's modelling done nearly 20 years ago for AR4. For instance:
Peter Gibson, et al., 2025, Downscaled CMIP6 future climate projections for New Zealand: climatology and extremes, Weather and Climate Extremes, https://doi.org/10.1016/j.wace.2025.100784.
Also when discussing changes in climate, the reference period needs to be included; e.g. "expected to warm by 1 degC by 2040 relative to the 1986-2005 average". In addition, as currently written only a single scenario has been explored (probably King 2010's mid-range scenario), but this choice needs to be highlighted and the scenario information included for context.In the context of this dataset, it might also be of interest to describe how NZ's climate has changed over the past 60 years (for example referencing the 'seven station series' (https://niwa.co.nz/climate-and-weather/nz-temperature-record/seven-station-series-temperature-data) or the following recent research on changing climate normals, although there are many other studies that could be used.
Srinivasan, R., et al. (2024). Moving to a new normal: Analysis of shifting climate normals in New Zealand. International Journal of Climatology, 44(10), 3240–3263. https://doi.org/10.1002/joc.8521Figure 4 and related text: Fig 4a units should be in %. Coefficient of variation is a normalised ratio of SD to MEAN, so either expressed as a fraction or %. The incorrect units are also in Table 3.
Line 253: Related to above, the sentence "Rainfall variability is highest in the eastern parts of both of the North and South Island" is not correct. Annual rainfall variability relative to the annual mean is highest in east (i.e. coefficient of variation). Variability (e.g. variance or standard deviation) of annual rainfall is greatest in the higher elevation parts of the West Coast. Without showing a plot of annual mean rainfall, a reader unfamiliar with NZ, might come away from this part of the paper not realising that the highest rainfalls are on the west. I recommend that Fig 4 also include a map of mean annual rainfall.Citation: https://doi.org/10.5194/essd-2025-244-RC1 -
RC2: 'Comment on essd-2025-244', Anonymous Referee #2, 25 Jun 2025
I agree with the other comments posted about this manuscript - this new open-access dataset will be a great new resource for those interested in New Zealand hydrology and climatology. Similarly, I also find the manuscript to be well written and presented, and have only minor changes to suggest. These are as follows:
Introduction: The recency of its publication is probably the reason for this omission, but the ROBIN (Reference observatory of basins for international hydrological climate change detection) data set (Turner et al. 2025; https://doi.org/10.1038/s41597-025-04907-y) needs to be acknowledged here. As the name indicates, this is a global dataset of streamflow records that have minimal direct anthropogenic influence. Importantly, this includes 111 flow records for New Zealand. Although not diminishing the contribution that this new CAMELS-NZ dataset provides, it is important to note that another large streamflow dataset for NZ exists. Furthermore, it would be very helpful if an additional attribute could be added to this CAMELS dataset to indicate which of its records have passed the ROBIN standards around direct human influence.
Line 73 and elsewhere: it seems a bit odd for the paper to cite itself – surely this is unnecessary?
Line 87: Snelder & Biggs (2002) is not the best citation here, at least not in isolation. Although it has some relevance, it does not directly describe the nature of NZ gradients in climate, topography and geology, or the extent to which they are ‘remarkable’.
Lines 103-104: volcanic catchments might be typical of Pacific Islands, but alpine settings are not.
Line 114: Naming conventions. While Aotearoa is commonly added to New Zealand (i.e. Aotearoa New Zealand) for the name of the country, it is not the official name of the country. Contrastingly, the North and South Islands do have official names in Te Reo, according to the NZ Geographical Board: Te Ika-a-Māui and Te Waipounamu, respectively (https://gazetteer.linz.govt.nz/). Further, use of Aotearoa New Zealand vs just New Zealand in the manuscript is not consistent. Perhaps use the Te Reo plus English versions could be given at first use (for all NZ place names), then only English thereafter?
Line 140: Glacier formation is not quite the right term – this is a process that takes many years (and probably is not occurring anywhere in the world right now?). Perhaps glacier mass balance would be more appropriate.
Line 143: state the emissions scenario this projection corresponds to. Same comment for the following precipitation projections. Note that the King (2010) report has also been superseded by more recent assessments.
Line 249: Given that this paper describes a dataset, more specific information on estimation of PET would be helpful. Conventionally, the Priestley-Taylor equation uses an empirical constant to model the effects of vapour pressure deficit on evaporation, meaning that relative humidity data are not used. Similarly, net radiation (rather than temperature) data are used. However, the Clark et al. (2008) study that is cited here states that “radiation terms are estimated empirically”, followed by a citation to Shuttleworth (1993). Correspondingly, it would be helpful to state explicitly how PET is calculated for this data set, and thus how it differs from the conventional Priestley-Taylor method.
Line 224-225. It would be more accurate to state that whilst a very dense gauging network would be needed to capture the high spatial variability in rainfall for the mountainous regions of NZ, such a network does not currently exist.
Line 226-227. While the VCSN does provide these data, access to the VCSN is largely restricted behind a paywall (https://data.niwa.co.nz/products/vcsn-timeseries?_gl=1*1q6gqoh*_ga*MTg1MTUxMzY0OS4xNzUwODE2ODg1*_ga_4CXN46915J*czE3NTA4MTY4ODUkbzEkZzAkdDE3NTA4MTY4OTIkajUzJGwwJGgw ). If this new CAMELS-NZ dataset effectively bypasses that paywall for these catchment-average time series, this is good news and should be more clearly noted!
Some comment on the use of catchment average rainfall should be provided, particularly in regions of high rainfall gradients such as the eastern side of the Southern Alps Main Divide. In regions such as this where mean annual rainfall can drop by an order of magnitude across a catchment, how informative is catchment average rainfall? Perhaps an additional data set attribute quantifying this rainfall gradient would be helpful?
Finally with respect to the VCSN – previous studies have highlighted issues problems with these data with respect to interpolation across complex terrain using a sparse observational network, e.g. Tait and Macara (2014; https://doi.org/10.2307/26169743) and Tait et al. (2012; https://www.jstor.org/stable/43944886). These need to be acknowledged here.
Citation: https://doi.org/10.5194/essd-2025-244-RC2 -
CC2: 'Comment on essd-2025-244', Ather Abbas, 25 Jun 2025
I am the author of AquaFetch (https://github.com/hyex-research/AquaFetch), a Python library designed to unify open-source hydrological datasets within the Python environment. While integrating CAMELS-NZ into our package, we encountered a few issues and would greatly appreciate it if they could be addressed in a future release of the dataset:
- The streamflow file for station 74368 is empty.
- The index/datetime format in the streamflow file for station 57521 differs from that of the other stations.
- The time series files for all stations and all five features contain missing and duplicated datetime indices. This appears to be due to data missing or duplicated at the 02:00:00 timestamp on certain dates.
- Adding a new attribute to the shapefile to identify the Station_ID for each catchment polygon would be highly beneficial.
Thank you very much for making this dataset available to the open-source community.
Citation: https://doi.org/10.5194/essd-2025-244-CC2
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
249 | 45 | 20 | 314 | 8 | 14 |
- HTML: 249
- PDF: 45
- XML: 20
- Total: 314
- BibTeX: 8
- EndNote: 14
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1