Articles | Volume 18, issue 5
https://doi.org/10.5194/essd-18-3671-2026
© Author(s) 2026. This work is distributed under the Creative Commons Attribution 4.0 License.
OneDZ: a global detrital zircon database and implications for constructing giant geoscience database
Download
- Final revised paper (published on 01 Jun 2026)
- Supplement to the final revised paper
- Preprint (discussion started on 03 Jun 2025)
- Supplement to the preprint
Interactive discussion
Status: closed
Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor
| : Report abuse
-
RC1: 'Comment on essd-2025-157', Anonymous Referee #1, 03 Jul 2025
- AC1: 'Reply on RC1', Keran Li, 01 Aug 2025
- AC2: 'Comment on essd-2025-157', Keran Li, 01 Aug 2025
-
RC2: 'Comment on essd-2025-157', Bryant Ware, 05 Aug 2025
- AC3: 'Reply on RC2', Keran Li, 07 Aug 2025
Peer review completion
AR – Author's response | RR – Referee report | ED – Editor decision | EF – Editorial file upload
AR by Keran Li on behalf of the Authors (22 Oct 2025)
Author's response
Author's tracked changes
Manuscript
ED: Referee Nomination & Report Request started (07 Dec 2025) by Kirsten Elger
RR by Bryant Ware (18 Mar 2026)
ED: Publish subject to minor revisions (review by editor) (20 Mar 2026) by Kirsten Elger
AR by Keran Li on behalf of the Authors (29 Mar 2026)
Author's response
Author's tracked changes
Manuscript
ED: Reconsider after major revisions (04 Apr 2026) by Kirsten Elger
AR by Keran Li on behalf of the Authors (29 Apr 2026)
Author's response
Author's tracked changes
Manuscript
ED: Publish as is (01 May 2026) by Kirsten Elger
AR by Keran Li on behalf of the Authors (04 May 2026)
Manuscript
General Comments
The OneDZ manuscript presents a modern approach to detrital zircon U-Pb geochronology and Lu-Hf isotopic data compilation. It highly improves efficiency in data management and curation using AI and python scripts to automate data validation and cleaning.
The need for such a compilation is evident, as reflected by the large download count of the archival dataset that accompanies this manuscript.
While great effort was put into creating the protocols, scripts and backend of OneDZ, its functionality as a standalone browser tool requires further development.
Specific Comments
User-serving components
There seems to be some missing functionality on the OneDZ user interface, e.g. https://dedc.geoscience.cn/onedz/HomePage.html returns a 404 error and the other two menu links are inactive.
When performing a coordinate search on the OneDZ user interface there seems to be a certificate issue blocking HTTP API requests over the HTTPS domain and preventing data download. This should be addressed and part of regular maintenance if the database frontend is intended as a community resource.
The table data extraction tool in DeepShovel seems to have better accuracy than most commercially available OCR products.
Navicat is a commercial software, if the intention is to “enhance user-friendliness”, while providing an interface that is accessible consider using an open source software (e.g. DBeaver).
Manuscript comments
There is no reference throughout the paper of other existing databases such as EarthBank (https://ausgeochem.auscope.org.au/map) or Geochron (https://www.geochron.org/geochronsearch.php), which provide a similar product, with more user-friendly interfaces.
The authors stated that “Although almost no previous research summarized the difficulties in collecting data sources”. There is an extensive body of literature on this topic, here is just a recent example https://doi.org/10.3390/rs16091484.
The authors mention “To ensure accessibility and inclusivity, Chinese-language papers on detrital zircons have been meticulously translated into English.” This is a major effort that is highly welcomed by the international community. To further ensure accessibility and inclusiveness of data access, consider translating the menus and buttons of the user interface as well.
Regarding the documented “spatial skew” of the OneDZ dataset – a side-by-side comparison of OneDZ sample distribution maps with AusGeochem/EarthBank (a global compilation that began as a nationally focused effort) reveals that curatorial priorities also contribute to regional data availability.
How is the discordance ratio defined in the database and was it calculated for all papers in the same way? See https://doi.org/10.1016/j.earscirev.2019.102899 for discussion.
Since this contribution is focused on an SQL database, the most useful figure would be a database schema with tables and keys and relationships noted.
“Class-2 and Class-3 types provide a more nuanced classification based on grain size” - Class-2 seems to provide a classification based on lithology (conglomerate, sandstone, mudstone, etc.).
Please clarify if publication Best Ages are what users have access to in the database.
Please make the code you used for the two resampling methods and SMOTE available in the Github, supplements, and mentioned around rows 255 and 395 respectively in the preprint.
The term “Paleo globality” is not frequently used in Earth Sciences. Consider rewording to paleo reconstruction of spatial distribution (or equivalent) to avoid reader confusion.
“Therefore, the evaluation results based on OneDZ, the world's largest detrital zircon database, indicate that the global scope of zircon big data research needs further assessment.” It would be useful to postulate what types of assessment you are implying e.g. which current day areas require more sampling. Comparisons with other databases seem useful as well.
“The impact of data sparsity is controlled by the 2 σ error” While the errors might help with outlier identification, they do not control data sparsity. Consider rewording this sentence.
Technical Corrections
Pre-Print Technical Corrections and recommendations are presented as comments in the attached pdf file.
Zenodo Dataset
The organization of the Zenodo archival dataset is confusing. The first version of the dataset contains SQL files without any description. The SQL files are then referenced as strongly recommended for use in the description of version v2 but are not present in the file list. To improve findability of key files SQL files should be added to v2, or at least a note clarifying that the SQL files should be downloaded from v1. The warnings in notes 1-3 while pertinent, are not very specific to this dataset. Since there are known and systematic errors, they should be specifically documented (e.g. which Chinese, Latin and Arabic characters have not been converted correctly) and/or fixed, either with excel macros or AI cleaning. Documenting the cleaning process of the transformed dataset would result in an important contribution for the community at large and improving LLMs that also struggle with these types of data transformations.
Supplementary material
Some of the Github python scripts contain the same header block which states ”This module is mainly designed to remove duplicate samples”, even for modules that have over functions e.g. latitude and longitude estimation. Accurate code documentation is essential for reusability.
Congratulations to the authors for their sustained efforts and thoughtful considerations in improving access to geoscience data.