1State Key Laboratory of Severe Weather, Chinese Academy of Meteorological Sciences, Beijing 100081, China
2Hubei Subsurface Multi-scale Imaging Key Laboratory, Institute of Geophysics and Geomatics, China University of Geosciences, Wuhan 430074, China
3Key Laboratory of Geographic Information Science (Ministry of Education), School of Geographic Sciences, East China Normal University, Shanghai 200241, China
4College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
5Ministry of Education Key Laboratory for Earth System Modeling, Department of Earth System Science, Tsinghua University, Beijing 100084, China
6Meteorological Observation Center, China Meteorological Administration, Beijing 100081, China
7School of Environment and Spatial Informatics, China University of Mining and Technology, Xuzhou, China
8China Meteorological Administration, Beijing 100081, China
9State Key Laboratory of Atmospheric Boundary Layer Physics and Atmospheric Chemistry, Institute of Atmospheric Physics, Beijing 100029, China
1State Key Laboratory of Severe Weather, Chinese Academy of Meteorological Sciences, Beijing 100081, China
2Hubei Subsurface Multi-scale Imaging Key Laboratory, Institute of Geophysics and Geomatics, China University of Geosciences, Wuhan 430074, China
3Key Laboratory of Geographic Information Science (Ministry of Education), School of Geographic Sciences, East China Normal University, Shanghai 200241, China
4College of Informatics, Huazhong Agricultural University, Wuhan 430070, China
5Ministry of Education Key Laboratory for Earth System Modeling, Department of Earth System Science, Tsinghua University, Beijing 100084, China
6Meteorological Observation Center, China Meteorological Administration, Beijing 100081, China
7School of Environment and Spatial Informatics, China University of Mining and Technology, Xuzhou, China
8China Meteorological Administration, Beijing 100081, China
9State Key Laboratory of Atmospheric Boundary Layer Physics and Atmospheric Chemistry, Institute of Atmospheric Physics, Beijing 100029, China
Received: 03 May 2022 – Discussion started: 05 May 2022
Abstract. The planetary boundary layer (PBL) is the lowermost part of the troposphere that governs the exchange of momentum, mass and heat between surface and atmosphere. To date the radiosonde measurements have been extensively used to estimate PBLH; suffering from low spatial coverage and temporal resolution, the radiosonde data is incapable of providing the diurnal description of PBLH across the globe. To fill this data gap, this paper aims to produce a temporally continuous PBLH dataset during the course of a day over the global land by applying the machine learning algorithms to integrate high-resolution radiosonde measurements, ERA5 reanalysis, and GLDAS product. This dataset covers the period from 2011 to 2021 with a temporal resolution of 3-hour and a horizontal resolution of 0.25°×0.25°. The radiosonde dataset contained around 180 million profiles over 370 stations across the globe. The machine learning model was established by taking 18 parameters derived from ERA5 reanalysis and GLDAS as input variables while the PBLH biases between radiosonde observations and ERA5 reanalysis were used as the learning targets. The input variables were presumably representative regarding the land properties, near-surface meteorological conditions, terrain elevations, lower tropospheric stabilities, and solar cycles. Once a state-of-the-art model had been trained, the model was then used to predict the PBLH bias at other grids across the globe with parameters acquired or derived from ERA5 and GLDAS. Eventually, the merged PBLH can be taken as the sum of the predicted PBLH bias and the PBLH retrieved from ERA5 reanalysis. Overall, this merged high-resolution PBLH dataset was globally consistent with the PBLH retrieved from radiosonde observations both in magnitude and spatiotemporal variation, with a mean bias of as low as –0.9 m. The dataset and related codes are publicly available at https://doi.org/10.5281/zenodo.6498004 (Guo et al., 2022), which are of significance for a multitude of scientific research and applications, including air quality, convection initiation, climate and climate change, just to name a few.
This study developed a state-of-the-science method to derive a global-wide PBLH dataset merging in situ observations and reanalysis dataset, which has optimized the performance of a so-called “data fusion” technology and provided critical data for climate research. There are no obvious flaws in the methodology, and the final output is informative enough to compensate for the disadvantages of current atmospheric datasets existing as the spatial-temporal discrepancy. Despite the good structure and comprehensive analysis, the authors are required to answer or address the following questions or comments. After that, I think this manuscript can be accepted for publication.
Specific comments:
Line 123: It is suggested that the authors explain a little bit more of the relationship by a gradient of terrain or lower-tropospheric stability induced underestimation of the PBLH.
The title of the paper is ‘’…ERA5 reanalysis, and GLDAS’’. However, GLDAS didn’t occur until the last paragraph. It is suggested that the authors can add some descriptions of GLDAS.
Line 154: please clarify if the interpolation is based on altitude or elevation.
Line 158: It seems to me not correct to say spatially even coverage. The coverage in Australia is substantially not even especially in Figure 1d.
Line 173: Any reference for the definition of LST?
Line 207: how did the authors match the stational PBLH and gridded PBLH in the comparison?
Line 259: Please specify clearly if all the data from 2011-2021 were included in the model training stage. Were they divided by the measuring time (e.g., 0000, 0006…)?
A simple question: What is the merit of ~100/200 m improvement of PBLH (compared with the raw method) considering the future application of this dataset? Any impacts on climate-scale studies?
Technical corrections:
Line 99 and 116: the definition of ERA-5 should be moved ahead.
Please keep it consistent by using either ERA5 or ERA-5 in the whole manuscript.
Line 233: in the main text, the authors mentioned that Table 2 shows the correlation coefficients between PBLH and each variable, but the caption of Table 2 says that it is a correlation coefficient with PBLH bias between radiosonde and ERA5 reanalysis, which is easy to be misinterpreted. Please address.
Line 242, please use subscripts or other notations to mark PBLH-M and PBLH-E in the equation. Otherwise, it will be easy to be recognized as a minus.
A global continental merged high-resolution (PBLH) dataset with a good accuracy compared to radiosonde is generated via machine learning algorithms, covering a time period from 2011 to 2021 with a 3-hour and 0.25º resolution in space and time. The machine learning model takes parameters derived from the ERA5 reanalysis and GLDAS product as input while PBLH biases between radiosonde and ERA5 as the learning targets. The merged PBLH is the sum of the predicted PBLH bias and the PBLH from ERA5.
A global continental merged high-resolution (PBLH) dataset with a good accuracy compared to...