Comment on essd-2021-268

This paper outlines a unique dataset on ground temperatures recorded over a 3 year period on the high plateau of the Bale Mountains. The area is underrepresented in terms of past climate data records and the data is of clear use in understanding the current environment of tropical high mountains. It is also of interest as a dataset collected in remote circumstances and I think therefore that this paper and the data are of value and should be published. There are some concerns however. The data is somewhat messy, contains a lot of gaps, and the time period is not extensively long. The value would be significantly enhanced therefore if it was to be combined with attendant meteorological measurements (particularly air temperature) which have apparently also been measured in many of the same locations at similar hourly resolution.

(3493-4377 m a.s.l.) in the Bale Mountains, Ethiopia (2017-2020)" by Alexander R. Groos et al., Earth Syst. Sci. Data Discuss., https://doi.org/10.5194/essd-2021-268-RC2, 2021 This paper outlines a unique dataset on ground temperatures recorded over a 3 year period on the high plateau of the Bale Mountains. The area is underrepresented in terms of past climate data records and the data is of clear use in understanding the current environment of tropical high mountains. It is also of interest as a dataset collected in remote circumstances and I think therefore that this paper and the data are of value and should be published. There are some concerns however. The data is somewhat messy, contains a lot of gaps, and the time period is not extensively long. The value would be significantly enhanced therefore if it was to be combined with attendant meteorological measurements (particularly air temperature) which have apparently also been measured in many of the same locations at similar hourly resolution.
Since this is a data paper the communication of the dataset and its organisation are important. I am a little confused by the numbering of the locations/loggers and the rather unsystematic organisation of this aspect. I know several loggers were stolen and there are no observations as a result. I would recommend ignoring these and numbering the sensors with data starting with TM1 and GT1, and somehow creating a more logical order (maybe high elevation to low elevation sites in order?). Loggers that were lost don't really need numbers in the final dataset. Figure 3 can then be ordered in the correct order (starting with TM1 and GT1)… You could even number the loggers at the same location but different depths GT1h (2 cm), GT1m (10 cm) and GT1l (50 cm) -i.e, high, mid, low (or something similar, a, b and c) to make it more obvious which are in groups of three. A more intelligent numbering system would make the dataset easier to navigate and make it appear less a collection of disparate sub-experiments. Figure 1 needs to show all the weather stations (two are off the bottom edge of the map) and I think some contours would help show the hypsometry more clearly -the current shading is somewhat confusing and concentrates on landforms rather than elevation bands… perhaps 500 m contour interval?
There are two sets of calibration data between the high and low cost loggers in the lab and in the field. Again it would be best if this was clearly accessible and perhaps made available in a sub-directory, since this is important information. Having said this I am somewhat concerned that the field calibration is of limited use, since one logger was installed slightly lower than the other (page 10: line 32). This seems like an error. The details of the lab calibration in the text are vague (page 5, line 31). It says "several hours" and does not say under what controlled temperatures for example. If there is too much information for the main text put it in a text file with the calibration data.
I think the data itself might be clearer in 3 directories, a) raw data, b) corrected data and c) complete time series with gaps filled. The current structure is a little confusing. Put the readme files concerning each stage in the relevant directory.
I also have some comments about the analyses and findings.
Much of the data analysis concerns comparing ground temperatures with equivalent meteorological variables measured at Tuluka (3848 m). Since this AWS is not adjacent to most of the sensors a more logical choice would be Tullu Dimtu (4377 m) which apparently also has an AWS. Can you explain why this station is not chosen?. If it has similar data I would suggest using this site. Having said this, Figures 4 and 5 are very useful. Vertical lines separating each season (NDJF, MAM, JASO) would help the reader see the seasonal changes each year much more clearly. I am also wondering whether June is a short (but cloudy) break in the two wet seasons since it does appear that there is a short dry period around this time (any comments?). Maybe this is a fourth season?
The analysis of slope aspect is good (I would keep it) but it is not just the timing of the peak soil temperature that is changed due to aspect (page 12, line 14). The peak is much subdued on the north facing slope because the sky is cloudy during June when the sun is at its most northern point in the sky. The cloudless period coincides with when the sun is near its most southern trajectory. Thus, this explains the much higher readings recorded on the south face.
The lapse rate relationships in Figure 6 appear to be skewed by an outlier which is much warmer than expected given the elevation (I guess it is TM12 since its elevation is just below 3800 m), particularly in NDJF (when it is often sunny). I suspect therefore that this site is south facing (or has a distinct microclimate) and I would drop it from the lapse rate calculation.

Some specific comments:
Page 2, line 1. It is not always true that high mountains in the tropics receive more precipitation than adjacent lowlands (see Kilimanjaro for example) and precipitation often decreases on the highest summits. This statement is a bit misleading.
Page 2: line 12: elevation-dependent warming (and places elsewhere) Page 2: line 16: define longer records (I think the original context was >20 years) Page 5: line 21: TM04 was used as calibration but then no longer used in the field. Any reason? This whole section is a bit vague on detail. Table 1: It strikes me that the elevation range of 3493 m to 4377 m (marketed in the abstract) is rather optimistic since the lowest station was only recorded for a year, and without this station the range is only ~600 m.
Page 9: line 14 ff: It would be good to have the regression equations listed somewhere in the metadata files, rather than just r2 values and RMSE. This enables someone else to replicate your work.
Page 10: lines 23-28. This paragraph seems out of place… it is about reliability of method and should come after everything else about the dataloggers -or in the method section.
Page 11, line 6: where does the 2°C figure come from (data source?) Page 11, line 23: Also the solar angle is lowest in Dec/Jan -with a maximum elevation of only around 60° at the December solstice -yet it is overhead in Apr/Aug. Page 11, line 27: This is so much higher than the sites shown in Figure 4 (which are also at 2 cm) and must be a result of specific soil properties or the datalogger becoming exposed to radiation at the surface? Can you comment?
Page 11, line 30: cold air ponding is an interesting hypothesis but do you have any evidence? i.e. from air temperatures? It strikes me that an analysis of frost incidence at each site would be really interesting. Perhaps a histogram showing the number of hours below freezing and its seasonal distribution at each site would be a useful graph. This is especially important, given the context given for the research which is about permafrost and peri-glacial landforms.
Page 12, line 14: the sun is not in its zenith in Jan/Feb…. it is overhead in the southern hemisphere. It may have a high local angle of incidence on the south facing slope, but that is not the same thing.
Page 12, line 28: the five sites which are being continued need identification in Figure 1, Table 3.
Page 13, line 22: the mean annual air temp is how much lower than soil temp?…. some figures would be useful here Page 13, line 28: not just the timing, but the amplitude of the seasonal cycle Page 14, line 14: elevation-dependent warming (as earlier) The conclusions are a bit similar to the abstract. I note that ground frost is again mentioned here as a major finding, yet there is no analysis of this aspect (frost frequency).
I hope that these suggestions will improve the organisation and communication of the findings.