Review of the manuscript “New 30 m resolution Hong Kong climate, vegetation, and topography rasters indicate greater spatial variation than global grids within an urban mosaic” by Morgan and Guénard
General comments
The manuscript “New 30 m resolution Hong Kong climate, vegetation, and topography rasters indicate greater spatial variation than global grids within an urban mosaic” describes a high- to medium-resolution dataset of a large variety of topography, vegetation and climate rasters for the area of Hong Kong. The authors explain well the motivation and the usefulness of such a dataset emphasizing the applicability especially in Species Distribution Modeling. The selection of different variables, their elaboration and their evaluation are described in detail. While I cannot evaluate if the data manipulation was properly designed and following the standard manipulation procedures, the authors make a great effort to describe their executed procedure in detail. The vast and varied dataset along with the manuscript fit well into the scope of the journal “Earth System Science Data” and could be considered for publication after the authors address some of the comments and technical corrections.
The dataset DOI link works seamlessly and the reference to the discussion paper is provided on the dataset landing page. The authors could include a short instruction on how to cite the discussion/final paper as well as the dataset itself (consider some entries on the Pangaea repository (https://www.pangaea.de/) for nice examples). The few randomly selected datasets download and open (in two different GIS programs without any problems. The dataset names correspond to the descriptions in the discussion paper. However, on the Figshare page I was not able to locate the monthly zip files and the “readme” document (with file names, descriptions and summary statistics) that the authors describe in the “Data availability” section. The authors should upload these files on Figshare or modify the manuscript.
Specific comments
Title: For me personally the second part of the title (“indicate greater spatial variation than global grids within an urban mosaic”) is a bit redundant as it is common knowledge that finer resolution documents variation much better than a coarser resolution. In my opinion, the first part of the title perfectly describes the authors contribution and is adequate on its own. That said, I do not insist on changing the title and just provide my opinion.
P1 L2: Maybe “including” would be a more appropriate word than “particularly” in this context.
P4 L8-12: Here you basically reiterate the motivation you already explained in P3 L9-15. I suggest you remove the part on P4 or include some of the text from P4 in P3.
P6 L17-19 and Table 2: The abbreviations of the variables from Table 2 should be specified in the manuscript (for example: “maximum temperature (tmax)”) or/and in the table caption.
P6 L22-23: “When necessary, each predictor was statistically transformed to approach a normal distribution” What was the criteria you used to determine, whether it was necessary to transform a predictor?
P6 L29: The “AIC” abbreviation is not explained.
P7 L25: Maybe provide a reference for the standard equation?
Section 4.2: I have no experience in modeling climate interpolation, so I will not comment on the technical aspects and used variables, which, nevertheless, seem to be sufficiently described in Section 3.2. However, I do have some problems with understanding the climate interpolation modeling results. You describe monthly results of the climate variables, but I do not understand if this means monthly averages for a period of approx. 20 years (e.g. all the Januaries between 1998 and 2017) or monthly averages for every year (e.g. the average of a variable for January 1998). I think you refer to the first case, but in order to make the manuscript clearer, you should emphasize the considered period in parts of the manuscript and in the figures showing the results.
P8 L15: Why “Minimally”? Do you mean that 32,024 was the lowest used number of measurements for one of the variables? I suggest you rephrase the sentence, to make it clearer.
P8 L20: Why are only 3 of the 10 variables shown in Fig. 5?
P9 L29: Should you even refer to Fig. 1 in this part of the manuscript?
Section 4.2.4: When you compare your models with the WorldClim 2, you do not specify if both models cover the same (or at least a similar) time period from 1998 to 2017. Considering the changing climate, it is important to compare climates in the same time frame. If the models cover different time periods, you can still compare the models, but have to discuss the differences between them in light of the different time frame.
P11 L28-29: The sentence “Models projecting future climate scenarios would enable biodiversity change predictions, with additional variables like cloud cover and solar radiation useful.” is incomprehensible. Probably the comma and “useful” are remnants from a previous version of the manuscript?
References: As a reader I would prefer to have DOI’s (where available) included in the reference list. However, I do not know if DOI inclusion is obligatory in Earth System Science Data.
Table 2: It should be clearly indicated in the table and/or in the table caption, that SD is standard deviation. Additionally, from what was the ratio in this table calculated? From the description in the manuscript I suppose it is the ratio of the standard deviations of the two rasters, however the calculations are off (for example, in the first line: 1/0.5=2 and not 1.9)
Table 2 Caption: The last sentence “Increased standard deviation ranges from 1.4x to 3.4x.” is not very clear, as it is not explained which standard deviation the authors have in mind. Additionally, standard deviation values in Table 2 are much larger than 3.4. If the authors refer to the ratio, they should firstly calculate it again (see previous comment).
Figure S1: I really like this figure as it showcases the whole extent of the work the authors have done. In order to emphasize the several unprecedented datasets that the authors created, they could visually discriminate (by color maybe) between the datasets from other sources and the datasets the authors created themselves (for example: Elevation (30 m) vs. all the Relative Elevations)
Technical corrections
P1 L2: “However, these data, …” instead of “However these data, …”
P3 L11: “… Kong in …” instead of “… Kong, in …”
P5 L18: “… the raster package …” should probably be “… the R raster package …”?
P5 L19: Probably “Secondly, …” instead of “Second, …”
P5 L19-20: The sentence “Second, water proximity … surrounding a given pixel” is a bit difficult to read and understand. As they continue in the manuscript by using the term radius, they could maybe write something in the line of “… as the percent of land surface within a radius of a given pixel.”
P6 L3: Perhaps “… vegetated areas have a value of 0.” Instead of “… vegetated areas are 0.”
P6 L8: Perhaps “… linear regressions …” instead of “… linear regression …”
P6 L32: Probably “Firstly, …” instead of “First, …”
P6 L32: Probably “Secondly, …” instead of “Second, …”
P8 L23: “6° C” instead “6°C”
P8 L24: “> 900 m, < 18° C” instead of “>900 m, <18°C”; “> 24° C” instead of “>24°C”
P9 L4, L6, L10: “> 2500” instead of “>2500”; “< 1600” instead of “<1600”; “52 %” instead of “52%”
P9 L15, L16, L17: “15.5° C” instead of “15.5°C”; “19° C” instead of “19°C”; “90 %” instead of “90%”; “75 %” instead of “75%”
P9 L31: two times “2° C” instead of “2°C”
P10 L 13: “… exception are the … forests at …” instead of “… exception is the … forests, at …”
P10 L20, L26: Probably “Firstly, …” instead of “First, …”; probably “Secondly, …” instead of “Second, …”; “… array of rasters …” instead of “… array rasters …”
P11 L4: “… presented data …” instead of “… data presented …”; probably “Firstly, …” instead of “First, …”
P11 L9, L10: Probably “Secondly, …” instead of “Second, …”; “For example …” instead of “Forexample …”
P12 L5: “… foreseeable …” instead of “… forseeable …”
P12 L10: “Morgan and Guénard …” instead of “Morgan Guénard …”
Fig. 7 caption: “average low temperature” is probably “average temperature”? |