the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
OpenMesh: Wireless Signal Dataset for Opportunistic Urban Weather Sensing in New York City
Abstract. We introduce OpenMesh, a publicly available dataset of wireless signal measurements from a community-run communication network in New York City. While originally designed for affordable internet access, these links can be used opportunistically for high-resolution weather monitoring in dense urban areas, providing 1-minute sampling and dense spatial coverage. Spanning eight months of measurements (November 2023 to June 2024), the dataset comprises 103 directional links in Lower Manhattan and Brooklyn, operating in three primary frequency ranges: 5–6 GHz (C-band), 24 GHz (K-band), and 58–70 GHz (V-band)—part of the millimeter-wave (mmWave) spectrum.
Our analysis incorporates meteorological records from the study period, including precipitation from local weather stations, thereby enabling real-time analysis of signal-weather relationships and expanding in-city applications through opportunistic sensor networks. During the study period, diverse weather events—ranging from intense rainfall that caused link attenuations of up to 30 dB and occasional outages, to snowstorms in Winter 2024—demonstrated the network’s potential for broader meteorological sensing. Analyzing multi-band observations provides valuable insights into emerging 5G/6G challenges and uncovers new opportunities for urban environmental monitoring. The OpenMesh dataset is available at https://doi.org/10.5281/zenodo.15268340 (Jacoby et al., 2025). By publishing both the datasets and our preliminary analyses, we hope to encourage further research that leverages wireless networks in dense urban areas for real-time sensing.
- Preprint
(8058 KB) - Metadata XML
- BibTeX
- EndNote
Status: open (until 20 Sep 2025)
-
RC1: 'Comment on essd-2025-238', Anonymous Referee #1, 12 Aug 2025
reply
-
AC1: 'Reply on RC1', Dror Jacoby, 08 Sep 2025
reply
Dear Referee #1,
We appreciate your detailed assessment and the issues you identified. Thank you for reviewing our manuscript and for your encouraging remarks on its quality and importance. We are confident that the planned revisions will improve the quality of both the manuscript and the data package. All changes will be clearly marked in the revised manuscript, together with an updated and enriched data package and repository.
Based on your review and suggestions, we will make two main improvements:
Dataset. The OpenMesh data package will cover all data used in the manuscript—including Weather Underground PWS and NOAA station records—to ensure reproducibility. As suggested, we will add hourly-resolution datasets from nearby stations. We will release a new Zenodo version that includes microwave-link measurements, PWS, and NOAA station data, with documentation and reproducibility scripts available in our Git repository.
Manuscript. The revised version will focus on the dataset description and limit content to elementary analyses derived directly from the dataset, in line with ESSD’s scope.
Please find below detailed responses to all comments.
(MC labels map to the referee’s numbering.)MC. Dataset and Repo Updates
MC1 — Add reference datasets (PWS/NOAA) for reproducibility.
Response: We will publish a new Zenodo version that consolidates all data used in the manuscript—microwave-link measurements, Weather Underground PWS, and NOAA station records—together with metadata, a README, and reproducibility scripts in our GitHub repository to ensure full reproducibility.MC1a — Include all PWS and NOAA actually used.
Response: Included as part of MC1: the Zenodo package will contain the NOAA weather-station data and the Weather Underground PWS records used in the manuscript.MC1b — Add hourly NOAA station data.
Response: We rechecked availability and, as suggested, confirmed hourly NOAA data from nearby stations; these will be added to the updated Zenodo archive, with sources and coverage documented in the README and workflow for reproducibility.MC1c — Consider adding MRMS radar products.
Response: We appreciate this suggestion. At this stage, we decided not to include MRMS radar data in the present Zenodo release. Incorporating and aligning radar products is beyond the current scope, as we focus here on near-ground sensors. We agree it is promising and will consider it for a future extension.MC1d — Add additional QC for PWS data.
Response: We will add additional quality-control routines for PWS data, with documentation and examples in the GitHub workflow.MC2. Microwave-link dataset
MC2a — Explain why only a subset of links is released.
Response: Since the published microwave dataset already provides dense coverage of sensors—both wireless links (and now also PWS)—within a relatively small urban area, we release only the available and usable data after filtering and checks to avoid redundancy. In the revised manuscript, we will explain the subset selection from the NYC Mesh network and why certain links are not included: (i) many 5 GHz links are extremely short and in some cases partly indoors, so they are only marginally affected by weather due to both short path length and low carrier frequency (see Fig. 2); (ii) several links suffer from large data gaps and reliability issues, which makes their inclusion problematic. We note that even among the ~100 released links there remain real-world challenges that are valuable for testing methods. These reasons will be stated explicitly so that the subset selection is transparent and well-justified.MC2b — Pair sublinks in NetCDF per OpenSense convention.
Response: We will restructure the NetCDF file so that, where available, sublinks are grouped into bidirectional pairs using a cml_id identifier, consistent with OpenSense conventions.MC3. Content and structure
MC3 — Streamline content to fit ESSD data-paper scope.
Response: We will focus on dataset description and reduce extended analyses, aligning the structure with ESSD expectations.MC3a — Improve data overview (maps, accumulations, official gauge).
Response: Following the addition of PWS and NOAA stations to the dataset, we will include a more detailed description of both metadata and raw data across all released datasets. Specifically, we will (i) replace or add a full-domain map showing all sensors and include zoomed insets of the urban core for specific areas to provide both an overview and detailed representation; and (ii) provide cumulative rainfall plots from PWS alongside official NOAA station data for comparison.MC3b–c — Sections 3 & 4 are long/overlapping; clarify and merge.
Response: We will merge Sections 3 and 4 into a single, more concise section. Its purpose will be stated clearly at the outset, with a brief background and references to standard methods. Only essential content needed to describe the dataset will remain. The merged section will focus on the main analysis and highlight the novel dataset results (MC3d). Detailed workflows and code will be moved to the supplementary materials and GitHub.MC3d — Emphasize novel results; reduce less-novel parts.
Response: We will highlight the novel results—particularly the multi-band case (Figure 8) and the snowfall events (Figure 10)—and reduce the length and emphasis of less-novel analyses. The text will be streamlined in line with ESSD expectations.SC. Specific comments — general notes from the referee
SC0 — Prioritize restructuring over line-by-line fixes.
Response: Structural changes outlined in MC3—merging §§3–4 and streamlining content to fit ESSD’s data-paper scope—will take precedence. Many specific notes will be resolved by this restructuring; any remaining items will be reviewed and addressed individually afterward.SC-L70+ — Language, precision, and objectivity.
Response: A full language/clarity pass is planned: wording will be tightened, phrasing made more objective, terminology unified, and redundancies removed. Later sections will be shortened to include only content directly relevant to the published dataset.SC. Specific comments — line items (future tense, minimal edits)
SC-L11 — Comment: “emerging 5G/6G challenges” phrasing.
Response: This sentence will be rephrased for clarity. “Emerging 5G/6G challenges and opportunities” refer to multi-band trade-offs: stronger weather attenuation at higher frequencies versus the greater robustness of lower frequencies for communication.SC-L18 — Comment: “satellites” vs. satellite microwave links.
Response: The text will clarify that the reference is to satellite microwave links (SML) used opportunistically, not to dedicated weather satellites.SC-L19 — Comment: References not fitting.
Response: References will be revised to works that directly connect accurate meteorological observations with applications (e.g., water management, agriculture, urban planning), as suggested.SC-L22 — Comment: “spatial detail” is incorrect; use “spatial representativeness.”
Response: Agreed — changed to “spatial representativeness.”SC-L27–31 — Comment: Sentence is hard to read; link between OS and AI forecasting is unclear; “diverse datasets” claim is not accurate.
Response: The intention is forward-looking: while current AI models are trained mainly on reanalysis or gridded radar data with homogeneous inputs, opportunistic sensing can provide fine-scale, heterogeneous observations. The sentence will be rephrased to make this argument clear and avoid confusion.SC-L32 — Comment: No need to differentiate “microwave” vs “mmWave”; simplify terminology.
Response: Agreed — terminology will be simplified.SC-L33 — Comment: Not all mmWave frequencies behave similarly; better state the bandwidth–attenuation trade-off (e.g., oxygen absorption near 60 GHz).
Response: We will state the frequency-dependent trade-offs (bandwidth vs. attenuation). The dataset includes V-Band links (58–70 GHz, incl. ~60 GHz) where oxygen absorption is relevant; related text will be reorganized for clarity.SC-L49 — Comment: “NGN” is unnecessary/unclear; either explain or remove.
Response: The “NGN” abbreviation will be removed; we will describe the NYC Mesh architecture directly to avoid jargon.SC-L50 — Comment: Define “co-temporal”: same period and/or same temporal resolution?
Response: We will rephrase for clarity: the datasets are time-aligned, covering the same observation period, with native resolutions (1–30 min) resampled during preprocessing for a unified structure.SC-L51 — Comment: “Fine-grained” is not the right term; ESSD is not for in-depth analysis.
Response: “Fine-grained” was meant as high-resolution; we will replace it with clearer wording and keep the focus on the dataset description.SC-L57 — Comment: “High-resolution mapping” is not done in §4.
Response: Agreed — the phrase “high-resolution mapping” will be removed from Section 4.SC-Fig 1 — Comment: (a)/(b) labels should be smaller; figure edits look rough.
Response: Fig. 1 labels (a/b) will be reduced and formatting improved.SC-L75 — Comment: Earlier only “backhaul” is mentioned; now “x-haul” appears without roles; is this needed?
Response: X-haul is mentioned only for context; in our dataset the links function as backhaul, which will be clarified.SC-L79 — Comment: With your mmWave definition, “high mmWave” is misleading; focus on actual covered bands.
Response: Text will be revised to emphasize the specific frequency bands included in the dataset.SC-L80–L84 — Comment: “Small cells” is unclear since the data are not from a cellular network; why is this mentioned?
Response: “Small cells” is used only as broader context in this sentence; we will clarify that the dataset is from a community mesh (not a cellular RAN) and frame the analogy as conceptual.SC-L107 — Comment: Clarify maturity across variables: rainfall established; humidity limited (E-band potential); wind only proposed; air quality indirect (David 2016); avoid implying parity.
Response: Agreed — text will be revised accordingly. This part will be revised to treat rainfall sensing separately as the most established application. Other phenomena will be explicitly qualified by maturity: humidity—limited, with some potential at E-band; wind—exploratory only; air quality—indirect (David 2016). Relevant references (e.g., Rubin et al., 2022; David, 2016) will be added, and wording will be adjusted to avoid suggesting parity with rainfall estimation.SC-L113 — Comment: Text sounds like OPENSENSE (COST Action) rather than GMDI; update per Fencl et al. 2025.
Response: Clarified — the sentence will be rewritten for precision.SC-L121 — Comment: Unclear sentence; “With …” may need removal.
Response: Corrected — rephrased for clarity.SC-L122–125 — Comment: Tone reads like an advertisement; is this truly a mesh? Topology description is unclear; be objective.
Response: The description will be revised to be neutral and precise. We will detail the NYC Mesh and focus on network topology and remove promotional phrasing (e.g., “fault-tolerant web,” “routing through multiple paths”) unless supported.SC-L156 — Comment: Role of OSPF unclear; UISP not introduced; assess relevance.
Response: Noted — wording will be updated for clarity.SC-L130 — Comment: Mesh structure not evident on map; most endpoints appear single-hop to hubs; be precise/neutral or show a map based on released data.
Response: This section will be revised to be neutral and technical, avoiding promotional language. The dataset is derived from NYC Mesh backhaul links with a hybrid topology: most endpoints connect in a star-like manner to hubs, and a number of hubs/supernodes are interconnected in a partial mesh backbone. As a result, the released dataset does not represent a uniform mesh across the city. The NYC Mesh design proposal (https://wiki.nycmesh.net/books/5-networking/page/mesh-design-proposal) explicitly discusses this trade-off between star-like and mesh-like structures.SC-L126 — Comment: Unclear sentence and relation to the dataset.
Response: Sentence will be clarified and explicitly linked to the dataset.SC-L128–L129 — Comment: Incomplete sentence.
Response: Sentence will be completed and tightened.SC-L160 — Comment: Should read “… that runs daily:”.
Response: Wording will be corrected.SC-L170+ — Comment: Why only 103 sublinks? What about others shown in Fig. 2?
Response: The released set is a curated subset of 103 sublinks (see MC2a): (i) many 5 GHz paths are very short/partly indoors and only marginally weather-sensitive; (ii) several links have large gaps/unreliable logging; (iii) archival during the study period is inconsistent. Fig. 2 will clearly highlight the released subset.SC-L171 — Comment: You also provide TSL, not only RSL.
Response: Dataset provides RSL only; any TSL fields will be removed or marked as missing and noted in the README.SC-L185 — Comment: “Official WS often follow …” — be precise; paragraph is vague; you mostly use PWS.
Response: Revised: official WS follow standardized WMO/ASOS protocols and serve as references; PWS provide dense local coverage in NYC.SC-L195 — Comment: Add brackets around el Hachem et al. 2024.
Response: Typo will be fixed.SC-L198 — Comment: Meaning of “responsiveness” unclear.
Response: Will be defined as reporting at short intervals in near real time; phrasing updated.SC-L202 — Comment: “Enabling high-resolution analysis …” sounds promotional; are records long/reliable enough?
Response: Clarified: WUnderground offers dense, time-stamped PWS records useful for event-scale studies; reliability and long-term trends depend on QC and record length; claims softened and references added.SC-L209 — Comment: Collector area must be in area units; “20 mm” unclear.
Response: Corrected: 20 mm was a diameter; converted to proper area units (cm²) consistent with tipping-bucket specs.SC-L210 — Comment: Bad sentence; comma/“and” placement.
Response: Sentence will be rephrased as suggested.SC-L224 +SC-L229 — Comments: Are airport stations official? Does WU include both PWS and official data? Why not use NOAA directly? Daily data vs. ASOS devices; why not higher-resolution NOAA portal data?
SC-L224 + SC-L229 — Comment: Are airport stations official? Does WU include both PWS and official data? Why not use NOAA directly / higher-resolution NOAA data?
Response: In the preprint, focus was on PWS for spatial alignment with links, and airport series were accessed via WUnderground. The revision will clearly state that airport stations are official ASOS sites (NOAA/FAA/DOD); source and cite official WS data directly from NOAA/ASOS at hourly (and higher, where available) resolution; separate PWS (WU) from official WS (NOAA/ASOS) in the text, README, and metadata; and update figures/tables and provenance so sources and attribution are unambiguous.SC-L236 — Comment: Show a map of all links and all other data in the open dataset (plus zoom if needed).
Response: Because coverage is not spatially continuous (Brooklyn core vs. a few Harlem links), a single city-scale map would hide short links. A full-extent overview with zoomed panels will be included so the entire released dataset (links, PWS, NOAA stations) is clearly visible and labeled.SC-L239 — Comment: “We focus …” — is analysis a subset, or is the released data this subset?
Response: Analysis uses the entire published dataset (103 sublinks), which is a curated subset of the larger NYC Mesh network.SC-L250 — Comment: PWS at 15 min; how resampled to 5 min? Why not keep original timestamps?
Response: Sensors operate at different, unsynchronized steps (e.g., PWS are user-maintened and reported, links at 56/64 s,), so series were resampled to a unified grid via simple linear interpolation for analysis-ready use. As suggested, WU datasets with native timestamps will also be published, with preprocessing scripts, so users can choose between original and analysis-ready versions.SC — Section 3 and subsections
Comment: Purpose/flow unclear; Fig. 4 details first, then §3.1 methods (k–R); §3.1–3.2 look like methods; see main comments.
Response: Agreed — Section 3 will be reorganized for a clearer flow between data, methods, and results. It will present dataset-driven analysis of NYC Mesh links:Comment: Equation 3 — clarify Le vs geometric length and why it matters.
Response: We emphasize that Le denotes the effective path length in the k–R relation. While it is often approximated as the geometric distance L, Le may be treated differently in other formulations—both for long-link models (e.g., Silva-type) and short-link correction models. This matters for datasets with diverse link lengths such as OpenMesh.https://digital-library.theiet.org/doi/abs/10.1049/ic.2007.0872
SC-L296 Comment: Is PWS data spatially interpolated, or is a mean used?
Response: No weighted interpolation was used; for each link the mean of all PWS within a 3 km radius (stated in the inroduction for analysis).SC — Figure 6a
Comment: CCDF steps/quantization; avoid early binning; clarify ΔA15\Delta A_{15} axis scaling.
Response: PWS values are averaged within 3 km of each link to represent overall rainfall in the link area, not single-point measurements or spatial interpolation.SC-L319
Comment: Citation format is wrong.
Response: Noted —will be updatedSC-L321
Comment: Refers to Fig. 3a though dry conditions are unknown there.
Response: The intended reference is Fig. 6a; this will be corrected.SC-L324
Comment: CCDFs are discussed as if events are temporally aligned; they are not.
Response: Will clarify that Fig.~6a compares independent CCDFs over the 8-month period (overall distributions, not temporally aligned events); sentences will be rephrased to separate PWS and link results.SC-L327
Comment: Confusing: did PWS capture localized showers or not?
Response: Will be clarified: extremes may reflect real localized showers captured by dense PWS coverage or sensor errors. Corrected units (same line): these extremes refer to rain rates (>50 mm h−1^{-1}), not attenuation (>50 dB km−1^{-1}).SC-L328
Comment: “Link collapse” suggests a mechanical failure.
Response: Rephrased to “link outage”.SC — Figure 9
Comment: Which rain-rate data were used to group attenuation?
Response: Attenuation for each link was grouped using rain rates from nearby PWS (per L296 / §3.2); the description will state this.SC-L344
Comment: Text says 15 min; axis shows 60 min.
Response: Correct aggregation is hourly (60 min); text will be corrected.SC-L364
Comment: Should be Fig. 7c vs. Fig. 6c; 24 GHz mention unclear.
Response: Corrected to the intended figure reference; the 24 GHz mention will align with the figure content and surrounding text.SC-L372
Comment: Outage gap not visible in time series—interpolated?
Response: The outage period was interpolated for plot continuity; we will mark this more clearly in the figure/caption.SC-L380
Comment: If noting gauge limits for non-liquid precip, also note microwave-link limits; links may indicate phase but QPE for snow/graupel/sleet is challenging.
Response: Will be clarified accordingly: microwave-link QPE for non-liquid precipitation is challenging; links can indicate phase changes but are not necessarily superior to gauges.Comment (L382): Identify devices at Central Park and the two other sites; are they dedicated snow-height sensors; how reliable/comparable are values?
Response: Details on the devices and their comparability will be provided within the published dataset description.Comment (Fig. 10): Clarify “cyan” vs “blue,” align axes, consider plotting all PWS series, and state PWS count and distances.
Response: The lighter blue will be labeled as cyan and shading clarified in the captions; axes will be aligned and a common y-axis considered; precipitation is averaged across PWS from the Brooklyn area to show general patterns, with snow measured at airports. This will be clarified in text and captions.Comment (L401): Daily NOAA data not shown; include higher-resolution NOAA data.
Response: Correct—these are daily totals summarized in a table; the text will note this explicitly, and higher-resolution NOAA datasets will be included as suggested.Comment (L404): The 39 mm likely refers to snow depth, not rainfall.
Response: Yes—39 mm refers to daily snowfall depth reported by the NOAA station; this will be stated explicitly.Comment (L411): Which sensors provide precipitation codes; why not show main classes in Fig. 10; if codes not used, omit.
Response: Codes come from ASOS airport stations (JFK, LGA); the text will be simplified to relevant examples and detailed codes removed since they are not shown in Fig. 10.Comment (L417): Redundant phrasing on dry vs wet snow; rephrase.
Response: Wording will be rephrased concisely to avoid repetition.Comment (L426): “Detect … effectively” overstates; be modest and define “effective.”
Response: Language will be moderated: links will be described as indicating frozen precipitation and highlighting transitions, without implying quantitative estimation or superiority.Comment (L432): “the applicability OS of rainfall” unclear.
Response: The sentence will be corrected for clarity.Comment (L434–L435): “classify and detect … solely from signal strength” sounds novel; rephrase.
Response: Wording will be softened to avoid implying novelty.Comment (Fig. 11): What rain classes are used for PWS and why not show rain rates?
Response: This figure will remain a capability demo using a known algorithm; the rain-intensity classes (as in the CCDF plot) will be listed, and exact values can be provided.Comment (Section 4 – general): Too much in-depth processing for a data paper; simplify and focus on dataset potential.
Response: Section 4 will be simplified to a minimal demonstration with references to standard methods; short-case metrics (e.g., NRMSE) will be removed; emphasis will shift to dataset potential and future analyses.Comment (L471): “everyday internet backhaul” is odd; use “standard internet access.”
Response: Wording will be revised to “standard internet access.”Comment (L474): “baseline calibration/broader weather patterns” unclear; how do low-freq links help?
Response: Wording will be clarified: low-frequency links will be presented as stable references in non-rain periods and for network-wide event indication; ambiguous terms will be removed.Comment (L474): What are the “additional attenuation factors”?
Response: These will be clarified as non-rain contributors (e.g., oxygen absorption).Comment (L474–L457): “critical consideration for hydrological sensing” unclear.
Response: The phrase will be removed; we will state plainly that strong oxygen absorption at ~60 GHz limits usable link length and constrains rainfall estimation, and exemplify this in the data.Comment (L478–L480): Thought 60 GHz issue is short path-length; clarify.
Response: The text will clarify that both oxygen absorption and short path-lengths limit 60 GHz rainfall estimation in this dataset.Comment (L484): State how many dual-band links are in the dataset.
Response: The count of dual-band links (60/70 GHz paired with 5 GHz backup) will be specified in the dataset description and Zenodo record and highlighted explicitly.Comment (L510): What is the “opportunistic-hydrology” codebase; link points to OpenSense group.
Response: It will be clarified that the OpenMesh NetCDF is interoperable with repositories under the OpenSenseAction GitHub organization and follows the OpenSenseAction data-format conventions; specific repositories used will be cited.Citation: https://doi.org/10.5194/essd-2025-238-AC1 -
AC2: 'Reply on AC1', Dror Jacoby, 09 Sep 2025
reply
Publisher’s note: the content of this comment was removed on 10 September 2025 since the comment was posted by mistake.
Citation: https://doi.org/10.5194/essd-2025-238-AC2
-
AC2: 'Reply on AC1', Dror Jacoby, 09 Sep 2025
reply
-
AC1: 'Reply on RC1', Dror Jacoby, 08 Sep 2025
reply
-
RC2: 'Comment on essd-2025-238', Anonymous Referee #2, 29 Aug 2025
reply
Reviewer #1 already provided a thorough assessment of the paper and identified several important areas for improvement, with which I fully agree. To complement their review and avoid redundancy, I focused on other aspects of the paper, particularly the data. In doing so, I identified a number of issues that should be addressed during revision.
- Data file OpenMesh.nc
- variable rsl appears to contain only two digit integer values. Yet, it is encoded as a float. A signed 8-bit integer would save a lot of storage space.
- variable tsl does not have a long name nor a units attribute. Also, all tsl values appear to be empty/missing. Why include them at all?
- String polarization: all links appear to have a “Vertical” polarization. In other words: this variable is constant. Not a big deal, since there aren’t that many links. But the fact that all links use vertical polarization could also be specified in the global header.
- The use of double precision floats for latitude, longitude and length is not ideal. A standard float type would be enough to encode these variables.
- Similarly, the use of a 64-bit integer for storing time makes little sense. Time is expressed in minutes and ranges from 0 to 354240. A regular 32-bit integer would work fine.
- Links_Metadata.csv:
- why is there a rsl column in this metadata file? The README does not mention this variable. Therefore, it’s not clear what it is supposed to represent. I assume it’s the median RSL level mentioned in the paper.
- Figure 4: Please specify which link (sublink number) is shown. Moreover, in the top panel, what do you mean by “Wireless signal strength”. That’s just the received signal level right?
- Figure 5: The caption of this figure should be “Histogram of median received signal level for the 103 sublinks during the measurement period.” (or similar). Overall, I do not find this figure to be very useful since the median RSL is not actually used as a dry-weather baseline. It would be more useful to know the range of variation (min-max) of the dynamically estimated baseline.
- Figure 6: The x-axis does not show the fraction but the percentage.
- Figure 7: Path-averaged specific attenuation (specific is missing).
- Figure 8: the left map does not have coordinates and/or a scale associated to it.
- Figure 11: In the caption, please explain what the rain classes represent and/or refer to the corresponding section in the paper.
- Page 6, lines 127-129: “In addition to operating public Wi-Fi hotspots, NYC Mesh maintains an open https://wiki.nycmesh.net/ (Wiki) and a https://github.com/nycmeshnet/nycmesh.net (GitHub repository), enabling collaborative updates and service enhancements.
- Please create a proper reference for the Wiki and the Github repository and cite it like you would cite any other source. Do not put URLs in the text directly.
- Proper referencing of github repository. My suggestion would be to create a persistent identifier (DOIs) for the current release of the Github repository using Zenodo. Since the current Github repository does not have a specific release version, people won’t be able to easily access the current codes and replicate results (assuming that the codes will change over time). For more information on how to create a release version of a repository and associate a DOI to it, see: https://docs.github.com/en/repositories/archiving-a-github-repository/referencing-and-citing-content
Citation: https://doi.org/10.5194/essd-2025-238-RC2 -
AC3: 'Reply on RC2', Dror Jacoby, 09 Sep 2025
reply
Dear Referee #2,
We thank you for your constructive review and for highlighting key improvements in our work for ESSD. Your comments will guide refinements in both the manuscript and the published dataset, ensuring clarity and long-term usability.
Please find below detailed responses to all comments.
Comment (Data file OpenMesh.nc): Issues noted with variable encoding .
Response: We follow the OpenSenseAction/netCDF_CML conventions (https://github.com/OpenSenseAction/OS_data_format_conventions/blob/main/netCDF_CML.adoc) to ensure full compliance with community standards for variable types, attributes, and metadata. While many of the reviewer’s points are valid in a general data-formatting context, strict adherence to these conventions guarantees interoperability with the broader netCDF ecosystem. Based on this framework, we address the comments point by point below.Spesific Comment (Data file OpenMesh.nc) RSL stored as float though values are integers, TSL missing attributes and unused, polarization constant, coordinates stored as double precision, and time stored as 64-bit int unnecessarily.
Response: RSL values appear integer-like since the device resolution is 1 dBm. Per the OpenSense CML conventions, RSL/TSL must be stored as float/double in dBm units (see the rsl (cml_id, sublink_id, time) convention), and we use float32 to preserve sub-dB precision while remaining compliant. TSL: As noted in the manuscript, only RSL is provided, since TSL is mostly constant in this dataset; the convention states that when RSL or TSL is constant, only the variable that changes is required. Therefore, to avoid confusion, the TSL entry will be removed from the NetCDF file, and we will clarify this in the manuscript. Polarization is defined per sublink as a string, and we will retain this variable for compliance while also noting in the global attributes that all sublinks are vertical. Latitude, longitude, and length will be standardized to single-precision float as suggested. Time. While the dataset would fit into int32 given its one-minute sampling resolution, the conventions require time to be stored in UTC seconds since 1970. This ensures consistency with other datasets that may use higher sampling rates and guarantees long-term usability. Therefore, we will continue to store time as int64.
Comment (Links_Metadata.csv): The unexplained RSL column is undocumented in the README.
Response: The RSL column reflects expected values provided by NYC Mesh for each sublink, but it will be removed to ensure consistency with the README and conventions and to avoid redundancy.
Comment (Figure 4): Sublink number should be specified and “wireless signal strength” clarified as RSL.
Response: We will specify the exact sublink shown and clarify that “wireless signal strength” refers to RSL.Comment (Figure 5): Caption should be clearer; median RSL is not very informative compared to baseline variation (min–max).
Response: We agree and will revise the figure to show baseline variation (min–max), while retaining the median RSL, which is used as a dry-period baseline for each sublink..Comment (Figure 6): X-axis shows percentages, not fractions.
Response: Correct — the x-axis shows percentages. We will update the label for clarity.Comment (Figure 7): “Specific” missing in the caption.
Response: Will be rephrased for clarity.Comment (Figure 8): Left map lacks coordinates/scale.
Response: As suggested, we will add coordinates and a scale bar to the left map to improve clarity.Comment (Figure 11): Rain classes in the caption should be explained or referenced.
Response: We will add a brief explanation of the rain classes in the caption and reference their definitions (WMO, 2018).Comment (Page 6, L127–129): Inline URLs should be replaced with proper references.
Response: Proper references for the Wiki and GitHub will be created and cited formally, with URLs removed from the main text.Comment (Repository referencing): GitHub repository should be archived with a Zenodo DOI for reproducibility.
Response: A Zenodo DOI will be minted for the GitHub release to provide a persistent identifier and ensure reproducibility. This update will be reflected in both the dataset package and the manuscript.Citation: https://doi.org/10.5194/essd-2025-238-AC3
Data sets
OpenMesh Dataset Dror Jacoby https://zenodo.org/records/15268341
Model code and software
read_OpenMesh_nc.py Dror Jacoby https://zenodo.org/records/15268341
Viewed
HTML | XML | Total | BibTeX | EndNote | |
---|---|---|---|---|---|
759 | 103 | 33 | 895 | 21 | 23 |
- HTML: 759
- PDF: 103
- XML: 33
- Total: 895
- BibTeX: 21
- EndNote: 23
Viewed (geographical distribution)
Country | # | Views | % |
---|
Total: | 0 |
HTML: | 0 |
PDF: | 0 |
XML: | 0 |
- 1
# Summary
The authors present OpenMesh, a dataset of signal level records from a network of microwave links in New York. This dataset is interesting and relevant because there is still only a small number of open datasets from microwave links available and the OpenMesh dataset is complementing these because of its urban setting and the large range of frequencies at which the microwave links are operated.
Based on a first quick look at the manuscript, I was excited to learn more about the details of the OpenMesh dataset with its urban setting, a mixture of low-frequency and high-frequency microwave links and the figure that shows snow events. However, I found the paper hard to follow because it does not stick to describing the published dataset and often expands into lengthy method descriptions, analysis and interpretation of the data. Given the data-focused scope of ESSD, the authors naturally do the analyses only in a preliminary manner so that conclusion are not final. Some limited analyses like this would be fine to show the value of the data, but, in my opinion the weight of these analysis is too large in the current manuscript. I base my assessment on the explanation of the content and scope of a „data description papers“ for ESSD which says
„Although examples of data outcomes may prove necessary to demonstrate data quality, extensive interpretations of data – i.e. detailed analysis as an author might report in a research article – remain outside the scope of this data journal“ from https://www.earth-system-science-data.net/about/manuscript_types.html.
In addition, many analyses are based on data that is not part of the published OpenMesh dataset, which severely limits reproducibility and makes these analyses fit even less for a ESSD paper. The writing in the paper is also of mixed quality and the structure should be improved, which could be achieved by focusing more on describing the dataset and having analysis and/or application only as a minor addition.
In conclusion, I find the dataset to be potentially very valuable and its description would fit well as a data paper for ESSD. However, given my arguments above, which I will lay out in more detail as main comments below, I suggest a substantial major revision of the manuscript and an associated extension of the published OpenMesh dataset.
# Main comments
1. Lack of reference and additional data in the OpenMesh dataset: A large portion of the manuscript describes or uses data (PWS, other WUnderground data, NOAA data) that is not part of the published OpenMesh dataset available on zenodo. Based on the description in the manuscript the WUnderground and NOAA data are openly available. Hence, I strongly suggest to add these to the OpenMesh data on zenodo. Not directly providing these datasets will introduce a significant hurdle in reproducing what is shown in the manuscript and additionally it will results in an unclear situation regarding reference and/or comparison data for future work with the OpenMesh data. Extension of the dataset is thus strongly recommended (and partially mandatory, see below) to allow successful usage similar to the existing OpenMRG and OpenRainER dataset (which provide station and radar data in addition). In the following I will give some suggestions ordered by decreasing priority:
1a. (Mandatory) Add all PWS data from WUunderground and all NOAA data that is used in the manuscript.
1b. (Strongly recommended) Check again for availability of high-resolution NOAA station data (see my specific comment) and add it.
1c. (Strongly recommended to increase impact of dataset) Add MRMS data, radar-only (5-minute) and/or radar-gauge-adjusted (1-hour) data that is freely available, see https://www.nssl.noaa.gov/projects/mrms/ or at https://registry.opendata.aws/noaa-mrms-pds/. This will also allow to investigate potential improvements of radar data with high-resolution microwave link data in urban regions, also for mixed of non-liquid precipitation where radars struggle.
1d. (Useful, but not strictly required) Perform additional quality control of the PWS data, with the existing tools mentioned in the article, and add these in addition to have more reliable station data which can serve as references if NOAA stations are too far from the microwave links to allow comparison with high-temporal resolution.
2. Issues regarding microwave link dataset: While it is nice to have the microwave link data also as NetCDF, in addition to the CSV files for raw and metadata, there is one important question and one suggested improvement.
2a. Not all data that seems to be available (see Figure 2) has been released. Why did you only release a subset and how was this subset chosen? In general, even if there are more challenges in the unpublished data, maybe due to artefacts or noise, it would be beneficial to be able to do an analysis on the full real-world dataset to also develop method that can cope with the additional challenges.
2b. The microwave link data in the NetCDF file is ordered by sublinks and is not grouped into pairs of sublinks as it is recommended by the OpenSense data format convention. This pairing would make it easier to analyse and process sublinks along the same paths. Since the data stems from a communication network it is very likely that there are only bidirectional connections, i.e. there should be at least two sublinks per path. If a separate low-frequency and high-frequency system is used they do not have to be combined (i.e. one hop having 4 sublinks) since they are typically different physical systems with different antennas. But for all bidirectional hops it would be convenient to have sublink pairs combined, e.g. by a cml_id, as it is often done with other microwave link data analysis.
3. Content and structure need improvement: The content goes beyond what is recommended for a ESSD paper. Reading the paper it feels like the authors were aware of the fact that in-depth analysis does not belong to an ESSD paper but they still tried to fit in as many interesting aspects as possible. This is understandable, but since the analyses can only be done superficially in such a paper, this distracts from the main scope of the paper without providing reliable insights. Below I list some specific points regarding content and structure.
3a. Give a better data overview: Figure 3a seems to not show all links. This should be changed. The NOAA stations seem outside of the domain of Figure 3a. If so, provide a zoomed out map so that one gets an idea of the distance between sensors. Figure 4 gives a nice overview of the occurrence of rainfall events but a accumulation plot of rainfall data, ideally from all PWS, would give a better idea of total rainfall sums over the dataset period and on potential spatial difference within the study domain. It would also be important to have at least one rainfall accumulation from an official rain gauge to get a clear idea of how the PWS compare to the official gauge.
3b. Section 3 is very long and mixes methods and analysis. Section 3.1 does not seem important since there is no in-depth analysis of quantitative precipitation estimation from the microwave links signal done in the manuscript and there is also no need to do one. Maybe one problem of section 3 is also that it is unclear what to expect from it. The names of the subsections and subsubsection, and/or the whole structure, could be improved. In addition it would be good to clearly state at the beginning of section 3 what is going to be presented here.
3c. Section 4 just shows a common and well established standard workflow to do rainfall estimation from a microwave link attenuation time series. In its current form this is not required in the manuscript. It has already been shown that microwave links with frequencies above 60 GHz can be used for rainfall estimation, e.g. in Fencl et al. (2020, https://doi.org/10.5194/amt-13-6559-2020). If feasibility with the OpenMesh data shall be shown, I suggest to leave out all details of the processing and reference to relevant literature and code (ideally providing it in the GitHub repo associated with OpenMesh). Ideally then the processing is done for several microwave links, but the authors should find a concise way of presenting these results to not steer to far away from the main goal of the paper, i.e. presenting the published dataset. In my opinion it is not required to add it to the manuscript, but having it in the example repository could be a nice addition. If a lot of effort is put into this, the processed rain rates could be added to the OpenMesh dataset, which would then would require to have the processing described in the manuscript. Given the potential challenges with the 5 GHz CMLs, it is probably better to leave the rain rate processing for future research, though.
3d. I want to highlight two of the presented results, also to provide an argument why other parts can be removed or reduced. The multi-band data shown in Figure 8 is really interesting because, to my knowledge, this is the first time a low- and high-frequency microwave attenuation time series is shown together with rain gauge data. As the authors points out, the increasing number of these systems, provide benefits e.g. during heavy rainfall when the high-frequency link might have an outage. Interesting research can potentially be done with this data in the future. The data from the snowfall days in Figure 10 are also very interesting, showing the different effect on low- and high-frequency microwave links. The results in Figure 6 and 7 are less novel, but still interesting. I just find the way they are presented and explained in the text suboptimal, maybe just too long. In general, I hope that pointing out some highlights here, will convince the authors that reducing the content in this manuscript is a plus and not a minus. I encourage them to take a brief look at the OpenMRG paper and take the length and structure from there as inspiration.
Below I provide a long list of specific comments. Note that some (or many of them) might stem from sections that should be shortened or removed. Do not take the existence of a specific comment as indication of relevance of a certain part of the manuscript. Consider restructuring/shortening based on my main comments and not the specific comments.
Additional note: From L70 onwards I decided to skip (most, but not all) comments on language and imprecise sentences. In particular towards the end of the paper I refrained from the cumbersome commenting on minor details, also assuming that portions of the text will be removed/rephrased/restructured anyway. The text of the whole paper should be checked carefully, making it more precise and objective. Many section should be shortened focusing on what is relevant for the presented dataset, avoiding lengthy description of less relevant details.
# Specific comments
L11: it is unclear what „emerging 5G/6G challenges“ in the context of multi-band observations should be. The observations are interesting, but also available in 4G network, or as is the case here, in microwave link networks that are not part the cellular network infrastructure. Please rephrase here.
L18: You most likely mean satellite microwave links and not only „satellites“ because, at least to my knowledge, there is no opportunistic use of other satellite data for environmental modelling. Please adjust, or make it clearer what you mean here.
L19: I do not find the two references very fitting here. They focus on nowcasting but do not directly analyse the impact of „accurate meteorological observations“ for water management, agriculture, urban planning, etc. I suggest to find better references here.
L22: „spatial detail“ is not correct here, better write „spatial representativeness“.
L27-L31: This sentence is hard to read. Furthermore, I do not see the clear connection between OS data and the AI-based improvements of weather forecasting. In particular, the statement that „diverse datasets“ have been used to train the AI systems does not seem correct. To my knowledge, all the big AI models for weather forecasting have been trained with ERA5, which strives to be the most consistent and homogeneous global gridded meteorological dataset. The nowcasting models, like DGMR are trained with gridded radar data, which I would also not call diverse since meteorological services put a lot of effort into making them as homogeneous as possible. Please rephrase this sentence to make it more readable and provide a clearer argument for how AI weather forecasting and OS are related.
L32: There is no need to differentiate between microwave and mmWave because the term microwave is not limited to centimetre wave frequencies. Furthermore, the links in the „mm wave frequency“ from 30-40 GHz behave very much like the ones with centimetre wave length in the range 20-30 GHz, while other „mm wave“ links at 70 or 80 GHz show some distinct behaviour, regarding water vapour and also the non-linearity of the k-R power law.
L33: not all mm wave frequencies are similar in that regard. Around 60 GHz there is significant attenuation from Oxygen which decreases with increasing frequency again. Your definition of mmWave is so that it starts at 30 GHz. Attenuation from rainfall is not yet too problematic regarding outages there. Only when going towards 100 GHz it is a problem with the available link budgets. Beyond 100 GHz atmospheric attenuation is severe anyway. It would be maybe better to just state here that there is a tradeoff because incasing frequency provides more bandwidth but atmospheric attenuation also increase.
L49: There is no need to introduce the abbreviation NGN. It is only used two times in the text. If this term is relevant here, please explain. Next-generation network sounds fancy, but it is unclear if this is something existing or something still in development. Is the OpenMesh dataset from a NGN and others, e.g. OpenMRG, are not? Are all modern cellular networks NGNs?
L50: What does „co-temporal“ mean? Do they have the same temporal resolution?
L51: „Fine-grained“ does not seem to be the right term here. Anyway, a data paper in ESSD is not the right place for doing an in-depth analysis of a certain dataset, if this is what is meant here.
L57: I do not see where „high-resolution mapping“ is done in section 4. Please correct.
Fig 1: The (a) and (b) should be smaller in this figure. Also in some other figure it looks like it was added too quickly with some image editor.
L75: Above you only write about „backhaul“, now you differentiate the x-haul options but it is not clear which one does what. Is this important in this paper?
L79: According to your definition of mmWave (30-300 GHz), „high mmWave bands“ would mean something above 150 GHz. Operational systems with such high frequencies are not yet available, though. As written above, I suggest to focus on the covered frequency bands instead of highlighting the use of mm wave frequencies.
L80-L84: It is unclear what „Small cells“ means here. The data described in this paper is not from a cellular network. Hence, this is confusing, also the description in the following sentence. What has this to do with the presented data?
L107: I strongly suggest to clearly state here the maturity of the different mentioned approaches because the readers that are not familiar with the details of the methods might think that sensing of humidity, wind and air quality with microwave link data works as good as rainfall estimation. This is not the case. For humidity some skill has been shown but the paper you cite (Rubin et al 2022) also clearly points out the challenges and limitations. E-band frequencies clearly have some potential here. The possibility of sensing wind is only briefly mentioned as an option in the paper you cite (Messer et al 2012), but no results are shown. For air quality, a paper exist (David 2016, https://doi.org/10.1021/acs.est.6b00681), which you do not cite here, but it anyway only shows that microwave link signals are prone to be disturbed by certain stable atmospheric layering which can also trap air pollution if there is a relevant source of the pollutants. These methods should not be mentioned, without further explanation, in the context of microwave link rainfall estimation, because for rainfall estimation there is a clear strong signal in the microwave link data which can fairly easily be inverted to estimation rainfall. Please update this sentence.
L113: The words you use here „…aims to foster collaboration among scientists, national weather services, sensor-network owners…“ sound like they describe the COST Action OPENSENSE, but not GMDI, which is a product of OPENSENSE, but has a very specific task. Please read the cited reference Fencl et al 2025 to find out what GMID is and update this sentence.
L121: Unclear sentence. Maybe the „With…“ need to be removed?
L122-L125: This sounds a bit like an advertisement of the NYC Mesh. But is this really a mesh network? The topology looks more star-like (see Figure 3) and it does not seem realistic to me that this is a „fault tolerant web“ that does „routing through multiple paths“. Please clarify and maybe be more objective in the description of the network.
L126: Unclear what is described in this sentence and how this is related to the presented dataset.
L128-L129: Incomplete sentence
L130 paragraph: I looked at https://map.nycmesh.net/ and I find it hard to see the described mesh structure there. From all the endpoints (non-hub points) my estimate is that 90% or more a connected via one hop to a hub. Is it only the hubs or supernodes that are built in a mesh structure to compensate for fibre-optics connections failures? In generally it is interesting to learn about the network topology, but the text here reads more like an advertisement of NYC Mesh and not like a description of the actual network contained in the dataset. Please be more precise here. Or even better, show a map, based on the released data, that shows the different parts of the network and highlight the mesh configuration.
L156: Unclear what role OSPF plays here. Also UISP was not introduced. Please update. Is this relevant for the paper and dataset?
L160: has to be „…that runs daily:“.
L170 and following: Why are only 103 sub-links included here. What about the others that are shown in Figure 2? Was their data not recorded? Why? If the data was recorded, why is it not in the dataset?
L171: You also provide TSL, not only RSL.
L185: „Official WS often follow…“. Of course, an official met service weather stations ALWAYS follows a standardised protocol! Please update and be more precise in your writing. This whole paragraph is a bit vague. You mostly use PWS data anyway, so why write a lot about official weather stations here?
L195: Add brackets around el Hachem et al 2024.
L198: What is meant with „responsiveness“ here?
L202: „…enabling high-resolution analysis of microclimates...“. That sounds again like an advertisement of the platform you got the data from, WUnderground in this case. How would they allow this analysis? Are the records long enough? Are the records reliable enough? Is there a publications that shows that? Same for „precipitation trends“? Reliability and length of data records are crucial for that.
L209: A collector area has to be given in square meter or square mm. What are the 20 mm here? Is it mm^2?
L210: Bad sentence. First comma should be removed. Needs „and“ before „might lead…“ and remove comma or rephrase a bit.
L224: I assume the stations at the airport are run by official authorities. So, does WUnderground also include these, i.e. they provide both PWS and official stations data? Aren’t these data available via NOAA or another official platform, or are they only available via WUnderground? Please check and update or clarify in the text.
L229: Does the daily data here stem from the same devices as the one described on „Airport WS“ since they both stem from the ASOS network? Why not get all the data with higher temporal resolution from the NOAA portal? Here https://www.ncei.noaa.gov/products/land-based-station/automated-surface-weather-observing-systems
L236: Please show a map of all links and all other data that is included in your open dataset. Note that I strongly suggest that you add all data that you use for the presented analysis, see my main comments. If the zoomed in map in Figure 3a is required, add that in addition to the full map. But the reader has to see on one map the full extent of the dataset, including links, gauge and weather stations, hopefully also the radar grid (see my main comment).
L239: „…we focus…“ So you focus on a subset in the analysis here or is the released data this subset?
L250: PWS might provide data at 15min resolution. How was this resampled to 5min? Linear interpolation? Why not keep the original time stamps so that it is clear to the user and a choice can be made when comparing to higher resolution data. When preparing the station data for release (see my main comment), I suggest to keep the original time stamps, assuming they are equidistant at 5min, 10min and 15min for individual stations.
Section 3 and subsections: Why is there a description of details of Fig 4 in the first paragraph and then we go into a subsection 3.1 that describes the k-R relation? What is the purpose of section 3? Section 3.1 and 3.2 seem to explain methods. They do not fit here. See my main comments.
Equation 3: How is the effective link length L_e different from the geometric link length and why is this important in this paper? Please update or explain.
L296 paragraph: Is the PWS data spatially interpolated or do you take the mean of the PWS in the vicinity of a link? Please make this clear in the text.
Figure 6a: Why has the PWS CCDF only the three big steps? The data comes with a quantisation of 0.25 mm. With 15-minute temporal resolution that equates to 1 mm/h increments. Please explain or redo the plot. I strongly suggest to not bin the PWS values into the three categories first. You can keep the dashed lines that indicate the rain rate thresholds, but you should plot the continuous CCDF. Is the y-axis of Delta_A_15 scaled so that it corresponds to the rain rates values that would be derived via the k-R power law? That would make sense.
L319: Citation format is wrong.
L321: In Figure 3a we do not know about dry conditions. Hence it is not clear how the conclusions are drawn here. Please update.
L324 paragraph: Here the CCDF of PWS and link are discussed as if the events that cause the exceedance are temporally associated. But there is not guarantee of temporal alignment of events. Please rephrase the paragraph or align PWS and link data first to draw these kind of conclusions. The sentences are also confusing since they switch back and forth between describing PWS and link data.
L327: This is confusing. Did the PWS capture the localised rain showers or not?
L328: The term „link collapse“ sounds as if the mechanical structure collapses. I assume, what you mean here are outages due to strong attenuation. I suggest to use another wording here.
Figure 9: Which rain rate data was used to group the attenuation data? The one from nearby PWS? Please explain.
L344: The x-axis in Figure 7 says this is 60-minute aggregations. The text here says 15 minutes. What is correct?
L364: Should be Fig 7c here, not Fig 6c. It is also not clear here why 24 GHz bands are mentioned. They are not shown in Fig 7.
L372: Is the data gap, which is caused by the outage during strong attenuation (note, that I do not like the term collapse, as commented above), visible in the plot or was the gap interpolated? Even when zooming in, I cannot see a gap in the time series.
L380: If you mention the limitations of conventional rain gauges with regard to measuring non-liquid precipitation, you have to also point out the limitations of microwave-based measurements of non-liquid precipitation. Doing quantitative precipitation estimation is very challenging for snow, graupel, sleet, etc, even with dedicated dual-pol weather radars. Using microwave links might provide additional insights, e.g. by giving an indication whether or not precipitation is liquid (see Øydvin et al., 2025 https://doi.org/10.5194/amt-18-2279-2025), but it is not clear to me, and from the existing literature, how a microwave-link-based measurement of non-liquid precipitation will be of better quality then the one from a rain gauge. Please clarify in the text.
L382: What is the device used in Central Park? What are the devices used at the other two locations? Are they all dedicated snow height sensors, e.g. ultra sonic height measurements? If not, how reliable and comparable are the values?
Figure 10: Unclear to me if „cyan“ and „blue“ are correctly attributed to the shaded areas. I only see a violette and blue shading. Is cyan the blueish one? Furthermore, to make interpretation of the different magnitude of attenuation at different bands easier, I suggest to use the same x-axis spacing for the attenuation plots. In addition, I find the mean and median of PWS precipitation data not very helpful. Maybe it would be best to just draw thin lines for all individual PWS time series? How many PWS are used here and at what distance from the links are they?
L401: I assume you refer to the daily NOAA data here and the reason it is not shown in the figure is because it is just a daily value. Please explain in the text. Ideally you should also include higher resolution NOAA data anyway (see my main comments).
L404: So the 39 mm from the NOAA data mentioned above are snow depth and not rainfall depth? Please clarify.
L411: Which sensors did provide the precipitation classification codes mentioned here? Why are they, or at least the main classes „dry snow“ and „melted snow“ or „mixed-phase“ not shown in Figure 10? If you do not have the precipitation classification codes available for your analysis, there is not need to mention them here. It is enough to state that there is dry snow and melting snow and that they cause different amounts of attenuation.
L417: There is not need so write „…suggesting that dry snow is effectively transparent to microwave signals“ and in the sentence afterwards stating that wet snow causes strong attenuation. This is well know and you wrote about that some lines above. Please rephrase.
L426: „These examples demonstrate that wireless links detect both liquid and frozen precipitation effectively...“. I absolutely do not agree with this statement. For the exact reasons you described above, which you also summarise in the next part of this sentence, it is very challenging to detect frozen precipitation and to distinguish the transition from liquid to mixed-phase, with microwave link data. Please be more modest in your formulations. Also, what does „detect something effectively“ actually mean? This needs to be rephrased anyway.
L432: „…the applicability OS of rainfall…“ What should that mean. Please correct.
L434-L435: „…illustrating the ability to classify and detect precipitation solely from signal strength measurements“. This sounds as if you show something new here. This is not the case. Please rephrase.
Figure 11: What are the rain classes used for PWS here and why aren’t the rain rates shown?
General comment on section 4: Since this is a data paper, there is no need to do an in-depth analysis of processing performance because you are not sharing the processed data and thus do not have to report on the methods. It is a nice addition to show applicability of existing methods, though. However, with one selected sublink and a selected period, we do not learn much about the applicability of existing method to the dataset. It is not surprising that existing methods work here. There is also no need to describe any details on the methods if it is just the standards ones. You can refer to standard literature. There is also no point in reporting an NRMSE for a super short period and a single sublink. If you want to show processing skill based on time series I suggest to use at least several sublinks and at least one month of data to also show challenges during dry periods. Or, you could focus on a specific rain event and process all sublinks for the given day, showing also spatial rainfall distribution. Ideally, if you include radar data, you can show the benefit of microwave link data, providing near-ground data with high temporal resolution, similar to what is shown in the OpenMRG ESSD paper. In general, show what potential users of your dataset can expect from your dataset and hint at the interesting analysis that can be carried out in the future.
(Note that I also cover this issue in one of my main comments above, which I wrote as synthesis after writing all specific comments. But I keep this specific comment here because it discusses some additional details.)
L471: „everyday internet backhaul“ is a strange term. Why not just write „standard internet access“.
L474: „…baseline calibration or detection broader weather patterns“. What is „baseline calibration“ and how should the low frequency links help here? And what are „broader weather patterns“?
L474: I agree that there is strong absorption due to oxygen that limits the link range, but what are the „additional attenuation factors“ that are mentioned?
L474-L457: I do not understand what the „critical consideration for hydrological sensing“ should be. Please clarify or remove.
L478-L480: I thought the issue with the 60 GHz links is due to the short path-length. Please clarify.
L484: It would be really good to know how many of these dual-band links are in the dataset. That should be clarified already above when the dataset is introduced.
L510: What is the „opportunistic-hydrology“ codebase? The provided link points to the Github group of the OpenSense COST Action. Please clarify or update.