the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Hydrologic, biogeochemical, microbial, and macroinvertebrate responses to network expansion, contraction, and disconnection across headwater stream networks with distinct physiography in Alabama, USA
Abstract. Here we present a comprehensive dataset of hydrologic, biogeochemical, microbial, and macroinvertebrate community measurements from a set of multi-year, co-occurring, watershed studies in non-perennial stream networks that dynamically expand and contract over space and time. The data were collected over the 2022–2024 water years across three stream networks draining watersheds with a similar humid, subtropical climate but distinct physiographies (i.e., Piedmont, Appalachian Plateau, Coastal Plain) in Alabama, USA. Our goal was to characterize the spatiotemporal patterns and drivers of how non-perennial stream networks expand and contract, as well as the biogeochemical, microbial, and macroinvertebrate dynamics associated with changes in network connectivity and water availability. We used a combination of spatial, temporal, and spatiotemporal sampling and sensor-based monitoring approaches to capture hydrologic, biogeochemical, and ecological responses to network expansion and contraction in each watershed. This manuscript describes the overall study design, monitoring network and sampling approaches, data and sample collection and analysis, and specific datasets generated. All data products are publicly available through the Hydroshare data repository for hydrologic, biogeochemical, and macroinvertebrate data (https://www.hydroshare.org/group/247) and through the NCBI data repository for microbial data. All data product-specific DOIs and repository-specific unique IDs are cited in Appendix A (Table A1, Table A3).
- Preprint
(6620 KB) - Metadata XML
- BibTeX
- EndNote
Status: final response (author comments only)
- RC1: 'Comment on essd-2025-559', Anonymous Referee #1, 25 Nov 2025
-
RC2: 'Comment on essd-2025-559', Anonymous Referee #2, 05 Jan 2026
I would like to congratulate the authors for this tremendous effort into collecting a very comprehensible 3-year dataset of intermittent streams in three watersheds located in Alabama, USA. This dataset is very valuable due to its broad coverage of hydrological, biogeochemical and ecological data including macroinvertebrate, microbial, and fungal community sampling. The authors claim that a high resolution dataset as such is highly rare, and necessary due to long-term monitoring traditionally focusing on perennial rivers. However, as a previous reviewer has mentioned earlier, the usability of this dataset may be limited due to multiple different sampling designs and temporal resolutions applied for the individual datasets.
A major confusion in this dataset are the different sampling/monitoring designs utilized, which are named Approach 1, 2 and 3. These are firstly introduced in Table 1, however, are only named in the Methods section. A quick check of data products also revealed that some samples do not belong to any of the approaches, which adds to the confusion. These approaches should be clearly introduced, explained and named in the last paragraph of the introduction to prepare the reader for the subsequent figures/tables and sections. Meaningful identifiers (e.g. watershed outlet, seasonal sampling etc.) rather than numbers would be preferable, for the reader to intuitively distinguish the approaches used.
Many acronyms are used in this manuscript, which easily becomes overwhelming. Some acronyms are also not defined in the data products (e.g. ‘sublocation’ in ENVI_SE_TAL.xlsx). A major confusion is that the watersheds themselves are referred to by the physiographical region and their acronyms do not follow the same pattern. If I googled correctly, the acronyms derive after a city in the watershed. This is highly non-intuitive. Watershed-scale studies typically name their watershed after the biggest river in the watershed. Namely the river, where the watershed outlet is located. Failing to do so makes searching for this dataset difficult, especially for scientist who work on a global scale and are unfamiliar with this specific region.
Furthermore, as another reviewer has pointed out, many sensors are used in this dataset and their error estimates are not given or discussed in the manuscript. The accuracy and potential drift that may have been observed over the years is not mentioned. It is highly unlikely that over the extensive temporal period, no drift was observed in any of the sensors.
Finally, Table 2, 3 and 4 are especially important tables as they are overviews of what variables are in each data product and every reader will skim over these to evaluate whether the dataset is suitable for their needs. Keeping this in mind, beyond variables taken and their units, it would be especially helpful to disclose the temporal resolution sampled for each data product, the time periods covered, and what the total sample size in each watershed is.
Specific comments:
Please treat each display product as stand-alone, and define acronyms in the table or figure captions for readers who do not read your entire manuscript.
The sampling dates in Table 1 are hard to comprehend (e.g. asterisks, bold distinction). It would be easier to read if the two approaches had their individual rows and sampling dates are separately disclosed.
Figure 1: I am assuming that the 5 distinct colours in the left map are physiographical areas. Please add a legend to define them, or remove the colours that are not relevant to this study.
Figure 2, 3, 4: Do the contours refer to elevation? Please described in figure captions.
427-430: Please provide error ranges for variables that were measured with two different methods or instruments. Were any statistical tests done to statistically support that there were no differences?
501: When I first read ‘composite sample’ I understood that the four sample types (litter, biofilm, surface water and sediment) were merged together in the end, and sequenced together as one sample. However, reading the DNA extraction and sequencing procedure, it seems that the sample types were extracted and sequenced separately. In this case, I would recommend the authors not to refer to them as ‘composite’ samples but just refer to them as four different sample types that were sampled, extracted and analyzed separately.
Citation: https://doi.org/10.5194/essd-2025-559-RC2 -
AC1: 'Comment on essd-2025-559: Author Response to Reviewer Comments', Stephen Plont, 17 Feb 2026
Reviewer 1:
The datasets are particularly valuable because hydrological, biogeochemical, and ecological observations are rarely collected together.
RESPONSE: Thank you for taking the time to provide a thoughtful and thorough review our manuscript! Based on your feedback and comments, we have taken steps (outlined below) to better clarify and describe our highly-complex dataset, to improve the overall readability and comprehension of our manuscript, and to report and elaborate on potential sources of measurement error and uncertainty.
However, there are inconsistencies in the spatial and temporal resolutions among the datasets. Not all observations share the same spatiotemporal resolution, which represents a major limitation and may hinder usability for other researchers. The authors should also explicitly discuss these limitations in the text.
RESPONSE: We recognize the use of multiple monitoring- and sampling-based approaches, each of which differ in terms of spatial and temporal resolution, contributes greatly to the high complexity of our catalog of datasets. However, non-perennial streams are highly variable systems, and the complexity in our data collection approach reflects the complexity in expected hydrologic, biogeochemical, and ecological responses across a watershed during periods of network expansion and contraction. Further, many biogeochemical and ecological responses to stream drying and rewetting cannot be detected or inferred solely from sensor-based monitoring approaches. Our watershed study set out to collect data in such a way that individual datasets were co-collected to maximize harmony among datasets and comparability across watersheds through consistent sampling and monitoring approaches.
All sensor-based datasets were collected at 15-minute intervals from Autumn 2021 to Autumn 2024, with comparable temporal coverage across sensor-based datasets and across watersheds. All samples within sample-based datasets (e.g., water and dissolved gas chemistry, microbial and macroinvertebrate communities) were co-collected within each sampling approach at sites paired with either STIC or water sensors, ensuring both consistent spatial and temporal resolution and overlap between sampling and monitoring sites. The manuscript now notes the few cases where datasets were limited to specific watersheds or study periods.
Several ancillary datasets were collected in addition to the core data efforts detailed in this manuscript to support dissertation research for students or other projects. Further, some datasets have limited temporal coverage due to budgetary constraints or instrument malfunctions. Ancillary datasets were only included in this data catalog if they 1) were consistently sampled in at least two of the three sampling approaches in at least one of the study watersheds and 2) were sampled consistently for over one year of the study. The ancillary or limited data sets we opted to include based on these criteria were GHGS, TSSS, WAIS, and the macroinvertebrate community data. We have added text at the beginning of section 2.2 (Watershed study design) and throughout section 3.3 (Water chemistry sampling and analysis) and 3.4 (dissolved gas sampling and analysis), 3.5 (Microbial community collection and analysis) to describe these differences in sampling duration and resolution, and the potential limitations where relevant.
The data methods are clearly articulated; however, there is no discussion of uncertainty estimates. Although many sensors and analytical methods have been employed, the authors have not reported the associated accuracy and uncertainty values that users need to consider. This information is essential for evaluating spatiotemporal heterogeneity. Without it, we cannot determine whether observed differences across space or time are truly significant or simply within the bounds of measurement uncertainty.
RESPONSE: For sensor-based monitoring data products, we have incorporated sensor-specific measurements of error, detection limits, and accuracy (as provided in manufacturer documentation) throughout the text where applicable. Further, we have incorporated more details of specific sources of uncertainty (e.g., baseline shifts from pressure transducer removal for data downloads, uncertainty in streamflow and water quality sensor measurements during high flow events). We have also added more specific details throughout the manuscript about data quality assurance-quality check (QAQC) flags that was previously only contained in the sensor metadata to aid in interpretation of data quality and measurement uncertainty.
For sampling-based data products, namely samples for water chemistry, we have included more details on specific method detection limits in text and in Table 3. Water chemistry samples were collected in triplicates and replicate samples were filtered out if concentrations exceeded three standard deviations of the mean triplicate concentration for a specific analyte. In the final data, we report mean and standard deviations for each sample, the method detection limit, and a flag for samples that where the mean concentration is below the method detection limit. We have elaborated on these details in section 3.3 (Water chemistry sampling and analysis).
The authors claim “novelty” at Line 73, but it is not clearly established. The first two paragraphs primarily emphasize the need for this work, which is not the same as demonstrating its novelty. Could the authors explicitly articulate what makes this dataset novel?
RESPONSE: Thank you for this helpful comment. We have revised the introduction to more clearly articulate the novelty of our dataset. Specifically, our study represents one of the first and most comprehensive efforts to co-collect hydrologic, biogeochemical, microbial, and macroinvertebrate synoptic data within non-perennial stream networks. Our study design intentionally integrates sensor-based monitoring networks with both spatially extensive and temporally intensive watershed sampling. This paired approach enables direct investigation of spatiotemporal dynamics in network expansion and contraction and their consequences for ecosystem processes and community structure within the watershed as well as the consequences for downstream water quality. In addition, comprehensive datasets from non-perennial streams in mesic regions are rare, particularly in the southeastern US; an important global freshwater biodiversity hotspot and region expected to undergo climate-driven hydrologic intensification in the coming decades.
There is an inconsistency regarding the study period: the abstract states 2022, while the main text (Line 239) indicates 2021. Please clarify which year is correct and ensure consistency throughout the manuscript.
RESPONSE: Thank you for highlighting this point of confusion. The study took place across the 2022-2024 water years, a unit primarily used by the U.S. Geological Survey and defined as the 12-month period between October 1 of a given year and September 30 of the following year. To avoid confusion, we have removed mentions of the specific water years and ensured that the study period is consistently defined as Autumn 2021-Autumn 2024 throughout the manuscript.
Overall, the structure and organization of the subsections are good. However, there are too many abbreviations, making it difficult to follow the text. I strongly recommend reducing the number of abbreviations to improve readability.
RESPONSE: We have removed several acronyms and abbreviations that were low usage throughout the manuscript, were used in specific sections but not throughout the manuscript, did not serve to improve the readability on the text, or were otherwise deemed to be unnecessary. We have also removed uses of sampling approach 1, 2, and 3 and streamlined our discussion of these different sampling methods throughout the manuscript to aid in overall readability and comprehension. Moreover, we have removed mentions of sampling approaches from resource titles on HydroShare to maintain consistency.
Reviewer 2:
I would like to congratulate the authors for this tremendous effort into collecting a very comprehensible 3-year dataset of intermittent streams in three watersheds located in Alabama, USA. This dataset is very valuable due to its broad coverage of hydrological, biogeochemical and ecological data including macroinvertebrate, microbial, and fungal community sampling. The authors claim that a high resolution dataset as such is highly rare, and necessary due to long-term monitoring traditionally focusing on perennial rivers.
RESPONSE: Thank you for taking the time to provide a thoughtful and thorough review our manuscript! Your feedback and comments have been very helpful to better clarify and describe our highly-complex dataset.
However, as a previous reviewer has mentioned earlier, the usability of this dataset may be limited due to multiple different sampling designs and temporal resolutions applied for the individual datasets.
RESPONSE: Although employing multiple monitoring and sampling approaches adds complexity to our dataset catalog, this design is intentional and necessary. We co-collected datasets using standardized methods to ensure strong integration within watersheds and robust comparability across sites. All sensor-based monitoring datasets were collected at a consistent 15-minute temporal resolution. Within each watershed, individual sensor records span Autumn 2021 to Autumn 2024, and comparable sensor datasets are temporally aligned across all three watersheds. Sample-based datasets (e.g., water and dissolved gas chemistry, microbial and macroinvertebrate community composition) were co-collected at sites equipped with STIC sensors (spatially extensive approach), surface and groundwater level sensors (seasonal approach), or at watershed outlets. Thus, within each sampling approach, datasets share consistent spatial and temporal resolution. We have clarified in the manuscript the few instances where datasets were limited to one or two watersheds or to specific study periods.
A major confusion in this dataset are the different sampling/monitoring designs utilized, which are named Approach 1, 2 and 3. These are firstly introduced in Table 1, however, are only named in the Methods section. A quick check of data products also revealed that some samples do not belong to any of the approaches, which adds to the confusion. These approaches should be clearly introduced, explained and named in the last paragraph of the introduction to prepare the reader for the subsequent figures/tables and sections. Meaningful identifiers (e.g. watershed outlet, seasonal sampling etc.) rather than numbers would be preferable, for the reader to intuitively distinguish the approaches used.
RESPONSE: We have removed uses of sampling approach 1, 2, and 3, explicitly defined sampling approaches as “temporal outlet sampling”, “seasonal watershed sampling”, and spatially extensive watershed sampling” in the introduction, and streamlined our discussion of these different sampling methods throughout the manuscript to aid in overall readability and comprehension. Moreover, we have removed mentions of sampling approaches from resource titles on HydroShare to maintain consistency (see Table A1).
All sample-based data products belong to at least two of the three sampling approaches. For a given data product (e.g., “TAL_DOCS”), samples across the three sampling approaches are collated into a single dataset with data collected based on different sampling approaches being designated using approach-specific binary operator columns (In the appr1 column, “0” = not part of sampling approach and “1” = part of sampling approach). This is stated in Section 4 (Data Management/Availability) on L639-643. We chose to format sample-based data products in this way 1) to help collate data into fewer data products and files and 2) to allow for specific data points that belong in multiple sampling approach designations are counted and can be filtered as such (e.g., samples that were collected at the watershed outlet during seasonal watershed sampling campaigns belong to both “temporal watershed outlet” and as “seasonal watershed sampling” approaches). We recognize that these clarifications need to be iterated earlier and throughout the manuscript, as this key detail can be easily missed. We have incorporated text into the introduction to establish the sampling and sensor monitoring approaches early and elaborate throughout the manuscript on how the different sampling approaches are represented in the final data products.
Many acronyms are used in this manuscript, which easily becomes overwhelming. Some acronyms are also not defined in the data products (e.g. ‘sublocation’ in ENVI_SE_TAL.xlsx). A major confusion is that the watersheds themselves are referred to by the physiographical region and their acronyms do not follow the same pattern. If I googled correctly, the acronyms derive after a city in the watershed. This is highly non-intuitive. Watershed-scale studies typically name their watershed after the biggest river in the watershed. Namely the river, where the watershed outlet is located. Failing to do so makes searching for this dataset difficult, especially for scientists who work on a global scale and are unfamiliar with this specific region.
RESPONSE: As per recommendations from both reviewers, we have removed acronyms and abbreviations that were low usage, did not serve to improve the readability on the text, or were otherwise deemed to be unnecessary. We have clarified in the abstract, introduction, and “Data Management and Availability” section that the “SE” sublocation acronym within data product names is a regional designation (i.e., “SE” = “Southeast region”) within the larger “Aquatic Intermittency effects on Microbiomes in Streams (AIMS)” project. All AIMS data products are hosted through the same Hydroshare data repository with comparable data products collected using the same watershed sampling design existing in two other regions (“GP” = Great Plains, “MW” = Mountain West).
Watersheds were selected to be comparable in size to each other and representative of small, non-perennial, headwaters streams in different physiographic regions throughout the southeastern United States. The streams where each watershed outlet are located are unnamed. The hydrologic conditions observed in the nearest downstream named, monitored rivers (For Piedmont: Choccolocco Creek at Boiling Springs, AL; For App. Plateau: Estill Fork at Estillfork, AL; For Coastal Plain: Tombigee River at Gainesville, AL) are not comparable to those of our small, study watersheds. We have include all relevant geospatial information to locate and map each of our focal watersheds as data products in this release. In each watershed description, we have added information on the nearest named, downstream river, and what larger river system each watershed drains into.
The watershed acronyms (i.e., TAL, WHR, PRF) were based on project-wide watershed names, originally assigned based on local landscape descriptors. “Talladega” is in reference to the Talladega National Forest surrounding our Piedmont watershed. “Weyerhaeuser” is in reference to the land management company that owns the region linked to our Coastal Plain watershed. “Paint Rock-Fanning Hollow” is in reference to the larger Paint Rock River watershed. Unfortunately, while we were able to change the watershed names to be more descriptive, comparable, and transferrable, the watershed acronyms are deeply embedded in all data products featured in this publication and across the larger AIMS project-related products (e.g., site names, sample names, other published data records and descriptions, other publications) and not feasible to change. We have done our best to 1) refer to each study watershed by physiographic province in the manuscript and data products and 2) include a reference to the relevant watershed acronym wherever possible in the manuscript to minimize confusion.
Furthermore, as another reviewer has pointed out, many sensors are used in this dataset and their error estimates are not given or discussed in the manuscript. The accuracy and potential drift that may have been observed over the years is not mentioned. It is highly unlikely that over the extensive temporal period, no drift was observed in any of the sensors.
RESPONSE: We have incorporated text referencing sensor-specific measurements of error, detection limits, and accuracy (as provided in manufacturer documentation), as well as sources of uncertainty throughout the text where applicable. We have also added specific information on how sensor drift was accounted for (e.g., threshold adjustments for water presence/absence; manual measurements and adjustments of water level in pressure transducer data, baseline shifts for detected drift in water quality sensors). We have also added more specific details about data QAQC flags that was previously only contained in the sensor metadata to aid in interpretation of data quality and measurement uncertainty.
Finally, Table 2, 3 and 4 are especially important tables as they are overviews of what variables are in each data product and every reader will skim over these to evaluate whether the dataset is suitable for their needs. Keeping this in mind, beyond variables taken and their units, it would be especially helpful to disclose the temporal resolution sampled for each data product, the time periods covered, and what the total sample size in each watershed is.
RESPONSE: In our updates to the captions for Tables 2, 3, and 4, we have now included details on the temporal resolution and time periods covered for sensor monitoring datasets, water/dissolved gas chemistry samples, and for microbial and macroinvertebrate community analyses. The temporal resolution and time periods covered are identical for majority of parameters and data products outlined in each respective table as they were co-collected during the each sampling approach, and we have made note of exceptions to this in the table caption and in the manuscript text. The temporal resolution and time periods sampled are also illustrated for each study watershed in panel B of Figures 2, 3, and 4, respectively.
Specific comments:
Please treat each display product as stand-alone, and define acronyms in the table or figure captions for readers who do not read your entire manuscript.
RESPONSE: We have updated figure and table captions throughout to ensure each display product can stand-alone.
The sampling dates in Table 1 are hard to comprehend (e.g. asterisks, bold distinction). It would be easier to read if the two approaches had their individual rows and sampling dates are separately disclosed.
RESPONSE: We have updated Table 1 to give each sampling approach its own row and removed asterisks and bold distinctions.
Figure 1: I am assuming that the 5 distinct colours in the left map are physiographical areas. Please add a legend to define them, or remove the colours that are not relevant to this study.
RESPONSE: Yes, the five distinct colours in the left map in Figure 1 are in reference to the 5 physiographic regions represented in Alabama. We have added a legend to define each of these.
Figure 2, 3, 4: Do the contours refer to elevation? Please described in figure captions.
RESPONSE: We have updated figure captions for Figures 1-4 to state that contours in watershed maps refer to elevation.
427-430: Please provide error ranges for variables that were measured with two different methods or instruments. Were any statistical tests done to statistically support that there were no differences?
RESPONSE: We have incorporated error ranges and method detection limits for NO3-N concentrations in our cross-instrument and methods comparisons. Using a one-way analysis of variance of our cross-instrument comparison results, we found that there were no statistical differences between NO3-N concentrations measured using either of the ion chromatographs (p = 0.851, df = 35) or between ion chromatography and cadmium reduction methods (p = 0.782, df = 86).
501: When I first read ‘composite sample’ I understood that the four sample types (litter, biofilm, surface water and sediment) were merged together in the end, and sequenced together as one sample. However, reading the DNA extraction and sequencing procedure, it seems that the sample types were extracted and sequenced separately. In this case, I would recommend the authors not to refer to them as ‘composite’ samples but just refer to them as four different sample types that were sampled, extracted and analyzed separately.
RESPONSE: Thank you for highlighting this area of confusion. “Composite samples” refer to the merging of multiple samples collected from the same habitat compartment of a site to capture natural variability in habitat-specific microbial communities. We have further clarified text describing this sampling technique in our descriptions of microbial sampling and sequencing methods.
Citation: https://doi.org/10.5194/essd-2025-559-AC1
Viewed
| HTML | XML | Total | BibTeX | EndNote | |
|---|---|---|---|---|---|
| 427 | 238 | 37 | 702 | 33 | 40 |
- HTML: 427
- PDF: 238
- XML: 37
- Total: 702
- BibTeX: 33
- EndNote: 40
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
The datasets are particularly valuable because hydrological, biogeochemical, and ecological observations are rarely collected together. However, there are inconsistencies in the spatial and temporal resolutions among the datasets. Not all observations share the same spatiotemporal resolution, which represents a major limitation and may hinder usability for other researchers. The authors should also explicitly discuss these limitations in the text.
The data methods are clearly articulated; however, there is no discussion of uncertainty estimates. Although many sensors and analytical methods have been employed, the authors have not reported the associated accuracy and uncertainty values that users need to consider. This information is essential for evaluating spatiotemporal heterogeneity. Without it, we cannot determine whether observed differences across space or time are truly significant or simply within the bounds of measurement uncertainty.
The authors claim “novelty” at Line 73, but it is not clearly established. The first two paragraphs primarily emphasize the need for this work, which is not the same as demonstrating its novelty. Could the authors explicitly articulate what makes this dataset novel?
There is an inconsistency regarding the study period: the abstract states 2022, while the main text (Line 239) indicates 2021. Please clarify which year is correct and ensure consistency throughout the manuscript.
Overall, the structure and organization of the subsections are good. However, there are too many abbreviations, making it difficult to follow the text. I strongly recommend reducing the number of abbreviations to improve readability.