A 225-Year (1799&ndash;2024) Homogenized Daily Water Level Series of the Vistula River in Warsaw

Sobechowicz, Łukasz; Brykała, Dariusz; Kaznowska, Ewa; Wasilewicz, Michał; Wolski, Jacek; Noras, Marcin; Siwek, Wojciech Aleksander

doi:10.5194/essd-2025-538

Preprints

https://doi.org/10.5194/essd-2025-538

Preprints

04 Dec 2025

| 04 Dec 2025

Status: this preprint is currently under review for the journal ESSD.

A 225-Year (1799–2024) Homogenized Daily Water Level Series of the Vistula River in Warsaw

Łukasz Sobechowicz, Dariusz Brykała, Ewa Kaznowska, Michał Wasilewicz, Jacek Wolski, Marcin Noras, and Wojciech Aleksander Siwek

Abstract. We present a 225-year (1799–2024) homogenized daily water level series for the Vistula River in Warsaw, comprising 82,453 observations. The construction of this consistent dataset required adjustments for changes in gauge location, shifts in gauge zero, differences in historical measurement units, and calendar discrepancies between the Julian and Gregorian systems. A small number of missing observations were reconstructed using stage–stage relationships established between overlapping periods of observation at the Warsaw gauge and parallel measurements from downstream stations along the Vistula. The resulting dataset offers a robust foundation for long-term hydrological, climatic, and socio-environmental research. The dataset is openly available at Zenodo repository: https://doi.org/10.5281/zenodo.16919654 (Sobechowicz et al., 2025).

Received: 02 Sep 2025 – Discussion started: 04 Dec 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Preprint (PDF, 1857 KB)

Supplement (159 KB)

Download & links

Łukasz Sobechowicz, Dariusz Brykała, Ewa Kaznowska, Michał Wasilewicz, Jacek Wolski, Marcin Noras, and Wojciech Aleksander Siwek

Status: open (until 25 Mar 2026)

Post a comment Subscribe to comment alert

RC1:
'Comment on essd-2025-538', Anonymous Referee #1, 18 Jan 2026 reply
The authors have done an impressive job of creating a 225-year dataset of daily water levels in Warsaw. Such a long data record is crucial for climate research, hydrological modelling, and flood risk management. They have exhaustively included data from publications and yearbooks (somewhere in different units and different locations) prior to 1981. Nevertheless, there are some serious technical concerns that need to be addressed before publication. They are as follows,
A simple linear regression based on the neighbouring gauges has been used to fill the data. They have also selected specific highs and lows in the data for this purpose. The authors have observed a lag of up to 4 days for the waters to travel to the neighbouring gauges. Apart from this, the authors also use the maximum, minimum, and onset points of the flood data to fill in data that do not represent these conditions. In short, I find the gross assumption of an average time lag and the use of specific points overly simplistic, given the nonlinearity of the hydrodynamics of flows of varying intensity. More evidence of these kinds of assumptions would be useful for the readers to understand the reliability of the filled data. For instance, using cross-correlation between the time series of different gauges to show the time lag. The authors could have used robust techniques like Long Short-Term Memory (LSTM) neural networks for data filling, as they can capture non-linear temporal patterns while selectively retaining information relevant to the current output in long time series. The authors need to present the existing variations in time lag across different gauge locations and discuss the associated uncertainties. Please refer to the work done by Ren et al. (2022) in this regard.

The validation of the proposed linear-regression-based data filling is incomplete. A robust k-fold cross-validation is needed to assess the accuracy of the proposed data-filling methods. The data needs to be split iteratively into training and validation sets. Importantly, the error needs to be reported using RMSE, NSE, KGE, etc.

The authors themselves say that the bed level varied extensively due to gauge relocation and anthropogenic activities. They have taken these into consideration and modified the zero level. They need to comment on the reliability and accuracy of these adjustments.

Even high-resolution remote-sensing-based digital elevation models may not accurately represent riverbed topography. In addition, extensive cross-sectional surveys are needed to simulate the correct water levels that reflect the corresponding conditions. Therefore, the discharge time series is often highly useful to the hydrological community for validating hydraulic/hydrological models or for climate research. Can the authors provide a discharge time series or comment along these lines?

The minor comments are as follows,
Line 68, “km XXX of the Vistula River”, is difficult to understand. Please change similar lines in the manuscript.

Line 127, what do the authors mean by “km 421 + 600”? Please modify similar lines to provide greater clarity to readers.

In Table 4, what does the last column indicate?

Line 290, please use a clearer term than “early measurements”.

Lines 389 -417: the summary need not include the uses of the dataset.

Reference
Ren, H., Cromwell, E., Kravitz, B., and Chen, X.: Technical note: Using long short-term memory models to fill data gaps in hydrological monitoring networks, Hydrol. Earth Syst. Sci., 26, 1727–1743, https://doi.org/10.5194/hess-26-1727-2022, 2022.

Reply
Citation: https://doi.org/10.5194/essd-2025-538-RC1

Łukasz Sobechowicz, Dariusz Brykała, Ewa Kaznowska, Michał Wasilewicz, Jacek Wolski, Marcin Noras, and Wojciech Aleksander Siwek

Supplement

https://doi.org/10.5194/essd-2025-538-supplement

Data sets

Daily Water Levels of the Vistula River at Warsaw, 1799–2024: A Complete and Homogenized Long-Term Record Ł. Sobechowicz et al. https://doi.org/10.5281/zenodo.16919654

Łukasz Sobechowicz, Dariusz Brykała, Ewa Kaznowska, Michał Wasilewicz, Jacek Wolski, Marcin Noras, and Wojciech Aleksander Siwek

Viewed

Total article views: 518 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	Supplement	BibTeX	EndNote
347	147	24	518	58	25	21

HTML: 347
PDF: 147
XML: 24
Total: 518
Supplement: 58
BibTeX: 25
EndNote: 21

Views and downloads (calculated since 04 Dec 2025)

Month	HTML	PDF	XML	Total
Dec 2025	191	50	14	255
Jan 2026	95	37	4	136
Feb 2026	61	60	6	127

Cumulative views and downloads (calculated since 04 Dec 2025)

Month	HTML	PDF	XML	Total
Dec 2025	191	50	14	255
Jan 2026	95	37	4	136
Feb 2026	61	60	6	127

Viewed (geographical distribution)

Total article views: 503 (including HTML, PDF, and XML) Thereof 503 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 28 Feb 2026

Download

Preprint (1857 KB)
Metadata XML

Short summary

We present a 225-year (1799–2024) homogenized daily water level record for the Vistula River in Warsaw, comprising 82,453 observations. Adjustments were made for gauge relocations, zero shifts, unit differences, and Julian–Gregorian discrepancies. Missing data were reconstructed using stage–stage relationships with downstream stations. The dataset provides a robust basis for long-term hydrological, climatic, and socio-environmental research.


Total:	0
HTML:	0
PDF:	0
XML:	0