Interactive comment on “Satellite-based remote sensing data set of global surface water storage change from 1992 to 2018” by Riccardo

The manuscript entitled "Satellite-based remote sensing data set of global surface water storage change from 1992 to 2018" by Tortini et al. presents estimated global surface water storage changes ( ∆ V ) in large lakes and reservoirs using a combination of paired water surface elevation (WSE) and water surface area (WSA) extent products. In their approach, they used data produced by multiple satellite altimetry missions (TOPEX-Poseidon, Jason-1, Jason-2, Jason-3, and ENVISAT) from 1992 on, with surface extent estimated from Terra/Aqua Moderate Resolution Imaging Spectroradiometer (MODIS) from 2000 on. They used the relationships between elevation and surface area to produce estimates of ∆ V even during periods when either of the variables was not available. They produce time series of ∆ V as well as WSE and WSA for a set of 347 lakes and reservoirs globally for the 1992-2018 period. In general I ﬁnd the idea of manuscript very interesting and I also see the need for having such data base. Indeed, the production of long-term, consistent, and calibrated records of surface water cycle variables such as the data set presented here is of fundamental importance to baseline future SWOT products. We thank the anonymous reviewer for the kind words.

We are thankful to the anonymous reviewer for the thoughtful comments, which led to significant improvements to our manuscript. All comments were addressed as described below.

C1
The manuscript entitled "Satellite-based remote sensing data set of global surface water storage change from 1992 to 2018" by Tortini et al. presents estimated global surface water storage changes ( ∆V ) in large lakes and reservoirs using a combination of paired water surface elevation (WSE) and water surface area (WSA) extent products. In their approach, they used data produced by multiple satellite altimetry missions (TOPEX-Poseidon, Jason-1, Jason-2, Jason-3, and ENVISAT) from 1992 on, with surface extent estimated from Terra/Aqua Moderate Resolution Imaging Spectroradiometer (MODIS) from 2000 on. They used the relationships between elevation and surface area to produce estimates of ∆V even during periods when either of the variables was not available. They produce time series of ∆V as well as WSE and WSA for a set of 347 lakes and reservoirs globally for the 1992-2018 period. In general I find the idea of manuscript very interesting and I also see the need for having such data base. Indeed, the production of long-term, consistent, and calibrated records of surface water cycle variables such as the data set presented here is of fundamental importance to baseline future SWOT products. We thank the anonymous reviewer for the kind words.
Major comments: I believe the paper suffers from missing an important point. The authors calculate first the correlation coefficient between the two variables and use it as one of the decision parameter for taking the mean value instead of data itself. The correlation coefficient between two data set represents the linear dependency between two data sets, while the relationship between water level and surface area represent the bathymetry and the bathymetry of a lake should not follow a linear behaviour. I would strongly suggest to change this in the paper and in case the authors would like to assess the monotonic behaviour between water level and area, then they should use the Spearman rank correlation and not simply the Pearson correlation. C2 We agree with the reviewer's comment that the bathymetry of a lake should not follow a linear behavior, and acknowledge that the 0.5 multiplier used in Equation (1) usually underestimates the actual volume change by not taking into account factors such non-linear bathymetries and the shape of the shoreline. However, such approach works reasonably well at most lakes/reservoirs (cfr. Gao et al., 2012), ultimately proving more portable to lakes/reservoirs at the global scale. We now account for this as explained in page 15, line 25-35. "The quality of both elevation and surface area contribute to the accuracy of their relationship, but volume changes are mostly dominated by elevation changes. High correlations between elevation and area generally indicate reliable ∆V estimation. However, if either variable is systematically biased, the error associated with the relationship is carried to the estimated ∆V. For example, low correlation may arise when the target shows nearly constant WSA (vertical walls, in which case a variation in elevation reflects in a negligible change in WSA) or nearly constant elevation (i.e., shallow lakes, in which case a variation in surface area reflects in a negligible change in elevation). In these cases we proceeded in the modelling of ∆V with the parameterization of the invariant variable with its mean value. All the factors listed above introduce some degree of error in the WSE-WSA relationship; however, in most cases a linear approximation does not appear to be a major contributor (cfr. Gao et al., 2012)." My second major comment goes to the methodology for the area extraction. Figure 6 shows some vertical lines of points, which represent same area for different water levels. This is highly suspicious.
We acknowledge that the area classification algorithm may suffer from uncertainty due to the spatial resolution of the imagery used (i.e. 500 m). However, using MODIS imagery over other finer resolution satellite images (e.g. Landsat) ensured to obtain a denser observation time series (virtually 32 times higher) due to the satellite revisit times (i.e. two daily MODIS observations vs Landsat's 16-day revisit time. This ultimately led to establishing a more robust relationship between WSE and WSA in C3 order to model ∆V. We now emphasize this point in page 16, line 1-10 as reported below. "Despite GOLA's moderate spatial resolution it can potentially affect the accuracy of ∆V estimates, higher resolution satellite missions have longer satellite revisit time (e.g., 16 days for Landsat, 10 days for Sentinel-2A starting in 2015 and 5 days for Sentinel-2A and -2B in tandem formation starting in 2017). Because we leveraged the relationship between WSE and WSA to estimate ∆V, such satellite revisit times would produce sparser records, especially for water bodies located at high latitudes and/or altitudes as they are more affected by cloud cover. In fact, despite being highly desirable for monitoring of surface water dynamics, imagery from optical sensors is strongly affected by the presence of cloud cover, which can be extensive in late fall and winter, and in combination with low sun angle experienced at high latitudes may limit its usefulness at the global scale (Duguay et al., 2015). However, the integration of optical imagery (e.g., MODIS, Landsat, Sentinel-2) and radar altimetry data provides long-term continuity in the production of consistent and calibrated records, and we encourage to re-explore the lakes in our study using Landsat and/or Sentinel images with 20-30m spatial resolution." Specific comments: page 2, line 30, please mention River and Lake https://earth.esa.int/web/guest/-/river-and-lake-products-from-altimetry-4617 and HydroSat http://hydrosat.gis.unistuttgart. de We thank the reviewer for the suggestion. We now mention both in page 2, line 34-37.
"Further examples of global data sets are the University of Stuttgart's HydroSat (http://hydrosat.gis.uni-stuttgart.de/; accessed February 27th, 2020), and, despite being no longer actively maintained, the European Space Agency's River Lake Altimetry products (http://altimetry.esa.int/riverlake; accessed February 27th, 2020)." Page 4, Section 2.1, Did you make sure that all data from different data C4 centers have the same background models for atmospheric refraction? How did you deal with the intersatellite bias?
The reviewer brings up an important point here. The G-REALM10 products are constructed from the merger of (up to) four mission datasets. Elevation reconstruction within each mission is based not only on the version (standard) of the data set but also what atmospheric and tidal corrections are currently available for that mission. The 10-day products (G-REALM10) are thus a blend of Geophysical and Interim Geophysical data records, and a mix of data version's B through D. The G-REALM35 products (based on the ENVISAT mission) are dataset version 2.0. The atmospheric range corrections also vary between the datasets, for example, while the radiometer based wet tropospheric range correction is the primary selection across all, the secondary model-based choice utilizes the ECMWF estimates for Jason-2, Jason-3, and ENVISAT but is currently limited to employing the RADS/ERA model correction (TOPEX/Poseidon, Jason-1). The ionospheric range correction can also differ between missions, e.g., selecting the GIM model (Jason-3, ENVISAT) but otherwise utilizing various RADS options (TOPEX/Poseidon, Jason-1, and Jason-2). Full processing details can be found in the project ATBD document for the lake level products (Birkett et al., 2019). The altimetric community continues to upgrade mission datasets and is striving for a more common dataset version/standard across all missions. Merging (up to 4) 10-day resolution time series to create one uniform product spanning multiple decades relies on the availability of data within the 6-month overlap periods i.e., when the historical and new mission are in a tandem phase, overpassing the lake on the same ground track but spaced 1 minute apart. Any inter-mission range bias can be corrected for by noting the elevation shift required to align the results from the two time-series. Absence of data in this overlap period results in the application of a global mean inter-mission range bias estimated from global observation of ocean surface heights (Birkett et al., 2019, ATBD document). Merging a combination of GREALM-10, GREALM-35, DAHITI or LEGOS products to obtain the longest time record also cannot ensure uniformity of atmospheric corrections across all the different product C5 sources. In these merger cases elevation bias were estimated from the difference of the means of a subset (with good periodicity and few outliers) of each series. We now emphasize this point in page 4, line 21-22 and page 5, line 1-2. "Full details of the processing to create the G-REALM10 and G-REALM35 products can be found in the Algorithm Theoretical Basis Document (ATBD; Birkett et al., 2019). This includes the descriptions of the atmospheric corrections applied in the height reconstructions, the inter-mission height bias application, and the inherent differences between mission data set versions." In addition, we now reference the data sets' respective ATBD in "Abstract" (page 1, line 27) and "6. Data availability" (page 16, line 15). "The data sets presented and their respective Algorithm Theoretical Basis Documents are publicly available and distributed via NASA's Jet Propulsion Laboratory's Physical Oceanography Distributed Active Archive Center (PO DAAC; https://podaac.jpl.nasa.gov/)."

page 5 line 3, what is an acceptable accuracy? please quantify!
Our approach utilizes the classification algorithm described in Khandelwal et al. (2017) to estimate water surface area from MODIS imagery. In their paper, the authors validate the MODIS-based classification maps (500 m resolution) using higher spatial resolution Landsat-based reference maps (30 m resolution) at three target reservoirs (Mead, Kremenshugskoye, and Nova Ponte) under a dry and wet scenario (cfr. Table  2 in Khandelwal et al., 2017), discussing potential and limitations of such approach. Given the global nature of our study, it is virtually impossible to single-handedly establish "an acceptable accuracy" for 347 lakes/reservoirs. page 7, equation 1, I did not grasp the equation. shouldn't be WSA t+1 -WSA t?
We thank the reviewer for spotting the typo. We edited Equation (1)  We are thankful to the reviewer for further reinforcing this point, addressed in the reply to the major comment above. Figure 6, the extracted area is so noisy that similar area are obtained for different height. And in fact, no obvious linear relationship can be recognized.
We thank the reviewer for further highlighting this point, due to the resolution of the MODIS imagery utilized as discussed in previous comments.