the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
Global Ice Water Path Retrieval Using Fengyun series Satellite Data: A Machine Learning Approach
Abstract. This study presents a novel machine learning framework (RobustResMLP) for retrieving the global ice water path (IWP) and cloud ice water path (CIWP) from 2009–2024 via passive microwave observations from China's Fengyun-3 series satellites' microwave humidity sounders (MWHS-I/II). The framework employs a lightweight multilayer perceptron architecture enhanced with gated residual units and hierarchical differential dropout to address the challenges associated with high-noise satellite data. By establishing rigorous spatiotemporal collocation with CloudSat 2C-ICE products, we generate three operational products: (1) synoptic type that orbital-resolution IWP/CIWP (15 km; 2009–2024), (2) climatic type that gridded monthly composites (1°×1°; 2011–2024), and (3) cloud layer mask (CLM) products. Notably, the 89 GHz channel emerges is the most influential predictor despite theoretical limitations. This approach achieves a critical compromise between pointwise accuracy and spatiotemporal completeness, enabling unprecedented decadal-scale cloud feedback analyses. All the datasets are open available in the netCDF4 format for community sharing.
- Preprint
(1919 KB) - Metadata XML
-
Supplement
(588 KB) - BibTeX
- EndNote
Status: open (until 28 Oct 2025)
- RC1: 'Comment on essd-2025-447', Patrick Eriksson, 25 Oct 2025 reply
Data sets
Fengyun polar-orbiting satellite total/cloud ice water path retrieval dataset (2009-2024). Yifan Yang, Tingfeng Dou, Gaojie Xu, Rui Zhou, Bo Li, Letu Husi, Wenyu Wang, Cunde Xiao https://doi.org/10.11888/Atmos.tpdc.302932
Model code and software
Global Ice Water Path Retrieval Using Fengyun series Satellite Data: A Machine Learning Approach/generating figures and pre-post- precessing code Yifan Yang https://doi.org/10.5281/zenodo.16352116
Viewed
| HTML | XML | Total | Supplement | BibTeX | EndNote | |
|---|---|---|---|---|---|---|
| 905 | 22 | 15 | 942 | 19 | 19 | 15 |
- HTML: 905
- PDF: 22
- XML: 15
- Total: 942
- Supplement: 19
- BibTeX: 19
- EndNote: 15
Viewed (geographical distribution)
| Country | # | Views | % |
|---|
| Total: | 0 |
| HTML: | 0 |
| PDF: | 0 |
| XML: | 0 |
- 1
The manuscript by Yang et al. presents a new dataset of retrievals based on the MWHS instrument series. These retrievals focus on the ice water path (IWP), but also cloud IWP (CIWP) and cloud mask are considered. Retrievals of IWP based on operational microwave radiometers are surprisingly few, despite some clear advantages of such measurements for the task. An important forerunner is the work of Holl et al. (2014), also applying machine learning, using the same reference dataset (2C-ICE) and making use of similar microwave radiometers. However, Holl et al. (2014) also included passive near and thermal infrared (IR) measurements and in such way increased the sensitivity at conditions matching lower IWP. On the other hand, by involving near-IR data a restriction to day-time was introduced, a limitation avoided in this work. Another strength of this work is the relatively long time series of data provided, in contrast to Holl et al. (2014) that so far not been applied in an operational manner.
That is, the retrievals presented fill an important gap, and we want to see this dataset description being published in ESSD. However, the manuscript requires a major revision; at least, the details of how these retrievals based on machine learning were developed must be better described and the characterization of the retrieval performance must be extended. Details behind this recommendation are elaborated below.
As there will be several references to our own work, including a suggestion to consider data produced by us, we've decided to not stay anonymous in the interest of transparency. This review is made by Patrick Eriksson, assisted by PhD student Peter McEvoy. That said, we think the references to our own work are motivated.
General comments:
Specific comments
Line 21: According to the tables in the supplement, the instruments of concern can not be said to be "high-noise". It can also be questioned if noise is the main challenge in these inversions, this would be an ill-posed problem even in the limit of zero noise.
Line 23: It is unclear what is meant with "synoptic type that orbital-resolution", but presumably this refers to what is normally denoted to as level 2. The standard nomenclature of level 2 and 3 data should be adopted, see e.g. https://www.earthdata.nasa.gov/learn/earth-observation-data-basics/data-processing-levels
Line 26: Where is there a compromise between the accuracy on footprint level and the spatial-temporal completeness?
Line 27: The statement of "unprecedented" is vague and can be questioned. With respect to understanding the cloud feedback, for example, the retrievals based on MODIS must be considered as equally or more interesting. In any case, the CCIC retrievals (Amell et al. (2024); Pfreundschuh et al. (2025)) have a much higher spatial-temporal coverage, still offering a similar accuracy (as also trained on 2C-ICE).
Lines 32-36: The impact of cloud ice on the radiation budget is brought forward as the main motivation, but as the measurements of concern do not constrain the amount of cloud ice in a direct manner (as discussed above), other passive observations are more relevant for this aspect. On the other hand, the relatively direct measurement of larger ice hydrometeors is of high relevance for e.g. distribution of latent heat and understanding precipitation processes. That is, the motivation to bring forward should be considered.
Line 37: The statement of discrepancies in climate models "by orders of magnitude" needs closer specification. It is not true for mean IWP.
Line 49: Much of our knowledge in this matter goes back to work by Frank Evans, e.g. Evans and Stephens (1995), and seems reasonable to cite any of those works (as done by Zhao and Weng (2002)).
Line 51: "vertical profiles of the IWP"; IWP is a column value.
Line 64: The logic in these two sentences is not clear. Rephrase.
Line 81: Please replace Amell (2021) with the related journal publication Amell et al. (2022).
Line 83: The statement about Tana et al. (2025) does not seem correct. This was achieved, at least, in Amell et al. (2024).
Line 104: Wang et al. (2024) does not exist in the reference list.
Line 105: Tables S1-S4 are referenced in the text. It is very unclear that this refers to tables within the supplemental material. Please clarify that there is a supplement.
Line 105-106: The meaning of this sentence is unclear. What else than L1 should be used as basis for the retrievals?
Line 124-135: The quality control of the input Fengyun data is presented. Was any quality control or filtering applied to the 2C-ICE reference data?
Line 140: FY-3D and CloudSat are presented to be 30 min apart. If correct, there should not be any tropical collocations inside 15 min.
Lines 144: "pixel" seems to here mean boresight, but pixel indicates an area and is easily interpreted as footprint.
Line 145: This second criterion states, limiting co-locations to cases where the coefficient of variation for 2C-ICE pixels within an MWHS-II pixel is less than 0.6. This introduces a bias due to training only on relatively uniform cases. It would be helpful to have more motivation for this choice. Further, it must be clarified how the removed cases are considered in the error characterization.
Line 149: Please clarify what is meant by uniform distribution across latitude bands and how that is achieved.
Lines 149-153: Please clarify if and how these multiple training subsets are used. Or are they combined in some manner? Is there a separate model trained for each combination of MWHS-II and MWHS-I with IWP and CIWP?
Line 155: What is a balanced representation? In any case, motivate why going away from using the actual statistics of the reference dataset.
Line 157: As mentioned, just a reference to Li et al. (2012) is not sufficient. There is also a dot after (2012).
Line 158: Why are the number of cases for CIWP and IWP not the same (there should exist CIWP value for IWP)?
Line 159: As Sec 3 is very short (too short) it seems reasonable to merge Secs. 2 and 3.
Line 161: How has this resolution been determined? It sounds unlikely as not all channels used have a resolution of 15 km, and this resolution is only reached at nadir.
Line 162-163: Please provide more details on how the monthly means are provided. Any weighting of the data? Are all grid cells filled? Typical number of retrievals in each mean? Are those numbers reported in the resulting data files? See also first data comment.
Line 167: What is meant by "fundamental model" here? We can not find any RobustResMPL model outside this work. Or are the authors claiming to introduce RobustResMPL (but comments below contradict this)?
Line 169: Can 9 million parameters be considered as lightweight considering the few input data and the relatively limited scope of the model? For comparison, the MLP in Amell et al. (2022) had 0.3 million parameters. In any case, 9 million parameters seems large when compared to the training set of 700 000 – 900 000 cases. There should be a high risk for overfitting.
Line 170: “We make several significant improvements to the RobustResMLP” indicates that an existing model was used, but there is no reference to it.
Line 170: This list is appreciated. However, for a reader in the geoscience community, these techniques may not be familiar. It would be very helpful to have references to articles or other resources that provide more details on the techniques behind these improvements. Similarly, in Figure 1, references for "Lightweight Attention" and "Adaptive Feature Scaling" would be appreciated.
Sec 4.1: Since this is a supervised machine learning, the retrievals will work as long as the scenario being observed is similar to those in the training dataset. How does it handle rare events that are not close to the training set? Is there a way for the method to identify or flag retrievals that risk being out-of-distribution?
Line 186: Interesting solution on ensuring continuity across satellite generations by remapping values to the 150 GHz channel. However, at least a sentence or two quantifying any errors introduced by this approach, or providing motivation for why this can be expected to work sufficiently well, is motivated.
Line 193: Should be Fig. 1.
Figure 1: The first text box indicates that MWHS-I and II are used together. Presumably "and" should be "or".
Figure 1: The second text box explains that auxiliary data are used, but this is not mentioned in the text.
Figure 1: The text box "MLP Based Model for Mapping 150 GHz to 166 GHz" contradicts what was written in line 186, where the MLP is described as mapping to 150 GHz.
Line 200: With a log-transform, it must be described how IWP=0 was handled.
Line 219: It must be clarified how "detect clouds" is defined, for both retrievals. In the case of MWHS a log-transform is used and then no retrieval will be strictly zero.
Line 229-230: The naming convention suggests the orbital retrieval data is level 1, even though it is level 2. Recommend removing or changing L1 to L2 in the filenames.
Line 237: The naming convention proposes that the gridded data is level 1, even though it is level 3. Recommend removing or changing L1 to L3 in the filenames.
Line 240: Detail how "the most temporally stable" were identified and extracted.
Line 241: We were unable to find “Merged_Global_Mean.nc” in the data portal.
Sec 6: In brief, this section must be extended. The presented results must be discussed more carefully. For example, Fig. 3 having six panels of results is just commented in very brief terms. In addition, errors not covered by the present analysis must be incorporated, as indicated above.
Sec 6: For what data are the statistics derived? Training or validation data should not be used here. It must be clarified that the test data are sampled in an unbiased way. They should represent a fully random selection.
Sec 6: Since the model is trained and tested on 2C-ICE data, any bias and error from this dataset will be inherited. This issue should be discussed and the magnitude of these inherited errors should be listed.
Sec 6.1: We suggest making a figure with the occurrence fraction (histogram of values) of IWP in the training and test data, and retrieved dataset. This would clarify the nature of the training and test data, and also show the dynamic range of the retrievals.
Line 253: Please quantify “high accuracy”. The statement can be questioned as the biases reported are considerable.
Line 256: Start a new paragraph when starting to discuss Table 3. Same at line 261, when moving to SHAP.
Lines 256-257: The statement referring to Table 3 must be explained, what results show this? The next sentence "Our analysis ...", is this an explanation to the previous sentence, or a new topic?
Line 261: Clarify that Figs. S1 and S2 are in the supplementary material. What is SHAP?
Line 264: Wang et al. (2024) does not exist in reference list. In what way is there a consistency with Wang et al.?
Tables 2 and 3: Negative biases far exceeding the stated global mean IWP are reported. How is this possible. Are the retrievals giving negative values?
Table 3: Which combination of these inputs is the one applied for the general processing? Consider stating that clearly in the text, for example in Sec. 4. The two last combinations use lat and lon as input. Is that a wise choice? The training can then basically learn the geographical distribution (especially when using a net with millions of parameters). This could maybe be OK for some application, but should be avoided if temporal changes and trends are considered. If the model applied generally takes lat and lon as input, the consequence of this choice must be explored.
Figure 3: As IWP spans order of magnitudes, panels c and d should have a logarithmic x-axis, and the y-axis in f should give the relative error. As pointed out above, the errors when true IWP is zero must be reported somehow. For the first bar in panel b, please note that any retrieved IWP > 0 for (true) IWP=0 corresponds to an infinite relative error.
Sec. 7: The section only deals with IWP. There are no validations or comparisons for the other variables: CIWP and cloud mask. Such validations should also be performed.
Sec. 7: The section seems to ignore FY-3A. Does that indicate that those retrievals not are trustwhorthy? Include FY-3A, or remove it from the article (and disseminated data).
Sec. 7: Consider to include CCIC, as this is a product developed with similar objectives.
Figure 4: It is impossible to make a sensible comparison to 2C-ICE. This comparison requires that MWHS and ERA5 IWPs along the 2C-ICE transects are extracted and plotted together with 2C-ICE IWP as a line plot (and the transects added to panels e and h). Please include information on what satellite that carried the two instruments considered.
Line 305: The text reads as suggesting that Figure 5 shows that all IWP products exhibit fundamentally consistent spatial patterns. However, there are arguably larger differences in distribution between the FY-3X products than with the reference products, for example FY-3B and FY-3D. Text mentions that they are both "afternoon satellites". Are such differences expected?
Line 315: For clarity, please list which satellites are MWHS-I-based here again. To make it easier for users of the dataset to know what files to use.
Figure 7: According to the gridded dataset, FY-3B and FY-3C have overlapping data 2014–2019. The full FY-3B range should be included, or its omission should be motivated.
Line 330: Melia et al. (2016) does not seem to be a proper reference for DARDAR.
Figure 7: A downward trend for IWP for the FY-3X satellites can be seen, which is not reflected in MODIS, VIIRS or ERA5. That should be discussed. The FY-3X retrievals also appear to have a more clear annual cycle than any other dataset, that also should be discussed.
Sec 8: This section should be rewritten considering the general issues lifted above. In short, the section should also bring up limitations.
Line 348-353: In what way is the CLM a "distinct IWP product"?
Line 355-356: This reads as that neural networks automatically give a superior sampling. This is not correct, it is the choice of instrument that governs this aspect.
Line 357: The text indicates that "temporal continuity" has been achieved, while the main text mentions several data issues and biases between the FY-3X satellites are seen in Fig. 7.
Line 386-393: This "outlook" is not very relevant and can be removed. If kept, it must be revised to properly account for ongoing work in these directions. In addition, the previous paragraph is already of outlook character.
Line 398: Make it very clear where to find the data. Consider putting it as the first sentence. "The presented datasets are available at". The current phrasing is unclear.
Line 433-438: Duplicate reference.
Data comments
In the gridded data, all FY3B_MWHSX_GBAL_L1_YYYY_MEAN.nc (iwp) files have clear swath artefacts for certain months and they show up clearly when taking the yearly mean. Is there a suggested way to filter out these artefacts?
Is there a recommended way to combine overlapping files from different satellites to get a best estimate for the monthly gridded IWP?
There are some metadata issues in the NetCDF files; metadata units for IWP and CIWP are kg/m2, but value ranges suggest they are in g/m2. Cloud Mask Classification could benefit from a description on how to interpret the values.
Due to a suspected technical issue with the data provider, download speeds are unusably slow (max 20 KB/s). It takes > 4 hours to download the 300 MB gridded level 3 data files, and makes it unfeasible to download the > 400 GB orbital zip file. We have tried to download this on 9th Oct, 10th Oct and 12th Oct from multiple different internet connections in an effort to rule out local technical issues. Due to this, we were unable to look at the orbital level 2 dataset and its usefulness for the scientific community appears limited. For this reason we feel forced, at this moment, to rate the data quality as poor.
References
Amell, A., Eriksson, P., & Pfreundschuh, S. (2022). Ice water path retrievals from Meteosat-9 using quantile regression neural networks. Atmospheric Measurement Techniques, 15(19), 5701-5717.
Amell, A., Pfreundschuh, S., & Eriksson, P. (2024). The chalmers cloud ice climatology: Retrieval implementation and validation. Atmospheric Measurement Techniques, 17(14), 4337-4368.
Duncan, D. I., & Eriksson, P. (2018). An update on global atmospheric ice estimates from satellite observations and reanalyses. Atmospheric Chemistry and Physics, 18(15), 11205-11219.
Ekelund, R., Eriksson, P., & Pfreundschuh, S. (2020). Using passive and active observations at microwave and sub-millimetre wavelengths to constrain ice particle models. Atmospheric Measurement Techniques, 13(2), 501-520.
Eliasson, S., Buehler, S. A., Milz, M., Eriksson, P., & John, V. O. (2011). Assessing observed and modelled spatial distributions of ice water path using satellite data. Atmospheric Chemistry and Physics, 11(1), 375-391.
Eriksson, P., Baró Pérez, A., Müller, N., Hallborn, H., May, E., Brath, M., ... & Ickes, L. (2025). Advancements and continued challenges in global modelling and observations of atmospheric ice masses. EGUsphere, 2025, 1-42.
Evans, K. F., & Stephens, G. L. (1995). Microweve radiative transfer through clouds composed of realistically shaped ice crystals. Part II. Remote sensing of ice clouds. Journal of Atmospheric Sciences, 52(11), 2058-2072.
Holl, G., Eliasson, S., Mendrok, J., & Buehler, S. A. (2014). SPARE‐ICE: Synergistic ice water path from passive operational sensors. Journal of Geophysical Research: Atmospheres, 119(3), 1504-1523.
Li, J. L., Waliser, D. E., Chen, W. T., Guan, B., Kubar, T., Stephens, G., ... & Horowitz, L. (2012). An observationally based evaluation of cloud ice water in CMIP3 and CMIP5 GCMs and contemporary reanalyses using contemporary satellite data. Journal of Geophysical Research: Atmospheres, 117(D16).
Pfreundschuh, S., Kukulies, J., Amell, A., Hallborn, H., May, E., & Eriksson, P. (2025). The chalmers cloud ice climatology: A novel robust climate record of frozen cloud hydrometeor concentrations. Journal of Geophysical Research: Atmospheres, 130(6), e2024JD042618.
Zhao, L., & Weng, F. (2002). Retrieval of ice cloud parameters using the Advanced Microwave Sounding Unit. Journal of Applied Meteorology, 41(4), 384-395.