Articles | Volume 18, issue 2
https://doi.org/10.5194/essd-18-1287-2026
https://doi.org/10.5194/essd-18-1287-2026
Data description article
 | 
19 Feb 2026
Data description article |  | 19 Feb 2026

FYAI: a Fengyun satellite-based dataset for atmospheric ice water path

Yifan Yang, Tingfeng Dou, Gaojie Xu, Rui Zhou, Bo Li, Letu Husi, Wenyu Wang, and Cunde Xiao
Abstract

This study introduces FYAI, a global, long-term atmospheric ice water path (IWP) and suspended ice water path (SIWP) dataset spanning 2010–2024, derived from passive microwave observations (MWHS-I/II) onboard China's Fengyun-3 series satellites. The dataset is generated using a machine learning framework featuring a lightweight multilayer perceptron architecture enhanced with gated residual units. This design robustly handles the inherent uncertainties in satellite brightness temperatures and the spatial mismatch between passive microwave footprints and active radar/lidar training data. By establishing rigorous spatiotemporal collocation with CloudSat 2C-ICE products, FYAI provides two operational product levels adhering to standard Earth observation data processing definitions: (1) Level-2 (L2) products, offering instantaneous orbital-resolution IWP and SIWP at a nominal 15 km nadir resolution for 2010–2024; and (2) Level-3 (L3) products, comprising monthly global gridded composites at 1°×1° resolution (2010–2024). FYAI bridges the gap between instantaneous pixel-level precision and broad spatiotemporal coverage, offering a comprehensive, decadal-scale record of global atmospheric ice content. This dataset, specifically designed to support long-term climate analysis and model validation, is openly available in netCDF4 format for community use (https://doi.org/10.11888/Atmos.tpdc.303143, Yang et al., 2025)

Share
1 Introduction

Ice crystals play a pivotal role in cloud and precipitation processes, thereby significantly modulating the hydrological cycle, thermodynamics, and radiative transfer (Gultepe et al., 2017). Consequently, the reliable quantification of atmospheric ice content is critical for elucidating latent heat distribution and precipitation mechanisms (Amell et al., 2022). The primary metric used to describe this ice content is the ice water path (IWP), defined as the vertical integral of the ice water content (IWC). IWP is composed of both suspended ice and falling ice (also referred to as precipitation ice), although the criteria distinguishing these components remain ill-defined (Eliasson et al., 2011; Waliser et al., 2009). However, current climate models exhibit widespread inconsistencies and pronounced spatial heterogeneity in simulating IWP (Eriksson et al., 2025; Wang, 2022). Indeed, as highlighted in the Intergovernmental Panel on Climate Change Sixth Assessment Report (IPCC AR6), these cloud and precipitation processes remain primary sources of uncertainty in climate modeling and projections (IPCC, 2023). This underscores the critical need for high-quality observational constraints on atmospheric ice (Holl et al., 2014).

From an observational perspective, space-based remote sensing is the primary means of providing global IWP data, yet existing products face limitations. Visible and infrared sensors, such as MODIS and AIRS, have provided valuable long-term records. However, their measurements are often constrained by signal saturation in optically thick clouds, and they are primarily sensitive to upper cloud layers rather than probing the full depth of deep convective systems (Eliasson et al., 2011). Conversely, limb sounders like the Microwave Limb Sounder (MLS), while offering vertical profiles, are constrained by extremely sparse horizontal sampling, making them unsuitable for continuous regional monitoring (Wu et al., 2006). Active sensors (e.g., CloudSat/CALIPSO) offer high accuracy but represent only a “needle-thin” curtain of the atmosphere (Delanoë and Hogan, 2010; Hong and Liu, 2015). Passive microwave remote sensing bridges these gaps. On the one hand, microwave radiation can penetrate thick clouds and interact directly with ice particles via volume scattering to retrieve bulk ice mass, while polarimetric measurements provide further constraints on ice crystal shape and orientation. On the other hand, a notable limitation is its coarser horizontal and vertical spatial resolution compared to active sensors. Nevertheless, it remains the most effective approach for capturing broad-scale variability. Consequently, passive microwave instruments remain the optimal solution for retrieving large-scale, long-term, and all-weather IWP data due to their ability to penetrate dense clouds and interact directly with ice mass (Evans and Stephens, 1995; Wu et al., 2008, 2024).

Currently, microwave humidity sounders operating below 200 GHz (e.g., AMSU-B, MHS) are standard for ice detection. However, despite carrying Microwave Humidity Sounder (MWHS), the potential of China's Fengyun-3 (FY-3) series satellites remains largely untapped in producing global climate datasets. The FY-3 series offers a unique advantage unmatched by other operational systems: a complete three-orbit constellation comprising morning (FY-3A/C/F), afternoon (FY-3B/D), and the distinct dawn-dusk (FY-3E) orbit satellites (An et al., 2023; Tan et al., 2019; Wang et al., 2022). This configuration allows for substantially improved temporal sampling, filling critical gaps in the diurnal cycle of IWP that are missed by sun-synchronous satellites restricted to fixed crossing times, particularly with the inclusion of FY-3E observations starting in 2023. By leveraging this 15-year continuous archive (2010–2024), there is an opportunity to construct a coherent, long-term IWP climate data record that overcomes the spatiotemporal limitations of existing datasets.

While traditional physical retrieval methods offer interpretability, they rely heavily on complex scattering databases and microphysical assumptions (e.g., particle shape and size distribution) that are often difficult to constrain globally (Letu et al., 2016, 2020). In contrast, machine learning (ML) has introduced a novel paradigm for remote sensing retrieval. Its primary novelty lies in its ability to approximate complex radiative transfer processes through data-driven representation learning, effectively bypassing the rigid dependence on a priori microphysical assumptions required by physical inversions. By constructing deep neural network, ML can capture highly non-linear relationships and extract abstract features from multi-channel observations that are often imperceptible to traditional methods. Previous efforts, such as SPARE-ICE (Holl et al., 2014) or geostationary retrievals (Amell et al., 2022, 2024; Tana et al., 2025), have demonstrated the efficacy of NN-based approaches. Similarly, recent studies involving co-authors of this paper have explored ML applications on IWP retrieval using polar-orbiting FY-3 satellites (Wang et al., 2022, 2024). However, a dedicated, long-term IWP dataset derived specifically from the advanced capabilities of the FY-3 constellation – which also incorporates a distinction between total ice and suspended ice – is currently absent from the community.

To address these gaps, this study presents “FYAI” (Fengyun Satellite-Based Dataset for Atmospheric Ice Water Path), a novel global dataset generated using a NN-based framework. By training on 2C-ICE active remote sensing data and applying it to the MWHS-I/II records from the entire FY-3 family, FYAI provides a seamless 15-year record (2010–2024) of both Level-2 (L2) and Level-3 (L3) monthly gridded IWP. A unique feature of FYAI, achieved by integrating 2B-CLDCLASS product, is its ability to provide a separate product specifically for suspended IWP (SIWP), distinguishing it from falling ice. This distinction offers additional observational constraints for climate models. FYAI offers a unique combination of all-sky capability, dense spatial coverage, and the first-ever inclusion of dawn-dusk microwave observations, offering new insights into the global atmospheric ice content.

2 Data

2.1 Input data

The primary passive microwave instruments utilized in this study are the MWHS-I and MWHS-II, onboard China's second-generation polar-orbiting FY-3 series meteorological satellites. The MWHS-I is carried on the initial batch of these satellites (FY-3A and FY-3B). The MWHS-II represents a significant upgrade and was deployed in two successive batches: the first batch aboard the second satellite group (FY-3C, FY-3D), and the second batch aboard the third group (FY-3E, FY-3F). It expands the channel count from 5 to 15, adding new oxygen absorption channels near 118.75 GHz and a window channel at 89 GHz (Wang et al., 2024). Both MWHS-I and MWHS-II operate as cross-track scanners. The MWHS-I offers a nadir resolution of approximately 15 km across all its channels. For the MWHS-II, all channels also have a nadir resolution of about 15 km, with the exception of the 89 and 118 GHz channels, which have a coarser nadir resolution of approximately 25 km. Detailed channel specifications, instrument parameters, and the data temporal coverage for each satellite are provided in Tables S1–S4 in the Supplement.

For input into our retrieval model, we selected not only the Level-1 (L1) brightness temperature data from these instruments but also a suite of auxiliary geographical and geometric parameters. These additional features include the Digital Elevation Model (DEM), solar zenith angle, satellite zenith angle, land-sea mask etc. A comprehensive list of all input variables is presented in Table 1.

Table 1All input variables.

Download Print Version | Download XLSX

2.2 Reference data

2.2.1 2C-ICE

The CloudSat and CALIPSO ice cloud property product (2C-ICE) is developed by synergistically integrating measurements from the CloudSat Cloud Profile Radar (CPR) and the CALIPSO CALIOP lidar. Specifically, it utilizes CPR radar reflectivity (from the 2B-GEOPROF dataset) alongside CALIOP attenuated backscatter at 532 nm. By combining the penetration capability of the radar with the high sensitivity of the lidar to tenuous ice, this joint approach effectively overcomes the limitations of single-instrument retrievals, yielding IWC estimates with enhanced accuracy (Deng et al. 2010). The base CPR data provides vertical profiles at a 240 m resolution with a 1.4 km × 1.8 km footprint. In this work, the 2C-ICE product is specifically employed to be the IWP reference value.

2.2.2 2B-CLDCLASS

The 2B-CLDCLASS product, based on CloudSat CPR observations, utilizes a multidimensional approach to categorize clouds with high precision. The classification framework integrates key parameters, including hydrometeor dimensions (vertical/horizontal scales) and the maximum radar reflectivity factor (Ze), alongside crucial ancillary data such as precipitation flags and ECMWF temperature profiles, which aid in phase determination (Sassen and Wang, 2008). While enabling robust cloud climatology studies, in this work, the 2B-CLDCLASS product is specifically employed to distinguish and extract the SIWP component from the IWP.

2.3 Validation data

To ensure comprehensive evaluation, multiple validation datasets are utilized alongside 2C-ICE. These include satellite-derived retrievals from active and passive remote sensing instruments, as well as independent reanalysis products.

2.3.1 DARDAR (raDAR/liDAR) IWP

DARDAR (raDAR/liDAR) is a synergistic ice-cloud retrieval that combines CloudSat radar and CALIPSO lidar measurements within a variational framework to yield profiles of extinction coefficient, ice water content and effective radius (Re) (Delanoë and Hogan, 2008, 2010; Hogan et al., 2006). The algorithm adopts the “unified” particle-size distribution of Field et al. (2005) and employs in-situ-derived mass–and area–dimension relations for non-spherical ice particles (Brown and Francis, 1995; Li et al., 2012).

2.3.2 CCIC IWP

The Chalmers Cloud Ice Climatology (CCIC) is a long-term climate data record of global total ice water path (TIWP). It is generated by a deep model using geostationary satellite infrared window channel observations and provides continuous, all-sky (day and night) TIWP estimates from 1983 to the present within 70° S–70° N, which has been demonstrated to agree well with other in-situ and active radar observations (Amell et al., 2024; Pfreundschuh et al., 2025).

2.3.3 MODIS and VIIRS IWP

This study utilizes operational IWP data derived from MODIS and VIIRS instruments, obtained through the CERES SSF1deg product suite.

The IWP is retrieved via a bispectral algorithm from imager radiances and represents the total column ice mass. The native high-resolution retrievals are aggregated to CERES footprints and subsequently averaged onto a 1° global grid. Daily and monthly means are generated after temporal interpolation of instantaneous values (Platnick et al., 2017).

2.3.4 ERA5 IWP

ERA5 is the fifth-generation global atmospheric reanalysis from the European Centre for Medium-Range Weather Forecasts (ECMWF). It provides globally complete, hourly estimates of atmospheric variables from 1940 onward at a horizontal resolution of 0.25°. The dataset is produced using a fixed version of the ECMWF's Integrated Forecasting System (CY41R2) and a 4D-Var assimilation system, which incorporates over 200 diverse observation sources to ensure physical consistency (Hersbach et al., 2020). In this study, the ERA5 variable “Total column cloud ice water” is used as SIWP, while the sum of “Total column cloud ice water” and “Total column snow water” represents the total IWP.

3 Methodology

3.1 Preprocessing

Quality control

To ensure data reliability, rigorous quality control was applied based on the L1 product flags. For MWHS-II, we selected data points satisfying QA_Scan_Flag = 0, QA_Ch_Flag = 0, and QA_Score  90. For MWHS-I, we required cal_qc, pixel_qc, and scnlin_qc to all equal 0. Similarly, 2C-ICE data were filtered to exclude points where Data_quality was non-zero.

3.2 Collocations

To train the ML model, passive microwave observations must be collocated with reference data in space and time. FY-3D and CloudSat are both satellites in afternoon orbits. FY-3D crosses the equator at approximately 14:00 LT (local time), while CloudSat crosses at 13:30 LT. Due to CloudSat's orbital drift during operation, the time difference between it and FY-3D is mostly within 15 min. Consequently, temporal matching is straightforward, and a 15 min time window was selected to ensure a sufficient number of collocations. Furthermore, given that the typical cloud lifetime is on the order of minutes to hours, this 15 min interval falls within the physical timescale where cloud features remain relatively stable, rendering these non-strictly synchronous observations scientifically valuable (Holl et al., 2010).

Spatially, matching is more complex because MWHS-II has a coarser resolution than 2C-ICE, resulting in multiple 2C-ICE pixels falling within a single MWHS-II field of view (FOV). Based on previous studies (Holl et al., 2010; Wang et al., 2022), two criteria were initially adopted to ensure sufficient representativeness and homogeneity of the 2C-ICE pixels within each MWHS-II FOV: (1) at least nine 2C-ICE pixels must lie within a 7.5 km radius of the MWHS-II FOV center, and (2) the coefficient of variation (standard deviation divided by the mean) of these 2C-ICE pixels must be less than 0.6.

However, two critical limitations regarding this spatial matching approach must be acknowledged. First, using a fixed 7.5 km distance threshold is imprecise because MWHS-II spatial resolution varies by frequency: approximately 15 km at 150/183 GHz, but 25 km at 89/118 GHz. Since channels near 118 GHz are not included in our model input, only the 89 GHz channel differs in resolution from the others. Although the 89 GHz channel has a coarser resolution (25 km) and is crucial for IWP retrieval (Wang et al., 2024), we prioritized the matching accuracy for the 183 GHz channels (15 km), which constitute the majority of the input features. Therefore, the 7.5 km threshold is a compromise to ensure the highest fidelity for the sounding channels, despite the partial spatial mismatch at 89 GHz. Second, MWHS instruments are cross-track scanners, meaning their spatial resolution degrades as the scan angle increases away from nadir (Fig. S1 in the Supplement). The stated resolutions of 15/25 km represent the nadir resolution (the theoretical maximum). This further indicates that using a fixed 7.5 km threshold across the entire swath is not entirely accurate. While we plan to introduce a scan-angle-dependent variable threshold in future updates, the fixed 7.5 km threshold was retained in the current version to maintain algorithmic simplicity and consistency across the swath matched with the nadir resolution baseline.

Ultimately, using FY-3D data from October 2018 to October 2020, we generated a dataset containing 2 667 945 matched points. For the MWHS-I instrument, FY-3B is also an afternoon satellite with an ascending node local time of 13:40 LT. We thus used its data from December 2010 to April 2011 and matched them with corresponding 2C-ICE data following the same criteria applied for MWHS-II. This process yielded 426 761 matched points. Both the MWHS-I and MWHS-II datasets were then split into training and testing sets. Subsequently, the training set was further divided, with 80 % used for model training and the remaining 20 % reserved for validation.

The calibration process for the SIWP training dataset followed an approach similar to that used for the IWP dataset. Based on the FLAG methodology described by Li et al. (2012), we isolated the suspended component of the ice water path. This involved applying strict filtering criteria: all retrievals identified as surface precipitation were discarded. Furthermore, to minimize convective influence, we excluded data points classified as “deep convection” or “cumulus” according to the 2B-CLDCLASS product. Similarly, the final dataset consisted of 2 667 945 matched points for MWHS-II and 426 761 matched points for MWHS-I.

3.3 Postprocessing

The L2 IWP product maintains a native spatial resolution of nominal 15 km at nadir. To support climatological analysis, we generate monthly L3 products on a uniform 1°×1° global grid. This is achieved by resampling and averaging all available L2 data points within each grid cell for each calendar month.

3.4 IWP retrieval algorithm

To retrieve IWP from passive microwave remote sensing observations, we developed a NN-based model built upon the framework of quantile regression neural networks (QRNNs). QRNNs synergize the non-linear representation learning capabilities of neural networks with the statistical framework of quantile regression. Unlike traditional regression models that estimate only the conditional mean of a response variable, QRNNs are designed to estimate multiple conditional quantiles of the target distribution simultaneously. This approach provides a comprehensive probabilistic view of the prediction, quantifying the aleatoric uncertainty inherent in the data, which is particularly valuable in remote sensing retrievals where robust uncertainty assessment is crucial. Previous studies have demonstrated QRNNs to be a high-performance and readily deployable model in this field (Amell et al., 2022; Pfreundschuh et al., 2018; Wang et al., 2024). Furthermore, to enhance model performance, we implemented a deep residual network architecture combined with attention mechanisms (He et al., 2016; Vaswani et al., 2017). This design allows the model to automatically focus on the most critical feature channels in the input satellite data while maintaining high training stability. To enable the prediction of this uncertainty range, our model employs the specialized Quantile Loss, also known as the pinball Loss, instead of the traditional Mean Squared Error (MSE) loss function. The formula for the Quantile Loss is expressed as follows:

(1)Lτxτ,x=τx-xτxτx(1-τ)x-xτotherwise(2)xτ=inf{x:F(x)τ}(3)L(x)=1Ni=0NLτixi^,x.

Based on the fundamental assumption in deep learning that the training set, test set, and inference data are independent and identically distributed (i.i.d.), we calibrated our point estimation strategy using the test set statistics. Specifically, the deterministic point estimate was defined as the quantile associated with the mode of the optimal quantile distribution, calculated using 50 bins on the test set. Consequently, the optimal quantile was determined to be 47.87 % for the MWHS-I model and 40 % for the MWHS-II model. Additionally, the 5th and 95th percentiles were employed to define the uncertainty bounds for the IWP estimates. The matched dataset is partitioned into training and validation subsets. Prior to model training, the IWP reference values within the training set are log-transformed. To handle zero values in this transformation, they are replaced with a small positive value of 1×10-6. Analogous procedures were applied to the SIWP retrieval model. The specific structure of the model is shown in Fig. 1, and the detailed hyperparameters are listed in Table S5.

https://essd.copernicus.org/articles/18/1287/2026/essd-18-1287-2026-f01

Figure 1Structural diagram of the QRNN model and flowchart of the retrieval algorithm.

3.5 Evaluation metrics

The performance of the QRNN model in retrieving IWP is evaluated via the root mean square error (RMSE) and Pearson correlation coefficient (R), which are calculated as follows:

(4)RMSE=1Ni=1Nypred,i-yref,i2(5)R=1Ni=1Nypred,i-ypredyref,i-yrefσpredσref.

Here, ypred and yref represent the model predictions and reference values, respectively, whereas σpred and σref are the standard deviations.

For low IWP values regime detection, performance is evaluated via a confusion matrix M, with metrics including FAR and CSI, defined as:

(6) M = TP FP FN TN .

True positives (TP) correspond to cases where both MWHS-I/II and CloudSat detect a low-IWP regime, whereas true negatives (TN) occur when neither of them identifies such a regime. False positives (FP) arise when MWHS-I/II detects a low-IWP regime that CloudSat does not confirm, and false negatives (FN) occur when CloudSat identifies a low-IWP regime that MWHS-I/II fails to detect.

(7) FAR = FP / ( TP + FP )

(8) CSI = TP / ( TP + FN + FP )
https://essd.copernicus.org/articles/18/1287/2026/essd-18-1287-2026-f02

Figure 2Schematic of the data file structure: (a) L2 data file structure; (b) L3 data file structure.

Download

4 Data records

We generated L2 (15 km resolution) and monthly L3 (1°×1° grid) IWP and SIWP products using MWHS data from the FY-3 series (2010–2024). L2 files follow the naming convention “FY3X_MWHSX_GBAL_L2…”, while L3 files are named “FY3X_L3_Gridded…”. Notably, L3 products for FY-3E/F further distinguish between ascending and descending orbits. Table 2 details variable specifications, and Fig. 2 visualizes the internal data structure.

Table 2Data variables in FYAI L2 and L3 products.

Download Print Version | Download XLSX

Figure 3 shows the monthly count of FY-3 L1 data inputs to the model. Due to operational anomalies, hardware upgrades, and other mission-related factors, data availability dropped below 50 % in certain months. The 50 % data-availability criterion is not meant as a benchmark for climate-grade accuracy; whether it suffices depends on the study's objectives and the natural variability of the target region (Bertrand et al., 2024; Kotarba et al., 2021). Nevertheless, we recommend that users exercise caution when utilizing data from months where availability falls below 50 %.

https://essd.copernicus.org/articles/18/1287/2026/essd-18-1287-2026-f03

Figure 3MWHS-I and MWHS-II L1 data availability onboard the FY-3 series satellites.

Download

5 IWP retrieval performance

It is important to acknowledge that since the QRNN model was trained and tested based on the 2C-ICE dataset, it inevitably inherits the systematic biases of the 2C-ICE product. Previous studies have indicated that assumptions regarding the lidar ratio, particle size distribution (PSD), and particle shape in the 2C-ICE retrieval algorithm introduce systematic uncertainties. Comparisons with in-situ observations suggest an uncertainty of approximately 30 % in 2C-ICE retrieved IWC (Deng et al., 2010, 2013).

Figure 4 illustrates the comparison of IWP retrieval performance between the two satellite sensors. In terms of quantitative regression metrics, the model performance on FY-3D is significantly superior to that on FY-3B. Specifically, the scatter plot for FY-3D (Fig. 4a) shows a high consistency between predicted and reference values, with a correlation coefficient (R) of 0.833 and a RMSE of 450.78 g m−2. In contrast, the scatter distribution for FY-3B (Fig. 4d) is more dispersed, yielding a lower R of 0.620 and a larger RMSE (871.40 g m−2). This disparity highlights the substantial contribution of the rich channel information provided by MWHS-II to the quantitative retrieval of IWP.

https://essd.copernicus.org/articles/18/1287/2026/essd-18-1287-2026-f04

Figure 4Performance metrics of the QRNN model on the IWP test dataset. (a) scatter plot of mode-retrieved IWP values versus reference values on MWHS-II; (b) Q–Q plot of predicted values versus reference values on MWHS-II; (c) confusion matrix for MWHS-II using an IWP threshold of 0.5 g m−2; (d) analogous to (a) but for MWHS-I; (e) analogous to (b) but for MWHS-I; (f) analogous to (c) but for MWHS-I.

Download

Regarding statistical distribution, we analyzed both the Quantile–Quantile (Q–Q) plots (Fig. 4b and e) and the Probability Density Functions (PDFs, Fig. 5) based on an independent test dataset. As shown in the PDF analysis, the retrieved IWP distribution exhibits remarkable agreement with the reference distribution across nearly six orders of magnitude (ranging from 10−2 to 104 g m−2). This confirms that the model successfully reproduces the climatological statistics without suffering from significant mean-reversion. Both the PDFs and Q–Q plots indicate that the model robustly captures the data distribution characteristics. Critically, given the global mean IWP of approximately 100 g m−2 (Xu et al., 2022), the model maintains robust performance across predominant atmospheric conditions. However, deviations are observed in the extremely low-value region in the Q–Q plots. This is likely attributable to the inherent physical limitations of passive microwave remote sensing, which is sensitive to large scatterers (e.g., snowflakes) but lacks sensitivity to small ice crystals.

https://essd.copernicus.org/articles/18/1287/2026/essd-18-1287-2026-f05

Figure 5PDFs of IWP for the training dataset, testing dataset, and model retrievals. (a) FY-3D (MWHS-II model); (b) FY-3B (MWHS-I model). The histograms are calculated using logarithmically spaced bins to capture the wide dynamic range.

Download

To further investigate model performance in the low-IWP value range, we performed a binary classification assessment on the test set using a threshold of 0.5 g m−2. The results (Fig. 4c and f) reveal distinct characteristics for the two sensors. Although FY-3D achieves higher quantitative retrieval accuracy, its confusion matrix (Fig. 4c) indicates a relatively high False Alarm Ratio (FAR = 0.76) and a lower Critical Success Index (CSI = 0.23). This is primarily due to a large number of background pixels (low values) being misclassified as exceeding the threshold (FP = 257 474). Conversely, while FY-3B (Fig. 4f) has lower regression accuracy, it exhibits a better balance in classification metrics, with a lower FAR (0.51) and a relatively higher CSI (0.48). While this difference may be partially influenced by the varying sample sizes in the test sets, it suggests that the FY-3D model, while accurate in estimating IWP magnitude, tends to be over-sensitive at the boundary between weak signals and background noise.

The performance analysis for SIWP yields similar conclusions to those for IWP and is detailed in the Supplement (Sect. S2, Fig. S2).

6 Product validation

6.1 Typhoon events

Figure 6 presents the FYAI L2 IWP retrievals, alongside IWP estimates from the 2C-ICE product, the CCIC dataset, and ERA5 reanalysis data, capturing the case of Tropical Cyclone CILIDA over the South Indian Ocean on 24 December 2018. The retrievals from both MWHS-I and MWHS-II effectively capture the spatial distribution of high-IWP regions within the cyclone's convective core, a feature that is also accurately characterized by the CCIC product. In contrast, while the ERA5 reanalysis dataset broadly reproduces the macroscopic structure of these high-IWP regions, it exhibits significantly lower spatial detail compared to the satellite retrieval products.

https://essd.copernicus.org/articles/18/1287/2026/essd-18-1287-2026-f06

Figure 6Comparison of FYAI L2 IWPs from MWHS-I and MWHS-II retrieval, CCIC, 2C-ICE and ERA5 in a case study of tropical cyclone. UTC time is used. (a–e) Spatial distributions of IWP from MWHS-I, MWHS-II, CCIC, ERA5, and 2C-ICE, respectively. (f–i) Scatter plots of FYAI versus reference datasets: (f) MWHS-I vs. 2C-ICE, (g) MWHS-II vs. 2C-ICE, (h) MWHS-I vs. CCIC, and (i) MWHS-II vs. CCIC. The solid red line in (c) marks the collocation sampling track between CCIC and FYAI, along which data points are extracted to produce (h) and (i).

Download

To further evaluate performance against the CCIC product and the narrow-swath 2C-ICE observations, we performed spatiotemporal collocation and generated scatter plots for quantitative analysis. As illustrated in the scatter plots, the retrievals from MWHS-II demonstrate a higher degree of agreement with both the CCIC and 2C-ICE benchmarks compared to MWHS-I. This indicates a substantial improvement in retrieval capability and performance for the second-generation instrument relative to its predecessor.

https://essd.copernicus.org/articles/18/1287/2026/essd-18-1287-2026-f07

Figure 7Global average spatial distributions of the IWP compared with those of other satellite products and reanalysis products.

https://essd.copernicus.org/articles/18/1287/2026/essd-18-1287-2026-f08

Figure 8Zonal mean IWP compared with other satellite products and the ERA5 reanalysis.

Download

https://essd.copernicus.org/articles/18/1287/2026/essd-18-1287-2026-f09

Figure 9Analogous to Fig. 7 but for SIWP.

6.2 Global gridded product comparison and zonal mean comparison

Figure 7 presents the multiyear average spatial distribution of the IWP, whereas Fig. 8 shows the zonal mean distribution of the IWP. All the IWP products were resampled to a spatial resolution of (1°×1°). All the IWP products exhibit fundamentally consistent spatial patterns. Notably, FYAI demonstrates closer alignment with active sensor products than passive ones. However, it is important to point out that compared to the 2C-ICE and DARDAR active remote sensing baselines, the IWP retrieved from MWHS-II shows a slight overestimation in the equatorial region. In contrast, the MWHS-I retrievals align more closely with active observations at these latitudes. Meanwhile, both MWHS-I and MWHS-II exhibit a notable underestimation in the mid-to-high latitudes of the Southern Hemisphere. Although the time series do not overlap, we selected the 2007–2010 period for active instrument comparison because of CloudSat's superior data completeness before 2011. This selection is necessitated by data constraints but remains scientifically justified, as both spatial patterns and total magnitudes show minimal variation in IWP sequences. Additionally, passive optical/infrared instruments (MODIS, VIIRS) and the ERA5 reanalysis result in significant underestimations of IWP values at low-to-mid latitudes, whereas the MODIS and VIIRS retrieval products result in substantial overestimations in polar regions. For the SIWP, the multiyear average spatial distribution and zonal mean are shown in Figs. 9 and 10; the overall distribution closely resembles that of IWP, but the values are lower in magnitude. Notably, the SIWP derived from FYAI MWHS-II shows a closer agreement with 2C-ICE.

https://essd.copernicus.org/articles/18/1287/2026/essd-18-1287-2026-f10

Figure 10Analogous to Fig. 8 but for SIWP.

Download

6.3 Long-term analysis of gridded products

Figure 11 presents the time series of global total atmospheric ice mass derived from our gridded retrieval products for the period of 2011–2024. For comparison, the orange and blue-green lines represent IWP data from 2C-ICE and DARDAR (another IWP product based on active remote sensing instruments; Delanoë and Hogan, 2008), respectively. Due to battery anomalies with CloudSat after 2011, which resulted in the loss of nighttime data, the time series for both 2C-ICE and DARDAR are restricted to the 2007–2010 period.

https://essd.copernicus.org/articles/18/1287/2026/essd-18-1287-2026-f11

Figure 11Native time series of the monthly global average of total atmospheric ice content and comparison with other satellite products, along with the ERA5 reanalysis. All calculations of total atmospheric ice consider latitude area weighting.

Download

https://essd.copernicus.org/articles/18/1287/2026/essd-18-1287-2026-f12

Figure 12Analogous to Fig. 11. but for SIWP.

Download

In terms of magnitude, our retrieval products align closely with 2C-ICE and DARDAR. In contrast, estimates from passive optical/infrared instruments (MODIS and VIIRS) and ERA5 reanalysis are significantly lower than the active radar-based baselines. Note that all mass calculations are area-weighted by latitude.

However, the time series reveals that the FYAI product exhibits larger interannual variability compared to the 2C-ICE baseline. This variability is not uniform over time; it is most pronounced during the FY-3B era. While variability decreases in the later period, the fluctuations in the early record likely reflect sensitivity differences inherent to the first-generation instrument. The mean global total atmospheric ice mass from our products for 2011–2024 is 57.62±2.32 Gt (calculated as the mean ± 1 SD (standard deviation) based on a t-distribution; this also applies to the SIWP discussed below), which is consistent with our previous estimation using the DARDAR product (Xu et al., 2022).

Regarding SIWP, retrievals from both MWHS-I and MWHS-II align closely with ERA5 and exhibit strong consistency with the 2007–2010 2C-ICE baseline (Fig. 12). The estimated global suspended ice mass for the 2011–2024 period is 10.78±0.99 Gt.

7 Uncertainty analysis

Although the uncertainty in IWC from 2C-ICE is approximately 30 %, it remains one of the most reliable remote sensing IWP retrieval datasets currently available. As the FYAI dataset is generated using 2C-ICE as reference data for training ML models, it inevitably inherits uncertainty from 2C-ICE. This section outlines the uncertainty characterization for both FYAI L2 and L3 products.

7.1 L2 product uncertainty

The QRNN model employed in FYAI outputs an approximation of the quantile function (i.e., the inverse cumulative distribution function, or inverse CDF) of the conditional distribution. Consequently, the model implicitly models a conditional probability distribution, allowing for the retrieval of specific percentiles of the estimated variable. We have selected the 5th and 95th percentiles of the predicted distribution to represent the lower and upper bounds of uncertainty, respectively.

7.2 L3 product uncertainty

The uncertainty of the FYAI L3 product is calculated in two distinct stages. The first stage defines the uncertainty when aggregating L2 instantaneous observations into L3 monthly mean products, using the SEM as the metric. Based on the 5th/95th percentile bounds derived from the L2 products, and assuming errors follow a normal distribution, the variance for individual pixels is first estimated. Then, following the law of propagation of uncertainty (assuming independent errors among pixels within a grid cell), the variance of the grid mean is calculated (as the sum of individual variances divided by the square of the total number of observations falling within that grid). Finally, the square root of this variance is taken to obtain the monthly SEM.

The second stage addresses the uncertainty when aggregating L3 monthly means into L3 annual means. To avoid underestimating the final uncertainty, a conservative estimation strategy is adopted: assuming highly correlated errors between months (e.g., potential systematic errors), the annual mean uncertainty is defined simply as the arithmetic mean of the uncertainties of the 12 months in that year.

Table 3Summary of FYAI dataset components requiring cautionary usage or having specific limitations.

Download Print Version | Download XLSX

8 Code and data availability

The datasets generated in this study are available for download at https://doi.org/10.11888/Atmos.tpdc.303143 and https://cstr.cn/18406.11.Atmos.tpdc.303143, and should be cited as Yang et al. (2025). Additionally, the code and model weights have been deposited at https://doi.org/10.5281/zenodo.18479174 (Yang, 2026). Regarding the public source data used in this work, the FY-3 MWHS-I/II Level-1 observations are accessible via the National Satellite Meteorological Center (NSMC) data portal; the CloudSat-CALIPSO products (2C-ICE and 2B-CLDCLASS) can be obtained from the CloudSat Data Processing Center (DPC); the ERA5 reanalysis data are available via the Copernicus Climate Change Service (C3S) Climate Data Store under the dataset “ERA5 hourly data on single levels from 1940 to present”; and the CCIC product is hosted on the Amazon Web Services (AWS) Open Data Registry (Amell et al., 2024).

9 Conclusion and usage notes

A global IWP and SIWP dataset spanning 2010–2024 was produced using a ML framework derived from passive-microwave observations (MWHS-I/II) onboard the FY-3 satellite series. Three distinct product levels were generated: (1) L2 IWP and SIWP preserving native sensor resolution (15 km at nadir); and (2) L3 monthly gridded global composites (1°×1°) for individual sensors.

Prioritizing global representativeness and long-term homogeneity over instantaneous pixel-level precision was a deliberate strategy in this study. While our passive microwave retrievals provide the wide-swath coverage essential for decadal climate analysis, they may not match the instantaneous accuracy of active sensors. We acknowledge a fundamental sensitivity gap: while 2C-ICE synergizes lidar and radar to capture the full spectrum of ice clouds, MWHS channels rely primarily on volume scattering from larger particles. Consequently, a detection “blind zone” exists for tenuous cirrus, leading to expected discrepancies in the low-IWP regime. Despite this frequency mismatch, 2C-ICE remains the optimal global benchmark for vertical structure. Our ML framework bridges this gap by capturing robust statistical mappings where sufficient scattering signals exist. Although the network effectively filters label noise – even under the spatial mismatch between the coarse MWHS footprint (∼15 km) and the narrow 2C-ICE track – it must be noted that reported error metrics likely underestimate uncertainty in highly heterogeneous scenes.

Specific limitations regarding variable definition and instrument stability must be acknowledged. First, the partition of SIWP from total IWP represents an exploratory effort. Since no single instrument currently distinguishes suspended from falling ice reliably, this separation serves primarily to facilitate model-observation comparisons. Second, regarding temporal stability, specific subsets of the FYAI dataset require cautionary usage (summarized in Table 3). The larger interannual variability observed in the FY-3B era reflects a necessary trade-off: lacking the 89 GHz channels available on MWHS-II, we incorporated the 150 GHz channel to ensure sensitivity to ice clouds (Wang et al., 2022). Unlike the opaque 183 GHz band, this window channel is susceptible to surface emissivity variations, introducing background noise into the time series – a stability issue largely resolved in the post-2014 MWHS-II era. Additionally, L3 products derived from FY-3B show anomalous positive deviations during 2017–2019, attributed to potential instrument aging. Conversely, FY-3A products (2010–2013) exhibit a slight underestimation. While FY-3A and FY-3B form a valuable morning-afternoon constellation, users should be aware of these calibration nuances when conducting long-term trend analyses. We are actively working to address these issues in future updates through physics-based constraints and close collaboration with instrument specialists.

Based on this methodology, we generated comprehensive retrieval products spanning FY-3A through FY-3F. A distinctive advancement of this dataset is its global applicability over both land and ocean – surpassing the ocean-only limitation of many existing passive microwave products.

Looking ahead, we will explore advanced data fusion architectures to address current limitations. Our future work will prioritize three key directions: (1) synergetic retrievals combining passive microwave with optical/infrared observations, utilizing cloud-top information to compensate for the microwave spectrum's insensitivity to cirrus clouds; (2) joint retrieval frameworks that simultaneously assimilate multispectral observations within a unified radiative transfer model; and (3) Physics-Tnformed Neural Networks (PINNs) that incorporate cloud microphysical constraints to enhance the accuracy of vertical stratification.

In particular, the deployment of next-generation observation missions, such as EarthCARE and DQ-1, will provide superior reference benchmarks. Integrating these high-fidelity datasets will allow us to mitigate label noise and further refine retrieval accuracy. Furthermore, recognizing the rapid advancements in terahertz remote sensing instrumentation (Li et al., 2023), we plan to leverage terahertz technology to achieve higher-precision retrievals of IWP and SIWP. Collectively, these enhancements will significantly bolster the product's utility for monitoring rapidly evolving meteorological phenomena and validating climate model cloud parameterizations.

Supplement

The supplement related to this article is available online at https://doi.org/10.5194/essd-18-1287-2026-supplement.

Author contributions

YFY conceived the main algorithm, produced the dataset, validated its accuracy, and drafted the manuscript. GJX and RZ also contributed to parts of the algorithm design. BL, LTHS, WYW, CDX, and TFD supervised data production and validation, and revised the manuscript.

Competing interests

The contact author has declared that none of the authors has any competing interests.

Disclaimer

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. The authors bear the ultimate responsibility for providing appropriate place names. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Acknowledgements

The authors acknowledge the National Satellite Meteorological Center (NSMC) and CloudSat Data Processing Center (DPC) for providing access to the satellite data utilized in this work. We also extend our sincere gratitude to the two anonymous reviewers for their insightful comments throughout the review process. In particular, we would like to thank Patrick Eriksson for his valuable suggestions during the interactive discussion phase.

Financial support

This research is supported by National Natural Science Foundation of China (grant no. 42222608).

Review statement

This paper was edited by Jing Wei and reviewed by Patrick Eriksson and two anonymous referees.

References

Amell, A., Eriksson, P., and Pfreundschuh, S.: Ice water path retrievals from Meteosat-9 using quantile regression neural networks, Atmos. Meas. Tech., 15, 5701–5717, https://doi.org/10.5194/amt-15-5701-2022, 2022. 

Amell, A., Pfreundschuh, S., and Eriksson, P.: The Chalmers Cloud Ice Climatology: retrieval implementation and validation, Atmos. Meas. Tech., 17, 4337–4368, https://doi.org/10.5194/amt-17-4337-2024, 2024. 

An, N., Shang, H., Lesi, W., Ri, X., Shi, C., Tana, G., Bao, Y., Zheng, Z., Xu, N., Chen, L., Zhang, P., Ye, L., and Letu, H.: A Cloud Detection Algorithm for Early Morning Observations From the FY-3E Satellite, IEEE T. Geosci. Remote, 61, 1–15, https://doi.org/10.1109/TGRS.2023.3304985, 2023. 

Bertrand, L., Kay, J. E., Haynes, J., and De Boer, G.: A global gridded dataset for cloud vertical structure from combined CloudSat and CALIPSO observations, Earth Syst. Sci. Data, 16, 1301–1316, https://doi.org/10.5194/essd-16-1301-2024, 2024. 

Brown, P. R. A. and Francis, P. N.: Improved Measurements of the Ice Water Content in Cirrus Using a Total-Water Probe, J. Atmos. Ocean. Tech., 12, 410–414, https://doi.org/10.1175/1520-0426(1995)012<0410:IMOTIW>2.0.CO;2, 1995. 

Delanoë, J. and Hogan, R. J.: A variational scheme for retrieving ice cloud properties from combined radar, lidar, and infrared radiometer, J. Geophys. Res.-Atmos., 113, 2007JD009000, https://doi.org/10.1029/2007JD009000, 2008. 

Delanoë, J. and Hogan, R. J.: Combined CloudSat-CALIPSO-MODIS retrievals of the properties of ice clouds, J. Geophys. Res.-Atmos., 115, 2009JD012346, https://doi.org/10.1029/2009JD012346, 2010. 

Deng, M., Mace, G. G., Wang, Z., and Okamoto, H.: Tropical Composition, Cloud and Climate Coupling Experiment validation for cirrus cloud profiling retrieval using CloudSat radar and CALIPSO lidar, J. Geophys. Res.-Atmos., 115, 2009JD013104, https://doi.org/10.1029/2009JD013104, 2010. 

Deng, M., Mace, G. G., Wang, Z., and Lawson, R. P.: Evaluation of Several A-Train Ice Cloud Retrieval Products with In Situ Measurements Collected during the SPARTICUS Campaign, J. Appl. Meteorol. Clim., 52, 1014–1030, https://doi.org/10.1175/JAMC-D-12-054.1, 2013. 

Eliasson, S., Buehler, S. A., Milz, M., Eriksson, P., and John, V. O.: Assessing observed and modelled spatial distributions of ice water path using satellite data, Atmos. Chem. Phys., 11, 375–391, https://doi.org/10.5194/acp-11-375-2011, 2011. 

Eriksson, P., Baró Pérez, A., Müller, N., Hallborn, H., May, E., Brath, M., Buehler, S. A., and Ickes, L.: Advancements and continued challenges in global modelling and observations of atmospheric ice masses, EGUsphere [preprint], https://doi.org/10.5194/egusphere-2025-4634, 2025. 

Evans, K. F. and Stephens, G. L.: Microwave Radiative Transfer through Clouds Composed of Realistically Shaped Ice Crystals. Part I. Single Scattering Properties, J. Atmos. Sci., 52, 2041–2057, https://doi.org/10.1175/1520-0469(1995)052<2041:MRTTCC>2.0.CO;2, 1995. 

Field, P. R., Hogan, R. J., Brown, P. R. A., Illingworth, A. J., Choularton, T. W., and Cotton, R. J.: Parametrization of ice-particle size distributions for mid-latitude stratiform cloud, Q. J. Roy. Meteorol. Soc., 131, 1997–2017, https://doi.org/10.1256/qj.04.134, 2005. 

Gultepe, I., Heymsfield, A. J., Field, P. R., and Axisa, D.: Ice-Phase Precipitation, Meteorol. Monogr., 58, 6.1–6.36, https://doi.org/10.1175/AMSMONOGRAPHS-D-16-0013.1, 2017. 

He, K., Zhang, X., Ren, S., and Sun, J.: Deep Residual Learning for Image Recognition, in: 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 770–778, https://doi.org/10.1109/CVPR.2016.90, 2016. 

Hersbach, H., Bell, B., Berrisford, P., Hirahara, S., Horányi, A., Muñoz-Sabater, J., Nicolas, J., Peubey, C., Radu, R., Schepers, D., Simmons, A., Soci, C., Abdalla, S., Abellan, X., Balsamo, G., Bechtold, P., Biavati, G., Bidlot, J., Bonavita, M., De Chiara, G., Dahlgren, P., Dee, D., Diamantakis, M., Dragani, R., Flemming, J., Forbes, R., Fuentes, M., Geer, A., Haimberger, L., Healy, S., Hogan, R. J., Hólm, E., Janisková, M., Keeley, S., Laloyaux, P., Lopez, P., Lupu, C., Radnoti, G., De Rosnay, P., Rozum, I., Vamborg, F., Villaume, S., and Thépaut, J.: The ERA5 global reanalysis, Q. J. Roy. Meteorol. Soc., 146, 1999–2049, https://doi.org/10.1002/qj.3803, 2020. 

Hogan, R. J., Mittermaier, M. P., and Illingworth, A. J.: The Retrieval of Ice Water Content from Radar Reflectivity Factor and Temperature and Its Use in Evaluating a Mesoscale Model, J. Appl. Meteorol. Clim., 45, 301–317, https://doi.org/10.1175/JAM2340.1, 2006. 

Holl, G., Buehler, S. A., Rydberg, B., and Jiménez, C.: Collocating satellite-based radar and radiometer measurements – methodology and usage examples, Atmos. Meas. Tech., 3, 693–708, https://doi.org/10.5194/amt-3-693-2010, 2010. 

Holl, G., Eliasson, S., Mendrok, J., and Buehler, S. A.: SPARE-ICE: Synergistic ice water path from passive operational sensors, J. Geophys. Res.-Atmos., 119, 1504–1523, https://doi.org/10.1002/2013JD020759, 2014. 

Hong, Y. and Liu, G.: The Characteristics of Ice Cloud Properties Derived from CloudSat and CALIPSO Measurements, J. Climate, 28, 3880–3901, https://doi.org/10.1175/JCLI-D-14-00666.1, 2015. 

IPCC: Climate Change 2021 – The Physical Science Basis: Working Group I Contribution to the Sixth Assessment Report of the Intergovernmental Panel on Climate Change, Cambridge University Press, Cambridge, https://doi.org/10.1017/9781009157896, 2023. 

Kotarba, A. Z., Solecki, M., Kotarba, A. Z., and Solecki, M.: Uncertainty Assessment of the Vertically-Resolved Cloud Amount for Joint CloudSat–CALIPSO Radar–Lidar Observations, Remote Sens., 13, https://doi.org/10.3390/rs13040807, 2021. 

Letu, H., Ishimoto, H., Riedi, J., Nakajima, T. Y., C.-Labonnote, L., Baran, A. J., Nagao, T. M., and Sekiguchi, M.: Investigation of ice particle habits to be used for ice cloud remote sensingfor the GCOM-C satellite mission, Atmos. Chem. Phys., 16, 12287–12303, https://doi.org/10.5194/acp-16-12287-2016, 2016. 

Letu, H., Yang, K., Nakajima, T. Y., Ishimoto, H., Nagao, T. M., Riedi, J., Baran, A. J., Ma, R., Wang, T., Shang, H., Khatri, P., Chen, L., Shi, C., and Shi, J.: High-resolution retrieval of cloud microphysical properties and surface solar radiation using Himawari-8/AHI next-generation geostationary satellite, Remote Sens. Environ., 239, 111583, https://doi.org/10.1016/j.rse.2019.111583, 2020. 

Li, J. -L. F., Waliser, D. E., Chen, W. -T., Guan, B., Kubar, T., Stephens, G., Ma, H. -Y., Deng, M., Donner, L., Seman, C., and Horowitz, L.: An observationally based evaluation of cloud ice water in CMIP3 and CMIP5 GCMs and contemporary reanalyses using contemporary satellite data, J. Geophys. Res.-Atmos., 117, 2012JD017640, https://doi.org/10.1029/2012JD017640, 2012. 

Li, M., Letu, H., Ishimoto, H., Li, S., Liu, L., Nakajima, T. Y., Ji, D., Shang, H., and Shi, C.: Retrieval of terahertz ice cloud properties from airborne measurements based on the irregularly shaped Voronoi ice scattering models, Atmos. Meas. Tech., 16, 331–353, https://doi.org/10.5194/amt-16-331-2023, 2023. 

Pfreundschuh, S., Eriksson, P., Duncan, D., Rydberg, B., Håkansson, N., and Thoss, A.: A neural network approach to estimating a posteriori distributions of Bayesian retrieval problems, Atmos. Meas. Tech., 11, 4627–4643, https://doi.org/10.5194/amt-11-4627-2018, 2018. 

Pfreundschuh, S., Kukulies, J., Amell, A., Hallborn, H., May, E., and Eriksson, P.: The Chalmers Cloud Ice Climatology: A Novel Robust Climate Record of Frozen Cloud Hydrometeor Concentrations, J. Geophys. Res.-Atmos., 130, e2024JD042618, https://doi.org/10.1029/2024JD042618, 2025. 

Platnick, S., Meyer, K. G., King, M. D., Wind, G., Amarasinghe, N., Marchant, B., Arnold, G. T., Zhang, Z., Hubanks, P. A., Holz, R. E., Yang, P., Ridgway, W. L., and Riedi, J.: The MODIS Cloud Optical and Microphysical Products: Collection 6 Updates and Examples From Terra and Aqua, IEEE T. Geosci. Remote, 55, 502–525, https://doi.org/10.1109/TGRS.2016.2610522, 2017. 

Sassen, K. and Wang, Z.: Classifying clouds around the globe with the CloudSat radar: 1-year of results, Geophys. Res. Lett., 35, 2007GL032591, https://doi.org/10.1029/2007GL032591, 2008. 

Tan, Z., Ma, S., Zhao, X., Yan, W., and Lu, W.: Evaluation of Cloud Top Height Retrievals from China's Next-Generation Geostationary Meteorological Satellite FY-4A, J. Meteorol. Res., 33, 553–562, https://doi.org/10.1007/s13351-019-8123-0, 2019. 

Tana, G., Lesi, W., Shang, H., Xu, J., Ji, D., Shi, J., Letu, H., and Shi, C.: A New Cloud Water Path Retrieval Method Based on Geostationary Satellite Infrared Measurements, IEEE T. Geosci. Remote, 63, 1–10, https://doi.org/10.1109/TGRS.2025.3526262, 2025. 

Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., Kaiser, Ł., and Polosukhin, I.: Attention is All you Need, Adv. Neural Inf. Process. Syst., 30, 5998–6008, 2017. 

Waliser, D. E., Li, J. F., Woods, C. P., Austin, R. T., Bacmeister, J., Chern, J., Del Genio, A., Jiang, J. H., Kuang, Z., Meng, H., Minnis, P., Platnick, S., Rossow, W. B., Stephens, G. L., Sun-Mack, S., Tao, W., Tompkins, A. M., Vane, D. G., Walker, C., and Wu, D.: Cloud ice: A climate model challenge with signs and expectations of progress, J. Geophys. Res.-Atmos., 114, 2008JD010015, https://doi.org/10.1029/2008JD010015, 2009. 

Wang, P. K.: Theoretical Studies on the Motions of Cloud and Precipitation Particles – A Review, Meteorology, 1, 288–310, https://doi.org/10.3390/meteorology1030019, 2022. 

Wang, W., Wang, Z., He, Q., and Zhang, L.: Retrieval of ice water path from the Microwave Humidity Sounder (MWHS) aboard FengYun-3B (FY-3B) satellite polarimetric measurements based on a deep neural network, Atmos. Meas. Tech., 15, 6489–6506, https://doi.org/10.5194/amt-15-6489-2022, 2022. 

Wang, W., Xu, J., Letu, H., Zhang, L., Wang, Z., and Shi, J.: A New Deep-Learning-Based Framework for Ice Water Path Retrieval From Microwave Humidity Sounder-II Aboard FengYun-3D Satellite, IEEE T. Geosci. Remote, 62, 1–14, https://doi.org/10.1109/TGRS.2024.3352654, 2024. 

Wu, D. L., Jiang, J. H., and Davis, C. P.: EOS MLS cloud ice measurements and cloudy-sky radiative transfer model, IEEE T. Geosci. Remote, 44, 1156–1165, https://doi.org/10.1109/TGRS.2006.869994, 2006. 

Wu, D. L., Jiang, J. H., Read, W. G., Austin, R. T., Davis, C. P., Lambert, A., Stephens, G. L., Vane, D. G., and Waters, J. W.: Validation of the Aura MLS cloud ice water content measurements, J. Geophys. Res.-Atmos., 113, 2007JD008931, https://doi.org/10.1029/2007JD008931, 2008. 

Wu, D. L., Gong, J., Deal, W. R., Gaines, W., Cooke, C. M., De Amici, G., Pantina, P., Liu, Y., Yang, P., Eriksson, P., and Bennartz, R.: Remote Sensing of Ice Cloud Properties With Millimeter and Submillimeter-Wave Polarimetry, IEEE J. Microw., 4, 847–857, https://doi.org/10.1109/JMW.2024.3487758, 2024. 

Xu, G., Dou, T., Yang, Y., Yue, H., Letu, H., Ma, L., and Xiao, C.: The total mass and spatio-temporal structure of the aerial cryosphere, Chin. Sci. Bull., 67, 4130–4139, https://doi.org/10.1360/TB-2022-0184, 2022. 

Yang, Y.: Yang Yifan/FYAI: A Fengyun Satellite-Based Dataset for Atmospheric Ice Water Path/code and training/testing set, Zenodo [code and data set], https://doi.org/10.5281/zenodo.18479174, 2026.  

Yang, Y., Dou, T., Zhou, R., Li, B., Husi, L., Wang, W., and Xiao, C.: Fengyun polar-orbiting satellite total/suspended ice water path retrieval dataset (2010–2024), National Tibetan Plateau/Third Pole Environment Data Center [data set], https://doi.org/10.11888/Atmos.tpdc.303143, 2025. 

Download
Short summary
We developed "FYAI" (Fengyun Satellite-Based Dataset for Atmospheric Ice Water Path), a fifteen-year dataset (2010–2024) derived from Chinese Fengyun satellites. Using artificial intelligence, we mapped global atmospheric ice. This continuous record fills critical gaps in observation. It provides scientists with a vital tool to improve weather forecasts and better understand how atmospheric ice interacts with the global climate system.
Share
Altmetrics
Final-revised paper
Preprint