09 Nov 2023
 | 09 Nov 2023
Status: this preprint is currently under review for the journal ESSD.

Characterizing clouds with the CCClim dataset, a machine learning cloud class climatology

Arndt Kaps, Axel Lauer, Rémi Kazeroni, Martin Stengel, and Veronika Eyring

Abstract. We present a new Cloud Class Climatology dataset (CCClim), quantifying the global distribution of established morphological cloud types over 35 years. CCClim combines active and passive sensor data with machine learning (ML) and provides a new opportunity for improving the understanding of clouds and their related processes. CCClim is based on cloud property retrievals from the European Space Agency's (ESA) Cloud_cci dataset, adding relative occurrences of eight major cloud types as defined by the World Meteorological Organization (WMO) at 1° resolution. The ML framework used to obtain the cloud types is trained on data from multiple satellites in the Afternoon Constellation (A-Train). Using multiple spaceborne sensors reduces the impact of single-sensor problems like the difficulty of passive sensors to detect thin cirrus or the small footprint of active sensors. We leverage this to generate sufficient labeled data to train supervised ML models. CCClim's global coverage being almost gapless from 1982 to 2016 allows for performing process-oriented analyses of clouds on a climatological time scale. Similarly, the moderate spatial and temporal resolutions make it a lightweight dataset while enabling straightforward comparison to climate models. CCClim creates multiple opportunities to study clouds, of which we sketch out a few examples. Along with the cloud type frequencies, CCClim contains the cloud properties used as inputs to the ML framework, such that all cloud types can be associated with relevant physical quantities. CCClim can also be combined with other datasets such as reanalysis data to assess the dynamical regime favoring the occurrence of a specific cloud type or its radiative effects. Additionally, we show an example of how to evaluate a global climate model by comparing CCClim with cloud types obtained by applying the same ML method used to create CCClim to output from the icosahedral nonhydrostatic atmosphere model (ICON-A).

CCClim can be accessed via the digital object identifier: 10.5281/zenodo.8369202 (Kaps et al., 2023b)

Arndt Kaps et al.

Status: open (until 26 Dec 2023)

Comment types: AC – author | RC – referee | CC – community | EC – editor | CEC – chief editor | : Report abuse
  • RC1: 'Comment on essd-2023-424: Scratching the surface of extending active observation "curtains" of clouds', Anonymous Referee #1, 24 Nov 2023 reply

Arndt Kaps et al.

Data sets

CCClim - A machine-learning powered cloud class climatology Arndt Kaps, Axel Lauer, Veronika Eyring

Arndt Kaps et al.


Total article views: 54 (including HTML, PDF, and XML)
HTML PDF XML Total BibTeX EndNote
42 10 2 54 4 2
  • HTML: 42
  • PDF: 10
  • XML: 2
  • Total: 54
  • BibTeX: 4
  • EndNote: 2
Views and downloads (calculated since 09 Nov 2023)
Cumulative views and downloads (calculated since 09 Nov 2023)

Viewed (geographical distribution)

Total article views: 53 (including HTML, PDF, and XML) Thereof 53 with geography defined and 0 with unknown origin.
Country # Views %
  • 1
Latest update: 01 Dec 2023
Short summary
CCClim displays observations of clouds in terms of cloud classes that have been in use for a long time. CCClim is a machine-learning-powered product based on multiple existing observational products from different satellites. We show that the cloud classes in CCClim are physically meaningful and can be used to study cloud characteristics in more detail. The goal of this is to make real-world clouds more easily understandable to eventually improve the simulation of clouds in climate models.