C 3 ONTEXT: A Common Consensus on Convective OrgaNizaTion during the EUREC 4 A eXperimenT

. The C 3 ONTEXT (A Common Consensus on Convective OrgaNizaTion during the EUREC 4 A eXperimenT) dataset is presented as an overview about the meso-scale cloud patterns identiﬁed during the EUREC 4 A ﬁeld campaign in early 2020. Based on infrared and visible satellite images, 50 researchers of the EUREC 4 A science team manually identiﬁed the four prevailing meso-scale patterns of shallow convection observed by Stevens et al. (2020). The common consensus on the observed meso-scale cloud patterns emerging from these manual classiﬁcations is presented. It builds the basis for future studies and 5 reduces the subjective nature of these visually deﬁned cloud patterns. This consensus makes it possible to contextualize the measurements of the EUREC 4 A ﬁeld campaign and interpret them in the meso-scale setting. Commonly used approaches to capture the meso-scale patterns are computed for comparison and show good agreement with the manual classiﬁcations. All four patterns as classiﬁed by Stevens et al. (2020) were present in January - February 2020 although not all were dominant during the observing period of EUREC 4 A. 10 The full dataset including postprocessed datasets for easier usage are openly available at the Zenodo archive at https://doi. Acknowledgements. The author thanks the participants of the international remote classiﬁcation event for their time in labeling the cloud 195 patterns. Daniel Klocke is thanked for conducting the ICON-SRM simulations. Geet George is thanked for helpful feedback on an earlier version of this manuscript. We acknowledge the use of imagery from the NASA Worldview application (https://worldview.earthdata.nasa. gov), part of the NASA Earth Observing System Data and Information System (EOSDIS). GOES-16 Advanced Baseline Imager data is available at https://doi.org/10.7289/V5BV7DSR. Its Level 1b radiances were converted with Raspaud et al. (2019) to brightness temperatures. This publication uses data generated via the Zooniverse.org platform, development of which is funded by generous support, including a 200 Global Impact Award from Google, and by a grant from the Alfred P. Sloan Foundation.

The paper is structured as follows: the dataset and its collection are described in Section 2. Potential use cases are described in Section 3. In Section 4 additional classification methods that are able to detect the four meso-scale patterns are applied to 35 the EUREC 4 A time period and compared to the manual classifications described in this paper. We conclude with Section 5.

Data description and development
The manual classifications were gathered through the online platform zooniverse.org which has already been successfully used in an earlier project by Rasp et al. (2020). The platform makes it possible to crowd-source labels for e.g. machine learning projects. Additional workflows can be defined to separate different image sources or to separate, for example, labels made 40 during a practice run from those that belong to the actual classification. The former allowed everyone to familiarize with the zooniverse.org platform without influencing the results.
For this dataset, we defined three workflows. Two workflows are based on satellite observations in the visible (EUREC 4 A VIS) and infrared channels (EUREC 4 A IR) made by both the Geostationary Operational Environmental Satellite 16  Advanced Baseline Imager (ABI) and Moderate Resolution Imaging Spectroradiometer (MODIS) in the first and only ABI in 45 the latter case. Another workflow (EUREC 4 A ICON) is based on a storm-resolving simulation covering the EUREC 4 A time period. This workflow has been included to test how well a storm-resolving simulation is able to reproduce the patterns. Here we used output from an ICOsahedral Nonhydrostatic (ICON) simulation with a grid-spacing of 1.25 km that was initialized daily by the ECMWF IFS and forced hourly. Each run is 48 hours long. The first 24 hours were discarded to allow for spin-up.
To visualize the simulation output, we calculated a pseudo-albedo α by following the approximation of Zhang et al. (2005): (1) α = τ 6.8 + τ , where τ is the optical depth, LWP is the liquid water vapor path and N an assumed cloud droplet number density. Here a number density of 70 cm −3 is used, which is at the lower end of the observed concentrations (Siebert et al., 2013).
All workflows are further described in Tab. 1.

55
On March 24, 2020 the international, virtual classification event was hosted with 51 scientists from 15 institutes participating to create the pattern classifications. For a full day the participants classified patterns of shallow trade-wind convection by labeling, i.e. drawing rectangles around the four common types: Sugar, Gravel, Flowers and Fish (Stevens et al., 2020).
In the end, over 12,500 labels were gathered and accumulated intentionally on the workflows with observations (see Fig. 2) as it quickly turned out that the identification of the patterns in the model simulation was too demanding. The features had too 60 little similarity with those found in nature. The daily composites shown in Figure A11 reveal that Sugar was particularly hard to identify. Fish and Flowers however, were often classified at the same days as in the observational workflows (EUREC 4 A VIS, EUREC 4 A IR). This is also reflected by the comparison of the amount of labels that has been created for each class and workflow (Fig. 2). Sugar has been classified least in the simulation workflow, while the largest feature, Fish, however, has been  Because all users have familiarity with the patterns either by previous work and/or being involved in the classification event of Rasp et al. (2020) it can be assumed that the labels are of high quality. In addition, they were trained immediately before starting the classification through an online presentation to get familiar with the labeling interface on zooniverse.org and to refresh once more the different meso-scale cloud pattern categories. Compared to Rasp et al. (2020) where the focus has been to 70 classify as many diverse cloud scenes as possible to capture the variability and thus serve as a better machine learning dataset, the aim for this dataset is to create a common classification dataset for the EUREC 4 A time period that participating scientists agree on and can directly be used in further studies. Therefore, the temporal frequency has been increased from daily cloud scenes to 2-hourly cloud scenes to reflect also the changes on the sub-daily scale such as those identified by Vial et al. (2021).
Due to this design difference, a single day is now on average classified 20 times in case of the visible workflow instead of just 75 about 6 (3 per daytime AQUA/TERRA satellite overpass) as in Rasp et al. (2020). Each individual cloud scene is however still viewed about three times.  Figure 3. Overview of processing levels of the datasets including the variable names used in the respective datasets.
After the joint classification event, over 12,500 labels were processed to make them more user friendly, especially because the raw data misses any temporal and geospatial information. The processing steps with the intermediate products are illustrated 80 in Fig. 3 and described as follows: -Level 0 The Level 0 dataset consists of the raw data output and originate from the zooniverse platform. It consists of CSV files that contain entries for each workflow, image (subject) and classification including technical details like the time spent on drawing a specific label. Labels are given by their origin (x,y) and their height (h) and width (w) given in pixel 85 coordinates.
-Level 1 The Level 1 dataset is further processed and combines the information distributed over the Level 0 dataset files. It contains each label as a separate entry and contains information about the classified object, the user and the geographical and Cartesian coordinates of the label. This product is saved in a netCDF file.

90
-Level 2 For the Level 2 dataset, the data are merged by classification_id.
The classification_id is a unique identifier of a classification, where a classification refers here to the process of labeling a single image by a single user. The user might use several labels of the same or different kind to completely classify a scene. This process eliminates overlaps of same-user classifications for each pattern and turns the data into 95 masks, rather than coordinates (see Fig. 3). Masks have the advantage to be easier queried whether a specific location is influenced by a meso-scale pattern or not. This product is saved in zarr. -Level 3 To ease working with the dataset, the percentage of agreement (p) among users on a specific pattern on each location is calculated and saved as Level 3 data for each workflow. It is calculated as follows: where U is the number of users that have seen the particular image, c the classification mask from the Level 2 data and i, j the geographic coordinates. Because the labels of users that attributed several classes to one pixel are not removed, p can be greater than 100%.
An example of the daily average of this agreement is shown in Fig. 4 and shown for each day and workflow in Appendix

105
A to give an impression of the dataset and in particular the presence and distribution of meso-scale patterns during the EUREC 4 A field campaign. This product is saved in zarr.  (Konow et al., 2021). Coastlines are based on GSHHG shapefiles (Wessel and Smith, 1996).

Potential dataset use and reuse
The EUREC 4 A field campaign has been an international study with a wide range of research platforms and many minor objectives (Stevens et al., 2021). This dataset does not only cover the core area of the experiment, but also the wider area This dataset gives the opportunity to study all these measurements in the context of the meso-scale patterns observed in the downwind trades. Due to the high subjectivity of these meso-scale cloud pattern definitions, it is of particular importance to discuss results based on a common consensus to keep studies comparable. The C 3 ONTEXT dataset can serve as such a 115 reference for the period of the EUREC 4 A field campaign. In Appendix A the level 3 products are visualized for each studied day and can be used as a look-up    6 shows the comparison of the different methods. We recognize in Fig. 6 that the I org distribution has a large range of values for small mean cluster sizes and narrows with increasing mean cluster size. This is in agreement with Bony et al. (2020). From the appearance of the patterns, we expect Gravel and Flowers to be rather regularly distributed and therefore to have a lower I org compared to Fish and Sugar. It should be noted that the I org is calculated based on a threshold in brightness 145 temperature and therefore only the deeper clouds in the Sugar field are detected leading to a higher I org than one would expect Fish the upper right one. Flowers are harder to associate with a quadrant as they are more centered. This is also in alignment with Bony et al. (2020) where the lower right quadrant includes not only Flowers but also about 35% of the Fish cases (their Fig. 1c).  regions. Fig. 7 shows that in January -February 2020 all patterns were dominant at least once. It also shows that the day-to-day variability in changes of the dominant pattern type are rather rare and a rather smooth transition from Gravel to Fish to Sugar to Flowers is observed.
Overall, the different classification methods agree well with each other and no large discrepancies are found. This reassures 160 that these methods are valid for further analysis of meso-scale patterns. While the I org /S metric is computationally cheap and can be easily applied to different regions, the manual classifications are naturally more accurate. The neural network approach is a good compromise between precision and scalability. However, when concentrating on a limited time period like the EUREC 4 A field campaign period where each pattern is only occurring a few times, manual classifications are most accurate. In general, it has been shown that with little effort, classifications of the cloud field are possible and can be a huge benefit for the community, encouraging this approach for future studies.
6 Code and data availability The C 3 ONTEXT dataset including raw data is openly available at the zenodo database (European Organization For Nuclear Research and OpenAIRE, 2013): https://doi.org/10.5281/zenodo.5724585 (Schulz, 2021b). The source code necessary to gen-190 erate the dataset is available at https://doi.org/10.5281/zenodo.5724762 (Schulz, 2021a) together with examples on how to process the data and retrieve the classifications for any platform as shown in e.g.  shapefiles (Wessel and Smith, 1996).