GVCCS: A Dataset for Contrail Identification and Tracking on Visible Whole Sky Camera Sequences

Jarry, ﻿Gabriel; Dalmau, Ramon; Very, Philippe; Ballerini, Franck; Bocu, Stefania-Denisa

doi:10.5194/essd-2025-444

Preprints

https://doi.org/10.5194/essd-2025-444

Preprints

14 Oct 2025

| 14 Oct 2025

Status: a revised version of this preprint is currently under review for the journal ESSD.

GVCCS: A Dataset for Contrail Identification and Tracking on Visible Whole Sky Camera Sequences

Gabriel Jarry, Ramon Dalmau, Philippe Very, Franck Ballerini, and Stefania-Denisa Bocu

Abstract. Aviation's climate impact includes not only CO₂ emissions but also significant non-CO₂ effects, especially from contrails. These ice clouds can alter Earth's radiative balance, potentially rivaling the warming effect of aviation CO₂. Physics-based models provide useful estimates of contrail formation and climate impact, but their accuracy depends heavily on the quality of atmospheric input data and on assumptions used to represent complex processes like ice particle formation and humidity-driven persistence. Observational data from remote sensors, such as satellites and ground cameras, could be used to validate and calibrate these models. However, existing datasets don't explore all aspect of contrail dynamics and formation: they typically lack temporal tracking, and do not attribute contrails to their source flights. To address these limitations, we present the Ground Visible Camera Contrail Sequences (GVCCS), a new open data set of contrails recorded with a ground-based all-sky camera in the visible range. Each contrail is individually labeled and tracked over time, allowing a detailed analysis of its lifecycle. The dataset contains 122 video sequences (24,228 frames) and includes flight identifiers for contrails that form above the camera. As reference, we also propose a unified deep learning framework for contrail analysis using a panoptic segmentation model that performs semantic segmentation (contrail pixel identification), instance segmentation (individual contrail separation), and temporal tracking in a single architecture. By providing high-quality, temporally resolved annotations and a benchmark for model evaluation, our work supports improved contrail monitoring and will facilitate better calibration of physical models. This sets the groundwork for more accurate climate impact understanding and assessments.

Received: 25 Jul 2025 – Discussion started: 14 Oct 2025

Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors. Views expressed in the text are those of the authors and do not necessarily reflect the views of the publisher.

Download & links

Gabriel Jarry, Ramon Dalmau, Philippe Very, Franck Ballerini, and Stefania-Denisa Bocu

Status: final response (author comments only)

RC1:
'Comment on essd-2025-444', Anonymous Referee #1, 14 Oct 2025
General comments
This is a well-written paper that provides an extensive introduction to and description of a large, novel observational dataset of contrails. I have mostly minor comments, related to small details in the writing and presentation of the results. These can be found under “line-by-line comments”.
There are some open questions that remain after having read the manuscript, that I think should be addressed before the paper is published.
How were the to-be-labeled video sequences selected? Given a particular sequence start, how was its length determined? How do these choices affect the representability of the dataset, and the generalization performance of algorithms trained with it?

Do you expect that the video-based models would perform better as the length of the clips used for training them is increased? If I interpret the paper correctly, the maximum length of video clips that are used is 5 frames (so 2.5 minutes). This seems to be quite short, especially when placed in comparison to the length of temporal contexts used when analyzing geostationary satellite imagery (e.g. Ng et al. 2023).

It is found that the image-based models show better instance-segmentation performance than the video-based models, but that the latter also provides “tracking” output. What is the quality of these “tracking” outputs and is it sufficiently better than what could be obtained from the “better instance segmentation” output in combination with a simple tracking algorithm (e.g. based on overlap between image frames).

Could more examples of how the image pre-processing technique helps identify contrails, be given? Currently, only 1 such result is included, but more examples that show the value of the re-projection and contrast enhancement would be helpful in my opinion.

Line-by-line comments
P1L6: “don’t” -> “do not”, “aspect” -> “aspects”
P1L21: “trap outgoing” -> “reduce outgoing”
P1L22: “as due to aviation CO2 emissions”
P2L30: “context-dependent and extremely difficult to model reliably” -> “highly variable and challenging to model.”
P2L33: APCEMM stands for “Aircraft Plume Chemistry, Emissions, and Microphysics Model”
P2L53: I think it would be nice for the authors to explain in more detail why the attribution of observed contrails to flights is difficult when using geostationary imagery, and how the introduced dataset does not suffer from this problem.
P2L56: I think I understand what this paragraph is trying to say, but I feel like it could be made clearer. Perhaps state this as, “existing datasets of contrails annotated in observational data such as the OpenContrails dataset, do not track individual contrails over time or provide information on the flights that formed them”?
P3L70: is there no reference for this Mask2Former approach?
P4L106: Although named after Schmidt and Appleman, the criterion in its current form was originally presented by (Schumann, 1996) so I would suggest also citing that paper here.
P4L109: “trap” the more physically correct term is “reduce”.
P4L111: “The precise relative impact” what exactly is meant here with “relative impact”?
P4L117: “Atmospheric imagery” what is meant with this term?
P5L131: “coverage” -> “spatial coverage”
P8L198: the labeling of these images was done at the “semantic segmentation” level, not at the instance segmentation level.
P8L205: “Thanks to the 10-minute temporal…” This sentence does not make much sense to me. Perhaps it could be left out, as the rest of the paragraph is clear.
P8L209: “A 2025 update” Technically, the instance-level labels were always there, but not released as part of the 2023 Kaggle competition that utilized the OpenContrails dataset. So I would suggest to leave out this sentence.
P8L210: The authors could also cite (Pertino et al., 2024) here.
P8L219: “of dataset” –> “of a dataset”. The authors should note that earlier studies have collocated contrails between different remote sensing instruments. Examples are (Iwabuchi et al., 2012; Mannstein et al., 2010; Vazquez-Navarro et al., 2010).
P8L220: “Altitude” appears twice in this sentence.
P9L244: “Non-data-driven image-analysis techniques” -> “Traditional image analysis techniques”
P9L254: Citation “Jarry et al.” missing a year. Happens elsewhere as well.
P10L266: “Leveraging as well on Hough-based line detection” This sentence doesn’t make sense to me.
P10L282: provide exact coordinates of camera location, if possible.
P12L315: “or reviewing” -> “of reviewing”
P14L371: “U-net” written differently than elsewhere in the paper
P14L387: “k-means” maybe write “k” in italic?
P23 Figure 5: Include whether time is “UTC” or not. Same goes for Figure 7. Additionally, “Raw image” is used for an image that has already undergone quite some processing, so perhaps a different terminology could be used here?
P24 Figure 6: I think it would be helpful to combine figures 5 and 6 to make it easier for the reader to perform the visual comparisons.
P24L594: “erroneously merges contrails 5 and 6 into a single prediction” I don’t see this in the results at all? Is this potentially a typo? Should it be contrails 3 and 6?
P24L595: “merges contrails 5 and 6” Again, I don’t see this.
P26L638: “we are deployed” -> “we are deploying” ?
P26L656: “Integrating these tasks …” This sentence doesn’t make sense to me.
References
Iwabuchi, H., Yang, P., Liou, K., Minnis, P., 2012. Physical and optical properties of persistent contrails: Climatology and interpretation. J. Geophys. Res. Atmospheres 117.
Mannstein, H., Brömser, A., Bugliaro, L., 2010. Ground-based observations for the validation of contrails and cirrus detection in satellite imagery. Atmospheric Meas. Tech. 3, 655–669. https://doi.org/10.5194/amt-3-655-2010
Pertino, P., Pavarino, L., Lomolino, S., Miotto, E., Cambrin, D.R., Garza, P., Ogliari, E., 2024. Ground-Based Contrail Detection by Means of Computer Vision Models: A Comparison Between Visible and Infrared Images, in: 2024 IEEE 8th Forum on Research and Technologies for Society and Industry Innovation (RTSI). IEEE, pp. 254–259.
Schumann, U., 1996. On conditions for contrail formation from aircraft exhausts. Meteorol. Z. 4–23.
Vazquez-Navarro, M., Mannstein, H., Mayer, B., 2010. An automatic contrail tracking algorithm. Atmospheric Meas. Tech. 3, 1089–1101. https://doi.org/10.5194/amt-3-1089-2010
Citation: https://doi.org/10.5194/essd-2025-444-RC1
RC2: 'Comment on essd-2025-444', Anonymous Referee #2, 26 Oct 2025

The comment was uploaded in the form of a supplement: https://essd.copernicus.org/preprints/essd-2025-444/essd-2025-444-RC2-supplement.pdf

Citation: https://doi.org/10.5194/essd-2025-444-RC2
AC1: 'Comment on essd-2025-444', Gabriel Jarry, 17 Dec 2025

Dear referees,
Thank you for the effort in reviewing our manuscript “GVCCS: A Dataset for Contrail and Tracking on Visible Whole Sky Camera Sequences".
We appreciate the valuable and constructive feedback, which has significantly helped us improve the clarity, completeness, and context of our work.
We have addressed all comments point-by-point in the attached response letter.
Yours sincerely,

Jarry, G. et al.

Citation: https://doi.org/10.5194/essd-2025-444-AC1

Gabriel Jarry, Ramon Dalmau, Philippe Very, Franck Ballerini, and Stefania-Denisa Bocu

Data sets

GVCCS : Ground Visible Camera Contrail Sequences Gabriel Jarry, Philippe Very, Franck Ballerini, and Ramon Dalmau https://doi.org/10.5281/zenodo.16419651

Gabriel Jarry, Ramon Dalmau, Philippe Very, Franck Ballerini, and Stefania-Denisa Bocu

Viewed

Total article views: 456 (including HTML, PDF, and XML)

HTML	PDF	XML	Total	BibTeX	EndNote
295	132	29	456	23	34

HTML: 295
PDF: 132
XML: 29
Total: 456
BibTeX: 23
EndNote: 34

Views and downloads (calculated since 14 Oct 2025)

Month	HTML	PDF	XML	Total
Oct 2025	147	22	8	177
Nov 2025	77	33	15	125
Dec 2025	68	75	5	148
Jan 2026	3	2	1	6

Cumulative views and downloads (calculated since 14 Oct 2025)

Month	HTML	PDF	XML	Total
Oct 2025	147	22	8	177
Nov 2025	77	33	15	125
Dec 2025	68	75	5	148
Jan 2026	3	2	1	6

Viewed (geographical distribution)

Total article views: 457 (including HTML, PDF, and XML) Thereof 457 with geography defined and 0 with unknown origin.

Country	#	Views	%

Latest update: 04 Jan 2026

Short summary

The Ground Visible Camera Contrail Sequences (GVCCS) dataset provides annotated video sequences of aircraft contrails recorded by a ground-based camera in the visible spectrum. Each contrail is segmented, tracked, and, where possible, attributed to individual flights. A baseline model based on panoptic segmentation is also provided to demonstrate instance-level detection. This dataset enables empirical analysis of contrail lifecycle and supports the validation and calibration of physical models.


Total:	0
HTML:	0
PDF:	0
XML:	0