the Creative Commons Attribution 4.0 License.
the Creative Commons Attribution 4.0 License.
The Total Carbon Column Observing Network's GGG2020 data version
Joshua L. Laughner
Geoffrey C. Toon
Joseph Mendonca
Christof Petri
Sébastien Roche
Debra Wunch
Jean-Francois Blavier
David W. T. Griffith
Pauli Heikkinen
Ralph F. Keeling
Matthäus Kiel
Rigel Kivi
Coleen M. Roehl
Britton B. Stephens
Bianca C. Baier
Huilin Chen
Yonghoon Choi
Nicholas M. Deutscher
Joshua P. DiGangi
Jochen Gross
Benedikt Herkommer
Pascal Jeseck
Thomas Laemmel
Erin McGee
Kathryn McKain
John Miller
Isamu Morino
Justus Notholt
Hirofumi Ohyama
David F. Pollard
Markus Rettinger
Haris Riris
Constantina Rousogenous
Mahesh Kumar Sha
Kei Shiomi
Kimberly Strong
Ralf Sussmann
Voltaire A. Velazco
Steven C. Wofsy
Minqiang Zhou
Paul O. Wennberg
Download
- Final revised paper (published on 06 May 2024)
- Preprint (discussion started on 24 Aug 2023)
Interactive discussion
Status: closed
-
RC1: 'Comment on essd-2023-331', Denis Jouglet, 23 Sep 2023
Review of the preprint essd-2023-331
Title : The Total Carbon Column Observing Network’s GGG2020 Data Version
Authors : J. Laughner et al.
General comments
I read this preprint with much interest. The TCCON network has been providing unique data set for several years, with important impacts on atmospheric composition, atmospheric-surface exchange knowledge, satellite measurement validation and spectroscopy improvements. The required level of accuracy for the TCCON products is very challenging, with permanent improvements in the product processing. Such a complete description is therefore very important for traceability and deserves peer-reviewed publication.
This paper very well describes the huge work done by the TCCON team to improve its data processing since the last GGG2014 version. The authors demonstrate that they fully manage their complete measurement chain (interferogram processing, prior, spectroscopy, calibration, etc.). Most part of the measurement chain have been updated and can be considered at the state of the art. The authors have been able to include new uncertainty sources (e.g. the O2 volume mixing ratio depletion with time). A large effort has been done to give an exhaustive enumeration and description of the changes with respect to GGG2014.
The improvement of the forward model was done on solid physical basis. Besides, a large part of the processing consists in empirical corrections. The scientific choices of them could be discussed because they mean that some effects are not fully understood, and raise some doubts on the absolute and inter-station accuracy. These choices are however pragmatic and can be considered as acceptable since they are well documented, rigorous, validated and still under improvement. Moreover they can be considered as a pure calibration, as it is done for many other physical sensors, given that the other properties of the sensor (e.g linearity) are also demonstrated. I will raise some questions about the scientific content of this paper (see section Specific science comments below) but they do not require major revisions. Nethertheless this paper uses assumptions made in Wunch et al. 2015, which is a technical note, not a peer-reviewed paper. Such assumptions should therefore be questioned here.
However, I think the main weakness of this paper is its lack of clarity:
- this paper is not self-sufficient, since many rationales can only be found in previous papers: Wunch et al. 2010, 2011 and 2015. I think that scientists not familiar with the data cannot understand this paper on its own. I had to go back and forth to these papers. This would at least require systematic references to these papers (probably including the section, figure or table number), and in most cases a quick reminder (for example in a more detailed introduction).
- some assumptions made in these previous works are not fully described in these papers. In some cases, I was even not able to find the answer (see examples below).
- the paper is very long. I think that all parts deserve publication, but it could be split or rearranged if possible (mostly the uncertainty budget).
I will suggest revisions in the Suggested clarity improvements section hereafter.
As a conclusion, I fully support the publication of this paper with minor revisions for the science content, and substantial revisions for the self-consistency (processing description, rationales and referencing).
Specific science comments
Section 1 - Introduction
- l 40: did you think of permanently removing channels in the center lines that become saturated at large SZA, so as to homogenize the biases along the SZA?
- l 48 : the assumption of “consistent across sites” seems to be in contradiction with Wunch et al 2011 appendix A.a (0.2ppm).
Even if instrument are perfectly consistent, their different environments (boreal vs tropics) could translate in apparent inconsistencies.
- l 58: be careful that the refraction effect may differ in the O2 and the other gas windows.
Section 2
- l 85 : I have a question a bit beyond the scope of this paper, but I did not get the answer in previous papers. What is exactly your definition of AK? In a bayesian framework, AK are usually defined as information coming from the measurement with respect to the prior. According to Wunch et al 2010 the retrieval is least square fitting of a scaling factor of an a priori profile, without an explicit value of any prior of this scaling factor and associated uncertainty.
Section 3
- Large works have been done for improving spectroscopy, but no estimation of the gains on Xgas accuracy is given. Following section 7.1, they seem not to prevent the ACDF empirical correction. Can you provide an estimation?
- section 3.3: The O2 column is estimated from the 1.27µm band (always this band?). The O2 spectroscopic parameters are optimized so that Xluft, which is the ratio of O2 column from spectroscopy to O2 column from local pressure measurement, is close to 1 with low variance. Only the O2 spectroscopic parameters are tuned, which means that the surface pressure measurement is assumed to be the truth. Therefore, why using the O2 band and not directly the surface pressure measurement for the O2 column estimation?
- section 3.3: you choose to change some spectroscopic coefficients from your empirical observations. This is fully understandable. Did you have any discussion about that with spectrocopists? Are your changes inside the uncertainties given by spectroscopists?
- l.173: You choose to use only one value, T700, but the temperature vertical profile may be heterogeneous, in particular in the boundary layer. Do you not think that this approximation could be source of error ?
- fig 2: In panel (b) a slope can still be observed.
Section 4
- Do you not think that cross-sections computed for interpolated met profile for each hour of a day would provide an improvement in the retrieval?
Section 5
- l.242: Does detector saturation often happen? Why do you not adjust the gains to the maximum possible intensity of the place? This is quite deterministic, depending mostly on AOD (not strongly on SZAs in the SWIR when AOD is low and SZA not at extreme values).
- l.243: In such saturation cases, only the very low frequencies are lost. These frequencies are mostly retrieved by the continuum fitting and should not bring information on gases. Do you think that gas information is lost in such configuration?
The exception could be the Collision Induced Absorption (CIA), in particular in the 1.27µm O2 band.
Section 6
- Is there no interference between the instrument continuum fitting and the O2 1.27µm CIA when using polynomial orders greater than 2? The CIA brings much information on O2 amount.
Section 7
- Section 7.1: can you confirm that the new ADCF are computed on the Xgas benefiting from the O2 spectroscopic improvements of section 3.3?
- l.395: you consider the temperature dependence of the ADCF as “spurious”. I think this a hypothesis (like symmetry of Xgas with respect to noon in equation (5)) that requires to be identified as is. Can natural phenomena not also be responsible for such trends? Probably seeing this effect on some but not all windows of a trace gas could enforce the spurious hypothesis.
- In section 3.3 you update the O2 spectroscopic coefficients, using O2 from the pressure sensor as the truth to fit. Here in section 7.1, why do you not do the same optimization exercise for the spectroscopic coefficients of trace gases, rather than a posteriori empirical correction? This would require an external truth which could be the in situ measurements of section 7.3.
L.406 mentions this plan for temperature dependence but could be enlarged to SZA dependence.
- Section 7.1.1: you remove the windows that are the more affected by the temperature dependence. What is your threshold for the decision of correcting or removing?
- Section 7.3: I would like to understand the differing results of GGG2014 and GGG2020. In l.455, what are the differences between the in situ dataset used for GGG2014 and that for GGG2020? Do you expect changes / improvements in your AICF estimation with the GGG2020 dataset? Larger variety of in situ instrument to vary the potential biases; larger range in weather conditions to disangle trends?
What is the size of the GGG2020 in situ data set? And thus the number of elements in fig 11?
OK this is answered in table 2, but should be given in plain text.
- The appendix C6 is very important for this paper. It would be useful to give orders of magnitudes of the several sources of uncertainties (partially given in table 3).
- Appendix C6 l.1213: Why do you take twice the std and not the std itself?
- l.518: I agree the presentation of fig11 is better than the older presentation (fig5 in Wunch et al 2010, fig.8 in Wunch et al. 2015). Please precise that the in situ ratio of the fig11 is equivalent to the inverse of the slope of the best fit of older papers.
- l.524: Here for CO2 the ratio is ~1.01%, whereas in Wunch et al. 2010 fig5 and in Wunch et al. 2015 fig8 it was ~1.1% (1/0.989). Can we conclude that the updates of the GFIT processing and the new ADCF described in this paper have provided such a ~10% improvement? If so, please emphasize it. If not, please explain why (the values are comparable since the data set are different).
- l.627: I think “small” is a bit under-evaluated, since for 420ppm the order of magnitude is the same as the one of the new XCO2 scale.
- As already mentioned in my “general comments”, this section is very interesting in terms of metrology. I would interpret section 7.1 as a correction of intrinsic quality of the detector (removing all artifacts regardless the conditions), and section 7.3 as the absolute calibration of the sensor (more precisely fit to WMO standard).
In previous papers, only airmass dependencies were corrected. But now we can see that the list of potential dependencies for intrinsic quality is larger:
- ADCF in section 7.1
- Atmospheric temperature dependency in section 7.1
- Impact of Xgas. The classical Xgas(TCCON) = f(Xgas(in situ)) (like fig.8 in Wunch et al. 2015) have been discarded in this paper (section 7.3), this implicitly mean that TCCON is linear with Xgas
- Impact of Xluft as seen by section 7.3 fig.11 (l.555 mentions that it will be a future update)
Maybe the paper should be more explicit about this “metrological” process.
As a consequence, and depending on the size of the section 7.3 data set, I think other dependencies could be looked for (humidity, AOD, altitude, etc.).
Section 8
- l.758: it is written that surface pressure measurement is used for calculating the total column of air, whereas previously it was said that the O2 absorption band is used for that purpose, this is contradiction.
- l.784: do you not think that after the works done in section 7.1, it could be possible to add the spectroscopic errors as an uncertainty source?
- l.784: It would be interesting to get an inter-instrument budget beside the single instrument budget. This would be very useful since one of the use of the network is to analysis spatial gradients. For example, error in the retrieval like the choice of the prior will be partially common to all instruments (partially because it may depend on latitude). Some errors (pointing error, FOV error) will be different from an instrument to another.
- l.868: the classical way would be to use the standard deviation, please explain why you use the median absolute deviation here (robustness to outliers?)
- l.878: In your sensitivity study (first part of section 8), you did not include the radiometric noise, which would be the main random error source. Most sources you considered should be quite constant over a day, so the assumption of reduce sources of random error sounds good to me. Be careful however that some sources of your sensitivity study could be slowly variable and therefore mostly seen in the mean bias, not in the median absolute deviation.
- table 3 and l.883: In “Mean abs. dev.” there is the contribution of the instrument, of the in situ measurement and of the comparison between both. Do you not think you should compare “Mean abs. dev.” with the quadratic summation of “Error budget” and “Epsilon_insitu”, rather than “Mean abs. dev.” with “error budget”?
Suggested clarity improvements
Section 1 - Introduction
- This introduction should be expanded, and divided into several sub-sections:
- It should recall the main uses of the TCCON network (as given in the abstract).
- It should explain that the scope of this paper is to describe the major changes from GGG2014 to GGG2020, justify why so hard work has undertaken. The expected accuracy for GGG2020 and the current performance of GGG2014 would be the best rationale.
- Introduction should also reference to Wunch et al 2011 and 2010, since the major parts of the algorithm are described in the 2011 paper.
- I think the complete window definition should be recalled in a table (or at least referenced) in introduction or in appendix. This will ease the comprehension of section 2 by newcomers (useful also for l126), and also give the current status.
- Introduction should recall the main steps of the retrieval (Bayesian approach or not?), including the cross-sections computation and the AK definition, so as to make the paper more self-consistent.
- L 75: maybe the introduction should also mention the systematic quality check done by the central facility? Does it include the filterings listed in section 7.3?
- Description of the new merge of several windows of line 30 is redundant with section 7.2 and therefore could not be mentioned in introduction. I was not able to find the way several windows were merged in previous papers.
- It is not clear in sections 1, 7.1 and 7.2 whether the ADCF correction and the window merging is performed on column densities or on column average dry mole fraction (l.52). Please clarify.
- Please precise that Vgas and VO2 are column densities, and give the physical unit.
- l 45: can you give a reference for the 0.25%?
- The “scaling factor” or “scaling correction” could already be named AICF, and the “empirical airmass-dependent correction” ADCF.
Section 2
- People knowing the CO2 spectroscopy could wonder why the weak window at 6536cm-1 is not mentioned, maybe you can refer to section 7.1.1 l.415 which brings the explanation. Same for the 4905cm-1 strong CO2 band.
- To what the “l” of “lCO2” refers to? In the OCO-2 mission, such band is called sCO2, and the ~6300cm-1 bands to wCO2, which is confusing here.
Maybe the “CO2 window centered at 6220 cm−1” (l.119), which is the first standard window, should be given a short name as it the case for sCO2 and lCO2?
Section 3
- For clarity, I think a chapter named “Improvement of the forward model and the retrieval” should be created to include sections 3, 4, 5 and 6. The following sort would be more obvious : 5, 4, 3, 6.
- L 114 : give a reference for “Numerous spectroscopic studies”?
- L149 : Xluft is an important notion, but new and never mathematically defined (later it is said “similar to Xair”). Please give the mathematical formula of Xluft.
Section 4
- Please recall (in introduction?) that the absorption cross-sections are pre-computed, using the meteorological profiles. Despite the 3-hourly new product, can you confirm in the plain text that only one profile per day is used?
Section 5
- section 5.2: I cannot catch the improvement of GGG2020 with respect to 2014 in this section.
- section 5.3: I was not able to find in literature (Wunch et al. 2011, 2015) that the TCCON interferometer is single—sided. Please mention it (introduction?), and provide the length of the short arm (as well as that of the long arm).
- l 309: please explain why it is “more efficient” : is it for a better SNR?
Section 6
- L 312: “spectral response of the instrument” is ambiguous, may be confused with ILS. I understand you are talking about the instrumental ”continuum”.
- L 320: “the discrete Legendre polynomials” is not mentioned in Wunch et al 2015. Do you confirm it? Why Legendre polynomials and not classical polynomials?
Please give the orders used per window (or at least their maximum).
Section 7
- Section 7.1: sub-sections would be welcome
- I think that equations (3), (4) and (5) cannot be understood without an explicit reference to Wunch et al 2011 appendix A.e.i (l356 mentions “like GGG2014” but Wunch et al 2015 does not mention it). Please refer to it. Please recall that f is a model for the observed Xgaz diurnal variation, making the important assumption that any symmetrical Xgaz variation around noon is not expected to be true but an artifact.
- I think an illustration of XCO2=f(t,theta) with several examples would be welcome, and also to show the standard deviation that is aimed at being minimized.
- I note that despite Wunch et al 2011, the sin() function is replaced by a linear function, why?
- l.383: please precise how you get these uncertainties.
- l 395: As far as I understand by comparison with previous papers, this temperature dependence correction is new in GGG2020. Please emphasize it.
- l 399: The use of the theta notation for potential temperature is source of confusion since theta is use earlier for SZA. Maybe you should change it.
- l.402: please detail the exact operation: division by the ACDF at theta_mid=310K? linear correction requesting the knowledge of theta_mid?
- L.419: in this paper, as well as in previous papers, HCl was never mentioned as an atmospheric gas measured by TCCON (see table 3 of Wunch et al 2015), but only as for the gas cell for interferometer calibration. The mentioned windows should therefore be explained.
- Section 7.2: after reading the Wunch et al. 2015, 2011 and 2010, I was not able to find the way windows were merged in GGG2014 and older. I do not fully understand the iterative process described here. Please detail it.
- l 432: you mention the “retrieval error”, previous papers seems not to define it. I think it should be given in section 1.
- l.472: please give the formula of the FVSI index (or give a reference). Is it computed for a single scan duration? What is the duration of a single scan? (it can be mentioned in section 1)
l.481 also requires to mention the duration of a single TCCON duration, so as to understand whether 30 TCCON is a large part of a 2h window or not.
- l.495: I guess ai is the averaging kernel, please clarify it.
- l.514 and fig.11 legend: please explicitly precise that the uncertainty bars are given by appendix C6.
- l.564: can you tell which way was used for each dataset?
- l.582: for clarity reason, I would suggest to start a new sub-section dedicated to O2 decrease, as this is a different correction source (even if applied simultaneously with new XCO2 standard).
- l.634: It is not clear in this paragraph whether the new product includes the variable O2 variable fraction or not. The answer is given later in l.692, with some redundancy, therefore I would discard the l.634 paragraph.
Section 8
- This chapter is very important and must be kept. But it is a long part in an already long paper, and a bit different from the remaining of the paper which explains the updates of GGG2020. I would suggest several solutions:
- To place it in a dedicated companion paper. This can be merged with appendix B (which is small).
- Or (preferred) to move the text between l.704 and 782 in appendix B. This part is largely an update of similar works by Wunch et al. 2015 (section 8), 2011 (appendix B)
- Text between l.704 and 782 at least deserves its own sub-section
- Please refer to Wunch et al. 2011 and Wunch et al. 2015 for section 8.
In particular l.776 & 777 can refer to Wunch et al. 2015 for more details on ME.
- I would change the numeration of 8.x for x in {2,..,10} to 8.1.x.
- l.854: Please define this scale factors for HCl and how to use them to assess the ILS (or give a reference)
Section 9
- I think it would be more consistent to place this section at the beginning of document, either after 2 of after section 6.
Typing corrections
The paper is well written, and I did not detect typos.
- Section 6.1 has no 6.x follower
- Section 7.1.1 has no 7.1.x follower
- figures 8, 9, 11 and their font should be enlarged
- l.1595: the link seems to be dead.
Citation: https://doi.org/10.5194/essd-2023-331-RC1 -
AC1: 'Reply on RC1', Josh Laughner, 26 Feb 2024
The comment was uploaded in the form of a supplement: https://essd.copernicus.org/preprints/essd-2023-331/essd-2023-331-AC1-supplement.pdf
-
RC2: 'Comment on essd-2023-331', Anonymous Referee #2, 13 Oct 2023
This is an important paper for the satellite measurement community focused on greenhouse gases and other trace gases. The TCCON total column gas estimates serve as a primary validation data sets for many current space missions, as well as for planned missions. Fundamental research is also conducted with this dataset. For the measurement of gases with high precision requirements, the performance of the TCCON network needs to be extremely well characterized, so the comparisons can be informative.
In this manuscript, data and methods are updated from the previous versions of data. The data processing, assumptions, sources of input data, and calculations are all very clearly described. The analysis performed to connect the GGG2020 to reference data sets is also described in details, with assumptions, data selection criteria, and tradeoffs between different possible approaches.
The methods and material are described in great detail. While this creates a very long paper, it also creates a valuable and reliable reference document for this data set. The details of the approach, assumption, and input data sets can become scattered over many manuscripts and documents as data sets evolve. This paper creates the definitive reference document for the GGG2020 data set. One can imagine that for one or two data versions that follow, changes from this reference document will be sufficient for most purposes.
The detailed analysis and assessment of the error budget is something rarely documented so well with data sets, and this has a lot of value for users and others who are working in this area.
One note on the organization. The processing chain is complex and there are many steps. The first section of the paper gives a bit of an overview, and then notes the subprograms and where they are discussed in the paper. I do agree with another review that suggests the paper sections should be ordered to reflect the processing chain more directly. Start with the section on interferograms (now 5), and then move to gsetup (now section 4), and then the spectroscopy and continuum fitting (now sections 3 and 6). Finish with post processing, as the paper now has.
Regarding the dataset, there are no inconsistencies, implausible assertions or data, or noticeable problems. The data set is usable in its current form. There are netcdf files with detailed documentation and descriptions. The data set README file links to the wiki and to the published literature. Variables have detailed, informative names.
Although this is a complicated data set, someone who is working in this field (validation of remote sensing datasets, analysis of atmospheric constituents), should be able to find all the details they need to properly understand the creation and appropriate utilization of the dataset. This is not a trivial task, but all the relevant information is presented, and a motivated user can find everything that they need.
This dataset is definitely unique. It is derived from a record of ground based instruments that have been carefully deployed and characterized for periods that span from a few years to nearly 20 years. This data set falls into the category of a cost-intensive dataset which will not be replicated due to financial reasons.
This data set is very useful, to researchers and the satellite validation community. As it is gathered and analyzed with very rigorous approaches, and with careful attention to connecting the data set to reference standards (the WMO scale), it is uniquely valuable for satellite validation.
Presentation quality – this paper is long, but it has to be to fully and completely describe the complicate progress of transforming these FTIR measurements into accurate, calibrated and precise estimates of total atmospheric columns of a range of gases. I believe there is a great deal of value in having a complete and comprehensive documentation of the data collection, approach to analysis and assumptions, and the final resulting dataset.
Editorial comments:
The introduction section steps through the process of gathering measurement data and transforming into the desired gas columns. I was expecting it to set the stage for the subprogram discussion that starts at about line 60. I would suggest a modification in the paragraph that starts on line 23 with “TCCON instruments measure solar spectra in the near-infrared (NIR) wavelengths;”. I suggest you say TCCON instruments measure interferograms with direct sun measurements in the near-infrared (NIR wavelengths). There are transformed into spectra…. Or something to that effect, so the idea of interferograms is introduced here. The transformation to spectra and the issues such as detector non-linearity and phase correction are important and improvements there are having a significant positive impact on the dataset.
Line 55 – I would replace “for all retrievals except those listed in §7.3.2” with “for the discussion of changing O2 mole fraction in §7.3.2”
Lines 82 and 83 – how do the windows discussed here relate to the measurement windows of OCO-2 and other satellite instruments. Seems that if they are the same, this could provide some insights into how the intercomparions of TCCON and satellite data are influenced by window selection.
Line 111 – perhaps add a pointer to the section where the solar continuum is addressed (section 6)?
Line 117 in section 3.2 – what are the implications of the choice of lineshape? Now HITRAN is not a database that includes the info you need? Is that a change of strategy?
Line 198 – Are the changes you have made to the line intensities similar to the uncertainties reported by spectroscopists? Do experts in this area think this is a reasonable scaling?
Line 275 – is igram meant to be interferogram?
Line 300 – I’m curious to know what the level of disagreement is that remains? Is it hard to summarize with a single statistic, and that is what you have just stated “does not completely resolve the problem”, or can you be more quantitative?
Line 344 – is there information about the order of the polynomials in the final data product? How does a user know of this has been set properly or if channel fringing is an issue in a particular instrument?
Line 516 – figure 11 presents data as a function of Xluft. Did you examine other variables, like time or modulation efficiency, and decide that Xluft was the best variable to use?
Citation: https://doi.org/10.5194/essd-2023-331-RC2 -
AC2: 'Reply on RC2', Josh Laughner, 26 Feb 2024
The comment was uploaded in the form of a supplement: https://essd.copernicus.org/preprints/essd-2023-331/essd-2023-331-AC2-supplement.pdf
-
AC2: 'Reply on RC2', Josh Laughner, 26 Feb 2024
-
RC3: 'Comment on essd-2023-331', Gretchen Keppel-Aleks, 23 Oct 2023
Laughner et al. discuss the GGG2020 data version of the Total Carbon Column Observing Network. The paper and the dataset represent a monumental amount of work in terms of data collection, development of a vastly more capable processing and retrieval system, and standardization across the network. This is an invaluable dataset both for carbon cycle science and for satellite remote sensing. The focus of this paper is to describe in rigorous detail the advances to the GGG processing software that converts the measured interferograms into spectra, retrieves trace gas abundances from those spectra, and then applies post-hoc empirical corrections and calibration against in situ observations. The paper is very long, but this is necessary to adequately describe and provide visualization of the parameters and processes that have been modified in the GGG2020 version. Overall, I thought the figures were very helpful in illustrating the methods and results of the GGG2020 algorithm. Since the manuscript effectively is a reference manual, I believe it is critical that all the details in the paper be present within this single manuscript. The paper not only contains the details of the methodology used for GGG2020, but also contains asides that explain details about Fourier transform spectrometry (e.g., the description of detector nonlinearity) that were helpful in motivating the issues that the TCCON team has chosen to address with GGG2020.
In this review, I present a few opportunities for clarification within the manuscript. My attitude is that the GGG2020 data have been released, they are already being used by the community, and the data are filling critical needs for carbon cycle and greenhouse gas science as well as for space-based validation. The dataset itself is easy to access, download, and is in a format that it can be easily loaded into common software packages for analysis. Furthermore, the TCCON retrieval team has already moved on to making improvements to substantiate future data releases. Thus, these comments serve as minor suggestions to improve the clarity of the manuscript, not as major issues that should take precedence over the authors’ ongoing work to advance the GGG framework and continue providing high quality data to the community.
The paper does presuppose knowledge of the network’s goals and past versions of the data. For example, line 46 casually states that there is a 0.25% accuracy needed for greenhouse gas data, but this is not motivated in the next nor cited from another source. It could be nice to provide a brief retrospective of the defined needs and past performance for TCCON’s precision/accuracy, noting that the precision needs for validation of satellite instruments and for carbon cycle science could be different.
It might be helpful to provide a table of contents at the end of the introduction, to let the reader know what the main sections of the paper are (e.g., 2 = new Xgas, 3 = Updated spectroscopy). Since many readers will not read the paper end-to-end, this could make the manuscript more effective as a reference.
While reading the paper, I thought it could be helpful for the authors to tabulate the remaining/known issues with the GGG2020 data that the next version of GGG will attempt to address. For example, the XCO2 vs Xluft dependence or the T-dependence of N2O retrievals. A summary section is included at the end of the paper, however, so this might be unnecessary, but it would be helpful to note that this section is coming in the table of contents so the reader knows it is coming. On the other hand, it could be helpful, to summarize the known issues in a compact format (table, or a call-out box rather than just another section of the paper) if the authors think it could help the user community avoid author-anticipated pitfalls when interpreting the data.
Somewhat similarly, when I read the error budget section, I noted that a table would be helpful, and there was a table at the end. It is great that the paper is so thorough, but adding a bit of organization to let the reader know that something is coming a bit later on would be very helpful.
Line 111: I am curious the extent to which the linelists have been empirically adjusted. For example, how many weak lines have been added based on the empirical identification process described in this paragraph? It would also be interesting if the paper noted how/whether this information gained from the high resolution TCCON spectra flows back to other groups in Earth/sun remote sensing.
Line 118: refers to Mendonca 2016 and Devi 2007ab, which leads me to believe that these are the lineshapes that are being used in the 6220 and 6339 cm-1 CO2 bands but this is not explicitly stated.
Line 122: What is the metric the authors used to determine that the new spectroscopy and line shape improved “the quality of XCO2 retrievals in this spectral region” ?
Section 3.3: The empirical process to optimize the O2 line widths was very detailed, and I had to read it a few times. I am not sure how the text could be more clear given the multiple interdependent constraints. The figures here were very helpful in communicating the approach. One minor comment is that the allusion to T700 as a metric for synoptic scale meteorology doesn’t seem to fit (since a relationship between non-oxygen gases and T700 represents real variability whereas with O2 it is a spurious relationship). It might be more clear to say that T700 was used as a temperature metric because this value is already saved with the observed spectra since it has also been applied for downwind scientific analyses.
Line 178: The sentence about minimizing the variance in Xluft rather than the average value was confusing to me (perhaps because the sentence is pre-empting an argument that didn’t occur to me -- I’m not sure why it would be useful to minimize the average value).
Section 4.3: The reference to trace gas profile updates here was not alluded to in the introduction; perhaps mention ginput in the introductory section
Line 251: Clarify whether the DIP metric is reported for users to consider, or is an active filter employed by the TCCON team prior to reporting data. Upon the first reading, it seemed like it was a diagnostic, but I later realized (Line 940) that it is actively used in the filtering process.
Line 275 “igram” should be “interferogram”, I think.
Line 307: The need to phase correct the interferogram is clear, but it is not clear why computing a spectrum using both the long and short arm of the interferogram is “more efficient” than only using the long arm. Please add a brief sentence to clarify.
For section 7.1 on the airmass correction:
The description of the ACDF was a bit confusing. There were quite a few variables introduced, and some might be redundant. For example, my understanding is that alpha and ACDF are the same? And that c3 is closely related to these (just that it has a daily fitted value whereas alpha is the mean of all c3’s), so perhaps there is some notation that could be used to show this more clearly (e.g., calling it alpha_i, or alpha_daily)?
Figure 5: Would it be possible to show panels with the corrected data (side-by-side panels or overlaid if that isn’t too messy) to show the result of the procedure?
Figure 6: It appears that the value selected for g in the 6339 cm-1 window is 45 deg, which is at the margin of the allowable parameter space. This suggests the optimization hasn’t converged, and may reflect that the functional form chosen is not appropriate. Can the authors comment on this?
Figure 8 and 9: Please increase the font size on these figures.
Section 7.2: perhaps clarify whether the sj are the same for all sites and times.
Line 450: The vertical scale has been changed from 70 uniform levels to one in which the level separation increase away from the surface. Can the authors comment on how this change impacts the error associated with the stratospheric extrapolation?
Line 475: how frequently is a negative Xgas retrieved, and are there any general conditions under which this occurs?
Figure 11: Could this be bigger and could the yellow color be replaced with a more saturated hue?
Figure 12: Does the orange line (X2019+VarO2 – X2019) imply that there is a ~0.2 ppm draft in the X2007 retrievals that do not account for variable oxygen, or is this partially corrected for by the slope of the in situ correction since XCO2? Could a residual error be estimated due to neglecting the trend in O2?
Equation 10: This is another equation where the key parameter is alpha, but with different meaning. Perhaps a different variable could be chosen so that the variables used in the paper have a unique definition.
The error budget was very thorough, but I do not believe Fig. 17 was referenced in the text. This was the figure I found least informative, so perhaps it could be removed completely.
Citation: https://doi.org/10.5194/essd-2023-331-RC3 -
AC3: 'Reply on RC3', Josh Laughner, 26 Feb 2024
The comment was uploaded in the form of a supplement: https://essd.copernicus.org/preprints/essd-2023-331/essd-2023-331-AC3-supplement.pdf
-
AC3: 'Reply on RC3', Josh Laughner, 26 Feb 2024