A long-term and reproducible passive microwave sea ice concentration data record for climate studies and monitoring

A long-term, consistent, and reproducible satellite-based passive microwave sea ice concentration climate data record (CDR) is available for climate studies, monitoring, and model validation with an initial operation capability (IOC). The daily and monthly sea ice concentration data are on the National Snow and Ice Data Center (NSIDC) polar stereographic grid with nominal 25 km× 25 km grid cells in both the Southern and Northern Hemisphere polar regions from 9 July 1987 to 31 December 2007. The data files are available in the NetCDF data format at http://nsidc.org/data/g02202.html and archived by the National Climatic Data Center (NCDC) of the National Oceanic and Atmospheric Administration (NOAA) under the satellite climate data record program (http://www.ncdc.noaa.gov/cdr/operationalcdrs.html). The description and basic characteristics of the NOAA/NSIDC passive microwave sea ice concentration CDR are presented here. The CDR provides similar spatial and temporal variability as the heritage products to the user communities with the additional documentation, traceability, and reproducibility that meet current standards and guidelines for climate data records. The data set, along with detailed data processing steps and error source information, can be found at http://dx.doi.org/10.7265/N5B56GN3.


Introduction
The depletion of the Arctic sea ice coverage is occurring faster than most of the climate model predictions (Stroeve et al., 2007. In September 2012, a record low Arctic sea ice extent minimum was reached, well below the previous Arctic sea ice extent minimum record set in September 2007 (http://nsidc.org/arcticseaicenews/). While numerous sea ice products are available, with the substantial changes in the Arctic sea ice and the associated impacts of the change in weather and climate system, ecosystems, and coastal communities, it is valuable to have a climate data record (CDR) quality sea ice concentration product.
A CDR is defined by the National Research Council (NRC, 2004) as "a time series of measurements of sufficient length, consistency, and continuity to determine climate variability and change". A CDR also needs to be well documented for transparency, traceability, and ultimately reproducibility. For sea ice, the satellite-based products have an advantage of providing a complete data set due to their superior spatial coverage and continuous measurements in time during the life of the satellite missions when compared to other types of observations such as in situ or ship-based measurements.
A satellite-based sea ice concentration product has been transitioned from research to operation (R2O), based on the recommendations from NRC (NRC, 2004), through collaboration between the National Snow and Ice Data Center (NSIDC) and the National Climatic Data Center (NCDC) of the National Oceanic and Atmospheric Administration (NOAA) under the Satellite Climate Data Record Program (CDRP). The purpose of this R2O process is to produce and preserve a complete and consistent sea ice concentration climate data record derived from satellite measurements based on mature research algorithms. The sea ice concentration CDR is currently available with an initial operation capability (IOC), which is the first iteration of the publicly released version that has met all the software, product validation, documentation, data archive, and access requirements with a score of three or higher in all categories of the NCDC CDR Maturity Matrix (CDRP, 2011a, b). The NCDC CDR Maturity Matrix is based on a maturity model that defines CDR readiness of a product for R2O over six categories consisting of software, metadata, documentation, product validation, public access, and utility into six levels (Bates and Privette, 2012). The level three or higher represents a key threshold in the maturity of the sea ice concentration product: (i) stability in source code that meets certain coding standards, (ii) metadata that meets NOAA-recommended standards for collection-level and NCDC CDRP-recommended NetCDF Climate and Forecast (CF)-compliant attributes for file-level metadata, (iii) availability of documentation including a Climate Algorithm Theoretical Basic Document (C-ATBD) that describes the algorithm and process steps in detail, and (iv) publically available data and source code for transparency and traceability of the algorithm and processing. Another integral part of CDR readiness is examining the maturity of the algorithms and application of the product in peer-reviewed publications.
In this paper, we present a description of the NOAA/NSIDC sea ice concentration CDR data set and basic characteristics of the CDR such as long-term mean and trend to provide a baseline for users.

Data set description
The NOAA/NSIDC CDR sea ice concentrations are daily and monthly estimates of the fraction of ocean area covered by sea ice. They are derived from the brightness temperature from the Defense Meteorological Satellite Program (DMSP) series of Special Sensor Microwave Imager (SSM/I) passive microwave radiometers: F-8, F-11, and F-13 (Meier et al., 2011). The SSM/I sensors onboard the satellites have a swath width of about 1400 km. The CDR leverages two well-established and well-validated passive microwave sea ice concentration algorithms developed at the NASA's Goddard Space Flight Center (GSFC): the NASA Team (NT) algorithm (Cavalieri et al., 1984(Cavalieri et al., , 1996(Cavalieri et al., , 1999 and the Bootstrap (BT) algorithm (Comiso, 1986(Comiso, , 2000Comiso and Nishio, 2008). The algorithms use adjusted coefficients, based on overlap of sensor operations, to assure consistency through
the series of sensors. The input DMSP SSM/I brightness temperatures are daily gridded fields archived at NSIDC (Maslanik and Stroeve, 2004) derived from swath fields generated by Remote Sensing Systems, Inc. (RSS) (Wentz et al., 2007). Data periods of passive microwave sensor sources for the sea ice concentration CDR are listed in Table 1.
Variables in both daily and monthly data files are gridded onto the same NSIDC polar stereographic grid with nominal 25 km × 25 km grid cells, covering the ocean surface area from 31.1 to 89.84 • N in the Northern Hemisphere and 39.36 to 89.84 • S in the Southern Hemisphere. The NSIDC's polar stereographic projection sets the projection plane tangent to the earth's surface at 70 degrees northern and southern latitude to minimize the distortion of the cell area in the marginal ice zones. With the normal polar projection, which usually sets the plane tangent to the earth's surface at the poles, the distortion of the cell area at the edge of the northern/southern grid can reach 31 percent/22 percentage, respectively (Pearson, 1990;Snyder, 1987). The dimensions of grid cells in the x and y directions for each data file are 304 × 448 for the Northern and 316 × 332 for the Southern Hemisphere with cell areas decreasing linearly away from the poles (Fig. 1). Figure 2 (left three panels) shows spatial distributions of the CDR sea ice concentration fields from monthly CDR data files in the annual minimum sea ice extent month (i.e., September, for year 1987September, for year , 1997September, for year , and 2007, which provides an example of the decadal changes in the Arctic sea ice coverage. In addition to the distinct sea ice coverage depletion over these two decades, it also shows that ice concentrations are spatially homogenous over much of the sea ice field, and the large spatial variability tends to occur near ice edge. While the overall patterns between the monthly CDR and GSFC sea ice concentrations are quite similar, the ice edge is also the region where one may find the largest difference between the two fields (Fig. 2).
The daily and monthly sea ice concentration data files are available in the NetCDF data format, which is selfdescribing and machine independent. The file metadata conform to the guideline recommended by the NCDC CDRP (CDRP, 2011c). The guideline utilizes the existing metadata conventions such as CF Metadata Convention and Unidata Attribute Convention for Dataset Discovery (ACDD) for easy data set search and discovery, and downstream and downscaling applications. Each file includes three CDR related variables: the primary CDR sea ice concentration field from 9 July 1987 to 31 December 2007, local spatial standard deviation of the sea ice concentrations, and a quality flag field. The standard deviation and quality flags for each grid cell provide indications of uncertainty/error. Additional sea ice concentration variables in each data file include two products processed by the NASA's Goddard Space Flight Center and archived at NSIDC using the NT and BT algorithms, respectively (referred to as Goddard hereafter collectively or NT or BT, respectively). A third additional variable, processed at NSIDC, merges these two Goddard concentrations using the same methodology that generates the CDR sea ice concentrations (referred to as GSFC hereafter). These additional Goddard parameters provide access to heritage variables familiar to user communities in the same format and grid, and effectively extend the CDR record to 26 October 1978 (albeit without meeting the reproducibility requirement). They also provide a useful benchmark for evaluating the CDR. Each variable is described in more detail below. A list of input data sets for the CDR related fields is also provided.

Primary CDR variables
-CDR Sea Ice Concentration -this field provides sea ice concentration (fraction ice cover) for each grid cell.
The CDR value at each grid cell is defined as the higher concentration value between the NASA Team and Bootstrap outputs processed at NSIDC. This is done to mitigate the known issue that both algorithms tend to underestimate sea ice concentrations (Comiso et al., 1997;Kwok, 2002;Meier, 2005). A 10 % concentration threshold based on the Bootstrap concentration field is used to define the ice edge (the boundary between ice and open water). The processing includes several automated quality control steps. Two weather filters, based on ratios of passive microwave frequencies, are used to eliminate false ice due to atmospheric moisture and wind roughening over the open ocean (Cavalieri et al., 1999;Comiso and Nishio, 2008). A separate landspillover correction is used in each algorithm to filter false ice near the coast that results from mixed landocean grid cells (Cavalieri et al., 1999;Cho et al., 1996). Finally, monthly ocean masks are applied to remove sea ice in regions where it is never likely to occur; in the Arctic, the masks are based on maximum extent through the time series with an added buffer, while the Antarctic uses sea surface temperature (SST) climatologies. The Antarctic masks are much more conservative (i.e., larger region of potential sea ice).
-Concentration Standard Deviation -the standard deviation field is derived at each valid grid cell from the NT and BT concentration values at that cell and the surrounding eight cells. It thereby accounts for two sources of variability: (1) the difference between NT and BT estimates, and (2) the spatial variation within the neighborhood of each grid cell. Thus the standard deviation is calculated from up to 18 values (9 grid cells each of NT and BT sea ice concentrations, as illustrated by Fig. 3).
The rationale for this approach is that grid cells where NT and BT differ significantly have higher uncertainty than cells where the two agree. Also, regions where there is high spatial variability tend to have higher uncertainty. Such high variability regions include near the ice edge (where the low spatial resolution of the sensors results in limited precision of the ice edge location) and isolated ice-covered grid cells near the coast (which indicates possible land-spillover error). Thus, while the values are largely measuring spatial variability, the field provides a quantitative relative estimate of uncertainty because regions of high errors tend to be well correlated with regions of high spatial variability. Standard deviations associated with spatial variability are typically a few percent (Cavalieri et al., 1984) and can potentially serve as a quantitative upper limit of the concentration error (Gloersen et al., 1993). Melt also tends to result in higher variability in the concentrations, particularly in terms of differences between NT and BT. In cold, winter conditions, the spatial variability is generally < 5 %, while in melting conditions, it generally increases to 10-20 %.
-Data Quality Field -in addition to the quantitative local standard deviation field, a data quality field is included to provide further information about the nature of the concentration values. Flag values are assigned to denote several conditions at each grid cell: (1) algorithm value selected (NT or BT), (2) masked by ocean climatology, (3) low concentration (< 50 %), (4) adjacent to coast, and (5) melt occurred during summer (Arctic only). This melt onset is detected at a given grid cell at a given time and does not address the motion of ice after the melt starts. Multiple flag values can be assigned to each grid cell.

Additional variables included
-Merged Goddard Concentration -combined NT and BT sea ice concentration estimates, assembled in the same manner as the CDR concentration, but using the Goddard-produced fields as a source. The main difference between the merged Goddard and CDR concentrations is the fact that a manual quality control procedure is performed on the first product in both input brightness temperature and resultant NT and BT sea ice concentra- tion estimates that includes subjective removal of pixel values. These questionable pixel values tend to be associated with artifacts of weather effects in the passive microwave sea ice concentration retrieval algorithms.
-Goddard NASA Team -NT concentrations produced by Goddard. Concentrations are re-scaled to 0-100 percent for consistency with other variables.
-Goddard Bootstrap -BT concentrations produced by Goddard. Concentrations are re-scaled to 0-100 percent for consistency with other variables.
-Latitude/Longitude -included as a part of NetCDF CFcompliant file-level metadata requirements.
-Other Metadata -variables and attributes to satisfy NetCDF4 CF-compliant file-level metadata requirements, including projection information and grid size and resolution. An archived source code package that includes source code for processing NSIDC NT and BT sea ice concentrations from DMSP SSM/I brightness temperatures and producing the NOAA/NSIDC passive microwave sea ice concentration CDR, along with a README file, can be downloaded from http://www.ncdc.noaa.gov/cdr/operationalcdrs. html (click on the document icon under the "Source Code" column next to "Sea Ice Concentration" under the "Oceanic CDRs" column to download the source code package). The source codes for the sea ice concentration algorithms and re-gridding programs are in the C and FORTRAN programming languages while Python is used to call all the necessary pieces to create the CDR. In this source code package, one can also find the input data sets that are not archived elsewhere such as climatological minimum sea ice mask (CMIN).
A detailed description of the climate algorithm for the CDR and algorithm validation and error assessment can be found in Meier et al. (2011). The CDR has been shown to capture well the seasonal and interannual variability when compared with other satellite-based sea ice products with no significant systematic bias Peng and Meier, 2013).
Twenty years of monthly CDR data from January 1988 to December 2007 are used next to show the basic characteristics of the CDR. From hereafter, we will refer to NOAA/NSIDC CDR sea ice concentrations as CDR and merged Goddard sea ice concentrations as GSFC. As stated previously, CDR and GSFC are produced using the same methodology and based on the same algorithms -the main differences are additional manual quality control of input brightness temperature and output sea ice concentration fields, gap filling in both time and space and correction/replacement of obviously erroneous data for GSFC (e.g., remaining weather effects). Because of the interpolation and manual corrections, GSFC represents a higher quality research product that can be used as a benchmark for evaluating and characterizing model or other satellite-based sea ice products such as we have done in Meier et al. (2013) for CDR; however, the higher quality comes at the expense of traceability and reproducibility and longer data latency (about 12 months or longer). CDR, on the other hand, aims to ensure the consistency and sustainability of the sea ice time series with planned updates on a quarterly basis of the primary CDR fields and offers transparency, traceability, and reproducibility with the NetCDF-4 data format and CF-1.5 compliant file-level metadata elements along with collectionlevel metadata that follows the ISO 19115-2 standards.

Basic characterization of the CDR
The basic characterization of the CDR is provided here using monthly sea ice extent from monthly sea ice concentration data files for a period of 20 yr (January 1988-December 2007) in terms of mean and long-term trend to provide a baseline for users. The sea ice extent is computed by summing the grid cell area of all cells that have 15 percent or greater sea ice concentrations, assuming the area not measured by the sensor at the North Pole as shown in Fig. 2 is entirely covered by at least 15 % ice.
The CDR sea ice extent values are in good agreement with estimates from GSFC (Fig. 4, see Meier et al., 2013 for detailed comparison and analysis). As expected, the sea ice extent undergoes distinct seasonal cycles in both polar regions. It peaks in March and reaches the minimum in September in the Northern Hemisphere but peaks in September and reaches the minimum in February in the Southern Hemisphere (Fig. 5). The mean annual CDR sea ice extent is about 12 million km 2 for both hemispheres with mean biases, relative to GSFC, of about 0.1 and −0.05 million km 2 for the Northern and Southern Hemispheres, respectively ( Table 2). The standard deviation (SD), which is mainly associated with seasonal variability, is nearly twice as large in the Southern Hemisphere (compared 5.656 to 2.929 million km 2 ). However, the interannual variability, represented by the range of the shaded area for each month in Fig. 5, tends to be smaller in the Southern Hemisphere than that in the Northern Hemisphere. As an example, the statistical characteristics of the annual minimum and maximum of the CDR sea ice extents are provided in Table 3 for both hemispheres. The results indicate that the interannual variability of the CDR extents is nearly twice as large for the annual minimum as that for the annual maximum in the Northern Hemisphere while remaining similar for both in the Southern Hemisphere (Table 3).
With the predominant seasonal cycle, the cross-correlation coefficients between the CDR and GSFC sea ice extents are very close to one for both hemispheres with very small bias and root-mean-square (RMS) error (Table 2), largely reflecting the spatial homogeneity of ice concentrations over much of the field. A lot of variability does occur near the ice edge, but that is a small portion of the total grid cells. It has been shown that regional variability can be large (e.g., Cavalieri and Parkinson, 2008). More in-depth examination of the regional variability will be carried out but is beyond the scope of this paper.
Least-square linear regression of this twenty-year annual mean CDR sea ice extent time series indicates a sea ice coverage decrease of 0.597 million km 2 per decade in the Northern Hemisphere, which is significant at the 95 % confidence level. This decadal trend is about 4.94 % per decade of the annual mean sea ice extent of 12.076 million km 2 . The margin of error is about 0.18 million km 2 per decade (Fig. 6). On the other hand, an almost zero but slightly positive trend (0.04 ± 0.21 million km 2 per decade) is found in the Southern Hemisphere (Fig. 6). This trend represents an increase rate of less than 0.3 % per decade relative to the 20 yr annual mean sea ice extent of 12.179 million km 2 and is not significant at the 95 % confidence level. For the Northern Hemisphere, while the annual maximum sea ice extent decreases at a rate that is similar to the annual mean sea ice extent rate, i.e., 0.556±0.23 million km 2 per decade the annual minimum sea ice extent decreases at a faster rate, 0.99 ± 0.48 million km 2 per decade (Fig. 7), indicative of the effect of enhanced summer melt (Markus et al., 2009), thinning of the ice cover and loss of older ice types (e.g., Maslanik et al., 2011). Both annual maximum and minimum sea ice extent decadal trends are significant at the 95 % confidence level. On the other hand, both annual maximum and minimum sea ice extents in the Southern Hemisphere experience increase, with a rate of more than double for the annual maximum sea ice extent (0.35 million km 2 per decade, which is significant at the 95 % confidence level) than that of the annual minimum sea ice extent (0.138 million km 2 per decade, which is not significant at the 95 % confidence level). This difference in magnitudes is in part due to the large difference in absolute mean extent at the minimum and maximum, as well as climate factors such as atmospheric and oceanic circulation. Therefore, while the Arctic region experienced diminishing sea ice coverage for the two decades, with a faster reduction rate for the annual minimum sea ice coverage, the Antarctic Region, as a whole, has experienced a small increase in its sea ice coverage with a noticeable increase in its annual maximum sea ice coverage, significant at the 95 % confidence level. These trends and variability are consistent with trends and variability observed in other passive microwave sea ice products and in particular, they are in close agreement with the Goddard estimates.
In comparisons between the CDR and GSFC variables, no significant systematic bias was found. Trends and variability are also consistent. Thus the sea ice concentration CDR provides similar spatial and temporal variability as the GSFC fields to the user communities with the additional documentation, traceability, and reproducibility that meet current standards and guidelines for climate data records. Version 2 is now available, with minor changes, at http://dx.doi.org/10. 7265/N55M63M1. The main improvement for the version 2 is that it introduces a new snow melt onset date variable (Arctic only). The version 2 also extends the sea ice concentration CDR record to 2012 from 2007, using the Special Sensor Microwave Imager/Sounder (SSMIS) brightness temperature data from DMSP F-17.  Fig. 6 but for the annual maximum sea ice extent (left panels) and minimum (right panel) for both Northern (top panels) and Southern (bottom panels) Hemispheres. The decadal trend in red is significant at the 95 % confidence level.