Greenhouse gas observations from the Northeast Corridor tower network

We present the organization, structure, instrumentation, and measurements of the Northeast Corridor greenhouse gas observation network. This network of tower-based in situ carbon dioxide and methane observation stations was established in 2015 with the goal of quantifying emissions of these gases in urban areas in the northeastern United States. A specific focus of the network is the cities of Baltimore, MD, and Washington, DC, USA, with a high density of observation stations in these two urban areas. Additional observation stations are scattered throughout the northeastern US, established to complement other existing urban and regional networks and to investigate emissions throughout this complex region with a high population density and multiple metropolitan areas. Data described in this paper are archived at the National Institute of Standards and Technology and can be found at https://doi.org/10.18434/M32126 (Karion et al., 2019).


Introduction
As the population of cities grows globally due to trends toward urbanization, so does their relative contribution to global anthropogenic greenhouse gas (GHG) budgets (Edenhofer et al., 2014;O'Neill et al., 2010). City governments are making commitments to reduce their emissions of GHGs through various sustainability and efficiency measures and coordination with organizations like the C40 Climate Leadership Group (http://www.c40.org, last access: 23 March 2020), the Global Covenant of Mayors for Climate and Energy (https:// www.globalcovenantofmayors.org, last access: 23 March 2020), and others. These organizations require individual cities to conform to certain standardized mechanisms and practices for reporting their carbon emissions. City governments rely on inventories compiled using data on fuel use, energy usage, etc., to estimate their total emissions and changes over time and to determine the efficacy of various emissions mitigation policies. Analysis of atmospheric measurements provides additional useful information to such efforts, by confirming inventory estimates (Sargent et al., 2018;Lauvaux et al., 2016), detecting trends (Mitchell et al., 2018), or estimating emissions that are not well quantified using inventory methods, such as methane emissions Ren et al., 2018;Lamb et al., 2016;Yadav et al., 2019). Several urban top-down measurement efforts are underway in various cities that include networks of observations, often in situ CO 2 and CH 4 measurements from rooftops or towers (Verhulst et al., 2017;Xueref-Remy et al., 2018;Bares et al., 2019) or using other long-path and remote sensing methods (Waxman et al., 2019;Hedelius et al., 2018;Wong et al., 2016;Pillai et al., 2016).
The National Institute of Standards and Technology (NIST) has partnered with other federal, private, and academic institutions to establish three urban test beds in the United States: the Indianapolis Flux Experiment (INFLUX, http://influx.psu.edu, last access: 23 March 2020), the Los Angeles Megacities Carbon Project (http://megacities.jpl.nasa.gov, last access: 23 March 2020), and the Northeast Corridor (NEC, http://www.nist.gov/topics/northeastcorridor-urban-test-bed, last access: 23 March 2020). The goals of the urban test beds are to develop and refine techniques for estimating greenhouse gas emissions from cities and to understand the uncertainty of emissions estimates at various spatial and temporal scales (e.g., whole city annual emissions vs. 1km weekly emissions). Recent results from the longest-running test bed, INFLUX, show that whole city emissions can be estimated using three different methods to within 7% (Turnbull et al., 2019).
The Northeast Corridor (NEC) was established in 2015 as the third NIST urban test bed. The goals for this project are to demonstrate that top-down atmospheric emissions estimation methods can be used in a domain that is complicated by many upwind and nearby emissions sources in the form of surrounding urban areas. The objective is to isolate the anthropogenic GHG emissions from urban areas along the US East Coast from many confounding sources upwind (cities, oil and gas development, coal mines, and power plants) and from the large biological CO 2 signal from the highly productive forests nearby and within the cities. The presence of highly vegetated areas such as urban parks, local agriculture, and managed lawns is expected to dominate the CO 2 signal in summertime, as has been found in Boston, MA (Sargent et al., 2018). The NEC project has a current focus on the urban areas of Washington, DC, and Baltimore, MD, USA, with existing plans to expand northward to cover the entire urbanized corridor of the northeastern US, including the cities of Philadelphia and New York City, and eventually linking up with existing measurement stations in Boston, MA (McKain et al., 2015;Sargent et al., 2018).
The NEC project includes multiple measurement and analysis components. The backbone of the NEC project is a network of in situ CO 2 and CH 4 observation stations with continuous high-accuracy measurements of these two greenhouse gases. In addition, periodic flight campaigns of multiple weeks each year are conducted by the University of Maryland (FLAGG-MD, http://www.atmos.umd.edu/~flaggmd, last access: 23 March 2020) and Purdue University (https://www.science.purdue.edu/shepson/research/ ALARGreenhouseGas/, last access: 23 March 2020), focusing on wintertime observations of CO 2 , CH 4 , CO, O 3 , SO 2 , and NO 2 from instrumented aircraft Lopez-Coto et al., 2020a). The use of low-cost CO 2 sensors is also being investigated in Washington, DC, with work focusing on calibration and determination of long-term stability of inexpensive nondispersive infrared (NDIR) sensors with potential for use in CO 2 data assimilation techniques (Martin et al., 2017). The NEC project also includes an extensive modeling component. First, high-resolution meteorological modeling (using the Weather Research and Forecast (WRF) model) is being conducted (Lopez-Coto et al., 2020b), with output coupled to Lagrangian dispersion models such as STILT (Lin et al., 2003;Nehrkorn et al., 2010) and HYSPLIT (Stein et al., 2015). These transport and dispersion models are used to interpret observations from both aircraft and tower stations and in atmospheric inverse analyses to estimate fluxes of CO 2 and CH 4 from the cities of Washington, DC, and Baltimore, MD (Lopez-Coto et al., 2020a;Huang et al., 2019). A high-resolution fossil fuel CO 2 inventory, Hestia, is also being developed for this project (Gurney et al., 2012.
Here we focus on the high-accuracy tower observation network and associated data collection and processing methods. Section 2 describes the tower network design and characterizes the different site locations; Sect. 3 describes the measurement methods, instrumentation, and calibration; Sect. 4 presents the uncertainty derivation for the measurements; and, finally, Sect. 5 presents some of the observations from the current record.

Network design and site characterization
The NEC project includes 29 observation stations, all managed and operated by Earth Networks, Inc. 1 (http://www.earthnetworks.com/why-us/networks/greenhouse-gas, last access: 23 March 2020). A total of 10 stations were existing Earth Networks (EN) measurement sites in the northeastern US that became part of the NEC project in 2015. A total of 19 stations were established (or will be established) specifically for the NEC project, with site locations identified by NIST. A total of 16 of these station locations were chosen to be used for emissions estimation in a domain around Baltimore and Washington, DC (red boundary, Fig. 1), using inverse modeling techniques (Lopez-Coto et al., 2017;Mueller et al., 2018). Three others are in Mashpee, MA, Philadelphia, PA, and Waterford Works, NJ. As of publication, 14 of these 19 have been established, with delays occurring due to difficulty finding suitable tower locations to agree to house the systems. The hardware and software operating at all the sites is identical with few exceptions as noted in the text.
The initial design of the core urban Baltimore-Washington network was focused on optimizing tower site locations with the goal of reducing uncertainty in estimating 1 Certain commercial equipment, instruments, or materials are identified in this paper in order to specify the experimental procedure adequately. Such identification is not intended to imply recommendation or endorsement by the National Institute of Standards and Technology nor is it intended to imply that the materials or equipment identified are necessarily the best available for the purpose. anthropogenic CO 2 emissions from Washington, DC, and Baltimore using an atmospheric inversion model (Lopez-Coto et al., 2017). A total of 12 communications towers were identified as part of that study as ideal locations for measurements. Actual measurement sites were sometimes established at locations near the ideal study location, usually due to logistical difficulties obtaining leases at the ideal tower sites. A second design study determined ideal locations for background stations, i.e., observation station locations that would aid in the determination of background CO 2 entering the analysis domain (Mueller et al., 2018). Four stations were identified as part of that study; an existing EN site in Bucktown, MD, serves as a fifth background station southeast of the analysis domain ( Fig.  1). Although inlet heights were desired to be 100m above ground level (a.g.l.), often shorter towers were used due to the lack of availability of tall towers in ideal locations; the shortest tower in this network has the uppermost inlet at 38ma.g.l. (HRD). Table 1 indicates details and locations of each site.
The stations in Table 1 are all situated in areas with different land use. Even among the Washington, DC, and Baltimore area urban stations, the degree of urban intensity varies, from densely urbanized areas (such as northeastern Baltimore, NEB) to dense and moderately developed suburbs (such as Capitol Heights, CPH, and Derwood, DER, both suburbs of Washington, DC, located in Maryland). Figure 2 indicates the intensity of development from the US Geological Survey (USGS) 2016 National Land Cover Database (Yang et al., 2018) surrounding each urban station in the Washington, DC, and Baltimore network.
Similar variability in land cover for the regional stations exists, as indicated in Fig. 3. The sites established to characterize background conditions for the urban network in Washington, DC, and Baltimore (SFD, TMD, BUC) are in areas with little development: SFD and TMD are both in forested regions, while BUC is near the Chesapeake Bay and large wetland areas. The other regional sites span a range of land cover types from urban (MNY in New York City and RIC in Richmond, VA), to mostly rural and forested (DNH in Durham, NH).
3 Carbon dioxide, methane, and carbon monoxide measurements, instrumentation, and calibration

Instrumentation
The instrumentation contained in the Earth Networks (EN) system module has been described elsewhere (Welp et al., 2013;Verhulst et al., 2017); we will summarize the system here but refer the reader to those publications for further details, including additional equipment and part numbers. Figure 4 indicates the plumbing diagram of the typical tower setup. Three inlet lines reach from the sampling location on the tower into the equipment housed in a full-size rack inside a shed at the base of the tower. Typically, two inlet lines sample from the topmost level and one line samples from a lower level on the tower. Stafford, VA (SFD), is one exception with inlets at three different levels (50, 100, and 152m), and a planned tower in New Jersey (Waterford Works) will have five inlet height levels, as indicated in Table 1. At some sites there was no space to house the equipment in existing structures, thus small single-or double-rack sized enclosures were purchased and installed. Air is pulled through a filter into the inlet lines (0.953cm, i.e., 0.375in., OD Synflex 1300) that are continuously flushed at ~ 10 L min −1 by aquarium pumps (Alita AL-6SA). The three air lines are connected to a rotary multi-port valve (MPV; eight-port, VICI, Valco Instruments Co. Inc.) housed within a sample control box (calibration box). Two or three calibration standards are also connected to the MPV with 0.156 cm (0.0625 in.) OD stainless steel tubing. The control system for the MPV directs the air stream to the analyzer cycling every 20 min through each of the three inlet lines so that each inlet is sampled at least once an hour and every 22h through each standard (Sect. 3.2). The common port of the MPV is connected to a pressure controller that reduces the pressure to 80 kPa (800 mb), after which the sample (either ambient air or air from a standard gas cylinder) enters a 183cm long Nafion dryer (Permapure, Inc., model MD-050-72S-1), where it is dried to a water vapor mole fraction of ~ 0.1% prior to flowing through the cavity ring-down spectroscopic (CRDS) analyzer (Picarro, Inc., Model 2301). The lower-than-ambient inlet pressure of 80 kPa is prescribed in order to lower the flow rate of the analyzer to ~ 70standard cm 3 min −1 . At Mashpee, MA (MSH), a CRDS Picarro Model 2401 analyzer is operational, and this is the only site currently also measuring carbon monoxide (CO) in addition to CO 2 and CH 4 . The CRDS analyzers report measurements of dry air mole fraction of each gas in air, also known as the mole fraction, i.e., moles of the trace gas per mole of dry air. Throughout, we refer to these measurements in units of μmol mol −1 for CO 2 and nmol mol −1 for CH 4 and CO, following the SI recommendations (Bureau International des Poids et Mesures, 2019). Software (GCWerks, Inc.) installed on a separate mini-PC at each site controls the run cycle and the MPV selection valve. The data are collected on this computer and sent to the central EN data server, also running GCWerks. All data are processed on the central EN data server but additional post-processing and uncertainty assignment to hourly observations is performed at NIST. As recommended by the World Meteorological Organization (WMO), the software has the capability of reprocessing all the data from the original raw files and thus can accommodate any changes to the assigned values of the standards (due to a reference-scale update, for example) at any time (WMO, 2018).

Calibration cylinders
When the Earth Networks GHG monitoring system was established in 2011, each site hosted two calibration cylinders (standards) with ambient level dry air mole fractions as part of the original system design. This continues to be the case at most NEC sites. At the NEC sites, these standards have values close to 400 μmol mol −1 dry air CO 2 , 1890 nmol mol −1 dry air CH 4 , and 115 nmol mol −1 dry air CO (at MSH only) and are sampled by the analyzer periodically, in a sequence identical to that described for the Los Angeles Megacity network by Verhulst et al. (2017). The standards are purchased from the WMO Central Calibration Laboratory (CCL), the National Oceanographic and Atmospheric Administration's Earth System Research Laboratory (NOAA/ESRL) Global Monitoring Division in Boulder, CO, USA, where they have been calibrated on the WMO scales (X2007 for CO 2 , X2004A for CH 4 , and X2014A for CO, Zhao et al., 1997;Dlugokencky et al., 2005;Novelli et al., 2003). One of these two cylinders serves as a standard for calibration and drift-correction, while the second serves as a target tank or check standard. The target tank is used for data quality checks and uncertainty calculations (Sect. 4). The residual of the target tank (the rms difference between its value assignment when treated as an unknown and its reference value from NOAA) is a critical indicator of data quality and is monitored in order to alert the operators of any general problems in the system such as leaks, mistakes in the assignment of MPV ports, or drift in calibration tank value. In the field, all gas standards are sampled for 20 min every 22h. In data processing, the first 10 min of any tank run are filtered out to allow for the system equilibration, including flushing of the regulator and tubing. In some cases, when the standard runs were found not to equilibrate as quickly as desired, 15 min of data were filtered until the problem could be fixed (typically either contamination or inadequate regulator flushing). The first 10 min of the ambient air sample following a standard run are also filtered for equilibration, and the first 1 min of each 20 min ambient air run is filtered if it follows another ambient air run (i.e., an inlet switch). The longer flush time is desired for the standard runs because of the need to flush stagnant air remaining in the regulators and tubing when sampling from the cylinder, while the ambient air lines are continuously flushed.
At a few NEC sites (currently BWD and MSH, with more planned), a third gas cylinder is installed at the site to serve as a permanent high-concentration standard (referred to as the high standard), to improve calibration and reduce uncertainties. This standard typically contains air with a mole fraction of CO 2 close to 500 μmol mol −1 , CH 4 at approximately 2300 to 2500 nmol mol −1 and at MSH, CO, near 320 nmol mol −1 . At MSH, this cylinder has been provided directly by NOAA/ESRL, while at BWD this cylinder was purchased as natural whole air from Scott-Marrin, Inc. (now Praxair). The Scott-Marrin air is stripped of its original trace gases (CO 2 , CH 4 , CO, hydrocarbons, etc.) with CO 2 , CH 4 , and CO added back in to prescribed values. Several such standards have been purchased with the intent of placing them at urban stations to serve as high standards after calibrating them onto the WMO scales. We note that because they are being used together with NOAA/ESRL standards in the field, it is essential that these standards also be assigned values on the same scales. This calibration is transferred in the NIST laboratory using five standards calibrated and purchased from NOAA/ESRL. The CO 2 in the Scott-Marrin cylinders is isotopically different (in terms of the 12 C/ 13 C ratio in CO 2 ) from the ambient air tanks that are filled by NOAA/ESRL at Niwot Ridge, CO. However, the calibration is transferred from the NOAA standards to the Scott-Marrin gases using the same model (Picarro 2301) analyzer used in the field (i.e., measuring only 12 CO 2 ) in the NIST laboratory, effectively canceling out the error that would be caused by this isotopic mismatch (Chen et al., 2010;Santoni et al., 2014). Thus, the CO 2 values assigned by NIST to these standards are effectively the total dry air mole fraction of CO 2 the cylinders would contain if they were isotopically similar to the NOAA cylinders.
Additional sites in the network also benefit from the improved two-point calibration method in cases where measurements of a high standard were performed prior to analyzer deployment (NWB, NEB, JES, TMD, CPH, and HRD). Prior to system installation at these sites, tests were conducted at the EN laboratory in which the designated analyzer was set up measuring the calibration standard, target standard, and a high-value standard at ~ 490 μmol mol −1 CO 2 and ~ 2560 nmol mol −1 CH 4 daily for several days (enough for 3-5 measurements of 20 min each). This single high-standard cylinder was also calibrated by and purchased from NOAA/ESRL, with assigned values on the WMO scales. These laboratory tests allow the determination of the secondary correction to the instrument response or sensitivity, as described in Sect. 3.4.
The high-standard gas measurements are used to perform a secondary correction (referred to as a two-point calibration) (Sect. 3.4) to the original one-point calibration described by Verhulst et al. (2017) and in Sect. 3.3., reducing the uncertainty of the measurements. We note that while, in principle, a secondary correction is desirable, and the uncertainty is indeed reduced by its implementation (see Sect. 4.2), it remains quite small relative to the signals of interest in an urban network. Deployment of high standards at all sites has not yet occurred due to both costs and logistical and operational constraints; for example, at many sites the space available for the equipment is limited and prohibits the installation of a permanent third tank. Thus, we plan to implement a round-robin procedure circulating additional standards at various values through the network to evaluate the calibrations and implement the secondary correction throughout the network. Although the current state of having two different calibration methods coexisting in the network is not ideal, we aim to implement the secondary correction throughout the network as soon as possible.

Drift correction and single-point calibration
Here we describe the calibration and drift correction applied to all the mole fraction data. This single-point calibration uses only a single reference value, that of the calibration standard, to correct the raw mole fractions for each gas. The equations are identical (with a few nomenclature differences) to those found in Verhulst et al. (2017). In the following analysis, X′ denotes a raw dry mole fraction measurement (i.e., a reported value from the CRDS analyzer after internal water vapor correction), while X denotes a mole fraction after some correction has been applied (drift and/or calibration, as described in the equations below). A subscript cal indicates the main calibration standard (usually a single ambient level standard tank calibrated by NOAA/ESRL), subscript std indicates any other standard tank, tgt indicates a standard tank that is being used as a target, and the subscript air indicates the sample measurement. Note that within the GCWerks software, the meanings of the abbreviations cal and std are reversed from what is defined here; we choose to use the nomenclature by Verhulst et al. (2017) here for consistency with the literature. We note that we have changed some nomenclature slightly from Verhulst et al. (2017) for additional clarity and conciseness. We refer to the drift-corrected mole fraction as X DC , which is noted as X corr by Verhulst et al. (2017); we refer to the mole fraction after a secondary correction is applied as X SC . We also refer to the assigned mole fraction of a standard by the calibration laboratory as C rather than X assign . We define the sensitivity S to be the response of the analyzer or the ratio of the measured to the true value. In the case of the calibration tank, this is the ratio of the raw measured value, X cal ′ , to the assigned value of the standard by the calibration laboratory on the WMO scale for the given species, C cal : When only a single calibration standard is present (which is the case at most sites in the NEC network), this sensitivity is assumed to be constant across mole fractions but varying in time. The sensitivity for the calibration tank is thus interpolated in time and applied as a correction for the dry air mole fractions of CO 2 and CH 4 reported by the CRDS analyzer (X air ′ ): where X DC,air is the drift-corrected air data. An alternative drift-correction is to use an additive offset, which is also interpolated in time, rather than a sensitivity for drift correction: X DC,air = X air ′ + C cal − X cal ′ . ( Measurements from MSH that include a high-value cylinder suggest that the single-tank drift correction performs (very slightly) better using the ratio correction (Eq. 2) than the difference method (Eq. 3) for CO 2 and CH 4 , while the opposite is true for CO (Fig. 5), thus the difference method is used only for CO in our network.
The calibration standard mole fractions are interpolated in time between subsequent runs in order to apply the above corrections to the air data, thus removing drift in the instrument's response. This drift-corrected fraction is reported in the hourly data files for sites and time periods where no range of concentrations is available in the standard tanks.

Multiple-point calibration
At some sites and for some time periods, a higher-mole-fraction standard is available, and a second-order correction can be made to the instrument sensitivity, accounting for the sensitivity being a function of mole fraction. Usually in the field, this correction employs only one additional standard, the higher-mole-fraction standard so that it is a two-point calibration; here we describe the general procedure for applying a correction using multiple standards at a range of concentrations. This is applied as a second-order correction to the drift-corrected air data. In general, if a range of standard concentrations is available, the correction in GCWerks is applied as described below. First, a drift-corrected sensitivity (S DC ) is calculated for each standard when it is measured, which is the ratio of the driftcorrected mole fraction of that standard (X DC,std , based on Eq. 2 for CO 2 and CH 4 or Eq. 3 for CO) to its assigned value: For the calibration standard, this value is necessarily equal to 1, but measurements of standard tanks with different assigned values indicate that the instrument sensitivity is dependent on the composition of the sample gas (in this case, the mole fraction of the standard tank). In laboratory calibrations, we find that the drift-corrected sensitivity defined in Eq. (4) is a linear function of the mole fraction ratio to the calibration gas (X′/X cal ′ ), thus we use a linear fit to the range of standards to determine the slope m and intercept b: In this fit, we force m+b = 1 by fitting a slope m and then setting b = 1−m in order to maintain the proper relationship for the calibration tank itself, when S DC,cal = 1. Applying this fit to the air data, the final air mole fraction X SC,air is determined from In the NEC tower network, there are no sites with multiple standard tanks at various concentrations. At several sites, there are measurements of a single high-concentration standard (hstd) in addition to the calibration and target standards. The high-standard measurements are either performed in the laboratory before the instrument is deployed to the field, or in the field if the third standard is permanently installed (Sect. 3.2). The above secondary correction is applied using only two tanks to perform the fit and obtain the driftcorrected sensitivity. In this special case, the fit has zero degrees of freedom with no residuals. The correction parameters (slope and intercept) are determined based on measurements over time or single measurements in the laboratory prior to a specific analyzer deployment. The correction is applied to the data from the site for a time period that is specified, i.e., it is not automatically applied based on daily measurements of the high standard. It is determined by the science team and applied for the time period that is appropriate. This is necessary to avoid applying the wrong correction if an analyzer is replaced or if there are changes made to the analyzer that might affect its calibration response. At eight sites where a high standard has been measured at any point (MSH, BWD, NWB, NEB, JES, TMD, CPH, and HRD), slopes and intercepts have been determined and the correction has been applied to the data. At stations with no high-standard measurements, we rely on the single-tank drift-correction described in Sect. 3.3.
Laboratory tests with multiple standards with the same model instrument used in the network (Picarro 2301) were performed to assess the relative improvement of a fit to two standards over a fit to a single standard. Figure 6a illustrates the fit of the drift-corrected sensitivity (S DC ) to two standards (red line) vs. all five standards (blue line) for CO 2 , along with corresponding residuals in Fig. 6b. As was shown by Verhulst et al. (2017) for multiple analyzers, the fit to a single standard has a linearly varying residual that is typically 0.1 to 0.2 at 100 μmol mol −1 above the calibration standard value (green circles, Fig. 6b). The average slope of the one-point residual from multiple tests is used by Verhulst et al. (2017) to estimate the uncertainty of the single-point calibrations (called the extrapolation uncertainty, U extrap ), described in Sect. 4.1. Performing the additional correction using a high standard shows improvement in the residuals of the fit (Fig. 6b), while using all five standards only improves the residuals incrementally. The two-point correction (red) in this figure was applied using the 406 μmol mol −1 tank as the calibration and the 496 μmol mol −1 tank as the high standard; thus, the measurement at ~ 711 μmol mol −1 is an extrapolation of the two-point fit. The residuals at values between the calibration and high standard are very small, equal to or below the uncertainty (reproducibility) of the scale reported by NOAA; this was confirmed for other analyzers and other species.
The improvement in calibration from the secondary correction is quite small compared to the signals and gradients of interest in our network. For example, when considering the enhancement between the rural site TMD and a polluted urban site, HRD, the calibration method makes a median difference of 0.4% for CO 2 and 0.3% for CH 4 (over all hours over 1 calendar year). We intend to implement this calibration throughout the network through deployment of additional standards and periodic traveling calibrations when permanent installation is not practical for logistical reasons.

Data quality and processing
Automated data filtering is performed within the GCWerks software with parameters set identically to those extensively described by Verhulst et al. (2017) for the Los Angeles Megacities network. For example, individual measurements that are outside limits for cavity temperature, cavity pressure, and during transitions between sample streams are filtered. The data are automatically downloaded from each site's Linux PC to the central EN Linux server, where they are processed automatically every hour. We note that all mole fraction assignments can be recalculated by the GCWerks software from the archived raw files if required due to a change in filtering or flagging, or in assignment of a standard tank, for example, in the case of a scale change by the CCL. The data files exported from GCWerks contain 1, 5, and 20 min averaged air data, as well as separate files with 1, 5, and 20 min averages of all standard runs. Individual or groups of 1 min data points are flagged manually by EN or NIST researchers in the GCWerks if there is cause (e.g., a site visit that disrupted the sample stream or a leak in the line). Some additional quality checking is performed at this stage, specifically checking for systematic differences between measurements from two different inlets at the same height and checking for inconsistencies in the difference between measurements at different heights. For example, if the lower inlet is systematically reading lower CO 2 than the upper inlet, especially at night, it would indicate that the inlet lines may be switched (mislabeled) or there is a leak occurring. These indications would be then verified by a field technician, and the data are either reprocessed or flagged accordingly. Filtered and flagged points are excluded from the subsequent averaging exported by GCWerks. The 1 min air data files and 20 min standard data files are post-processed at NIST to calculate hourly averages from each air inlet level and to assign uncertainties to each hourly average (Sect. 4). Data from the two top-level inlets, when they are at the same height, are combined for inclusion into the hourly average. Thus, because of the 20 min cycling through the three inlets (Sect. 3.1), hourly averages at the upper inlet include approximately 40 min of measurements, and for the lower inlet only 20 min (fewer if a calibration occurs). Publicly released hourly data from this second-level processing are contained in separate files for each species and each level for each site. The files contain the hourly average mole fraction (i.e., mole fraction) along with its uncertainty, standard deviation, and number of 1 min air measurements included in that particular hourly average. These last two quantities are provided so users can determine the standard error of the hourly means in terms of the observed atmospheric variability within the hour. Observations at higher frequency and standard tank data are available by request.

Comparison with measurements of NOAA whole air samples
Ongoing whole air sampling in flasks at several of the NEC sites by NOAA Earth System Research Laboratory's Global Monitoring Division (NOAA/GMD) provides a check on the quality of the in situ measurements. The flasks are analyzed for CO 2 , CH 4 , and CO, among a suite of additional trace gases and isotopes that are not discussed here. The flask-sampling equipment draws air from one of the inlet lines at the top of the tower that is also shared by the continuous in situ measurement equipment (as indicated by the flask port in Fig. 4). The flask measurements are otherwise independent from the continuous in situ measurements. Comparisons at all the sites with available data indicate good agreement with little or no bias in the mean over the time period of the comparison, with the exception of CO at MSH, which shows a consistent bias with a median of 8 nmol mol −1 , which is larger than the 1σ uncertainty assigned to either measurement (described in Sect. 4) and the standard deviation of the offsets themselves (Table 2). Target tank residuals for CO in this period range from 1 to 7 nmol mol −1 , depending on the cylinders installed, indicating that at least some of this difference is caused by the calibration standard assigned value (possibly due to cylinder drift in time between the NOAA calibration and deployment to the site). Similar differences between NOAA flasks and in situ CO measurements were reported in Indianapolis (Richardson et al., 2017). This result requires further investigation, by sending the cylinders for recalibration and/or deploying different standards to the station. A significant bias in the CH 4 offset at NWB is also apparent, at a mean of −5.5 nmol mol −1 but a median of −1.7 nmol mol −1 , the result of a single outlier at −30 nmol mol −1 but with only 17 samples compared. BWD did not have any samples at the time of writing, thus we compare only LEW, MSH, TMD, NEB, and NWB. Table 2 also reports the mean uncertainty, intended as a metric for comparison of the standard deviation of the offsets. For each flask sample, this uncertainty is the quadrature sum of the continuous data uncertainty (described in Sect. 4) at that hour, the standard deviation of the 1 min averages in the continuous data during that hour, and the uncertainty expected in the flask measurement, estimated here as 0.04 μmol mol −1 for CO 2 , 1.12 nmol mol −1 for CH 4 , and 0.59 nmol mol −1 for CO. The values for the flask uncertainty are from Table 1 in Sweeney et al. (2015), which reports the average offset between measurements of surface network and 12-pack flasks (such as those used for the NEC) filled with identical air after a short-term storage test. For CO 2 , flask offsets can be larger than indicated by those dry-air laboratory tests Andrews et al., 2014;Karion et al., 2013), but we use 0.04 μmol mol −1 regardless because the average uncertainty in Table 2 is dominated by the atmospheric variability term and increasing the CO 2 uncertainty in the flasks to 0.1 μmol mol −1 (for example) does not change the values significantly.
Standard deviations of the offsets (Table 2) show that there is quite a bit of scatter in the results, especially at the more urban sites that exhibit a lot of variability in the continuous data. For comparison, Turnbull et al. (2015) report agreement for CO 2 between the same flask system and continuous in situ measurements in Indianapolis as 0.04 μmol mol −1 (mean) with a standard deviation of 0.38 μmol mol −1 , somewhat smaller than what was observed at our sites. The standard deviation of offsets is usually lower than the average uncertainty, however, with the exception of CO 2 at MSH and LEW, the two sites for which the flask samples are not integrated over an hour. It is likely that the large variability seen over an hour is the reason for the large scatter in the offsets. Because the continuous in situ measurements do not cover the entire hour of sampling (at the top level, the hourly average is typically the mean of only 40 min), the variability may not be captured in the mean uncertainty reported here and has a larger impact on the comparison than it would if the continuous hourly average was based on the full hour of observations. For example, a large plume or spike in concentration during a given hour might occur while the continuous system is sampling from the lower inlet and thus would not be included in the hourly average from the continuous system, while it would be included in the full 1h flask sample.

Uncertainty
The data set includes an uncertainty estimate on each hourly average data point, consistent with recommendations from the WMO (WMO, 2018). This uncertainty is our estimate of the uncertainty of the measurement itself and does not include atmospheric variability or assess the representativeness of the measurement of a true hourly mean.

Uncertainty of hourly mole fraction data
Verhulst et al. (2017) outlined a method for calculating an uncertainty in mole fraction measurements when using the single-tank calibration correction (drift correction). Here we present a brief overview but refer the reader to that paper for further details. All uncertainties are standard uncertainties, i.e., 1σ or k = 1. In the analysis below, we assume independent uncorrelated error components, given no evidence to the contrary and no physical reason to believe that they should be correlated; therefore, we sum the various components of the uncertainty in quadrature.
The uncertainty in the final mole fractions (U air ) is expressed as the quadrature sum of several uncertainty components: where U H 2 O is the uncertainty due to the water vapor correction, U M is a measurement uncertainty, and U extrap is the uncertainty of the calibration fit when assigning values relative to a single standard tank (more detail on this can be found later in this section and in the following section). U M encompasses errors due to drifting standard tank measurements (U b ), short-term precision (U p ), and error in the calibration standard's mole fraction assignment by the calibration laboratory (U scale ): Here we note that U p for CO 2 and CH 4 is assigned as described by Verhulst et al. (2017), as the standard deviation of the individual measurements during each 1 min average during a calibration, but for CO it is assigned as the standard error (standard deviation divided by the square root of the number of samples in the mean), based on Allan variance tests (not shown) indicating that the precision of the CO measurement increases with the number of points used in the average. If no calibrations have been performed over an entire calendar year, U p is set to the 10th percentile of the standard deviation of air measurements and U b is set to a default value of 0.1 μmol mol −1 , 0.5 nmol mol −1 , and 4 nmol mol −1 for CO 2 , CH 4 , and CO, respectively. This default value for U b is based on an upper limit of values that are observed in the network; typically, U b is much smaller than these values (Verhulst et al., 2017). In the current data set, this has only occurred once: there were no calibrations run at MNC over the entire 2015 calendar year, but we have no knowledge of abnormal operations or changes during this period, with analyzer sensitivity being similar before and after this period.
Because these uncertainty components are also tested through the use of a target tank, or check standard, the uncertainty U M is assigned as the root mean square of the target tank errors when those exceed the sum of the uncertainties above.
This residual is calculated by GCWerks, and the root-mean-square residual is interpolated in time as a moving 10d average. If a target tank has not been run through the system for 10d or longer, U TGT is set to a default value that is currently set to 0.2 μmol mol −1 , 1 nmol mol −1 , and 6 nmol mol −1 for CO 2 , CH 4 , and CO, respectively, based on typical maximum values for this uncertainty calculated from many sites over several years. The target tank in the field generally has a concentration value very similar to the calibration tank, thus this residual is a good estimate of the uncertainty caused by the precision, baseline changes, and tank value assignment. However, it is not a good indicator of uncertainty at mole fractions different from that of the calibration tank. Therefore, we assign an added uncertainty component, U extrap , indicating the uncertainty that increases as the measurement value moves farther from the value of the calibration tank in the case of a single calibration standard. This was found to be a linear relationship for a series of similar model analyzers that were tested in a laboratory, and the uncertainty was described as follows: See Verhulst et al. (2017) for details on determining the unitless slope of the uncertainty, epsilon (ε), which is currently assigned as 0.0025, 0.0031, and 0.0164 for CO 2 , CH 4 , and CO, respectively, for all data that are only drift corrected (i.e., not using a high standard).

Uncertainty for observations with additional standards available
When a high-standard tank is available and the secondary correction described in Sect. 3.4 is applied, the uncertainty analysis remains similar, but the uncertainty U extrap from Eqs. (7) and (10) is replaced by an uncertainty in the two-point fit, U fit . To estimate this uncertainty for CO 2 and CH 4 , we use the reported uncertainty of the assigned value of the high-standard and calibration-standard tanks, U scale , (typically 0.03 μmol mol −1 CO 2 and 0.5 nmol mol −1 CH 4 at 1σ) along with an estimate of the precision of the analyzer, U p , to estimate an uncertainty in the drift-corrected sensitivity of the high standard, U SDC,hstd , using standard propagation of errors (black error bar, Fig. 7a). We note that in the case where the value assigned to the high standard is through a propagation of the WMO scale at NIST, the assigned value has additional uncertainty; i.e., U scale includes both the uncertainty that NOAA assigned to the cylinders used for the assignment and the uncertainty from the laboratory fit at NIST. This second uncertainty is equal to the standard deviation of the residuals of the fit and is added in quadrature to the NOAA uncertainty.
We note that the analysis described below assumes uncorrelated independent errors. We express the slope of drift-corrected sensitivity (m) and the overall drift-corrected sensitivity (S DC ) as functions only of the drift-corrected sensitivity of the high standard, S DC,hstd : This second equation uses b = 1−m. Here we do not include uncertainty in the x coordinate, i.e., X′/X cal ′ . Uncertainty in the slope is as follows: U m = U SDC,hstd X′ hstd /X′ cal − 1 .
We propagate the uncertainty in the drift-corrected sensitivity of the high standard, U SDC,hstd , to the overall drift-corrected sensitivity of all the air values using Eq. (14) and then to the two-point corrected air data by propagating through to obtain Eq. (15).
U SDC = U m X′/X′ cal − 1 = U SDC,hstd X′ hstd /X′ cal − 1 X′/X′ cal − 1 U XSC,air = U fit = U SDC S DC X SC,air To evaluate the use of standard propagation of errors, we also use a bootstrap to estimate the uncertainty using the laboratory calibration shown in Fig. 6 by randomly selecting two tanks of the five tanks from the test to calculate 1000 versions of the correction (blue shading shows the standard deviation of the result, Fig. 7). For this test, the calculated 1σ uncertainty (red shading) was similar to the 1σ bootstrap uncertainty (slightly larger for CO 2 and slightly smaller for CH 4 , not shown). This comparison indicates that the estimated uncertainty using the equations above compares reasonably well with the uncertainty we would derive from a bootstrap analysis, which gives us confidence in our methodology.
The uncertainty in S DC leads to the estimate of the fit uncertainty, U fit , shown in Fig. 7b. To implement this uncertainty across all times and towers, we calculate it assuming a fixed nominal value of the high calibration standard of 490.50 μmol mol −1 CO 2 and 2560.61 nmol mol −1 CH 4 . This is based on the value of the high standard that was in residence in the Earth Networks laboratory when several of the CRDS analyzers were tested and assigned twopoint calibration corrections. We use the site-specific (instrument-specific and periodspecific) slope and intercept that are applied to the data (which are static over the time period they are applied) and the value of the calibration tank to calculate the remainder of the values required for the uncertainty analysis.
Only one site so far, MSH, measures continuous CO, and the history of standard tanks there indicates significant uncertainty in tank value assignments with large target tank residuals and corresponding U TGT relative to errors in slope. We have chosen not to implement the two-point calibration at this site for CO because the range of slopes of S DC includes one, i.e., the correction is so small that the uncertainty dwarfs the correction.
Mean absolute residuals of the two-point fit for nine laboratory calibrations analyzed (seven tested at NOAA/ESRL and described by Verhulst et al., 2017, Table S2, and two additional units at NIST) average to 0.03 μmol mol −1 for CO 2 between the calibration and highstandard data, and larger for the test that included an even higher-concentration tank, shown in Fig. 7 at ~ 711 μmol mol −1 for CO 2 . The fit uncertainty encompasses (at 1σ) this residual as well (Fig. 7b). The residuals at lower values can be explained by the uncertainty in the measurement (precision) and uncertainty in value assignment of the tanks. For CO, only eight tests were available, with a mean residual inside the range of the calibrations of 1.1 nmol mol −1 , higher than the reported reproducibility from NOAA of 0.4 nmol mol −1 (all values are noted here at 1σ although they are given by NOAA at 2σ). This larger residual is likely caused by the lower precision of the analyzers for CO but also could be caused by larger uncertainty in the tank assignments, possibly due to drift in the mole fraction of the tanks themselves. We intend to conduct additional tests outside the two-point calibration range with additional analyzers and tanks to evaluate and possibly update this uncertainty component, U fit , as needed, and especially focus on CO if and when additional CO measurements are added to our network.

Network observations
Here we show some observations and time series of CO 2 and CH 4 from the NEC in situ tower network, focusing on data coverage, vertical gradients, and observed differences between urban and rural or outer suburban sites.

Data coverage and network expansion
The NEC network is continuously growing, with sites coming online at different times. Figure 8 shows the availability of hourly observations as the various sites have come online.

Vertical gradients
Observations in global trace gas measurement networks (e.g., AGAGE, GGRN) are specifically sited far from local sources or strong sinks to ensure that air reaching the site is representative of the large spatial scales of interest to a global study. This allows the observations to be more easily interpreted by a coarser global model (e.g., Peters et al., 2007). In urban networks, it is desirable to measure trace gas concentrations closer to sources so that finer spatial gradients can be used to inform emissions estimates at urban scales. However, a balance must be struck between the necessity to observe and distinguish sources that are in close proximity to each other and the ability of a transport and dispersion model to simulate the observations. In some instances, novel ways to simulate observations at low heights above ground level and in very dense networks have been used to resolve this problem (Berchet et al., 2017). In the NEC urban network in Washington, DC, and Baltimore, the tower sites were selected to be between 50 and 100m above the ground given the desire to place a tower in a specific location (as identified in an initial network design study by Lopez-Coto et al., 2017). Inlets at two (or three, at SFD) heights on the tower give some insight as to the proximity of each tower to sources whose emissions are not always vertically well mixed by the time they reach the inlets, depending on atmospheric stability conditions. Here we report average vertical gradients, determined using the observations at different levels, for the urban and background sites in our network. These gradients were calculated using hourly average data from each level, but because the instruments are only sampling from one level at any given time and cycling between them, there is an assumption of measurements averaged in a given hour being representative of the entire hour. Because different towers have different inlet heights and different vertical spacing between the lower and upper inlet, here we compare three urban sites (ARL, NDC, and JES) with inlets at similar heights, ~ 90m and ~ 50ma.g.l. We define the gradient as the mole fraction of CO 2 or CH 4 at the topmost inlet minus that of the lowermost inlet divided by the distance between them so that a negative gradient indicates a higher concentration at the lower inlet (the most common case).
Analysis of the diurnal cycle of the vertical gradient at urban sites in the Washington-Baltimore area (Fig. 9) indicates different characteristics in summer vs. winter. These differences are most likely caused by different meteorology and possible seasonal differences in timing of fluxes, especially for sites influenced by the urban biosphere. Greater turbulent mixing in summertime boundary layers and different timing in the boundary layer growth and collapse mostly dominate the seasonal differences. This analysis shows that at these three sites the wintertime average gradient in midafternoon hours (defined based on these figures as 11:00-16:00LST) is approximately −0.016 μmol mol −1 m −1 for CO 2 (−0.105 nmol mol −1 m −1 for CH 4 ), which translates to a −0.8 μmol mol −1 (−5.2nmol mol −1 for CH 4 ) difference between levels spaced 50m apart; this is not an insignificant gradient. At other urban sites with shorter towers, they can be even larger. These observations can help evaluate vertical mixing in transport and dispersion models that might be used to estimate emissions, or to identify times when modeled and observed vertical gradients agree. Large vertical gradients overnight into the early morning at all sites and seasons are indicative of local sources (likely mostly anthropogenic but also including respiration from the biosphere) influencing the observations at these times when there is stable stratification in the boundary layer and concentrations are higher near the surface. The larger CO 2 gradients overnight in summer compared to winter periods suggest a strong respiration signal at these urban sites, with a large degree of variability between sites indicated by large spread. Nighttime CH 4 gradients are slightly larger in winter than summer, possibly reflecting greater wintertime anthropogenic CH 4 emissions, or possibly due to seasonality in mixing layer heights.
The diurnal cycle of the vertical gradients from the sites identified as background stations for the Washington-Baltimore urban network shows large variability in summertime gradients between the three stations (Fig. 10). Stafford, VA (SFD), shows that the surrounding biosphere causes relatively large gradients in nighttime and early morning hours at this low-density suburban site. These are apparent at Bucktown, MD (BUC), as well but less so at Thurmont, MD (TMD), a forested site in western Maryland. The large difference between summertime early morning vertical CO 2 gradients at SFD and TMD, despite the similar surrounding land use (mostly deciduous forest, Fig. 3), might be caused by the elevation difference, as SFD is close to sea level while TMD is on a ridge at 561m elevation. BUC observations show larger CH 4 gradients in summer, due to surrounding wetlands and agriculture (Fig. 3). Wintertime gradients are near zero at all hours at all three of these sites, indicating that they are far from local anthropogenic sources of either gas. We note that the top inlet height at BUC is lower, at 75m, than at SFD or TMD (100 and 111m), while the lower inlet is similar for all three (~ 50m). For SFD (inlets at 152, 100, and 50m), we use the 100 and 50m inlets to define the gradient to be more consistent with the inlet heights of the other towers (Table 1).

Urban and rural differences in seasonal cycles
Here we continue to describe the network in terms of differences between rural (background) and urban stations, determining typical enhancements from urban influences. The seasonal cycles of CO 2 and CH 4 indicate enhancements in the urban sites in our network relative to the more rural stations throughout the year (Fig. 11). Summertime CH 4 at urban sites is not as enhanced compared to the rural sites as it is in winter, possibly due to wetland sources influencing the background station at BUC or lower CH 4 emissions from natural gas in urban areas. Similarly, for CO 2 , some of the rural stations surrounded by active vegetation (Fig. 3) are likely to show stronger influence from biospheric uptake than urban sites, especially in the summer months (Fig. 10). We specifically caution against using any of the in situ data from the NEC rural stations directly as a background for analysis of the urban enhancement without examining these issues. Sargent et al. (2018) indicate that for an analysis of CO 2 enhancements in the Boston urban area, CO 2 observations from upwind stations alone did not represent the correct background. Even when the air that reaches an urban tower originates near an upwind rural site, back trajectories (from a Lagrangian Particle Dispersion Model such as STILT, for example) indicate that much of the air may originate from a higher altitude than the upwind station. Thus, the measurement at an upwind station is not necessarily representative of the proper background or incoming concentration, given the large concentration gradients between measurements within the planetary boundary layer and in the free troposphere near background stations with local fluxes. Mueller et al. (2018) conducted an analysis of the issues concerning background determination for this urban network, mostly concerning the large emissions of both CO 2 and CH 4 upwind of the region that is difficult to capture at upwind stations. We will examine the proper background for investigating urban enhancements in the Washington, DC, and Baltimore, MD, area further in future work.

Data availability
This data set of hourly averaged observations from the Northeast Corridor tower-based network is available on the NIST data portal at https://data.nist.gov (last access: 23 March 2020) under the https://doi.org/10.18434/M32126 . Initially, the repository will contain data from 23 sites (Table 1) for years spanning 2015-2018; not all years are available for all sites. Files are version-dated, and the current plan is to provide annual updates for 2019 and beyond.

Conclusions
Here we present a data set of hourly average observations of CO 2 , CH 4 , and CO (where applicable) from a network of towers in the northeastern United States. Measurements are funded by NIST and conducted in a collaboration with Earth Networks, Inc., with quality control, assurance, and uncertainty determination conducted by a science team that includes NIST, Earth Networks, and collaborators from the Los Angeles Megacities Carbon Project from NASA/JPL and the Scripps Institution of Oceanography. We present 4 calendar years of data (2015 through 2018), with different stations coming online through the years, and most Washington, DC, and Baltimore, MD, urban stations becoming established after late 2015. We have also presented our methodology for calibrating the measurements to WMO scales for each gas and determining uncertainties for these measurements, as recommended by the WMO (WMO, 2018). We show that analysis of observations at two different inlet heights can be useful for determining the presence of emissions close to the towers, which may be necessary for evaluating the efficacy and choice of transport model used to analyze the data. We also note that the tower stations that were established to characterize incoming or background air are not necessarily appropriate for use directly as background for the urban stations, as they are often affected by local fluxes that do not influence the urban stations. A more careful treatment of incoming background air is necessary for any given analysis.    Average fraction of land cover type within 5 km of regional tower sites in the Northeast Corridor network, in order of decreasingly developed land. Several NLCD classifications have been grouped for clarity (e.g., "developed" includes open spaces and low-, medium-, and high-intensity developed land). SFD, TMD, and BUC are sites established to help characterize background conditions for the Washington, DC, and Baltimore urban network.  Time series of standard tank run residuals (i.e., X DC −C) for CO 2 (a, b), CH 4 (c, d), and CO (e, f). X DC is calculated using a single calibration tank (not shown) and the ratio method (Eq. 2) on the left (a, c, e) and the difference method (Eq. 3) on the right (b, d, f). Assigned tank values are shown in the legend; one tank was not calibrated for CO so only the residuals of the high-concentration tank at 315 nmol mol −1 are shown. The residual magnitude is smaller for CO 2 and CH 4 using the ratio method, but the standard deviations (variability) are similar using both methods. For CO, both the magnitude of the residual and the standard deviation are smaller using the difference equation; the ratio equation does not properly account for the drift in the analyzer at the start of the time series (May-June). Data shown are from MSH; a measurement gap exists in July. Example of a laboratory calibration of a CRDS analyzer with five standards of different assigned CO 2 mole fractions. (a) Secondary correction of drift-corrected sensitivity using either two (red) or all five (blue) standards. Green line at 1 indicates the assumed sensitivity when only a single standard is used. (b) Residual of each type of fit; error bars represent 1σ reproducibility stated by NOAA/ESRL. The simple single-tank drift correction results in the green circles as residuals; these residuals were used in the Verhulst et al. (2017) analysis to estimate the extrapolation uncertainty of the single-point correction. Red x symbols are the residuals of a fit to two standards, and blue asterisks are the residuals of the fit to all five standards. Uncertainty (1σ) in fit for two-point calibrations. (a) Two-point fit to drift-corrected sensitivity (S DC ) (red line) with uncertainty (red shading) calculated using the uncertainty in the high standard (black circle with error bar). Blue shading shows uncertainty calculated using a bootstrap conducted by randomly selecting sets of two standards from the laboratory test (black circles) to calculate the slope. There is no uncertainty at 1 because the driftcorrected sensitivity is defined as equal to one at the value of the calibration standard. (b) Uncertainty in final CO 2 as a function of raw CO 2 ; red and blue shading have the same meaning as in (a). Data (CO 2 and CH 4 , and CO for MSH only) availability from the various NIST-EN tower sites in the Northeast Corridor network included in this data release. Gaps represent data outages due to various failures (analyzer, communications, etc.). Diurnal cycle of vertical gradients in CO 2 (a) and CH 4 (b) for urban towers in the Washington-Baltimore area, averaged over 2015-2017 in winter (blue) and summer (orange), with shading indicating 1σ standard deviation among sites. Some of the spread can be caused by sampling in different years at the different sites. Sites included are HAL, ARL, NDC, NEB, NWB, and JES. HRD was excluded due to lack of data in this period. Diurnal cycle of vertical gradients in CO 2 (a) and CH 4 (b) at the three background towers for the Washington-Baltimore region in summer (orange shades) and winter (blue shades). Seasonal cycles from urban and rural sites in the Washington, DC, and Baltimore region with at least 1 year of observations. Midafternoon (13:00-18:00 LST) daily averages are detrended using a linear fit to the annual trend at Mauna Loa (for CO 2 ) and the global average (for CH 4 ) (data from NOAA/ESRL) and then averaged monthly. Rural sites include TMD, SFD, and BUC; urban sites are ARL, NDC, JES, HAL, NEB, and NWB. Shading indicates 1 standard deviation of the averages from all the sites.