A high-resolution synthesis dataset for multistressor analyses along the US West Coast

. Global trends of ocean warming, deoxygenation, and acidiﬁcation are not easily extrapolated to coastal environments. Local factors, including intricate hydrodynamics, high primary productivity, freshwater inputs, and pollution, can exacerbate or attenuate global trends and produce complex mosaics of physiologically stressful or favorable conditions for organisms. In the California Current System (CCS), coastal oceanographic monitoring programs document some of this complexity; however, data fragmentation and limited data availability constrain our understanding of when and where intersecting stressful temperatures, carbonate system conditions, and reduced oxygen availability manifest. Here, we undertake a large data synthesis to compile, format, and quality-control publicly available oceanographic data from the US West Coast to create an accessible database for coastal CCS climate risk mapping, available from the National Centers for Environmental Information (accession 0277984) at https://doi.org/10.25921/2vve-fh39 (Kennedy et al., 2023). With this synthesis, we combine publicly available observations and data contributed by the author team from synoptic oceanographic cruises, autonomous sensors, and shore samples with relevance to coastal ocean acidiﬁcation and hypoxia (OAH) risk. This large-scale compilation includes 13.7 million observations from 66 sources and spans 1949 to 2020. Here, we discuss the quality and composition of the synthesized dataset, the spatial and temporal distribution of available data, and examples of potential analyses. This dataset will provide a valuable tool for scientists supporting policy-and management-relevant investigations including assessing regional and local climate risk, evaluating the efﬁcacy and completeness of CCS monitoring efforts, and elucidating spatiotemporal scales of coastal oceanographic variability.

As a result of the connections between upwelling, low oxygen, and acidification events, models predict the CCS's vulnerability to extreme events will increase as climate change progresses (Gruber et al., 2012;Bakun et al., 2015). Relative 70 to a preindustrial baseline, anthropogenic forcing has shoaled corrosive and hypoxic conditions by more than 50 m (Bograd et al., 2008;Feely et al., 2008;Chan et al., 2008;Gruber et al., 2012). Modeled projections of the CCS suggest that pH levels are declining sufficiently swiftly that by 2035 the range of annual variability may no longer overlap with conditions present in the 2010s, while the calcium carbonate mineral aragonite could be perennially undersaturated at 100 m depth by 2045 (Hauri et al., 2013;Marshall et al., 2017). Meanwhile, nearshore dissolved oxygen concentrations are expected to decline by 75 10-20 µmol kg -1 by the end of the century (Siedlecki et al., 2021). Upwelling-favorable winds may intensify under future warming (Sydeman et al., 2014;Bakun et al., 2015;Wang et al., 2015); although this effect may be counteracted in some locations by increased stratification of seawater layers (Howard et al., 2020a;Siedlecki et al., 2021) or in areas where winddriven upwelling is not the dominant process (Garciá-Reyes and Largier, 2010). These competing forces might enhance the disparities between climate hot spots and refugia, underlining the importance of gathering and analyzing climate data with 80 high spatiotemporal resolution.
Despite recognition of the complexity of CCS coastal climate stress, successfully capturing mesoscale, sub-seasonal, and very nearshore patterns of OAH and warming remains challenging. One impediment to unraveling this complexity is the decentralized and non-standardized nature of much OAH monitoring in the CCS, undertaken by governmental, non-profit, 85 and academic centers with varying methodologies and approaches to data accessibility (Taylor-Burns et al., 2020). Further, existing synthesis datasets are not optimized for simultaneous analysis of nearshore warming, deoxygenation, and acidification risks (e.g., Hofmann et al., 2011;Sharp et al., 2022). While several excellent databases compile place-specific biogeochemical data, such as CeNCOOS and SCCOOS (Terrill et al., 2006;Ruhl et al., 2021), they often are limited regionally, provide access to only a single parameter at a time, lack key datasets, or do not require standard data formats or 90 quality assurance/quality control (QA/QC) methods (Weisberg et al., 2020).
A deliberate synthesis of OAH-relevant datasets with standardized formatting and quality control maximizes our ability to explore, map, and resolve coastal climate stress on sub-regional scales (Bushinsky et al., 2019;Chan et al., 2019). Here, we present the Multistressor Observations of Coastal Hypoxia and Acidification (MOCHA) synthesis, the highest resolution 95 OAH-relevant U.S. West Coast dataset to date. MOCHA is a compilation of published nearshore temperature, dissolved oxygen, and carbonate chemistry-relevant datasets for the CCS newly available archived at the National Centers for Environmental Information (NCEI, https://doi.org/10.25921/2vve-fh39) along with associated metadata and quality assurance in adherence with the FAIR principles (Wilkinson et al., 2016;Kennedy et al., 2023). We source published data from oceanographic cruises, buoys, moorings, and shore samples as well as previously unpublished observations contributed 100 by the author team and present them in a formatted, quality-controlled, downloadable database for easy access and analysis ( Fig. 1). While this dataset is not exhaustive, it both highlights real disparities in oceanographic monitoring intensity and https://doi. org/10.5194/essd-2023-205 Preprint. Discussion started: 31 May 2023 c Author(s) 2023. CC BY 4.0 License.
provides future investigators the opportunity to compare and integrate their own datasets. This synthesis provides an important tool for scientists across disciplines and coastal decision-makers to investigate spatiotemporal variation in marine climate risk from OAH events and warming, evaluate the efficacy and completeness of CCS monitoring efforts, link 105 oceanographic conditions to coastal social or socio-economic considerations across large geographic ranges, evaluate spatial management zones such as aquaculture sites and Marine Protected Areas, and pursue other questions of interest to coastal communities.

Figure 1: All individual locations for temperature (a), dissolved oxygen (b), and carbonate-system (c) observations included in this
synthesis along the U.S. West Coast. These figures overstate the useful spatial density of the data, as many individual locations have only been sampled once, but highlight the limited scale of available carbonate-system observations relative to more commonly assessed parameters like temperature and dissolved oxygen.

Data Sources and Types 115
This project compiled published and publicly available data, as well as data contributed by the author team, including multiparameter OAH-relevant observations from shipboard discrete water samples, in-situ autonomous sensors, and shorecollected datasets from along the U.S. West Coast. We primarily sourced multiparameter data through existing public data portals, such as NCEI and literature searches, prioritizing datasets that included carbonate-system or dissolved oxygen https://doi.org/10.5194/essd-2023-205 Preprint. Discussion started: 31 May 2023 c Author(s) 2023. CC BY 4.0 License. observations in addition to temperature. When available alongside our target parameters, we also incorporated published 120 chlorophyll and nutrient concentrations. In all cases, we took the published or publicly hosted data as our starting point, rather than asking for the unprocessed data from the original investigators, then applied additional quality-control measures described in Sect. 2.4. We have limited this publication to data collected before 2020, but we will continue to incorporate new observations according to the methods outlined below, where possible, and will periodically make updated versions of this synthesis dataset publicly available at NCEI (https://doi.org/10.25921/2vve-fh39; Kennedy et al., 2023). 125 The data in this synthesis comes from a wide array of observational methods and instruments. We screened carbonate-system datasets before incorporating them following the discussions of method reliability summarized in Martz et al., (2015). The carbonate-system observational methods included in this synthesis dataset are: discrete seawater samples of pH, total alkalinity (TA), and dissolved inorganic carbon (DIC) (all preserved at the time of collection and analyzed in a lab with 130 established techniques; e.g., Dickson et al., 2007); pH observations from ion-sensitive field-effect transistor-based autonomous sensors (e.g., Honeywell Durafet; Martz et al., 2010) or spectrophotometric sensors (e.g., SAMI-pH; Lai et al., 2018); and pCO2 observations from autonomous equilibrium-based spectrophotometric sensors (e.g., SAMI-CO2; Schar et al., 2009). We did not include pH measured on glass electrode sensors, due to known issues with precision (Martz et al., 2010). We discarded any dissolved oxygen and carbonate-system datasets that lacked accompanying temperature data, as 135 accurate observations of both parameters require simultaneous temperature readings (Dickson et al., 2007). Data collection methods are available for all parameters except temperature and salinity and have been simplified into four groups: "discrete", for bottle-collected samples analyzed in a laboratory, "CTD" for observations from ship-side profiling devices, "autonomous sensors", for stationary instruments collecting data at pre-programmed intervals, and "handheld sensors" for observations collected in the field via a glass-electrode probe. The specific instruments associated with each data source are 140 available in the Metadata Table archived at NCEI, Accession 0277984 (Kennedy et al., 2023).

Formatting
After identifying a dataset of interest, we downloaded all available processed data and metadata, including descriptive papers, primary investigator information, project and instrument descriptions, and the original source of the data. Each dataset was assigned a unique identifying number to ensure that every data point could be quickly associated with its parent 145 data source and metadata (Table 1). For all datasets, we retained a copy of the original published data. We manipulated all original data into a comma-separated file with minimal alterations -typically limited to eliminating extra header rows and streamlining column names -before transferring datasets into R or Python for further formatting to ensure that all manipulations were trackable.

150
This synthesis dataset is structured such that each row represents a set of oceanographic observations from a shared time, depth, location, and data source. For easy filtering, we included a "collection method" column that classified each dataset as https: //doi.org/10.5194/essd-2023-205 Preprint. Discussion started: 31 May 2023 c Author(s) 2023. CC BY 4.0 License. one of four types: "cruise" for ship-collected samples, "mooring" for autonomous instruments attached to buoys, "intertidal/subtidal autonomous sensor" for shore-or diver-accessed autonomous sensors, and "intertidal/subtidal hand collected" for water samples collected by hand from a dock or the shore. We also assigned each observation a habitat type, 155 labeling observations as "estuarine" if they were collected within semi-restricted lagoons and bays (e.g., Humboldt Bay), or "oceanic" otherwise. We recorded measured variables, data types, and data quality in adjacent columns.
For a full description of included parameters, refer to the detailed metadata table archived at NCEI (Kennedy et al., 2023).
We retained all directly measured chemical oceanographic observations as we incorporated each dataset, converted 160 observations to standard units if necessary, and mapped them directly to our corresponding synthesis dataset columns. We did not retain published data calculated from algorithms, such as TA extrapolated from salinity measurements, nor any calculated carbonate system variables, regardless of whether the source publication included such data. While we note that published data may have been summarized or filtered by the initial investigators, we did not further summarize or filter data before including it in this compilation except for the Ocean Observatories Initiative (OOI) datasets, discussed below. 165

Ocean Observatories Initiative (OOI) Datasets
The Washington and Oregon OOI data included millions of observations of temperature, salinity, dissolved oxygen, pH, and 175 pCO2 at sub-minute resolutions. The size of these datasets required us to aggregate the data to daily mean values before incorporation into the larger synthesis dataset. We filtered raw OOI with input from the OOI staff to remove outlying and unreliable data, grouped the remaining data by day, aggregated to daily mean values, then quality-controlled the aggregated data a second time according to the methods described in Sect. 2.4. Because much of the publicly available OOI data had not been previously quality controlled, we contacted OOI staff for their guidance on initially filtering the raw data before aggregation. They provided extensive code developed by the sensor manufacturers and OOI staff to identify erroneous pH and DO data from the raw publicly available streams, available at https://github.com/oceanobservatories/ooi-data-explorations/tree/master/python. OOI staff also provided access to discrete sample analyses taken at the sensor moorings to further ground-truth sensor readings. We only retained data for aggregation 185 if it 1) passed through the manufacturer's code, 2) had discrete samples associated with the beginning and end of that sensor's deployment, 3) the daily mean sensor values for dissolved oxygen and pH on the day of discrete sampling were within 20 umol/kg of the discrete sample dissolved oxygen and/or 0.05 pH units, and 4) displayed reasonable DO/pH concentrations and variance in those concentrations over time. We eliminated all DO data prior to 2018 based on advice of OOI staff because the DO sensors prior did not have adequate biofouling control. We then aggregated these data into daily 190 mean values before formatting and quality controlling them as normal.

Quality Control
After formatting individual datasets, we checked all observations to standardize quality across data sets and avoid using questionable data points in future analyses. Our QA/QC methods drew from a combination of the publishing authors' notes, plots of the data, and expert knowledge of the CCS. Incoming quality-control notes associated with each data source ranged 195 widely, though most datasets that did include quality information followed the Quality Assurance/Quality Control of Real-Time Oceanographic Data (QARTOD) system, which assigns flags based on internal instrument checks, data reasonableness, and collection method (Bushnell 2018). Using available existing QA/QC information and our further quality control investigations, we categorized each data point as one of three confidence levels: 1 for "plausible and reliable" data, 2 for data that we had not assessed yet, and 3 for "low quality or unreliable" data. We flagged all data the publishing authors 200 had listed as unreliable with a 3. Regardless of published notes, we assigned all other observations a flag of 2 before additional evaluation by our project team.
Given the diversity of the datasets and projects this synthesis draws from, we examined each dataset individually using a combination of plots tailored to maximize our ability to identify and evaluate anomalies in that dataset's specific 205 oceanographic and spatiotemporal context. Given that this synthesis sources mostly published data, we erred towards retaining data as "plausible", rather than following a more stringent flagging philosophy. We recommend that investigators perform additional QC with the MOCHA dataset targeted towards their project requirements. Common quality control plotting techniques included property-property plots of temperature, salinity, dissolved oxygen, pH, total alkalinity, and dissolved inorganic carbon against one another; single-parameter time series from sensor and long-running datasets; and map 210 views and oceanographic cross sections of synoptic cruise data. We examined questionable data through as many different views as possible, such as examining apparent outliers in a temperature-salinity property-property plot individually in their respective time series, to ensure that we were not trimming real or plausible observations. When possible, we further https://doi.org/10.5194/essd-2023-205 Preprint. Discussion started: 31 May 2023 c Author(s) 2023. CC BY 4.0 License. evaluated suspicious observations against other datasets collected nearby. We discussed all data flagging decisions with at least three project members. After this focused quality control, all observations not flagged as "low quality or unreliable" (3) 215 were upgraded to our "plausible and reliable" flag (1). All subsequent mapping and analysis with the observed oceanographic values used only "plausible and reliable" data.

Example Subset: Daily Data
High-resolution (sub-daily observations) autonomous sensors are an important component of this synthesis dataset, but the data they produce comes with significant computational costs. Furthermore, variability on the scales of hours or minutes 220 captured by such high-resolution records is less comparable to lower-resolution datasets such as those collected over quarterly or annual synoptic oceanographic cruises. To evaluate the spatiotemporal extent of our data coverage, seasonal patterns, and relationships between observed parameters, we aggregated the dataset to daily mean values for each location, depth, and data source. We dropped all questionable data (i.e. data flagged with a "3" QA/QC code) before creating this summary dataset to ensure that unreliable data did not influence averages. This reduced the total number of observations 225 from 13.7 million to 1.2 million. We used this summary dataset in all following analyses that do not explicitly cite "original data." We have included the code necessary to reproduce this summary dataset from the published data compilation in our public code repository (https://github.com/egkennedy/DSP_public_code).

Additional Carbonate System Calculations
To maximize the OAH information available in our daily summarized dataset, we calculated the full carbonate system 230 parameters for all discrete samples that included at least two high-quality observations of primary variables of the carbonate system (pH, TA, DIC, or pCO2) in addition to high-quality, co-occurring temperature and salinity measurements. These calculated parameters can be reproduced using the code in our public code repository (https://github.com/egkennedy/DSP_public_code). We used the R package "seacarb" (Gattuso et al., 2018) for all carbonate system calculations and used constants appropriate for the temperature and salinity as recommended by Dickson et al. 235 (2007). In cases where more than two carbonate system parameters were available, we prioritized TA-DIC pairs following Dickson (2010), then TA-pH pairs, then DIC-pH pairs. When applicable for mapping and time series analyses, measured and calculated carbonate system observations were concatenated, with measured data prioritized in all overdetermined systems.
All references to an analysis of "original" data and all discussions of the distribution of observations only include directly measured variables; however, the oceanographic relationships discussed in Sect 3.5 and shown in Figs. 3, 5, and 6 include 240 these additional calculated observations. https://doi.org/10.5194/essd-2023-205 Preprint. Discussion started: 31 May 2023 c Author(s) 2023. CC BY 4.0 License.

Overall Data Totals
This synthesis dataset includes observations from 67 individual data sources organized across 13.7 million rows and 41 columns. This includes 24.1 million unique measurements, with 13.2 million temperature, 3.6 million salinity, 3.3 million 245 DO, 2.1 million pH, 1.2 million chlorophyll, 561,000 nutrient, 113,000 pCO2, 10,400 TA, and 8,500 DIC measurements.
While we prioritized multiparameter datasets for this effort, our synthesis also includes several temperature-only, highresolution records to fill specific project needs. Summarizing the data by day for each dataset, location, and depth provides a clearer picture of the availability of multiparameter data by muting the outsized influence of high-resolution sensors. Of the 1.2 million daily averaged observations, just 104,000 are temperature-only. 250 Data totals across dissolved oxygen and carbonate-system observations varied substantially by observational method.
Autonomous sensors are the most common observational method in the original dataset with 5 million individual measurements, versus 226,000 individual discrete measurements, 193,000 CTD measurements, and 828 handheld field measurements. Across data aggregated by day, autonomous sensors are still the most common, with 643,000 individual daily 255 averaged parameter measurements, versus 223,000 discrete, 192,000 CTD, and 816 handheld sensor observations. For evaluating the spatiotemporal coverage of carbonate-system observations, we calculated an additional 4,599 daily pH observations from paired discrete samples of two other primary carbonate system parameters, equal to 3.1% of the total directly measured daily pH observations ( Table 2). The calculated pH observations were included in our analysis of the spatiotemporal extent of available OAH data discussed in Sect 3.3 and the oceanographic relationships discussed in Sect. 260 3.5.  Table 2: Overview of parameter observation methods, total number of daily observations (grouped by data source, location, and depth), and the reliability rates. Autonomous sensors are associated with slightly lower reliability rates due to periods of sensor bio fouling or malfunction.

Flagging and Reliability
The amount of original data flagged as unreliable varied substantially by dataset, parameter, and observation method, but was typically low (Fig. 2). As the bulk of the data in this synthesis product was previously published and had undergone some preliminary QA/QC prior to our incorporation, high reliability rates were expected. Of the dozens of datasets contributing temperature and salinity observations, only one dataset each had a parameter flag rate above 5%. Flag rates 270 above 10% were uncommon for all parameters across all datasets, and completely absent for TA and DIC observations. For pH and DO, flag rates within datasets were above 10% for 3 and 8 datasets, respectively. These high rates of "unreliable" data were caused by either 1) clear periods of autonomous sensor malfunction 2) observational methods described by the publishing authors as unreliable, or 3) more rarely, slightly higher QA/QC standards applied to data that had not been previously screened and published. The vulnerability of autonomous sensors to periods of biofouling or sensor malfunction 275 contributed to higher flag rates relative to other methods, but all four methods were largely reliable (  Figure 2: The rate of unreliable ("flagged") observations varied by parameter and dataset, but was generally low, especially for temperature (Temp) and salinity (Sal) observations. All datasets that included dissolved oxygen (DO) observations with a >30% flag rates used measurement methods described by the original publishers as "not quantitative". Flag rates between 10% and 30% were uncommon, but reflected occasional periods of fouling or equipment malfunction in high resolution autonomous sensor datasets or, in rare cases, more stringent standards applied to datasets that had not been previously published and initially quality controlled. capacity between 2015 and 2020. Temperature and dissolved oxygen measurements have the most extensive coverage, but are sparse outside of Southern California before 2000. Carbonate system records, here shown by both measured and calculated pH observations, are rare in all years north of 39° N. Overall, this data compilation demonstrates large spatial and 300 temporal data gaps, which limit our ability to resolve rapid changes in ocean acidification, hypoxia, or warming risk or to contextualize current carbonate-system and dissolved oxygen conditions with respect to the recent past. The intra-annual distribution of the daily data is more complex than the interannual distribution (Fig. 4). Temperature, 310 salinity, and dissolved oxygen records are common throughout the year, but have distinct peaks in abundance in April, May, and July through November. Carbonate system records are more patchy temporally. Nearly 50% of all TA and DIC observations were taken in May or August, with an additional 19% of observations from September, reflecting the sampling months of the NOAA West Coast Ocean Acidification Cruises (Feely et al., 2016). Between October and April, no single month includes more than 8% of DIC observations or 5% of TA observations. pH observations are more evenly distributed 315

Spatiotemporal Data Distribution
throughout the year, with all months hosting 6-10.5% of the observations except August, which hosts 16%. The concentration of carbonate-system observations between May and September is particularly concerning, as upwelling season in Central and Southern California starts in earnest in April (Garciá-Reyes and Largier, 2012;Jacox et al., 2018) and at least two carbonate-system parameters must be measured to fully constrain the carbonate system (Dixon et al., 2007), so the observational record may be missing significant low pH, low DO events from the early upwelling season. 320

Data Relationships
This synthesis dataset effectively captures seasonal and regional variability across OAH-relevant parameters (Fig. 5).
Median surface, nearshore (<25 m depth and <50 km from shore) temperatures rise in all regions during the spring and 330 summer months, peaking between July and September. In Washington and Oregon, peak upwelling occurs between June and August (Bograd et al., 2009;Jacox et al., 2016), which coincides with the period of highest variability and lowest minima for pH and DO observations captured in this synthesis. In both California regions, seasonal surface data is less consistent with the expected upwelling patterns. There, peak upwelling occurs between April and June and is weakest south of Point Conception (Bograd et al., 2009;Garciá-Reyes and Largier, 2012;Jacox et al., 2016). Somewhat unexpectedly, the highest 335 variability and lowest minimum DO and pH observations occur between July and September in both California regions rather than during the months of expected peak upwelling. This trend may reflect intermittent upwelling into the warmer summer months or could be capturing high surface respiration as waters warm and invites further investigation. October through March conditions across all West Coast regions are more poorly sampled, but have less variability, cooler mean temperatures, and higher dissolved oxygen concentrations and pH. 340 The relationships between daily measured OAH parameters illustrate the complexity of nearshore oceanographic processes.
As expected in an upwelling ecosystem, low surface pH and DO conditions are most frequently associated with low 350 temperatures, but warmer OAH events still occur (Fig. 6). pH conditions below 7.8 can be stressful for many marine organisms (e.g., Byrne and Przeslawski, 2013;Gobler and Baumann, 2016;Bednaršek et al., 2021;Kroeker et al, in press) and have been observed 9,928 times within 50 km of shore and 50 m of the surface (Fig. 6). Of these instances, 99 events are https://doi.org/10.5194/essd-2023-205 Preprint. Discussion started: 31 May 2023 c Author(s) 2023. CC BY 4.0 License. accompanied by DO concentrations below the "coastal hypoxia" threshold of 61 µmol kg -1 and 548 events have DO concentrations below the "mild hypoxia" threshold of 107 µmol kg -1 (Hofmann et al., 2011). An additional 1,765 nearshore, 355 near-surface observations of DO concentrations below 61 µmol kg -1 have been recorded without accompanying pH information. No simultaneous surface observations of DO and pH record coastal hypoxic conditions with pH levels above 7.8. The low pH, low oxygen observations are most common off the Oregon coast during low temperature upwelling events, but simultaneous low oxygen, low pH conditions are also found occasionally throughout the coast and at a range of temperatures, especially during late summer in semi-restricted estuaries. The few simultaneous observations of DO 360 concentration and pH suggest that only 1.0% of observations of low pH (pH <7.8) are accompanied by hypoxic water, while shallow hypoxic waters are accompanied by low pH conditions 99% of the time. These relationships underscore the importance of multiparameter OAH observations, the clear need for pH monitoring efforts to catch up with dissolved oxygen monitoring efforts, and the potential for even shallow waters to experience extreme conditions.

370
The nearshore, near-surface data in the MOCHA synthesis also highlights the difficulty of developing accurate nearshore algorithms that can predict carbonate-system parameters from other more commonly measured hydrographic variables in coastal ecosystems, even in the absence of large freshwater inputs. The relationship between salinity and TA is regionally dependent and less reliable in nearshore environments and near San Francisco Bay, as has been noted by investigators developing carbonate-system algorithms ( Fig. 7; e.g., Alin et al., 2012;Davis et al., 2018). Excluding the San Francisco Bay 375 area, surface TA-salinity relationships are linear and not highly dependent on distance from shore between 5 km and 100 km, successfully leveraged by previous investigators to extrapolate carbonate-system conditions (Alin et al., 2012;Davis et al., 2018;Middelburg et al., 2020). The weakness of coastal TA-salinity relationships underscores the importance of monitoring multiple parameters of the carbonate system.

Region
Offshore relationship Nearshore relationship  (b), and near the mouth of San Francisco Bay (c). Excluding the San Francisco area, TA-salinity relationships between 5 km and 100 km offshore are strong and linear, with small differences between geographic regions. Within 5 km of shore throughout the Coast and within 100 km of San Francisco Bay (right), the TA-salinity relationships are much less reliable. This limits the utility of carbonate-system algorithms and emphasizes the need to fully characterize the carbonate-system through simultaneous measurements of two master parameters to effectively assess nearshore acidification conditions. 400

Dataset Limitations
This data compilation reflects high-quality, publicly available data, and directly contributes to our ability to map coastal temperature, dissolved oxygen, and carbonate-system variation; however, this synthesis also encodes the limitations of our observational record and the differences in data availability, data scales, and data quality. High resolution autonomous sensors provide excellent temporal resolution for a specific location, but are vulnerable to sensor drift, are not often 405 published with clear calibration records, and are rarely deployed in arrays that fully capture the carbonate system as well as temperature and dissolved oxygen variability. Conversely, discrete samples and CTD profiles from synoptic cruises provide extremely high-precision, multiparameter observations with broad spatial resolution, but are less relatable to high-resolution sensors or hand-collected observations from the surf zone. Carbonate-system observation availability has strong seasonal and spatial bias, with data concentrated in summer months and along coastal population centers. The MOCHA synthesis pulls 410 these distinct data sources into a single location, but we do not claim to have fully solved the inherent difficulties of combining data of differing quantity, resolution, and quality into a unified picture of the nearshore CCS.
Additional data streams that provide both spatial and temporal resolution could help bridge some of the divides between quality, quantity, and spatial extent in this synthesis and we acknowledge a few such potential data streams here. The 415 temperature and dissolved oxygen records do not include CTD casts from most annual fishery-independent surveys, which could improve spatial resolution at all depths (e.g., Sakuma, 2022). This compilation also excludes some potentially valuable carbonate system data streams, particularly those focused on pCO2 measurements. For example, potential additional data sources include underway pCO2 records from transiting oceanographic ships or sail drones, pH or pCO2 records from autonomous gliders (e.g. Chavez et al., 2017), and pCO2 and DIC records from shore based monitoring systems (e.g., Burke-420 o-Lators; Hales et al., 2004;Bandstra et al., 2006). The first would significantly improve the spatial coverage of surface pCO2 and could improve seasonal bias, but would not have a significant impact on our ability to resolve the full carbonate system or to consider deeper water. Glider datasets would similarly improve our spatial coverage while providing additional information about water column structure. These could represent a valuable expansion to this synthesis, provided calibration records are also available and will likely be included in updates to this synthesis product (Bushinsky et al., 2019). Shore 425 based monitoring systems recently deployed by the West Coast OOIs will also be valuable expansions to this synthesis and will also likely be included in an updated product.

Conclusions
The CCS is one of the most intensively monitored marine ecosystems in the world, but our ability to accurately resolve the true complexity of coastal climate stress remains limited by data fragmentation, availability, and quality. As interest has 430 shifted from documentation of the global patterns of acidification and hypoxia to more complex coastal environments, the CCS has seen an explosion in nearshore (<50 km) and very nearshore (<5 km) monitoring efforts in the last 15 years. This explosion has included an increase in both surface and subsurface monitoring efforts, though monitoring efforts below 5 m depth are still much less common than surface observations in very nearshore environments. While this situation is improving, the continued relative paucity of subsurface nearshore measurements is of particular concern given that mildly 435 hypoxic (DO <107 umol kg -1 ) and corrosive conditions have been documented at depths as shallow as 10 m (Kekuewa et al., 2022).
Surprisingly, the U.S. West Coast had especially continuous spatial and temporal coverage of OAH-relevant parameters between 2012 and the beginning of 2015, before a reduction in coverage that lasted through 2020 (Fig. 5). By coincidence, 440 the reduction in dissolved oxygen and carbonate-system monitoring in 2015 coincided with the second half of the marine heatwave known as "the Blob", which stretched from 2014 through 2016 and was associated with higher surface DO and pH (Bond et al., 2015;Siedlecki et al., 2016;Gentemann et al., 2017). Assessing the interactions of an unprecedented marine heatwave with DO and carbonate-system conditions lies at the heart of multi-stressor risk management; however, our ability to resolve both Blob impacts and its recovery was very limited in Northern California and Oregon by the concurrent 445 contraction in oceanographic monitoring. Although the CCS is well monitored compared to many other parts of the world's oceans, our synthesis here highlights that a patchwork of monitoring projects, often driven by inconsistent funding, has an outsized impact on our ability to utilize that data to understand how the CCS is changing.
While increasing interest in coastal OAH monitoring and the availability of autonomous sensors has markedly enhanced 450 CCS data availability, the frequency and footprints of synoptic oceanographic cruises has decreased in the region.
Oceanographic cruises provide highly accurate and spatially broad water column measurements that can bridge the gap between the coastal and open-ocean domains and provide regional contexts for local observations. They also provide some of our only observations near remote portions of the coast. However, nearly all routine oceanographic cruises in the CCS have cut back their footprint, sampling frequency, and depth resolution. The Southern California-based CalCOFI cruises 455 extended throughout the CCS during the 1960s, contracted to Southern and Central California by the 1980s, and now only cover the Southern California Bight while also sampling at significantly fewer depths (Bograd et al., 2003). The loss of CalCOFI cruises in Central California has been offset in part by triannual Applied California Current Ecosystem Studies cruises near San Francisco Bay, though these cruises are limited to the continental shelf between 37.3° N and 38.4° N. The NOAA West Coast Ocean Acidification Cruises took place along the entire CCS five times from 2007-2016, but a 2017 460 cruise only included Washington (Feely et al., 2016;Alin et al., 2019). The shift towards high-resolution, nearshore monitoring is a significant improvement over a wholesale reduction in oceanographic monitoring, but the concurrent erosion of consistent oceanographic cruises means the ability to resolve large-scale regional patterns is being traded for highlyspecific understanding of a few select locations.

465
This synthesis dataset provides one of the largest compilations to date of West Coast nearshore acidification and deoxygenation related data. This dataset highlights monitoring gaps, but equally provides opportunities for insight into coastal conditions. With the updated spatiotemporal resolution this effort affords, this dataset offers a wealth of opportunities to investigate questions about coastal oceanography and evaluate localized patterns of marine climate stress. We expect the https://doi.org/10.5194/essd-2023-205 Preprint. Discussion started: 31 May 2023 c Author(s) 2023. CC BY 4.0 License.
MOCHA synthesis to also be of use for new projects combining temperature and dissolved oxygen records into species 470 metabolic indices (e.g., Howard et al., 2020b), for investigating the frequency and interaction of individual and overlapping ocean acidification and hypoxic events (e.g., Burger et al., 2022), for developing updated carbonate system algorithms more suited to nearshore environments (e.g., Alin et al., 2012;Davis et al., 2018); and for evaluating the efficacy of spatial management zones such as Marine Protected Areas (Hamilton et al., in press). By archiving this dataset at the National Centers for Environmental Information (https://doi.org/10.25921/2vve-fh39; Kennedy et al., 2023) in an easily manipulated, 475 consistent format that includes relevant metadata and quality assurance, we provide an important tool for scientists across ecological, oceanographic, and social disciplines and coastal decision-makers to address the environmental, economic, and cultural needs of coastal communities.

Data Availability
The full Multistressor Observations of Coastal Hypoxia and Acidification dataset and detailed metadata tables are publicly 480 available for download at NCEI as Accession 0277984 with the DOI 10.25921/2vve-fh39 (Kennedy et al., 2023). This data set is discoverable via the NOAA Ocean Acidification Portal, NCEI Geoportal (https://www.ncei.noaa.gov/metadata/geoportal/#searchPanel), and other online discovery tools.

Code Availability
Code for performing carbonate-system calculations with the formatted dataset, creating a summarized dataset aggregated by 485 day, and making all included figures is available on GitHub at https://github.com/egkennedy/DSP_public_code. conceptualization. HMP, MW, AMR, GVG, CNR, GC, MD, MIW, EH, and SW provided data curation and sourced new datasets for inclusion.