BAWLD-CH4: A Comprehensive Dataset of Methane Fluxes from Boreal and Arctic Ecosystems

5 Department of Geological Sciences, Stockholm University, Stockholm, Sweden 6 Bolin Centre for Climate Research, Stockholm, Sweden 7 Marine Science Institute, University of California at Santa Barbara, Santa Barbara, USA 8 Institute of Arctic and Alpine Research (INSTAAR), University of Colorado Boulder, Boulder, CO, USA. 9 Water and Environmental Research Center, University of Alaska Fairbanks, PO Box 755860, Fairbanks, Alaska 99775-5860, 15 USA. Institute of Arctic Biology, University of Alaska Fairbanks, Fairbanks, Alaska 99775, USA


Introduction
Methane (CH4) is a strong climate forcing trace gas that is naturally produced and emitted from wetlands and lakes, which are abundant in northern regions (Matthews and Fung 1987;Lehner and Doll et al. 2004;Messager et al. 2016). Current estimates of CH4 fluxes from the northern Boreal and Arctic region (~>50º) range between 9 and 53 Tg CH4 y -1 from wetlands 45 (Spahni et al. 2011;McGuire et al. 2012;Zhu et al. 2013;Bruhwiler et al. 2014;Treat et al. 2018;Watts et al. 2014;Thompson et al. 2017;Peltola et al. 2019;Saunois et al. 2020) and between 12 and 24 Tg CH4 y -1 from lakes (Bastviken et al. 2011;Wik et al. 2016a;Tan et al. 2016;Walter Anthony et al. 2016;Matthews et al. 2020;Saunois et al. 2020). Combined, CH4 emissions from northern ecosystems make up a significant but uncertain portion of fluxes from natural sources (232 to 367 Tg CH4 Yr -1 for averaged bottom -up and top down global estimates, respectively; Saunois et al. 2020). One reason for the large range of 50 high latitude CH4 emissions estimates is the consistently lower estimates based on top-down approaches compared to bottomup approaches. Top-down approaches use atmospheric observations of CH4 concentrations with atmospheric-inverse modeling frameworks to estimate regional CH4 budgets (e.g. Bruhwiler et al. 2014;Thompson et al. 2017) while bottom-up approaches merge land cover datasets and empirical CH4 flux inventories or process-based models to scale emissions across regional scales (e.g. Wik et al. 2016a;Treat et al. 2018;Peltola et al. 2019). A key issue for bottom-up approaches is the lack of differentiation 55 among different wetland and lake types despite clear evidence indicating differences in both the magnitude and drivers of CH4 fluxes among wetland and lake types (Olefeldt et al. 2013;Turetsky et al. 2014;Wik et al. 2016a;Treat et al. 2018).
Net CH4 flux to the atmosphere depends on a suite of physical and biological controls linked to microbial production, oxidation, and transport via diffusion, ebullition, and plant-mediated processes (Bastviken et al. 2004;Whalen et al. 2005).
While the basic underlying CH4 processes are the same across all ecosystems, the dominance of different production, oxidation, 60 and transport pathways vary within and among terrestrial (wetlands and non-wetlands) and lentic open-water aquatic ecosystems (lakes and ponds), leading to a wide range of reported CH4 fluxes at the site level with differences of up to four orders of magnitude (Olefeldt et al. 2013;Wik et al. 2016a;Treat et al. 2018). Furthermore, drier terrestrial sites may drawdown, or uptake, CH4 out of the atmosphere (Treat et al. 2018). Despite the wide range in reported CH4 fluxes, key overarching controls on emissions from wetlands and aquatic ecosystems have been identified through the work of syntheses 65 (Olefeldt et al. 2013;Wik et al. 2016a;Treat et al. 2018), suggesting that different ecosystems can be partitioned based on a handful of key CH4-emitting characteristics.
For terrestrial ecosystems, CH4 fluxes across the Boreal-Arctic region are primarily linked to permafrost conditions and hydrology (Olefeldt et al. 2013;Treat et al. 2018) which encompass other important controls on CH4 emissions. For example, permafrost condition and hydrology can be directly linked to water table position and redox conditions (Moore et al. 70 1994;von Fischer et al. 2010;Olefeldt et al. 2017), which in turn influence plant composition (i.e. plant function types including graminoids, Sphagnum mosses, shrubs, and trees; Olefeldt et al. 2013;Bridgham et al. 2013), microbial community composition (McCalley et al. 2014), productivity (Öquist and NykänenChristensen et al., 2003), and organic matter availability (Wagner et al. 2003;Christensen et al. 2003). Both permafrost condition and hydrology can further be used as an indication of soil temperature with typically colder conditions in drier soils and permafrost-dominated landscapes (Olefeldt et 75 al. 2017). Methane fluxes are typically highest from graminoid-dominant wetlands likesuch as marshes and fens, which are frequently inundated, which. Inundation, in turn, enhances primary productivity (Ström et al. 2012), creates a soil habitat conducive to CH4-producing microbes (Woodcroft et al. 2018), and facilitates transport CH4 through aerenchymatous roots and stems (Chanton et al. 1993;Ström and Christensen, 2007). Conversely, CH4 fluxes are typically low from permafrost bogs and bogs which tend to have colder (in the case of permafrost bogs) and drier soil conditions (Beylea and Baird, 2006;80 Anderson et al. 2011), which are less conducive to the presence of graminoid species and promote the consumption of CH4 through oxidation (Bartlett et al. 1992; Moosavi and Crill, 1997).
Methane fluxes from aquatic ecosystems (lakes and ponds) are highly influenced by lake morphology (Rasilo et al. 2015;Holgerson and Raymond, 2016) and lake genesis (Wik et al. 2016a), including underlying permafrost conditions (Walter 85 et al. 2006), which are associated with other key controls and CH4 fluxes. Lake morphology influences sediment temperature, macrophyte presence (Marinho et al. 2015;Wik et al. 2018), and turbulent transfer (MacIntyre et al. 2018). Lake morphology, permafrost condition, and lake genesis all determine organic substrate availability in sediments (Walt er et al. 2006, Wik et al. 2016a) and trophic status (Bastviken et al. 2004;DelSontro et al. 2016). For example, peatland lakes and ponds, which form through degradation and permafrost thaw processes in peatlands, are relatively high CH4 emitters (Matveev et al. 2016;Kuhn 90 et al. 2018;Burke et al., 2019). These waterbodies are underlain by organic-rich sediments and are typically small and shallow and less likely to be seasonally stratified, allowing for rapid sediment warming and carbon mineralization (Matveev et al. 2016). Glacial and post-glacial waterbodies, on the other hand, have relatively low CH4 fluxes due to deeper water columns, which limit ebullition by creating cooler sediment temperatures and greater hydrostratichydrostatic pressures for bubbles to overcome (Bastviken et al. 2004;DelSontro et al. 2016). These waterbodies also tend to have mineral-rich sediments with 95 typically less labile organic substrates (Schnurrenberger et al. 2003;DelSontro et al. 2016;Wik et al. 2016a). Therefore, while there are many physical and biogeochemical controls on aquatic CH4 fluxes, size and lake genesis can be useful proxies for many of these underlying factors.
There are various methodologies used to measure surface CH4 fluxes from terrestrial and aquatic ecosystems. Two approaches used in both terrestrial and aquatic ecosystems include micrometeorological eddy covariance (EC) techniques 100 and chamber measurement techniques. Eddy covariance measurements are collected at high temporal frequencies from towers and typically cover a footprint of 100-10,000 m 2 . The near-continuous nature of EC measurements provide valuable insight into the temporal patterns and drivers of CH4 fluxes, however, towers are geographically limited across the Boreal-Arctic region and it can be difficult to attribute flux transport pathways and specific source areas at fine spatial scales ( Knox et al. 2019;Delwiche et al. 2021;Knox et al. 2019). Conversely, static chamber measurements cover small spatial areas that 105 allow for detailed assessments of environmental controls on fluxes (Olefeldt et al. 2013;Bäckstrand et al. 2008;Olefeldt et al. 2013). Chamber-based methods quantify fluxes by calculating the change in chamber headspace concentration over a set time, which varies based on extraction methods (i.e. syringe, automated chamber, or portable gas analyzer). While chamber-based techniques have drawbacks, including surface disturbance, typically low sampling frequency, and high labor intensity, they are easily installed, can capture environmental controls of CH4 fluxes at a sub-meter scale, and are cheaper options 110 compared to installing and maintaining EC towers. Thus, we focus mostly on chamber-based flux measurements in this synthesis because they have been performed at a large number of sites across the Boreal and Arctic region and represent more of the geographic variation across the region.
In aquatic ecosystems, turbulence-driven modeling approaches, inverted funnels (i.e. bubbles traps), and ice bubble surveys (IBS) are additionally used to quantify fluxes. Modeling approaches calculate net hydrodynamic flux (herein referred 115 to as diffusion) to the atmosphere by determining the concentration of dissolved CH4 in the water column and an estimate of the gas transfer velocity k (See Sect. 2.4 for more information). Bubble traps capture the volume of bubble gas released from sediments; ebullitive flux can be estimated by using the concentration of CH4 found in the bubble (Wik et al. 2013). Finally, IBS are used to quantify the spatial abundance and types of bubble formations trapped within lake ice over the winter (Walter et al. 2010). Importantly, these surface-based methods can be used to assess controls of CH4 exchange at scales of individual 120 ponds, lakes, and portions of open-water wetlands, providing key insights into the environmental processes controlling CH4 flux to the atmosphere (Olefeldt et al. 2013;Wik et al. 2016a).
Here we expanded, updated, and merged previous CH4 flux syntheses for northern wetlands (Olefeldt et al. 2013) and lakes (Wik et al. 2016a) to create a small-scale (sub-meter), surface-based dataset for CH4 fluxes collected from 189 studies across the Boreal-Arctic region. The dataset was built in parallel with a novel, CH 4 -specific land cover dataset for the 125 circumpolar north-the Boreal-Arctic Wetland and Lake Dataset (BAWLD; Olefeldt et al. 2021), allowing for flux observations and spatial distribution of land cover features to be classified under the same criteria for the first time at a pan-Arctic scale.
This dataset provides an open platform forincludes surface-based fluxes and associated environmental drivers from aquatic, wetland, and upland (i.e. non-wetland) ecosystems and can be utilized by both field researchers and the modeling community.
Information in the dataset can be used toThe Boreal-Arctic region represents a potentially globally significant, but still highly 130 unknown source of CH4. This dataset can be used to help constrain Boreal-Arctic flux estimates, compare field results, identify new research opportunities, or build and test models. This dataset includes and uniformly classifies lake, wetland, and upland (non-wetland) surface CH4 flux data for the circumpolar north. In this study, weWe show CH4 flux distributions and environmental drivers from various terrestrial (wetland and upland) and aquatic ecosystems across the north, compare the results to previous CH4 flux syntheses, highlight key gaps in the data, and suggest future research directions. 135

Dataset description and BAWLD land cover classification
The dataset is composed of two parts including 1) terrestrial ecosystems (vegetated wetland and non-wetland ecosystems) and 2) lentic open-water aquatic ecosystems (lakes, ponds, and open water pools; hereafter referred to as "aquatic ecosystems"). This synthesis does not include lotic systems (streams and rivers), which are already synthesized in Stanley et al. (2015). The datasets for terrestrial and aquatic ecosystems are reported as separate components due to differences between 140 both the drivers of CH4 fluxes and data collection methods. The terrestrial dataset extends the work by Olefeldt et al. (2013), who compiled CH4 flux estimates for wetlands in the permafrost zones designated by Brown et al (2002). Our dataset expands on this initial work to include flux data from non-permafrost and non-wetland sites throughout the Arctic and Boreal region (Olson et al. 2001) and flux data from studies between 2012 and February 2020. We updated the initi al dataset to include separate entries for individual sites that reported flux and water table data for multiple years. We expanded the number of site-145 year flux estimates in the original terrestrial dataset by 83% and expanded the number of independent studies by 86%, leading to a total of 555 warm-season (~May through October depending on the location) flux estimates and 121 studies (Fig. 1a). The aquatic dataset extends the work by Wik et al. (2016a) which is a compilation of CH4 flux data for lakes and ponds north of 50º N. We expand on this initial work to include studies between 2016 and February 2020. Additionally, we updated the original aquatic dataset to include the within-lake location for ebullition measurements and the equation used to model the gas 150 velocity coefficient k. We expanded the number of lakes in the dataset by 71% and the number of studies by 66%, summing to a total of 1251 lakes and 68 independent studies (Fig. 1b). Finally, each terrestrial and aquatic site was reclassified into a new land cover classification, further explained below. Sites included in the aquatic flux dataset. The number of "sites" in the terrestrial data set represents site-years, which in some cases represent multiple years of data from one site or data from the same site reported by different studies. "Sites" in the aquatic dataset represent the reported average fluxes for one or multiple lakes. In some cases, studies reported one mean value for multiple lakes, therefore the number of lakes and the number of sites are not the same. *Boreal-Arctic Region boundary from Olson et al. 2001. Permafrost zones are from Brown et al. 2002. Continental shoreline base layers are from Wessel et al. 1996.

Land cover classes in the Boreal Arctic Wetland and Lake Dataset
Land cover classes in the Boreal-Arctic Wetland and Lake Dataset (BAWLD; Olefeldt et al. 2021) were chosen and defined to enable upscaling of CH4 fluxes at large spatial scales. As such, we aimed to include as few classes as possible to facilitate large-scale mapping, while still having sufficient classes to allow separation among groups of ecosystems with similarities in hydrology, ecology, and biogeochemistry and therefore net CH4 fluxes. The BAWLD land cover classification is hierarchical; 165 with four upland classes, five wetland classes, seven lentic aquatic classes, and three lotic aquatic classes. As mentioned previously, fluxes from lotic ecosystems (streams and rivers) are not been included in this dataset but are covered by Stanley et al. (2015).

Wetland Classes
Wetlands are defined by having a water table near or above the land surface for sufficient time to caus e the 170 development of wetland soils (either mineral soils with redoximorphic features, or organic soils with > 40 cm peat), and the presence of plant species with adaptations to wet environments (Canada Committee on Ecological (Biophysical) Land Classification et al., 1997;Jorgenson et al., 2001;Hugelius et al., 2020). Wetland classifications for boreal and arctic biomes can focus either on small-scale wetland classes that have distinct hydrological regimes, vegetation composition, and biogeochemistry or on larger-scale wetland complexes that are comprised of distinct patterns of smaller wetland and open-175 water classes (Glaser et al., 2004;Masing et al., 2010;Gunnarsson et al., 2014;Terentieva et al., 2016). While larger-scale wetland complexes are easier to identify through remote sensing techniques (e.g. patterned fens comprised of higher elevation ridges and inundated hollows), our classification focuses on wetland classes due to greater homogeneity of hydrological, ecological, and biogeochemical characteristics that regulate CH 4 fluxes (Heiskanen et al., 2021).
Several boreal countries identify four main wetland classes, differentiated primarily based on hydrodynamic 180 characterization; bogs, fens, marshes, and swamps (Canada Committee on Ecological (Biophysical) Land Classification et al., 1997;Masing et al., 2010;Gunnarsson et al., 2014). The BAWLD classification follows this general framework, but further uses the presence or absence of permafrost as a primary characteristic for classification and excludes a distinct swamp class, yielding five classes; Bogs, Fens, Marshes, Permafrost Bogs, and Tundra Wetlands (see Fig. 2 and Fig. 3). The swamp class was omitted due to the wide range of moisture and nutrient conditions of swamps, as well as the limited number of studies of 185 swamp CH4 fluxes. We instead included swamp ecosystems in expanded descriptions of Bogs, Fens, and Marshes. The presence or absence of near-surface permafrost was used as a primary characteristic to distinguish between Permafrost Bogs and Bogs and to distinguish Tundra Wetlands from Marshes and Fens. The presence or absence of near-surface permafrost is considered key for controlling CH4 emissions given its influence on hydrology, and for the potential of permafrost thaw and thermokarst collapse to cause rapid non-linear shifts to CH4 emissions (Bubier et al., 1995;Turetsky et al., 2002;Malhotra and 190 Roulet, 2015;

205
Bogs are described as ombrotrophic peatland ecosystems, i.e. only dependent on precipitation, and snowmelt for water inputs. Peat thickness is at least 40 cm, with maximum thickness > 10 m. The peat profile is not affected by permafrost, although in some climatically colder settings there may be permafrost below the peat profile. Bogs are wet to saturated ecosystems, often with small-scale (<10 m) microtopographic variability, with stagnant water and a water table that rarely is above the surface or more than 50 cm below the surface (Fig. 3). Bogs have low pH (<5), low concentrations of dissolved ions, 210 and low nutrient availability resulting from a lack of hydrological connectivity to surrounding mineral soils. Vegetation is commonly dominated by Sphagnum mosses, lichens, and woody shrubs, and can be either treed or treeless (Beaulne et al., 2021). Our description of Bogs also includes what is commonly classified as treed swamps, which generally represent ecotonal transitions between peatlands and upland forests (Canada Committee on Ecological (Biophysical) Land Classification et al.,

1997). 215
Fens are described as minerotrophic peatland ecosystems, i.e. hydrologically connected to surrounding mineral soils through surface water or groundwater inputs. A Fen peat profile is at least 40 cm thick (Gorham et al. 1991), although maximum peat thickness is generally less than for bogs. The peat profile is not affected by permafrost. Fens are wet to saturated ecosystems, with generally slow-moving water (Fig. 3). Fens have widely ranging nutrient regimes and levels of dissolved ions depending on the degree and type of hydrological connectivity to their surroundings, ranging from poor fens to rich fens . 220 Vegetation largely depends on wetness and nutrient availability, where more nutrient-poor fens can have Sphagnum mosses, shrubs, and trees, while rich fens are dominated by brown mosses, graminoids (sedges, rushes), herbaceous plants, and sometimes coniferous or deciduous trees (e.g. willows, birch, larch). Our description of Fens also includes what is commonly classified as shrubby swamps, which often are associated with riparian ecotones and lake shorelines.
Marshes are minerotrophic wetlands with dynamic hydrology, and often high nutrient availability (Fig. 3). Vegetation 225 is dominated by emergent macrophytes, including tall graminoids such as rushes, reeds, grasses, and sedgessome of which can persist in settings with >1.5 m of standing water. Marshes are saturated to inundated wetlands, often with highly fluctuating water levels as they generally are located along shorelines of lakes or coasts, along streams and rivers, or on floodplains and deltas. It is common for marshes to exhibit both flooded and dry periods. Dry periods facilitate the decomposition of organic matter and can prevent the build-up of peat. As such Marshes generally have mineral soils, although some settings allow for 230 the accumulation of highly humified organic layerssometimes indicating ongoing succession towards a peatland ecosystem.
Salinity can vary depending on water sources, with brackish to saline conditions in some areas of groundwater discharge, or in coastal settings.
Permafrost Bogs are peatland ecosystems, although the peat thickness in cold climates is often relatively shallow.
Permafrost Bogs have a seasonally thawed active layer that is 30 to 70 cm thick, with the remainder of the peat profile 235 perennially frozen (i.e. permafrost). Excess ground-ice and ice expansion often elevate Permafrost Bogs up to a few meters above their surroundings, and as such, they are ombrotrophic and generally the wetland class with the driest soils ( Fig.relatively well drained (Fig. 3). Permafrost Bogs have moist to wet soil conditions, often with a water table that follows the base of the seasonally developing thawed soil layer. Ombrotrophic conditions cause nutrient-poor conditions, and the vegetation is dominated by lichens, Sphagnum mosses, woody shrubs, and sometimes stunted coniferous trees. Permafrost Bogs are often 240 interspersed in a fine-scale mosaic (10 to 100 m) with other wetland classes, e.g. Bogs and Fens. Common Permafrost Bog landforms include palsas, peat plateaus, and the elevated portions of high-and low-center polygonal peatlands.
Tundra Wetlands are treeless ecosystems with saturated to inundated conditions, most commonly with near-surface permafrost (Fig. 3). Tundra Wetlands can have either mineral soils or shallow organic soils, and generally receive surface or near-surface waters from their surroundings, as permafrost conditions preclude connectivity to deeper groundwater sources. 245 Vegetation is dominated by short emergent vegetation, including sedges and grasses, with mosses and shrubs in slightly drier sites. Tundra Wetlands have a lower maximum depth of standing water than Marshes, due to the shorter vegetation. Tundra Wetlands can be found in basin depressions, in low-center polygonal wetlands, and along rivers, deltas, lake shorelines, and on floodplains in regions of continuous permafrost. Despite the name, limited wetlands with these characteristics (hydrology, permafrost conditions, and vegetation) can also be found within the continuous permafrost zone in boreal and sub-arctic regions 250 (Virtanen et al., 2016).

Upland and Other Classes
Upland and other classes in BAWLD; Glaciers, Rocklands, Dry Tundra, and Boreal Forests, have in common that they are neither wetlands nor aquatic ecosystems. Glaciers are assumed to have neutral CH4 fluxes, however, to our knowledge there are no published studies with field data. from the glacier surface. There are a handful of studies that highlight lateral CH4 255 export and emission from glacial outflows and termini (Christiansen & Jørgensen, 2018;Burns et al. 2018;Lamarche-Gagnon et al. 2019), however due to both limited atmospheric flux measurements and information on the spatial distributions of termini features and difficulties in mapping their areas at the circumpolar scale, we did not included these fluxes. Fluxes from glacial outflows and streams are considered as riverine fluxes and our flux synthesis does not include riverine fluxes . Rocklands are also expected to have very low CH4 fluxes (Oh et al. 2020), potentially with more frequent CH4 uptake than releasehowever, 260 there were very few. No sites that fit within this class (n=5), therefore these flux estimatesincluded in the database were described as Rocklands (Emmerton et al. 2014). There are five sites described as high polar desert or desert tundra, which were combined withincluded as Dry Tundra sites.
The Dry Tundra class includes both lowland arctic tundra and alpine tundra; both treeless ecosystems dominated by graminoid or shrub vegetation. Dry Tundra ecosystems generally have near-surface permafrost, with seasonally thawed active 265 layers between 20 and 150 cm depending on climate, soil texture, and landscape position (van der Molen et al., 2007;Heikkinen et al., 2004). Near-surface permafrost in Dry Tundra prevents vertical drainage, but lateral drainage ensures predominately oxic soil conditions. A water table is either absent or close to the base of the seasonally thawing active layer. Dry Tundra is differentiated from Permafrost Bogs by having thinner organic soil (<40 cm), and from Tundra Wetlands by their drained soils (average water table position >5 cm below soil surface). 270 Boreal Forests are treed ecosystems with non-wetland soils. Coniferous trees are dominant, but the class also includes deciduous trees in warmer climates and landscape positions. Boreal Forests may have permafrost or non-permafrost ground, where the absence of permafrost often allows for better drainage. Overall, it is rare for anoxic conditions to occur in Boreal Forest soils, and CH4 uptake is prevalent, although low CH4 emissions have been observed during brief periods during snowmelt or following summer storms (Matson et al., 2009), or conveyed through tree stems and shoots (Machacova et al., 275 2016). The Boreal Forest class also includes the few agricultural/pasture ecosystems within the boreal biome.

Aquatic Classes
Lakes in BAWLD include all lentic open-water ecosystems (herein referred to as aquatic ecosystems), regardless of surface area and depth of standing water. It is common in ice-rich permafrost lowlands and peatlands for open-waterbodies to have shallow depths, often less than two meters, even when surface areas are up to hundreds of km 2 in size (Grosse et al., 280 2013). While small, shallow open-waterbodies often are included in definitions of wetlands (Gunnarsson et al., 2014;Treat et al., 2018; Canada Committee on Ecological (Biophysical) Land Classification et al., 1997), we include them here within the lake classes as controls on net CH4 emissions depend strongly on the presence or absence of emergent macrophytes . Further classification of lakes in BAWLD is based on lake size and lake genesis, where lake genesis influences lake bathymetry and sediment characteristics (Fig. 4). Previous global spatial inventories of lakes include detailed information 285 on size and location of individual larger lakes (Downing et al., 2012;Messager et al., 2016), but do not include open-water ecosystems <0.1 km 2 in size, and do not differentiate between lakes of different genesis (e.g. tectonic, glacial, organic, and yedoma lakes). Small waterbodies are disproportionately abundant in some high latitude environments (Muster et al., 2019), have high emissions of CH4 (Holgerson and Raymond, 2016), and therefore require explicit classification apart from larger waterbodies. Furthermore, lake genesis and sediment type have been shown to influence net CH4 flux from lakes (Wik et al., 290 2016a). In BAWLD we thus differentiate between large (>10 km 2 ), midsize (0.1 to 10 km 2 ), and small (<0.1 km 2 ) lake classes, and further differentiate between three lake types for midsize and small lakes; peatland, yedoma, and glacial lakes (Fig. 4). Small and Midsize Peatland Lakes are described as lakes with thick organic sediments that are mainly found adjacent to or surrounded by peatlands, or in lowland tundra regions with organic-rich soils. Small Peatland Lakes includes the numerous small pools often found in extensive peatlands and lowland tundra regions, e.g. including the open-water parts of 300 string fens and polygonal peatlands. Peatland Lakes generally form as a result of interactions between local hydrology and the accumulation of peat which can create open water pools and lakes (Garneau et al., 2018;Harris et al., 2020), but can also form in peatlands as a result of permafrost dynamics (Sannel and Kuhry, 2011;Liljedahl et al., 2016). As such, these lakes with thick organic sediments are often shallow and have a relatively low shoreline development index. Peatland lakesLakes typically have dark waters with high concentrations of dissolved organic carbon. 305 Small and Midsize Yedoma Lakes are exclusive to non-glaciated regions of eastern Siberia, Alaska, and the Yukon where yedoma deposits accumulated during the Pleistocene (Strauss et al., 2017). Yedoma permafrost soils are ice-rich and contain fine-grained, organic-rich loess that was deposited by wind and accumulated upwards in parallel with permafrost aggradation, thus limiting decomposition and facilitating organic matter burial (Schirrmeister et al., 2013). Notable thermokarst features, including lakes, often develop when yedoma permafrost thaws, causing labile organic matter to become available for 310 microbial mineralization (Walter Anthony et al., 2016). Small Yedoma Lakes are thus more likely to have actively thawing and expanding lake edges where CH4 emissions can be extremely high, largely driven by hot spot ebullition emissions (Walter Anthony et al., 2016;Fig. 4). Century-scale development of yedoma lakes can shift the main source of CH4 production from yedoma deposits to new organic-rich sediment that accumulated from allochtonous and autochthonous sourcesresulting in such lakes here being considered as Peatland Lakes. 315 Small and Midsize Glacial Lakes include all lakes with organic-poor sedimentspredominately those formed through glacial or post-glacial processes, e.g. kettle lakes and bedrock depressions. However, due to similarities in CH4 emissions and controls thereof, we also include all other lakes with organic-poor sediments within these classes. Glacial Lakes typically have rocky bottoms or mineral sediments with limited organic content. Lakes in this class are abundant on the Canadian Shield and in Scandinavia but can be found throughout the boreal and tundra biomes. Many Glacial Lakes have a high shoreline 320 development index, with irregular, elongated shapes. Generally, Glacial Lakes are deeper than lakes in the other classes, when comparing lakes with similar lake areas and are more likely to stratify seasonally than peatland lakes (Fig. 4).
Large Lakes are greater than 10 km 2 in surface area. Most Large Lakes are glacial or structural/tectonic in origin.
Lake genesis is not considered for further differentiation within this land cover class.

Terrestrial Methane Flux Dataset 325
The terrestrial CH4 flux dataset includes warm-season (~May-October depending on the location) fluxes and was compiled using data from studies published before February 2020. We identified relevant studies using 1) JStore™, Google Scholar™ and Web of Science™ searches with the terms (peatland OR wetland OR bog OR fen OR marsh OR upland) AND (north* OR boreal OR arctic OR sub-arctic) AND (methane OR CH4 OR greenhouse gas*); 2) references from published studies; and 3) contributions of unpublished data (n=1). If multiple, yearly CH4 flux and water table measurements were 330 reported from one site or if multiple studies reported fluxes from the same site, the data were entered as separate individua l lines and were considered each their own "site." Sites that underwent manipulations (soil temperature, water table, nutrients, etc.) were not included in the dataset, however, any control or undisturbed sites included within manipulation studies were included. Sites that had recently experienced disturbance from thermokarst processes were included. Winter flux measurements from terrestrial sites were excluded from this dataset (winter/ice-out emissions from aquatic ecosystems are 335 included-see Sect. 2.3). A comprehensive synthesis of seasonal winter estimates of CH4 emissions from northern terrestrial ecosystems are presented in Treat et al. (2018).
The terrestrial dataset includes predominantly chamber measurements (n=519) at the sub-meter scale which allows for a detailed representation of specific land cover classes (i.e. one land cover class per chamber measurement). However, a handful of eddy covariance measurements were included if the authors could clearly partition fluxes based on specific land 340 cover classes (n=36). For more information on EC-based CH4 synthesis, we direct the readers to the FLUXNET-CH4 Community Product (Delwiche et al. 2021;Knox et al. 2019) and additional FluxNet resources (fluxnet.org). We grouped chamber measurements from specific studies by "Site", which we defined as surfaces with similar veg etation composition (dominant, present, absent) and physical characteristics (including water table position, permafrost conditions, organic laye r depth, soil moisture, and pH) within proximity to each other (typically 1 -100 m radius). In most cases, chambers and sites 345 were already classified by these standards, however, sometimes it was necessary to combine or split chamber measurements presented by the authors into our site and classifications. By combining and splitting sites this way, we were able to classify sites into BAWLD land cover classes. Average daily warm-season fluxes were then calculated from the average CH4 flux from each site over the study's measurement period.
In addition to CH4 flux data, we extracted various site descriptors and categorical and continuous environmental 350 variables (See Table 1 for detailed attribute information and additional variables not discussed here). For all sites, we included information on the site name (Site), location (LatDec/LongDec, Country), the months measurements were taken (SampMonths), the flux measurement method (Meth), the author's description of the site (SiteDescrip), and vegetation composition. Most studies did not classify land cover types with similar BAWLD criteria, therefore we assigned BAWLD land cover classifications. Permafrost zone was assigned according to Brown et al. (2002). When reported by the authors, we also 355 Eriophorum spp; referred to as "Sedge" in the dataset), sphagnum moss (Sphag), non-sphagnum moss (Moss), tree, and shrub species. Vegetation composition of the functional plant type was considered dominant if that type made up greater than 50% of the reported biomass or areal coverage or was one of only two species present at the site. Trees were assigned as the dominant vegetation type if the canopy was described as closed. Gridded (0.5 by 0.5 degrees) climate variables including mean 365 annual temperature (referred to as GRID_T) and mean annual precipitation (CD_Pcp_An) were extracted from WorldClim2 (http://www.worldclim.com/version2).

Aquatic Methane Flux Dataset
The aquatic flux dataset includes ice-free season (~May-October depending on the location) and winter/ice-out fluxes and was compiled using data from studies published before February 2020. We identified new studies using 1) JStore ™, Google Scholar™ and Web of Science™ searches with the terms (lake* OR pond*) AND (north* OR boreal OR arctic OR sub-arctic) AND (methane OR CH4 OR greenhouse gas*); 2) references from published studies; and 3) contributions of 375 unpublished data (n =1). If multiple, yearly measurements were given for one site by the same study, we averaged the flux values (following the initial protocol taken by Wik et al. 2016a). If different studies reported fluxes from the same lake th en these data were reported as separate entries. In instances where ice-free seasons fluxes and storage/ice-out fluxes were reported for the same lake, those data were entered on separate lines, but the number of lakes was designated as NA for the winter measurement as to not add to the total lake count. We defined sites based on reported average CH4 fluxes. For example, some 380 studies reported one average flux value for a group of lakes and this was considered one "site," however, the number of lakes was noted. Studies that only reported CH4 concentrations and not a flux estimate were not included.
Similar to the terrestrial dataset, the aquatic dataset focuses on small-scale measurement techniques that allow for flux estimates to be attributed to one specific land cover class. Therefore ice-free season diffusive fluxes included in this dataset were measured using dissolved CH4 concentrations and modeling approaches (n = 254) or floating chambers (n = 181), while 385 ebullitive fluxes were measured by bubble trap (n = 187) or floating chamber (n = 34). Diffusive modeling approaches include an estimate of the gas transfer coefficient, k. Gas transfer velocity estimates are commonly calculated using equations established by(e.g. Cole and Caraco (, 1998). However, more recent efforts with EC systems, chambers, and either calculation or measurement of the near-surface turbulence that enables flux across the air-water interface indicates that fluxes using Cole and Caraco's (1998) wind-based model of gas transfer velocities underestimate fluxes from non-sheltered and sheltered 390 waterbodies by a factor of two to four (Heiskanen et al. 2014;Mammarella et al. 2015;MacIntyre et al. 2020). Highly sheltered waterbodies, such as small lakes surrounded by trees, may be an exception and can have reduced mean lake k values (Markfort et al. 2010). While we do not recalculate fluxes in this synthesis, we indicate which k calculations were used so that future studies and can easily identify and recalculate fluxes when required. Only a handful of eddy covariance (EC) measurements (n = 5) were included in the dataset. We included a limited number of EC measurements due to difficulties that most studies 395 had in attributing the fluxes to lakes specifically. We classified all EC fluxes as diffusive fluxes as it is hard to separate between ebullition and diffusion within this measurement technique, however, for this reason, EC measurements were excluded from statistical analysis for ice-free season fluxes.
We further delineated aquatic fluxes by transport pathway including ebullition (bubbles), diffusion (hydrodynamic flux), and winter storage/ice-out flux. Ebullition and diffusion measurements were averaged over the ice-free season to 400 represent a mean daily flux estimate across a lake. In some cases, if measurements were only taken from one zone of the lake (i.e. just lake edge or just lake center) we averaged the fluxes and assumed whole-lake fluxes. Some studies only reported a seasonal ice-free flux estimate. If they also reported the number of days in the ice-free season, we then calculated the average daily flux rate. Storage/ice-out flux includes the annual release of CH4 that accumulates within and under the ice over the winter and is released during spring turnover and also includes estimates from ice bubble surveys (IBS). Our storage flux 405 estimate does not include estimates of spring or fall circulation fluxes, wherein CH4 that is stored in the deep portion of the water column is released upon seasonseasonal turnover of the water column (Karlsson et al. 2013;Sepulveda-Jauregui et al. 2015). We also include an estimate of the ice-free season ebullition and diffusive fluxes if provided by the authors or if the authors provided the number of ice-free days. Note that flux measurements that include the transport of CH4 through littoral vascular plants were not included as aquatic fluxes, but as Marsh or Tundra Wetland fluxes within the terrestrial dataset. 410 In addition to aquatic CH4 flux data, we also collected various site descriptors and categorical and continuous environmental variables (See Table 2  specific categorical variables include the overarching lake genesis type (TYPE), binned waterbody size (SIZE), and BAWLD land cover class (CLASS). BAWLD land cover classes were assigned based on author descriptions of the waterbodies. If the 420 authors did not provide information indicating the lake type, we used the coordinates provided to find the waterbody on Googl e Earth™ and used yedoma permafrost (Strauss et al. 2017) and organic soil maps (Hugelius et al. 2014) to determine the land cover class. In a handful of cases, the land cover class could not be determined we left the Class field blank. When reported, we extracted the following continuous variables: surface area (SA), waterbody depth (DEPTH), water temperature (TEMP), dissolved organic carbon concentration (DOC), and pH. Gridded (0.5 by 0.5 degrees) climate variables including mean annual 425 temperature (GRID_T) and mean annual precipitation (CD_Pcp_An) were extracted from WorldClim2 (http://www.worldclim.com/version2).

Statistics
All statistical analyses were performed in R statistical software (Version 1.1.383; www.r-project.org). We tested for significant relationships between log-transformed warm-season (terrestrial sites) or ice-free season (aquatic sites) average CH4 fluxes and several covariates using a combination of linear regression and linear mixed-effects models when necessary (R Package 3.3.3; Lme4 Package; Bates et al. 2014). To include sites with CH4 uptake or near zero fluxes we added a constant of 10 (terrestrial 435 fluxes) or 1 (aquatic fluxes) before log transformation. Mixed-effects modeling was used when a given model included sites with multiple yearly measurements or if multiple studies reported fluxes from the same site (R "nmle" package; Pinheiro et al., 2017). In these cases, site ID was included as a random effect in the analysis to help account for lack of independence across repeated measurements and to weight potential biases (Treat et al. 2018). Almost no studies in the terrestrial or aquatic datasets provided information on all of the variables, therefore, individual statistical analyses have different sample sizes, 440 however, the same subset of data was used to select the best performing mixed models (n = 206 and n = 149 for the terrestrial and diffusive aquatic mixed models, respectively). The significance of individual predictor variables in the mixed models was evaluated using forward model selection. Model performance was conducted using size-corrected Akaike information criterion (AICc; "AICcmodavg" package; Mazerolle & Mazerolle, 2017), wherein a decrease in AICc by 2 or more as an indication of a superior model (as in Olefeldt et al. 2013 andDieleman et al. 2020). All models were tested against each other and the null 445 model. The null model only included the random effects. Non-parametric Tukey's HSD post-hoc tests were performed to assess differences in median fluxes among sub-categories if the overall model was determined significant. All aquatic diffusive and ebullitive fluxes were analyzed separately. Eddy covariance CH4 flux estimates for aquatic ecosystems (n = 5) were not included in the statistical analysis since ebullitive and diffusive fluxes could not be partitioned. We modeled the temperatu re dependence (Q10) of CH4 fluxes following Rasilo et al. (2015). 450

Limitations
Due to limitations of the studies where we extracted data from, some parts of the annual period are not considered in our dataset. Thus this dataset focuses on small-scale, surface-based spatial patterns in CH 4 fluxes associated with specific land cover classes and does not represent temporal patterns in fluxes. For both terrestrial and aquatic datasets, we extracted dat a on the average CH4 fluxes over warm periods or ice-free periods. While we do include an estimate of ice-out/winter fluxes from 455 aquatic ecosystems, our dataset does not include autumnal turnover fluxes from aquatic ecosystems, which may represent a substantial portion of annual emissions (Ferndández et al. 2014;Klaus et al. 2018). Nor do we include shoulder season or winter fluxes from terrestrial ecosystems, which can represent substantial components of the annual flux (Treat et al. 2018;Zona et al. 2016). Furthermore, our data extraction methods were not designed to assess inter-annual changes in fluxes as this dataset compiles the data of multiple studies over a large range of years . Despite data limitations, the datasets 460 represent an important step forward regarding the spatial variability in fluxes among different land cover types.

Summary Statistics
In total, we extracted 555 site-year CH4 estimates from terrestrial (wetland and non-wetland) ecosystems. The majority of reported fluxes (site-years) were from Canada and Greenland (34%), followed by Russia (27%), Alaska (25%), and 465 Scandinavia (14%) (Fig. 1a). Terrestrial fluxes followed a bimodal distribution, split by net positive fluxes (82% of all reported fluxes) and net uptake or zero-emission (18% of all reported fluxes; Fig. 5a). The median number of measurement days per site-year flux for chamber measurements was 10 and the median number of collars per site measurement was five (Fig. 6a).
Of the site-year fluxes reported from aquatic ecosystems, there were 441 diffusive estimates and 175 ebullitive ice-free season estimates, and 125 estimates of winter/ice-out fluxes (including storage, winter ebullition, ice bubble surveys, or a combination 470 of the three). Aquatic sites were distributed throughout the Boreal-Arctic region with greater a greater density of sites in Alaska and eastern Canada (Fig. 1b). Diffusive fluxes showed a unimodal distribution, while ebullition showed bimodal peaks near 100 and 0 mg CH4 m -2 d -1 (Fig. 5b, 5c). The median number of measurement days per site-year flux was three and 15 for diffusion and ebullition, respectively ( Fig. 6b; 6c). Winter/ice-out fluxes were reported as annual estimates and are shown in Table 5. 475
The best explanatory model for terrestrial CH4 emissions was an additive model that included site-level predictors of water 500 table, soil temperature, and graminoid cover alongside the broader classification of land cover class (R 2 m = 0.69; P <0.0001, df = 224; SI Table 2). There was no effect on model performance using interactive effects (DeltaAICc = 0.84), however, the R 2 m did increase to 0.73 (SI Table 2). Notably, on their own, individual models with just the site-level predictors or with just land cover type explained close to the same amount of variation in CH4 fluxes (R 2 m = 0.55 and 0.54, respectively). Methane uptake fluxes, when analyzed separately, were positively correlated with thaw depth (i.e. more uptake with greater thaw depths; 505 R 2 m = 0.55, χ 2 = 19.61, P < 0.0001, df = 22; SI Fig. 1). No other continuous variables were correlated with CH4 uptake; however, sites where shrubs were present, had significantly higher uptake than sites where shrubs were absent or dominant (Tukey PostHoc, P < 0.001 for both, df = 2; SI Fig. 2). There were significant differences in fluxes among the BAWLD terrestrial land cover classes (χ 2 = 253. 69, P < 0.001, df = 6; Fig. 9a, Table 3). Median fluxes were highest from Marshes, Tundra Wetlands, and Fens (mean water table = +2, -0.4, and -6 cm, respectively). Median fluxes from Bogs were lower than the Marshes, Tundra Wetlands, and Fens, but higher than 525 Permafrost Bogs, Dry Tundra, and Boreal Forests. Permafrost Bogs were the only wetland class that fell into the lowest emitting group of classes. However, the frozen and elevated nature of Permafrost Bogs typically leads to lower water table conditions more similar to Dry Tundra and Boreal Forests (mean water table = -22, -15, and -40 cm, respectively). However, it must be noted that in most Boreal Forest sites the water table is not in the top two meters, therefore water table is not commonly measured or reported. The mean water table depth presented here is likely an over estimate that represents wetter 530 Boreal Forest sites that had measurable water tables in the top two meters. Boreal Forest ecosystems were the only class to have negative median CH4 flux for the entire class (net uptake). Permafrost Bogs and Dry Tundra classes also included net uptake site-year CH4 estimates (n= 17 and 31, respectively). One Wetland Tundra site in the Canadian High Arctic had net CH4 uptake for one of the three years it was measured (Emmerton et al. 2014). Notably, the apparent temperature sensitivity from the drier terrestrial sites (Boreal Forest, Dry Tundra, and Permafrost Bogs: Q10 = 3.7) was higher than from the wet 535 terrestrial sites (Marshes, Tundra Wetlands, Bogs, and Fens; Q10 = 2.8). Table 3. Characteristics of BAWLD terrestrial classes based on environmental variables. The number of sites (site years) and contributing studies are shown for each class. Also shown are the mean, median, and quartiles for site average CH4 flux, water table, soil 540 temperature between 5 and 25 cm (TSoilB), sedge cover, pH, ecosystem respiration (ER), and gross primary productivity (GPP). *In some cases one study contributed flux data for multiple classes.
Ebullitive CH4 fluxes from aquatic ecosystems were positively correlated with logged DOC (F = 12.25, P = 0.0008, adj.R 2 = 0.14, df = 71; Fig. 10d) negatively correlated with surface area (F = 13.88, P = 0.0003, adj.R 2 = 0.08, df = 164; Fig.   10a) and latitude (F = 5.38, P = 0.02, adj.R 2 = 0.03, df = 160; Fig. 10c) and were weakly correlated with water temperature (F 565 = 5.55, P = 0.02, adj.R 2 = 0.06, df = 67; Fig. 10e). The apparent Q10 for ebullitive emissions was 2.4 (SI Table 1). There was no apparent relationship with lake depth and ebullitive fluxes (F = 0.02, P = 0.91, df = 151; Fig. 10b). There were no differences in ebullitive emissions between the permafrost zones with the exception of lower ebullitive emissions from the continuous zone compared to the sporadic zone (Tukey' HSD, P <0.001; Fig. 11). Similar to diffusive fluxes, ebullitive fluxes were higher from the small lake classes compared to midsize lakes (Wilcoxon Rank Sum, P = 0.0006, note that Large lakesLakes 570 did not have a large enough sample size (n=1) to be included in the post-hoc analysis). Finally, ebullitive fluxes were similarly higher from Peatland and Yedoma lakesLakes compared to Glacial lakesLakes (Tukey' HSD, P = 0.006 and 0.001, respectively). The best explanatory model for ebullitive fluxes using a subset of the data with complete information for predictor variables of interest (i.e. SA, log.CH4.E.FLUX.plus1, SITE, CLASS, SIZE, DOC, TYPE, LAT, GRID_T) included just waterbody surface area (continuous) as a predictor variable (F = 19.85, P = 0.0001, adj.R 2 = 0.21, df = 68). 575 Figure 10. Relationships between site-averaged ice-free diffusive and ebullitive CH4 fluxes (note the log scale) and environmental variables. Environmental variables include surface area, waterbody depth, latitude, dissolved organic carbon (DOC) concentration, water temperature, and pH. Regression lines and R-square values are shown for significant relationships. Log diffusive CH4 flux was linearly related to surface area, depth, latitude, water temperature, and DOC. Log ebullitive fluxes were linearly related to surface area, latitude, 580 DOC, and water temperature. * P <0.05. ** P < 0.01. *** P < 0.001. Figure 11. Ice-free season diffusion (left) and ebullitive (right) CH4 fluxes as described by categorical variables. Orange circles represent mean flux values. The number of sites for each category is represented in the column to the right (n) in the representative colors for diffusion (light blue) and ebullition (dark blue). The letters (Sig) indicate statistical differences among the categories., wherein bars with 585 the same letters are not significantly different. Lake Size represents binned surface areas for < 0.1 km 2 (Small), 0.1 -10 km 2 (Midsize), and > 10 km 2 (Large). Lake Type represents the BAWLD classification of waterbody types including Peatland, Yedoma, and Glacial lakes. Fluxes higher than 380 are not shown.
There were clear differences in diffusive CH4 fluxes among the aquatic class types, but few differences were observed for ebullitive fluxes. Diffusive fluxes were higher from the Peatland and Yedoma lakeLake classes (both small and midsize), 590 associated with organic-rich sediments, compared to mineral-rich glacial and large lakes (χ 2 = 119.8, P < 0.001, df = 6; Fig.   9b; Table 4). While ebullition fluxes appear to follow a similar trend to diffusive fluxes, the only significant difference was between Small Yedoma lakesLakes and Midsize Glacial lakesLakes (Tukey' HSD, P < 0.001; Fig. 9b). However, the lack of statistical differences found for ebullition between lake classes may in part be due to fewer and more variable ebullition measurements compared to diffusion (Table 4). Reported winter ice-out emission estimates (including storage flux and Ice 595 Bubble Survey (IBS) flux) were scarce in comparison to reported ice-free season emissions. Small Glacial Lakes and Midsize Glacial Lakes had the most reported winter ice-out emission estimates (n= 20 and 31, respectively). Average winter emissions (storage flux + IBS) generally were lower than annual estimates of ice-free diffusive and ebullitive emissions (Table 5); however, statistical tests were not performed across all of the classes due to low sample sizes from some of the classes. Winter ebullition estimates (i.e. direct ebullition emission to the atmosphere from seeps during the ice-cover winter season) were not 600 included in winter emission sums because of the non-uniform spatial nature of these emission types (Sepulveda-Jauregui et al. 2015;Wik et al. 2016a), but are shown in Table 5. In the future, more estimates of winter emissions from aquatic systems are needed to more accurately estimate total annual emissions. Table 4. Characteristics of the BAWLD aquatic classes based on CH4 and environmental variables. The number of sites and contributing studies are shown for each class and flux pathway. Also shown are the mean, median, and quartiles for site average diffusive and ebullitive CH4 flux, waterbody surface area, waterbody depth, and dissolved organic carbon concentrations (DOC). *In some cases one study contributed flux data for multiple classes and pathway types. One ebullition outlier point (flux = 1815 mg CH4 m 2 d -1 ) was excluded 610 from the Midsize Glacial class as it was influenced by beaver activity (Sepulveda-Jauregui et al. 2015

Joint Analysis of Terrestrial and Aquatic Fluxes
We performed joint analysis of fluxes from both the aquatic and terrestrial datasets with regional predictor variables 620 (Class, MAAT, MAP, Permafrost Zone, and Biome) using mixed models to assess the potential for universal drivers across all Boreal-Arctic ecosystems. The best model included Class and MAAT (χ 2 = 345.6, P < 0.0001, R 2 m = 0.47, df = 18: SI Table   4). However, Class alone explained 44% of the variation in fluxes (compared to 47% in the best model; SI Table 4), suggesting that ecosystem classification based on CH 4 emitting characteristics, alongside corresponding spatial extent, is one of the most important variables to consider when scaling CH4 fluxes across the Boreal-Arctic region. 625

Flux Variation Largely Explained by Land Cover Classes
In this review, we assessed the controls on CH4 emissions from 189 studies across terrestrial and aquatic ecosystems in the Boreal-Arctic region. A central component to this study was the inclusion of new land cover classes split by CH4emitting characteristics common across terrestrial and aquatic ecosystems, respectively. Terrestrial classes were split by 630 permafrost conditions and hydrology (and vegetation and nutrient conditions therein) whereas aquatic classes were split by size and lake genesis (i.e. type). We found that much of the observed CH4 flux variability from terrestrial and aquatic ecosystems could be explained by this land cover classification system (Fig. 9). When modeling fluxes for both aquatic and terrestrial ecosystems together with regional-level predictors (variables assigned to sites based on the gridded product including Biome, Permafrost Zone, MAAT, and MAP) land cover class explained most of the variation (44%) with significant, but small 635 contributions in explained variation from gridded MAAT (3% of 47% total variation explained; SI Table 2). This suggests that spatial differences in land cover classes are the most important consideration for estimating CH4 flux at this scale, with some influence of MAAT.
For terrestrial fluxes alone, land cover class as a predictor variable explained 55% of the flux variation. Site -level predictors, including water table, temperature, and vegetation conditions explained 54% of the variation in the fluxes when 640 analysed separately. The best model for terrestrial fluxes included these site-level variables and land cover class and explained 69-73% of the variation (depending on additive or interactive effects; SI Table 2). This model likely performed better than land cover class on its own because the extra information added from the continuous soil temperature and water table variable s captured the variation in these conditions within each class. While permafrost presence came out as a non-significant term in our best model (SI Table 2), the effects of permafrost presence and absence, including confounding temperature effects, were 645 already intertwined into the land cover classes.
For aquatic ecosystems, the best models for diffusive and ebullitive fluxes contained different predictor variables.
The best model for diffusive fluxes explained 41% of CH4 flux variability and included an interactive effect between surface area and lake type (Peatland, Yedoma, and Glacial) and water temperature. Land cover classes (i.e. lake types split by small and midsize categorical sizes) did not come out as significant in this model because the continuous variable of surface area 650 captures the size variation within each lake type. However, land cover class modeled on its own explained 25% of the flux variation. The significant effect of surface area is consistent with previous global synthesis efforts that found small waterbodies tend to have higher CH4 fluxes likely due to the compounding effects of higher substrate availability and warmer temperatures compared to larger waterbodies (Holgerson and Raymond, 2016;DelSontro et al. 2018). Notably, previous synthesis efforts also found that waterbody depth was a significant predictor variable of diffusive fluxes (Wik et al. 2016a. While 655 depth did not come out as significant in our model, the effect of waterbody depth is taken into account with the lake types. For example, we found diffusive fluxes are typically higher from Peatland lakeLake types compared to Glacial lakesLakes, which have average depths of 1.6 meters and 6.7 meters, respectively. Waterbody depth is also an important factor contributing to waterbody temperature (i.e. warmer waters in shallower waterbodies), thus the effect of waterbody depth may also be confounded with that of the temperature variable. 660 The best model for ebullition contained waterbody surface area as a predictor and explained 21% of the variation in the fluxes. Previous synthesis efforts have linked ebullition fluxes to both temperature (Aben et al. 2017) and waterbody depth (Wik et al. 2016a). The weak or absent relation with temperature and depth here is not surprising especially given the broad depth range of the lakes evaluated, nor contradicts the previously observed relationships, because it is likely that the temperature and depth influence is clearer over time and space, respectively, in each specific system. In this dataset, such 665 patterns may be masked by differences in measurement strategies (i.e. number of measurements per season or measurement distributions over the lake) or among overall system characteristics. It is also possible the effects of depth areThere are a few potential explanations as to why we did not find similar relationships between ebullition and temperature or waterbody depth.
First, Aben et al. include global data that encompass sites across broad temperature ranges from the north to the tropics (20 17).
It is possible that the range of temperatures represented by our dataset is not wide enough to capture this relationship. It is also 670 possible that the summary data collected, including average temperature and average flux over the ice-free season, are too coarse to show a relationship. It is likely that temperature and also depth influence is clearer over time and space in each respective waterbody and that a higher resolution of data would show these relationships. Regarding waterbody depth, it is also possible that in the absence of detailed surveys, estimated mean and max depths may be less reliable. The effects of depth may also be confounded with surface area as the two metrics are highly correlated (SI Fig. 5). While this dataset represents 675 one of the largest collections of ebullitive emissions from northern lakes so far, this emission pathway is still largely underrepresented and waterbody depth and temperature are not always reported with the flux estimates. Furthermore, we collected information on surface water temperature for this dataset because it was the most widely available temperature metr ic. Sediment temperature is a better metric to collect in hand with ebullition due to production and transport directly from the sediments (Aben et al. 2017;Wik et al. 2013). Future studies should work to report sediment temperature and water column 680 temperature alongside their flux measurements.

Directions for Future Research
While our small-scale, surface CH4 flux datasets for northern ecosystems are the most extensive datasets compiled to date for the Boreal-Arctic region, we identified key gaps in the data and areas of improvement that future studies should focus on. While the geographical gaps represented in Figure 1a suggest widespread geographic under-representation of terrestrial 685 ecosystems, especially across central Russia and the Canadian Territories of Nunavut and Northwest Territories, these regions are comprised primarily of Boreal Forest and Dry Tundra ecosystems, respectively (Fig. 12e, 12g). Study sites for many of the other land cover types, for example, Bogs and Fens, were relatively well distributed across the Boreal and Arctic region ( Fig. 12a, 12b). However, to assess how well or poorly represented a land cover class is, class area and flux magnitude must also be considered (Fig. 13a). For example, Fens are a high-emitting land cover class and are spatially abundant, leading to a 690 high total flux contribution across the study region (~41%, Fig. 13a), however, the relative number of Fen sites represented in the available literature is not proportional to the total flux contribution (~26%). This, alongside the large spread of reported flux magnitudes (Fig. 9a), suggests future studies should focus on Fens to better constrain the flux magnitude. Conversely, Permafrost Bogs are low contributors to the total wetland flux (~4%) and sites are well represented throughout the literature (~19%), suggesting fewer direct flux measurements are needed from these ecosystems. 695  (Fig. 14a, 14d), despite this region containing the most lakes per unit area throughout the north (Messager et al. 2016). Notably, Large lakesLakes are the least represented of all of the aquatic classes (~2.4% of measurements), but could potentially contribute ~17% of the total flux, mostly from diffusive emissions. Interestingly, while Small Peatland Lakes are well represented (~42% of measurements and 37% of potential total flux contribution), Midsize Peatland lakesLakes are under-represented (~5% of measurements) 710 compared to their estimated flux contribution (~28%). Thus, Large Lakes and Midsize Peatland Lakes may be important focal points for future research however; more empirical scaling-based uncertainty analyses should be explored. There are fewer ebullition measurements compared to diffusive flux measurements from aquatic ecosystems (21% and 79% of ice-free fluxes, respectively). Average ebullitive fluxes were greater than diffusive estimates for all of the land cover classes except Large Lakes (Fig. 7b), and thus represent an important component of total CH4 fluxes from these systems, however, none of the models performed in this study could explain a large amount of the variation in ebullitive fluxes. More 720 ebullition measurements, across all the land cover classes, will help to constrain our understanding of CH4 transport mechanisms and drivers. However, it is important to note that more representative ice-free season flux estimates are needed for both ebullition and diffusion. Wik et al (2016b) suggest that ~11 diffusive day flux measurements and ~39 ebullition day flux measurements are required to calculate a mean ice-free flux estimate within 20% of the true value. 86% of diffusive estimates were under the recommended 11-day mark and 58% of ebullition estimates were below the recommended 39-day 725 mark (Fig. 5b, 5c). Jansen et al. (2020) posit that an even higher frequency of sampling is required (14-22 days and 135 days for diffusion and ebullition, respectively). Further, Wik et al. recommend that in addition to the number of sampling days, measurements should be distributed spatially across the waterbody using a depth-stratified approach included ~3 and ~11 locations for diffusion and ebullition, respectively (2016b). While we did not collect data on the number of sampling locations across each waterbody, it is likely that many of the average fluxes included the dataset also represent spatially under-sampled 730 measurements. Under-sampling potentially reduces the accuracy of mean CH4 flux estimates leading to the relatively poor fitness and explanatory power of the aquatic regression analysis in this study compared to the terrestrial models' performances. This is especially true for ebullitive emissions, which were poorly explained by the reported predictor variables available for this dataset. Calculation of mean ice-free fluxes from a greater number of flux measurements is an important step forward towards better constraining CH4 fluxes from aquatic ecosystems. Finally, there are very few flux estimates for lakes over the 735 shoulder seasons and winter/ice-out compared to the ice-free season (Table 5). While shoulder season flux estimates, including autumnal turnover, were not included in this dataset, winter/ice-out measurements make up only 7% of all aquatic flux measurements collected. Winter/ice-out emissions could potentially contribute a significant portion of annual fluxes from aquatic ecosystems (Karlsson et al. 2013;Sepulveda-Jauregui et al. 2014) and therefore represent an important gap in CH4 flux data. 740

Conclusions
Methane fluxes from northern ecosystems represent an important component of the global CH4 cycle (Saunois et al. 2020). BAWLD-CH4 is a comprehensive flux dataset that uniquely represents flux data from both terrestrial and aquatic ecosystems across the Boreal-Arctic region. BAWLD-CH4 has many potential applications including benchmarking for process-based models, empirical scaling models and informing future research directions. Importantly, we show that land 745 cover classes, split CH4-emitting ecosystem characteristics, is a significant flux predictor variable across terrestrial and aquatic ecosystems and we suggest that future studies should scale CH4 emissions based on CH4 -emitting land cover characteristics.
We show that while land cover class explains most of the flux variation for wetland and aquatic ecosystems when analyzed jointly, MAAT significantly explains ~3% of the variation, which has important implications for future scaling efforts. Finally, we found that a higher percentage of terrestrial CH4 fluxes could be explained by land cover class and site-level variables than 750 for diffusive and ebullitive fluxes from aquatic ecosystems (73% vs 41% and 21%, respectively). Under-sampling of aquatic ecosystems is likely responsible for the lower explained variation observed in our models compared to terrestrial ecosystems.
Future studies should increase the number of sampling days for both diffusive and ebullitive fluxes from aquatic ecosystems to arrive at more representative ice-free flux estimates and total CH4 emissions from the Boreal-Arctic region.

Competing Interests 765
The authors declare that they have no conflict of interest. Riutta et al. 2020