River network and hydro-geomorphological parameters at 1/12° resolution for global hydrological and climate studies

. Global scale river routing models (RRMs) are commonly used in a variety of studies, including studies on the impact of climate change on extreme ﬂows (ﬂoods and droughts), water resources monitoring or large scale ﬂood forecasting. Over the last two decades, the increasing number of observational datasets, mainly from satellite missions, and the increasing computing capacities, have allowed better performances of RRMs, namely by increasing their spatial resolution. The spatial resolution of a RRM corresponds to the spatial resolution of its river network, which provides ﬂow direction of all grid cells. 5 River networks may be derived at various spatial resolution by upscaling high resolution hydrography data. This paper presents a new global scale river network at 1/12° derived from the MERIT-Hydro dataset. The river network is generated automatically using an adaptation of the Hierarchical Dominant River Tracing (DRT) algorithm, and its quality is assessed over the 70 largest basins of the world. Although this new river network may be used for a variety of hydrology-related studies, it is here provided with a set of hydro-geomorphological parameters at the same spatial resolution. These parameters are derived during 10 the generation of the river network and are based on the same high resolution dataset, so that the consistency between the river network and the parameters is ensured. The set of parameters includes a description of river stretches (length, slope, width, roughness, bankfull depth), ﬂoodplains (roughness, sub-grid topography) and aquifers (transmissivity, porosity, subgrid topography). The new river network and parameters are assessed by comparing the performances of two global scale simulations with the CTRIP model, one with the current spatial resolution (1/2°) and the other with the new spatial resolution 15 (1/12°). It is shown that CTRIP at 1/12° overall outperforms CTRIP at 1/2°, demonstrating the added value of the spatial resolution increase. The new river network and the consistent hydro-geomorphology parameters, freely available for download from Zenodo (https://doi.org/10.5281/zenodo.4009304 (cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58)(cid:58) .6475446), may be useful for the scientiﬁc community, especially for hydrology and hydro-geology modelling, water resources monitoring or climate studies.

1 Introduction 20 Global scale river routing models (RRMs) were primarily developed for climate studies. By simulating the flow routing processes through river networks, they allow climate models to close the water budget at the global scale. Then, several applications have been developed based on RRMs, including studies on the impact of climate change on extreme flows (floods and droughts, see e.g., Hirabayashi et al. (2013); Yamazaki et al. (2018)), water resources monitoring (e.g., Makungu and Hughes , 2021) or large scale flood forecasting (e.g., GloFAS, Alfieri et al. , 2013;Jafarzadegan et al. , 2021). 25 Over the last two decades, the increasing number of observational datasets, mainly from satellite missions, and the increasing computing capacities, have allowed better performances of RRMs, either by improving the representation of some processes (e.g., Arora and Boer , 1999;Decharme et al. , 2008;Getirana et al. , 2021;Guimberteau et al. , 2012;Schrapffer et al. , 2020;Vergnes et al. , 2014;Yamazaki et al. , 2013), by integrating new ones, such as lake dynamics (Guinaldo et al. , 2021;Tokuda et al. , 2021) or dams operations (Dang et al. , 2020;Zajac et al. , 2017), or by increasing the spatial resolution. Several studies 30 have demonstrated the benefit of increasing the spatial resolution in macroscale RRMs. For example, Mateo et al. (2017) showed that the river connectivity is better described at high spatial resolution, which improves the representation of the river flow dynamics within the river network. Nguyen-Quang et al. (2018) concluded that high streamflow simulation performances require a precise river catchment description, along with accurate forcing data (namely precipitation).
The river network, that mainly provides the flow direction of each cell, is the main component of a RRM. Higher spatial 35 resolution allows to represent narrower rivers and to better localize confluences, with potential positive impacts on streamflow simulations. The river networks of most RRMs are either grid-based or vector-based. Both approaches differ in their definition of unit-catchments. In grid-based approaches, the river network is discretized on a regular Cartesian grid, so that unit-catchments are rectangular pixels. On the other hand, vector-based river networks are based on irregular shapes of unit catchments extracted from high resolution hydrography data. For instance, TRIP (Oki and Sud , 1998), CTRIP (Decharme et 40 al. , 2019) and LISFLOOD (Van Der Knijff et al. , 2010) follow a grid-based approach, while CaMa-Flood , MGB-IPH (Collischonn et al. , 2007), VIC (We et al. , 2014) and RAPID (Lin et al. , 2018) follow a vector-based approach.
For grid-based models, the spatial resolution is defined by the size of the grid pixels, while for vector-based models, the spatial resolution relies on a threshold catchment area. For both approaches, the river network is generally derived from the 45 upscaling of high resolution hydrography data. The HydroSHEDS dataset (Lehner et al. , 2008) has been the basis for a lot of upscaled river networks used in RRMs. Recently, Yamazaki et al. (2019) released a new hydrography dataset, MERIT-Hydro, based on the Multi-Error-Removed Improved-Terrain DEM (MERIT DEM, Yamazaki et al. , 2017) dataset. MERIT-Hydro has been used in a large number of recent studies (see, e.g., Lin et al. , 2019;Shin et al. , 2020;Eilander et al. , 2021;Getirana et al. , 2021), demonstrating its overall high quality for use in RRMs, among other purposes. 50 Generally, grid-based approaches follow a D8 convention, meaning that each grid cell may flow into one of the eight neighbouring grid cells. Vector-based approaches are more flexible and may follow a D∞ convention, for which the water in a unit catchment may flow into any other unit-catchment (not necessarily a neighbouring one). This is particularly convenient when two rivers flow through the same grid cell without being connected. The vector-based approach allows a better representation of sub-basins than the grid-based approach does, leading to increasing modelling performances ). Yet, 55 it is expected that the difference between both approaches should decrease when the spatial resolution increases. Moreover, grid-based RRMs are more easily coupled to Land Surface Models which generally also follow a grid-based approach. Under these considerations, it seems worthy to continue developing high spatial resolution grid-based river networks.
Besides, along with the river network at the appropriate spatial resolution, RRMs also require parameters that are consistent with the river network. Some parameters depend on the river network itself, such as length and slope of river stretches, and 60 vary with the spatial resolution. Other parameters, including roughness coefficient, river width or bankfull depth, may be calibrated or estimated using empirical relationships. In the latter case, these parameters may also indirectly depend on the spatial resolution. Finally, several RRMs use sub-grid approximations to represent fine scale processes. For instance, some flooding scheme (as, e.g., in Decharme et al. , 2008;Yamazaki et al. , 2011;Decharme et al. , 2012) relies on Cumulative Distribution Functions (CDFs) of flood volume and depth within each grid cell. Vergnes and Decharme (2012) also used 65 CDFs for a sub-grid representation of groundwater dynamics. Such CDFs are computed from high resolution topography data and also directly depends on the spatial resolution of the RRM.
Although some recent studies provide new upscaled river network based on MERIT-Hydro (see, e.g., Eilander et al. , 2021), only a limited set of hydro-geomorphology parameters consistent with the new river network have been derived (such as subgrid river length and slope). In this study, we propose to apply the Hierarchical Dominant River Tracing (DRT, Wu et al. ,70 2011) algorithm on MERIT-Hydro to derive a new global scale river network at 1/12°(5 arcmin) along with a set of consistent hydro-geomorphological parameters. The choice of 1/12°as the spatial resolution for river routing modelling is a compromise between a fine scale representation of river dynamics and computing efficiency. It is also well suited for global to regional scale studies. New features presented in this paper then include: river network at 1/12°75 river geomorphology (length, slope, depth, roughness) floodplains roughness and sub-grid topography aquifers characteristics and sub-grid topography A direct quantitative assessment is not possible since there is, to our knowledge, no equivalent existing dataset at the same spatial resolution. As a consequence, to evaluate the new river network and the corresponding hydro-geomorphological param-80 eters, we propose to use the CTRIP model (Decharme et al. , 2019) and to compare results of two global scale simulations: the first one at the CTRIP current spatial resolution (1/2°) and the second one at the new spatial resolution (1/12°). The CTRIP model has been chosen because of its efficiency, robustness and overall performances (see, e.g., Schellekens et al. , 2017;Decharme et al. , 2019). In addition, it was used in many global hydrological applications, some of which highlighting important results regarding global land hydrology (Douville et al. (2013); Cazenave et al. (2014); Padrón et al. (2020)). Whatever, 85 the river network and parameters provided in this study could benefit to other similar large scale river routing models, or to any other study requiring all or part of this dataset at a similar fine spatial resolution.
The main purpose of this paper is to present the global river network at 1/12°and corresponding consistent hydro-geomorphological parameters. This dataset is mainly designed for all global or regional scale grid-based RRMs, although it could be used in a variety of hydrology-related studies that need flow direction at a medium spatial resolution (e.g., Catalán et al. , 2016;Robinne 90 et al. , 2018;Scherer et al. , 2018;Wan et al. , 2015;Zhou et al. , 2015). A majority of large-scale RRMs uses a gridded structure for global hydrological studies (see technical review of Kauffeldt et al. , 2016) and most of them are still running at a coarse spatial resolution. So with the entire dataset described here (flow direction, river length, river slope, river bank-full depth, river roughness, floodplains roughness, major groundwater basins boundaries, aquifer transmissivity, and aquifer effective porosity), many hydrological models could improve their river routing module by increasing the spatial resolution. Moreover, this consis-95 tent and comprehensive dataset can help modellers to integrate some important processes (such as inundation and groundwater) that are still neglected in some models.
The paper is organized as follows. The derivation of the river network at 1/12°is described in section 2, which also provides a quality assessment. Section 3 describes how hydro-geomorphological parameters are derived, while section 4 presents the results of CTRIP simulations at 1/2°and 1/12°, with comparisons with a large dataset of gauged river discharges. 100 2 River network at 1/12°resolution This section describes the methodology to derive the river network at 1/12°resolution at the global scale.

Background
River network datasets consist in flow direction maps generally derived from Digital Elevation Models (DEMs) corrected for hydrology. With the increasing amount of satellite observations during the last decades, several methods have been proposed to 105 derive river networks at various spatial resolutions using upscaling algorithms (for the D8 method, see, e.g., Döll and Lehner , 2002;Reed , 2003;Shaw et al. , 2005;Paz et al. , 2006;Davies and Bell , 2009;Wu et al. , 2011). All of them are based on a high resolution DEM and apply different upscaling strategies. Among them, the Hierarchical Dominant River Tracing (DRT, Wu et al. , 2011) presents interesting features for deriving D8 river networks. First it has been applied at different final resolutions (from 1/16°to 2°), showing its flexibility. Then it is a fully automated algorithm and does not necessitate any manual correction.

110
Finally, it is designed to preserve the river network structure by processing major rivers first and applying river diversion when necessary. DRT has been applied by Wu et al. (2011Wu et al. ( , 2012 using the high resolution hydrography network from HydroSHEDS (Lehner et al. , 2008) and HYDRO1K (U.S. Geological Survey , 2000).
Recently, the Multi-Error-Removed Improved Terrain DEM (MERIT-DEM) has been proposed by Yamazaki et al. (2017).
MERIT-DEM relies on the SRTM3 DEM (Farr et al. , 2007) and the AWE3D-30 m DEM (Tadono et al. , 2015) and integrated 115 a variety of filters and ancillary datasets to remove major height error components. MERIT-DEM has been used to derive a high resolution (3 arc sec, about 90 m at the equator) global flow direction map, MERIT-Hydro , with good agreement with existing hydrography datasets such as HydroSHEDS (in terms of flow accumulation, river basin shape and river streamlines localization) and even significant improvements in some regions. Although MERIT-DEM and MERIT-Hydro are quite recent, they have been used in a large number of recent studies (see, e.g. Lin et al. , 2019;Moudrý al. , 2018;Shin et 120 al. , 2019Shin et 120 al. , , 2020Wing et al. , 2020), generally showing the added value of these datasets.
Here we applied the DRT algorithm using MERIT-Hydro as the basis for the high resolution hydrography network to make benefit of the most recent available dataset.
The following notations and definitions are used throughout the article: -HR for high resolution (1/1200°) of MERIT 125 -12D for 1/12°resolution Figure 1. Example of estuary opening: red mask is the HR land mask, blue mask the 12D land mask, green mask represent the 12D cells converted from land to ocean to connect the river basin delineated in red to the ocean.
-HD for half-degree resolution pixel: unit element at HR cell: unit element at 12D

130
The first step in the generation of the river network is the set up of a land mask at the final resolution (1/12°, thereafter denoted 12D). The land mask is used to ensure that no flow direction is given to cells in the ocean, which can happen during the diversion step (see bellow). The 12D mask relies on the HR mask from MERIT-Hydro. Cells are considered as land if at least 50 % of HR pixels within the cell are land pixels. Particular attention has been paid to estuaries and their effective connection to oceans and seas. For example, it may happen 135 that a large river flow into a narrow estuary. In this case, the cell corresponding to the river outlet may be disconnected to the coast by cells considered as land (more than 50 % of HR land pixels). To ensure an effective connection to the coast, closed seas (water cells surrounded by land cells) counting less than 20 cells are first converted to land. Then the HR land mask is used to find the shortest way within the estuary from the river outlet to cells marked as ocean, then cells covering this way are forced to ocean. In this process, only rivers with a flow accumulation greater than 10,000 pixels (HR) are considered. An 140 example of an estuary in Ireland is presented in Figure 1.
Using the land mask as a basis, the DRT algorithm is applied to upscale the MERIT-Hydro river network from 3 arcsec to 1/12°resolution. Details of the DRT algorithm may be found in Wu et al. (2011Wu et al. ( , 2012. The main steps are reminded thereafter. 1. Rivers are first sorted by decreasing flow accumulation at the outlet. Rivers are treated in this hierarchical order to ensure 145 the best representation of large rivers as possible. The following steps are applied for each river. 2. Given the river outlet, the HR river route is defined as the longest upstream river. The headwater cell is given by the first cell with a flow accumulation larger than a given threshold (10 % of cell size, i.e. 1000 HR pixels). 5. If a downstream cell is already assigned (e.g., by a larger river), the river is diverted: a parallel route is created, as closed to the real river as possible, until HR route reaches an unassigned cell or outlet cell.
6. If the outlet is reached, the presence of loops is checked and corrected if necessary and the next largest river is treated 155 (steps 2-6).
River diversion (step 5) is an important feature of the algorithm as it allows to conserve the structure of the network. But it simultaneously may raise problems with changes of river location (eg localization of gage station). To overcome this issue, it may be useful to keep a track of the relationship between HR and 12D rivers, which is done here by identifying each processed river with a unique id in both the HR and 12D networks. Note that while river diversion is necessary with a D8 convention, it 160 can be avoided with a D∞ convention. Also, river diversion may have an impact on the attribution of input fluxes such as runoff used to force the routing model. Yet, we estimate that this impact could be neglected as runoff is generally a quite smooth field at this resolution when modelled by a Land Surface Model (LSM). An example of diversion in the Loire river basin (France) is shown in Fig. 3. The Loire river (dark blue) is the major river of the Loire basin and is treated first. Second is the Vienne river (light blue), followed by the Cher river (green), the Creuse river 165 (orange) and the Indre river (red). The M3-to-H3 cells are occupied by the Loire river at 12D, so that the Cher river portion within these cells has to be diverted to the M4-to-H4 cells. Similarly, the cells within L4-to-E4 are occupied by the Cher river and the Loire river at 12D so that the Indre river is diverted (L5-to-E5). River diversion allows to conserve as much as possible the river network structure, even when several rivers flow within the same cell. Without diversion (e.g., river merging at cell M3), both gauge stations (green squares) in the J3 cell would be associated to the Loire river, while one of them is located in 170 the Cher river. River diversion allows to conserve the location of gauges as well as river nodes (confluences) within the river network tree.
Hypsometry (elevation with respect to the longitudinal distance along the river) is computed during the process so that at the end of the process values of river length, slope and elevation are assigned to each cell. Fig. 4 shows the hypsometry curves of the rivers shown in Fig. 2. Hypsometry is interpolated in case of diversion. In addition, a unique identifier has been assigned to 175 each upscaled river and its corresponding river in the original HR network (as shown by colors in Figs. 2 and 3). This identifier and the hypsometry are used thereafter to derive the hydro-geomorphology parameters.
The final river network at 12D at the global scale is represented in Fig. 5, and zooms over selected regions are proposed in The type of river network required by most of river routing models (especially those working with the D8 convention) has 180 to provide a flow direction for each cell of the model. This ensures the closure of the global scale water budget in Earth System Models. The type of soil (nature, river, lake, cities etc.) and other characteristics (such as climate zone) are then not considered to set up the global scale river network. As a consequence, flow directions are also derived over arid regions where no river  exists or within cells where no headwater stream has developed. In that sense, the river network should be considered as a drainage network.

Quality assessment
The DRT algorithm as been designed to conserve the river network structure as much as possible. The hierarchical river selection and river diversion have been set up to that purpose. The quality of the resulting river network has been assessed by Wu Their total area equals 65.10 6 km 2 , which represents half of the total land area (excluding Antarctica and Greenland).

190
The qualitative assessment consists of visual comparisons of river network from different sources, including the original MERIT-Hydro, the previous version of the CTRIP river network (CTRIP-HD) and Google Earth images. The shape of the basins boundary has also been compared with those from CTRIP-HD, the original DRT network at 12D and the GRDC database (Lehner , 2012). For the latter, basin boundaries have been derived from the HydroSHEDS dataset at gauging stations within the largest 405 basins of the world. Basin boundary delineation has been carefully checked and is considered as high 195 quality (Lehner , 2012).
The quantitative assessment first relies on the relative difference between the basin area from the newly developed 12D river network and from other datasets, including the original DRT, MERIT-Hydro and GRDC. In addition, to assess the basin shape and coverage, an Intersection-over-Union (IoU) index is computed as: The IoU index is applied on the basin masks computed from the new 12D network and the original DRT network. It equals 0 in the perfect case where masks exactly overlap, and reaches 1 when both masks do not intersect. Details of the statistics are gathered in Table S1 in supplementary material.
The developer of MERIT-Hydro chose the Nelson River to be the major outlet of the South Indian Lake, considering the existing diversion project. We decided to disconnect this outlet, preferring to preserve the natural river network. Fig. 7 zooms over the region surrounding the South Indian Lake. Yellow circles denote cells where the flow direction has been inverted to 215 reconnect the lake to the Churchill River. Another noticeable difference can be shown in the Amur River basin (Asia) in which the Kherlen River appears disconnected to the Argun River, a tributary of the Amur River, while both are connected at Lake Hulun in the GRDC database. Lake Hulun is usually an inland lake without outlet, but in wet periods it may overflow and then join the Argun River (Brutsaert and Sugita , 2008) :::::::: (Brutsaert ::: and :::::: Sugita : , ::::: 2008). As for the South Indian Lake, the developer of MERIT-Hydro preferred to keep them separated, which is reflected in the 12D river network (Fig. S2).
Basin boundaries from the new algorithm, from DRT and from GRDC and drawn in green, magenta and orange, respectively. The overlapping blue mask represents arid regions. The IoU for this basin equals 14 %, and decreases to 8 % when removing arid regions. :::: Most :: of :::: the ::::::::: differences :::: with :::::: GRDC :::: and :::: DRT ::::: come ::::: from : arid conditions characterizing parts of some basins (with a red background in Table S1). In such regions, the terrain is generally quite flat and often disconnected to the river network (endoreic). It is thus quite difficult to extract river networks, which explains the differences between the datasets (as for example within the basins of Yellow River, Tigris-Euphrates, Senegal, Xi and Rufiji). Nevertheless, the small amount of precipitation that can fall in such regions is partly infiltrated and mostly evaporated. This volume of water never reaches the river network, so 230 differences between river networks over arid regions can be neglected. This can be accounted for in the IoU index by removing arid regions from basin masks, arid regions being defined as regions where the mean annual runoff is below a threshold fixed to 1 mm/yr. Fig. 8 shows that the new 12D river network differs from GRDC in the Southern part of the Tigris-Euphrates river system. Note that DRT is quite similar to GRDC in terms of basin delineation. This major difference can be neglected since it is within the arid region of the Arabian Peninsula. In most of the cases, the IoU significantly decreases (down to less than 10 235 %) when removing arid regions from the masks for basins showing large differences due to arid regions.

245
Finally, the upscaling algorithm produced a reliable and consistent global river network at 12D, very close to the GRDC database in terms of basin delineation for the 69 largest basins of the world. Since MERIT-DEM improved the HydroSHEDS high resolution river network , it is expected that the newly developed network improved the original DRT network.

Derivation of hydro-geomorphology parameters 250
Large scale river routing models make use of a river network (flow direction) to propagate runoff within river basins to the oceans (in case of exorheic basins). But the propagation dynamics also depends on geomorphological characteristics. These include river geometry (length, slope, width) and roughness (friction coefficient). For models that simulate floodplains, the topography is also needed (generally given as relationships between the surface elevation, the area of the floodplain and the volume of water), as well as the roughness in the floodplains. Similarly, simulating the dynamics of groundwater and the ex-

River parametrization
A set of parameters related to rivers and describing the flow dynamics within the river network is derived in this subsection.
For each cell, the river slope and bed elevation parameters are directly derived from the elevation of the original MERIT-Hydro adjusted elevation during the upscaling of the river network, by considering the river reach at HR associated to each 12D cell. It should be noted that applying DRT is quite similar to the vector-grid-hybrid method  for 265 the extraction of the ground elevation of each cell except that the representative area of each pixel is not computed from the HR sub-catchment but directly from the HR river stretch within the cell. Nevertheless, given the type of model the current dataset is developed for (simplified global scale routing model), and given the uncertainties at this resolution in runoff fields used to force the model, we suppose that the difference in area is negligible, at least for catchments covering several grid cells (or equivalently area greater than 1000 km 2 ).

270
For the river length within each cell, the computation relies on the HR route within the cell, contrary to other methods that use the flow direction to compute the distance between the center of the cell and the center of the following cell and that multiply this distance by a constant meandering ratio (e.g., CTRIP-HD). Here, meanders are accounted for in the computation of distances in the HR river route. The final river length within each cell is bounded between 1000 m and 20000 m. One may note that river reaches shorter than 1000 m correspond to headwaters, while reaches greater than 20000 m correspond to highly 275 meandering rivers. River slope is also bounded between 10 −4 m/m in flat regions and 0.5 m/m in mountainous areas.
The river width W riv is mainly derived from the Global River Width from Landsat dataset (GRWL, Allen and Pavelsky , 2018;Frasson et al. , 2019). GRWL was developed by processing Landsat imagery at approximately mean annual flow.
It provides high-resolution centerline locations alongside river width for global rivers wider than 30 m. Water body type is given for each river reach as metadata. Here, reaches corresponding to lake or reservoir, canal, or tidally influenced river are 280 discarded. Since the location of river centerlines may not match exactly the river network at 12D, river centerlines have first been clipped on the MERIT-Hydro river network. Then the river identifier making the correspondence between HR and 12D is used to derive the river width for each cell at 12D (based on the median of HR river width within each 12D cell). For grid cells where no river width can be derived from GRWL, we used the empirical relationship developed for previous versions of the CTRIP model (Vergnes et al. , 2014), based on the annual mean discharge Q mean : Q mean is derived from the runoff field in the GG-HYDRO database (Cogley , 2003) propagated through the river network. A threshold of 30 m is chosen as the minimum width. Figure 10 presents the distribution of river width from GRWL with respect to the annual mean discharge. It shows a strong relationship that is well captured by the empirical relationship from Vergnes et al. (2014).

290
Finally, a smoothing is applied over each river (moving average with a 16-pixel width) to avoid unrealistic shrinkages (see The last parameter related to river hydro-geomorphology is the roughness coefficient. Here we used the same methodology  The weighting coefficient α r varies linearly from 1 in the headwater cells to 0.5 at the outlet of the river basin: where SO is the stream order within the river basin, SO min and SO max are the minimum and maximum stream order values within the same basin. Figure 12 shows the different parameters over different regions of the globe (Amazon basin, USA and Europe).

Floodplains parametrization
Floods may occur when the water height within the river exceeds the river depth, then causing lateral flows over the river banks. Floodplains, described as the area surrounding the river which can be flooded during heavy rain events, acts as a water storage and directly impacts the water propagation within the river network. High accuracy representation of flow dynamics within floodplains requires a highly accurate DEM and intense computations to solve the 2D Saint-Venant equations. This can 310 be done over small areas, such as urban areas, but not at regional scales. A number of large scale river models which account for floodplains and their impacts on the flow dynamics are based on subgrid approximations (Yamazaki et al. , 2011;Decharme et al. , 2012). The concept generally relies on computing the volume of water that flows outside the river (given the river maximum volume) and estimating the water level and the area of the flooded zone from subgrid distributions . Here, in order to ensure the consistency between the river network and the floodplain representation, the adjusted elevation from MERIT-Hydro is used as the baseline to compute the subgrid distributions of elevation, cell fraction (related to the area) and volume of water within the floodplain. The method to extract these distributions is described in Decharme et al. (2012).

315
Floodplain roughness to estimate the flow velocity between the river and the floodplain, using the Manning-Strickler equation. In addition, a floodplain roughness coefficient is estimated empirically as in Decharme et al. (2019). This coefficient is 320 directly related to the type of land within the cell. The ECOCLIMAP-II land cover database (Faroux et al. , 2013) was used to characterize the type of land. For each cell at 1/12°, we computed the fraction f i of each land type in Table 1. Then the floodplain roughness was computed as the weighted average of default values n i for each land type as given in Table 1: 3.3 Groundwater parametrization

325
As floodplains, aquifers can significantly impact the propagation of water within rivers. Aquifers are usually recharged by the infiltration of water at the surface and can interact directly with rivers. The direction of the exchanges between rivers and aquifers depends on the water elevation in the river and the water table depth. As for floodplains, some large scale hydrology models (e.g. Döll and Fiedler , 2008;Vergnes and Decharme , 2012;Decharme et al. , 2019) integrate a simplified representation of aquifers in order to better represent the continental water cycle, and more specifically the water propagation within the river 330 network.
To delineate the main aquifers that could be represented in large scale hydrology models, Vergnes and Decharme (2012) used the global map from the Worldwide Hydrogeological Mapping and Assessment Programme (WHYMAP; http://www.whymap.org).
As in Vergnes and Decharme (2012), we considered two of the three categories included in this map, for which the twodimensional diffusive solver is well adapted: the "major groundwater basin" that gathers sedimentary basins and the alluvial 335 plains with permeable materials, and the "complex hydrogeological structure" which includes (among others) alluvial aquifers formed by the deposition of weathered materials. The "local and shallow aquifers" category corresponds to the old geological platforms characterized by crystalline rocks with scattered, superficial aquifers, and is not considered here. Finally, mountainous cells are removed by using a criteria on terrain slopes and the global lithology map from Dürr et al. (2005) was used to refine the delineation of aquifers. Examples of aquifer delineation are shown in Fig. 13(a).

340
In Vergnes and Decharme (2012), the groundwater dynamics is described by a two-dimensional diffusive equation which requires some additional parameters characterizing the soil, such as the effective porosity and the transmissivity. These characteristics highly depends on the lithology and can be estimated by mean values from the literature. Here, the lithology was derived from Dürr et al. (2005) and the mean values from Table 1 in Vergnes and Decharme (2012). Note that values of porosity and transmissivity have been capped at 0.05 m 3 /m 3 and 0.02 m 2 /s, respectively, in order to avoid a too high inertia 345 within the corresponding aquifers. Values of both parameters are shown in Fig. 13(b-c) over different regions of the globe.
Lastly, to simulate the exchanges between aquifers and rivers, the piezometric head has to be simulated and compared to the water level within the river. The piezometric head may be also used to represent upward capillary fluxes up to the vegetation root layer (Vergnes et al. , 2014). As for floodplains, a subgrid approach may be used, as from MERIT-Hydro is used to compute these distributions.

Modelling configuration
In this section, we set up a model configuration with the river network and the parameters described in sections 2 and 3. For this validation step, the CTRIP model is chosen with the same configuration as in Decharme et al. (2019). In the latter study,

355
CTRIP is operated at 0.5°resolution (CTRIP-HD) and the groundwater and floodplain components are accounted for. CTRIP- HD has been extensively validated against various types of observations, including river discharge, flood extent, groundwater head, total water storage (Alkama et al. , 2010;Decharme et al. , 2012Decharme et al. , , 2019Vergnes and Decharme , 2012;Vergnes et al. , 2014). The whole set of hydro-geomorphological parameters derived in this paper at 12D are available for CTRIP-HD.
Consequently, the new set of parameters can be evaluated and compared to its lower resolution while keeping a consistent simulation time step is set at 3600 s and the time step for output river discharge is 24 h (daily). Finally, a 30-year spinup period was used to let the groundwater storage state variable reach its equilibrium value.

Evaluation strategy 370
Here we compare the performances of the new configuration (CTRIP-12D) to those of the previous one (CTRIP-HD). The performances mainly relies on comparisons between simulated and observed discharge for more than 10,000 in situ gauge stations over the globe. extracted from various open access databases described in Table 2. A minimum of 3 years of records over 1979-2014 was imposed as a mandatory criterion, as well as the presence of localization and drainage area in the station metadata. A total of 13,516 stations was finally selected with drainage area ranging from 400 km 2 to 4.7 10 6 km 2 .

Localization of gauge stations
For the comparison between observed and simulated discharges, one must first localize the gauge station within the river 380 network of the model. A very common method consists of looking for the grid pixel surrounding the station, for which the drainage area is the closest to the one reported in the station metadata. Yet, in some cases, this can lead to an erroneous selection of the CTRIP pixel corresponding to a certain station. Such problems can happen fortuitously (see the example in Fig. 14) or for portions of rivers that have been diverted during the generation process (section 2). This highlights the necessity to improve the localization methodology.

385
Since the coordinates and the drainage area of each station is known, it is possible to delineate the catchment related to the station from the MERIT-Hydro database. First the pixel in the HR grid corresponding to the station is first designated by the one minimizing a criterion that combine the distance to the station and the drainage area. At such a high resolution, the method can be considered robust enough to avoid mislocalization.
The second step consists of sorting the CTRIP pixels around the station (as in Fig. 14) by descending drainage area. For 390 each pixel the comparison between the catchment delineation obtained from MERIT-Hydro and from CTRIP is quantified by computing the IoU index (Eq. (1)). The area relative error (a cost ) and the mask overlapping relative error (m cost ) are finally combined to find the best candidate.
Consequently, each station is assigned a CTRIP pixel more consistently than when using classical approaches. This process is applied for CTRIP-HD and CTRIP-12D. It also ensures that basins smaller than one grid pixel are excluded from the 395 selection since m cost would be too high. Note also that the method is able to solve potential localization difficulties due to the river diversion allowed during the network generation process (section 2). Although the river diversion can foster this kind of situation, it allows in the same time to correctly localize the confluences within the network. This avoids artificial confluences and would consequently prevent the stations concerned to be discarded (due to a bad mask overlap).

Evaluation metrics 400
The main metrics used to quantify the performances of each simulation is the modified Kling-Guppta-Efficiency (KGE, Kling et al. , 2012). The KGE is a combination of three factors describing the error in terms of relative bias, correlation and relative variability (Eq. (7)). KGE varies from −∞ to 1, the upper bound corresponding to simulation results that perfectly match the observations.
where r is the correlation coefficient between simulated and observed discharge, β is the bias ratio, γ is the variability ratio, µ is the mean discharge and σ the standard deviation. 410 We also use the Normalized Information Contribution (NIC) particularly suited to quantify the improvement between two simulations, as in Albergel et al. (2018):  the Niger basin, which highly impacts the discharge ratio. Finally, the dynamics of lakes is neglected, which also impacts the quality of the results, mainly in terms of correlation and standard deviation.
To verify that poor performances are mainly due to these reasons and not to the new parametrization at 12D, the next section 435 compares the performances of CTRIP-12D with those of CTRIP-HD, both of which being ran in the same configuration.

Comparison with CTRIP-HD
Considering that the CTRIP-HD model in its current version has been extensively validated in the past (e.g., Alkama et al. , 2010;Decharme et al. , 2012;Vergnes and Decharme , 2012;Vergnes et al. , 2014;Decharme et al. , 2019), we here mainly focus on the comparison between this existing version of CTRIP and the new one at 12D developed in this article.

440
By applying the methodology to localize gauge stations within the river network (see section 4.2.2), 2,612 stations have been selected to have a correct localization in the river networks of both CTRIP-HD and CTRIP-12D. For these stations, we computed the KGE values for both simulations as well as the NIC criterion that quantifies the improvement or degradation of CTRIP-12D compared to CTRIP-HD. As written in the previous section, despite the overall good quality of the CTRIP model, it may fail in reproducing observed discharges, in particular for stations highly influenced by human activities which are not 445 represented in CTRIP. For these stations, we consider that the CTRIP model is not adapted due to processes not accounted for. Consequently, we consider that improvement or degradation of model performances are not relevant and we discarded these stations. Fig. 16  To get a closer look into the differences between performances of CTRIP-HD and CTRIP-12D, panels in Fig. 17

460
Better performances could be expected for smaller basins since these basins are represented by just a few cells at HD, and the difference between the basin delineation at HD and 12D could be relatively high, then leading to different contributing areas.
The better performances of CTRIP-12D for larger basins is less expected. Indeed processes and forcing are the same for both configurations and parameters are derived using similar strategies and relationships. The improvement of the correlation and variability demonstrates that a better defined river network improves the dynamics of river propagation within the basin and 465 interactions with floodplains and aquifers. Other potential sources of differences between both models include: 1. the reference HR dataset (HydroSHEDS for CTRIP-HD, MERIT-Hydro for CTRIP-12D), which impacts the generation of floodplains and aquifers sub-grid parametrization; 2. the use of observed-based river width for CTRIP-12D.

Conclusions
This article presents a new global scale river network at 1/12°(12D) derived from the MERIT-Hydro high resolution hydrog-470 raphy data. We also provide a set a hydro-geomorphological parameters that are consistent with this new river network. The set of parameters includes: length, width, depth and roughness for rivers, roughness and sub-grid topography for floodplains, transmissivity, effective porosity and sub-grid topography for aquifers.
The new river network and hydro-geomorphological parameters have been implemented in a new version of the CTRIP model (Decharme et al. , 2019) and assessed through a comparison of simulation performances with the previous version of 475 CTRIP at 1/2°(HD). It is shown that river discharge are overall better estimated with the 12D version and that the improvement can be mainly attributed to the finer representation of the real river network. When increasing the resolution of CTRIP from HD to 12D, the total number of cells changes from 62 10 3 to 2.2 10 6 , the total number of basins increases from 4,800 to 56,500 and the total river length increases from 2.5 10 6 km to 21 10 6 km.
As a perspective, it can be mentioned that the derivation of some parameters could be improved over some regions by using 480 existing local or national data. For example, aquifers could be better described by the Référentiel Hydrogéologique Français (BDRHF) database available over France, or by hydrogeological maps from USGS over the United States.

495
25 6 Data availability The river network and the hydro-geomorphology (including floodplains and aquifers parametrizations) data sets are freely available for download from Zenodo (https://doi.org/10.5281/zenodo.6475446 Munier and Decharme , 2021). The source code is also available in this repository.
Author contributions. S. Munier and B. Decharme both designed the study and contributed to the manuscript.