A hydroclimatological approach to predicting regional landslide probability using Landlab

. We develop a hydroclimatological approach to the modeling of regional shallow landslide initiation that integrates spatial and temporal dimensions of parameter uncertainty to estimate an annual probability of landslide initiation based on Monte Carlo simulations. The physically based model couples the inﬁnite-slope stability model with a steady-state subsurface ﬂow representation and operates in a digital elevation model. Spatially distributed gridded data for soil properties and vegetation classiﬁcation are used for parameter estimation of probability distributions that characterize model input uncertainty. Hydrologic forcing to the model is through annual maximum daily recharge to subsurface ﬂow obtained from a macroscale hydrologic model. We demonstrate the model in a steep mountainous region in northern Washington, USA, over 2700 km 2 . The inﬂuence of soil depth on the probability of landslide initiation is investigated through comparisons among model output produced using three different soil depth scenarios reﬂecting the uncertainty of soil depth and its potential long-term variability. We found elevation-dependent patterns in probability of landslide initiation that showed the stabilizing effects of forests at low elevations, an increased landslide probability with forest decline at mid-elevations (1400 to 2400 m), and soil limitation and steep topographic controls at high alpine elevations and in post-glacial landscapes. These dominant controls manifest themselves in a bimodal distribution of spatial annual landslide probability. Model testing with limited observations revealed similarly moderate model conﬁdence for the three hazard maps, suggesting suitable use as relative hazard products. The model is available as a component in Landlab, an open-source, Python-based landscape earth systems modeling environment, and is designed to be easily reproduced utilizing HydroShare cyberinfrastructure.


Introduction
In steep mountainous landscapes, episodic shallow landslides (generally < 2 m depth; Bordoni et al., 2015) and landslide-triggered debris flows are often the dominant form of hillside erosion and a major source of sediment into streams (Benda and Dunne, 1997a, b;Goode et al., 2012). Where the landslide processes intersect with human development, they cause property damage, disruption of infrastructure, injury, and loss of life (Taylor and Brabb, 1986;Baum et al., 2008a), contribute to sedimentation in reservoirs (Bathurst et al., 2005), and may even lead to dam failures (Ghirotti, 2012). Landslides provide punctuated sediment input to streams, affecting stream geomorphology (Benda and Dunne, 1997a, b) and ecosystem dynamics (Pollock, 1998;May et al., 2009). Landslide hazard maps are a common tool used to characterize the relative potential for landslide occurrence in space, either qualitatively (using susceptibility levels) or quantitatively (using modeled landslide probabilities) (van Westen et al., 2006;Raia et al., 2014).
Our objective is to develop a parsimonious probabilistic model of regional shallow landslide initiation that can be implemented with minimal calibration for landslide hazard mapping using regionally available, spatially distributed input data for soil, vegetation type, topography, and hydroclimatology. Based on the literature review presented below, we propose that a regional landslide hazard model should (1) be flexible enough to incorporate changes in intrinsic and extrinsic conditions, such as vegetation and climate; (2) account for spatial variability in model parameters and forcings; and (3) integrate spatial and temporal dimensions of uncertainty to quantify landslide probability. With these principles in mind, we develop a hydroclimatological approach to modeling regional landslide hazard using the Landlab earth surface modeling toolkit -an open-source, Python-based earth surface modeling framework that provides flexible model customization and coupling . Next, we provide a short literature review that guides the design of our landslide modeling approach.

Geomorphology and modeling background
Landslides occur when destabilizing forces due to gravity and pore-water pressure exceed the resisting forces of friction and cohesion over a failure plane. These forces are controlled by intrinsic hillslope conditions -including attributes of topography, such as local slope and upslope contributing area, and properties of rock, soil, and vegetation root cohesion -and extrinsic drivers of rainfall, snowmelt, and earthquakes (Crozier, 1986;Wu and Sidle, 1995;van Beek, 2002;Naudet et al., 2008). There are three primary components of a landslide: (1) a source area or landslide scar where the initial failure begins, (2) a transmission or scour zone, such as a debris flow channel, and (3) a toe or zone of deposition (Lu and Godt, 2013).
Landslide susceptibility can be identified through numerous methods, which can be broadly grouped into empirical methods and process-based numerical models (Hammond et al., 1992;Wu and Sidle, 1995;Sidle and Ochiai, 2006). Datadriven empirical approaches relate the number and frequency of historical landslide observations in a region to triggering events (Caine, 1980;Crozier, 1999;Glade, 2001), landscape attributes Chung et al., 1995;Lee et al., 2007), or a combination of both (Kirschbaum et al., 2012) using threshold relations and various statistical models such as logistic regression, fuzzy logic, artificial neural networks, and a support vector machine (Lee et al., 2007;Pardeshi et al., 2013;Chen et al., 2014).
Process-based models employ effective stress principles to characterize the destabilizing and resisting forces under hydrologic drivers (Iverson, 2000;Montrasio and Valentino, 2016), offering the ability to explore changes in environmental and climatic conditions, critical for areas with limited landslide inventories (Pardeshi et al., 2013). Recent processbased numerical models have largely focused on improving the characterization of the space-time dynamics of subsurface flow as a driver of pore-water pressure (e.g., Baum et al., 2008b;Raia et al., 2014;Anagnostopoulos et al., 2015;Montrasio and Valentino, 2016). Distributed hydrology models that use steady-state or transient solutions for subsurface flow depth were coupled with an infinite-slope stability model that solves the ratio of stabilizing to destabilizing forces on a failure plane parallel to the land surface (Montgomery and Dietrich, 1994;Miller, 1995;Wu and Sidle, 1995;Pack et al., 1998;Borga et al., 1998;Casadei et al., 2003;Tarolli and Tarboton, 2006;Baum et al., 2008b).
Steady-state models assume that, at each point on the landscape, lateral subsurface flow, driven by the topographic gradient, is in equilibrium with a steady-state recharge rate (Montgomery and Dietrich, 1994;Pack et al., 1998). The degree of soil saturation is predicted proportional to the ratio of upslope contributing area to local slope and a ratio of watershed recharge and local soil transmissivity, following TOPMODEL (TOPography based hydrological MODEL) assumptions (Beven and Kirkby, 1979;O'Loughlin, 1986;Pack et al., 1998). More recent efforts have focused on the development of transient flow models in various complexities by coupling vertical infiltration and redistribution processes in the unsaturated zone, using the Richards equation for unsaturated flow (Richards, 1931) or its variants, with lateral flow parameterizations such as kinematic wave in one and two dimensions (Iverson, 2000;Casadei et al., 2003;Baum et al., 2008b;Godt and McKenna, 2008;Raia et al., 2014;Alvioli et al., 2014;Anagnostopoulos et al., 2015).
While transient flow models have contributed to an improved understanding of the influence of weather forcing and temporal variability in precipitation on landslide initiation, they remain tools typically applied for relatively small-scale assessments (Iverson, 2000;Raia et al., 2014). Transient models require a large number of hydrologic soil and vegetation parameters that are highly variable, uncertain, and difficult to measure or estimate (Godt and McKenna, 2008;Baum et al., 2008b). In addition, in most steep forested mountains where landslide risk is high, the presence of macropores due to connected root structures, biological activity, fractures, large clasts, and lenses leads to preferential and funneled flows that violate the assumptions of most matrixflow models (Nimmo, 2005;Sidle et al., 2001;Gabet et al., 2003;Montrasio and Valentino, 2016;Beven and Germann, 2013). Numerical solutions to flow equations also present a major computational bottleneck in large-scale applications for probabilistic quantification of landslide hazard.
While using transient hydrologic models provided slight improvements in the prediction of landslide locations, overall, statistical comparisons of model outputs between steadystate and transient models revealed fairly similar degrees of success (Gorsevski et al., 2006;Zizioli et al., 2013;Anagnostopoulos et al., 2015;Bordoni et al., 2015;Formetta et al., 2016). In some applications, model complexity increased the accuracy of predicted landslide locations at the expense of overestimating instability on unsaturated hillslopes (e.g., Godt et al., 2008;Bellugi, 2011). In other cases, model precision increased while accuracy decreased (Gorsevski et al., 2006).
Data uncertainty due to spatial and temporal variability in parameters continues to be one of the major challenges in predicting landslides over broad regions (Crozier, 1986;Sidle and Ochiai, 2006;van Westen et al., 2006;Baum et al., 2014;Anagnostopoulos et al., 2015). Parameter uncertainties can develop from geological anomalies, inherent spatial heterogeneities in soil and vegetation properties and their changes over time, and sampling limitations (El-Ramly et al., 2002;Cho, 2007;Baum et al., 2014). Uncertainties in hydroclimatic variables, such as precipitation, air temperature, and resulting hydrologic fluxes, are particularly pronounced in steep, high mountain regions due to lack of observations to capture complex atmospheric processes (Roe, 2005;Wayland et al., 2016). Designating landslide hazard as a probability, rather than an index, systematically accounts for uncertainty and variability in stability analysis (Hammond et al., 1992;Simoni et al., 2008;Arnone et al., 2014) and more appropriately represents complex systems (Berti et al., 2012). Recently, some promising advances have been made in process-based models accounting for data uncertainty in landslide hazard mapping (e.g., Raia et al., 2014;Arnone et al., 2016a).
Lastly, most landslide hazard methods disregard a temporal dimension over which landslide probability is defined (Wu and Sidle, 1995;van Westen et al., 2006). As a result of this, instead of using estimated probabilities directly in the form of return periods of observed landslides or expected values for risks resulting from landslides, models use probability estimates as relative indices (e.g., Pack et al., 1998) that can be used for hazard zonation (Pardeshi et al., 2013). A lack of temporal dimension limits the incorporation of model results into risk assessments and the decision-making processes in high-risk regions.

Approach overview
We develop a process-based modeling approach for shallow landslide initiation that incorporates imprecision and uncertainty in hydroclimatological forcing, soils, and vegetation parameters using Monte Carlo simulation. Our approach aims to develop a spatially continuous probability of landslide initiation that can be updated as conditions and triggers change. The model evaluates a factor of safety using the infinite-slope stability equation on the scale of a grid cell from a digital elevation model (DEM) through Monte Carlo simulation and calculates the probability of landslide initiation (Hammond et al., 1992;Raia et al., 2014). A Landlab component (LandslideProbability) and a model driver that runs the component are written, and a workflow is developed for mapping shallow landslide probability. The model driver and data are deployed on HydroShare (www.hydroshare. org), an online collaboration environment for sharing data, models, and code Idaszak et al., 2016), and are made available for cloud computing via the HydroShare JupyterHub infrastructure using a web browser (see Sect. 2.5).
In this work we explore the following question using Landlab and regional landslide observations: how do spatial patterns in hydroclimatology, vegetation, and soil depth influence shallow landslide initiation over large geographic scales? We demonstrate our approach in a mountainous region of Washington, USA. This Pacific Northwest (PNW) region is naturally susceptible to landslides because of high and intense rainfall, steep mountains, active tectonics, and geologic and glacial history (Nadim et al., 2006;Sidle and Ochiai, 2006). The Oso landslide occurred in the vicinity of our study area in 2014, resulting in 43 fatalities and over USD 50 million in economic losses (Wartman et al., 2016).

Probabilistic approach to landslide initiation
The infinite-slope stability equation, derived from the Mohr-Coulomb failure law, predicts the factor of safety (FS) of an infinite plane from the ratio of stabilizing forces of cohesion and friction, reduced by pore-water pressure, to destabilizing forces of gravity (Hammond et al., 1992;Wu and Sidle, 1995). The model as given by Pack et al. (1998) is C * is a dimensionless cohesion (Eq. 1b) embodying the relative contribution of cohesive forces to slope stability. When C * > 1, cohesion is sufficient to hold the soil slab vertically (Pack et al., 1998). C r and C s are root and soil cohesion, respectively (Pa), h s is the soil depth perpendicular to slope (m), ρ s and ρ w are saturated soil bulk density and water density (kg m −3 ), respectively, g is acceleration due to gravity (m s −2 ), θ is slope angle of the ground, and ∅ is soil internal friction angle ( • ). Relative wetness, R w , is defined as the ratio of subsurface flow depth, h w , flowing parallel to the soil surface, to h s . Deterministically, a hillslope element is unstable if FS < 1 and stable if FS > 1 (Sidle and Ochiai, 2006;Selby, 1993). When FS = 1, the slope is "just-stable" or in a state of "limited equilibrium" (Lu and Godt, 2013). Relative wetness is arguably the most dynamic factor on short timescales, relating to water table depth and to recharge rate. Considering that hillslope hydrology is more likely to attain equilibrium conditions during prolonged wet conditions (e.g., Barling et al., 1994;Borga et al., 2002), a steadystate representation of subsurface flow is used. It is derived from local subsurface lateral flow, q s (m 2 d −1 ), represented by a 1-D (i.e., flow parallel to bedrock) form of the kinematic wave approximated by Darcy's law using the topographic 52 R. Strauch et al.: A hydroclimatological approach to predicting regional landslide probability gradient of hillslope, q s = K s h w sin θ (Wu and Sidle, 1995). Under a steady-state assumption, lateral flow is in balance with the rate of water input, q r (m 2 d −1 ), through a uniform rate of recharge, R (m d −1 ), defined across the upslope specific contributing area, a (m), as q r = R a. This assumption gives R a = K s h w sinθ , where K s is saturated hydraulic conductivity (m d −1 ). Solving this equation for h w and dividing both sides by h s gives R w (Montgomery and Dietrich, 1994;Pack et al., 1998): Here T is local soil transmissivity (m 2 d −1 ), which is depthintegrated saturated hydraulic conductivity, K s . For uniform K s within the soil profile overlying impermeable bedrock, T = K s h s . Ground saturates when R w = 1, the maximum value for R w (Montgomery and Dietrich, 1994). These assumptions are appropriate for steep topography to efficiently characterize wetness over large areas (Tarolli and Tarboton, 2006;van Westen et al., 2006). A Monte Carlo simulation is used with Eq. (1a) by assuming R, T , C (C = C r + C s ), h s , and ∅ as random variables represented by probability distributions (Tobutt, 1982;Hammond et al., 1992). The uncertainty in R is represented using a dataset of the maximum daily recharge in each year (e.g., Benda and Dunne, 1997a;Borga et al., 2002;Istanbulluoglu et al., 2004). The model includes both spatially uniform and spatially distributed options for sampling recharge (described further in Sect. 2.3). Using sampled random variables in Eq. (1a), FS is calculated in each model iteration, i, during the simulation. The annual probability of failure P (F ) and landslide return period (RP) at each grid cell are defined as (Hammond et al., 1992;Cullen and Frey, 1999) where n() is the number of conditions met in bracket and N is the number of iterations. Our model does not predict the size of a probable landslide at the initiation point, which can be smaller or larger than the size of a DEM grid. P (F ) gives a relative propensity that a landslide could initiate within the grid cell. If some random samples lead to a low deterministic FS, they contribute to an increase in the P (F ) within that cell. Sensitivity analysis of the infinite-slope stability model was shown in the literature (see Sidle 1984;Hammond et al., 1992).

Model development in Landlab
Landlab is a Python-based earth surface modeling toolkit (http://landlab.github.io). It provides a grid architecture, a suite of pre-built components for modeling surface or nearsurface processes, and utilities that handle data creation, management, and interoperability among process components (Tucker et al., 2016;Hobley et al., 2017;. The LandslideProbability component is written in Python and implemented with a model driver (written as a Jupyter Notebook) using the workflow shown in Fig. 1 of the component's user manual (Strauch et al. 2018). The driver imports Landlab and necessary Python libraries, loads and processes data, and executes the LandslideProbability component on a RasterModelGrid (RMG), which is a Landlab class for creating raster grid objects. A structured grid is generated by the RMG class that covers the model domain.
The spatial model parameters and model forcing data are completed in preprocessing steps outside of Landlab. They are loaded and stored on grid nodes (the central point of grid cells) of the RMG as Landlab data fields, composed of NumPy arrays. The LandslideProbability component is instantiated by passing four arguments: the grid, number of iterations, recharge distribution, and recharge parameters. Once the component has been instantiated, the component's method calculate_landslide_probability() is executed in a for loop that performs the calculations at each node. An iteration number in the range of 700 (Malkawi et al., 2000) to > 1200 (Abbaszadeh et al., 2011) was found to be sufficient in the literature. We used 3000 in this study. At each node the method generates unique model parameters, and calculates the relative wetness (Eq. 2) and FS (Eq. 1a) for each iteration. At the end of the iterations, the probability of saturation and the probability of failure are calculated at each node.
Slope angle and specific contributing area are static parameters derived from a DEM in preprocessing steps. Total cohesion, C (i.e., C r + C s ), ∅, h s , and T are treated as random variables following a triangular distribution specified with three parameters (minimum, mode, and maximum) (Cho, 2007;Dou et al., 2014). Options for user-provided T or K s are accepted by the component; although comparisons of resulting landslide probabilities were found to be similar given that the value of T was derived from h s . Triangular distributions give weight to the most likely value (i.e., mode) and have been proposed in other Monte Carlo simulation studies for slope stability (Hammond et al., 1992;El-Ramly et al., 2002;Strenk, 2010).
Mode parameters of the triangular distribution used for all soil and vegetation parameters are developed as raster grids as part of preprocessing steps, loaded to Landlab, and assigned to nodes of the RMG (Fig. 1). For root cohesion we used vegetation types from the National Land Cover Data (NLCD) (Jin et al., 2013;USGS, 2014b), with a lookup table for cohesion obtained from the literature (Table 1). Only for cohesion are minimum and maximum parameters also provided as raster grids to represent distributed variation with vegetation. The Gridded Soil Survey Geographic Database (SSURGO) (DOA-NRCS, 2016) is used to assign ∅, h s , and T (see Sect. 3.2.1 for details). The current model design assumes negligible correlation between C and ∅ as assumed

Model driver
• Import Landlab and Python libraries • Establish RasterModelGrid (e.g., read in esri ascii DEM) as "grid" • Load data and add data fields to grid in other studies (e.g., Abbaszadeh et al., 2011;Arnone et al., 2016a). Other spatial soil and vegetation datasets can be used in the preprocessing of the model. Exposed bedrock and glaciated surfaces can be excluded from the model domain by the user.
In each Monte Carlo iteration, we use the annual maximum daily recharge, R, which represents a steady-state uniform recharge rate defined for the upslope contributing area of each RMG node. Local recharge (i.e., the flux of water entering the saturated zone) within the upslope contributing area of RMG nodes can be incorporated from a variety of grid resolutions from hydrologic models, referred to as a hydrologic source domain (HSD). A source tracking algorithm (STA) is developed that uses spatially variable recharge data from an HSD, resampled to the grid resolution of slope stability calculations, and routes local recharge in the downstream direction following the steepest descent until a target cell is reached. Then it calculates the spatially averaged upslope recharge for each node of the RMG, used as R in the model. STA is described in more detail in the component's user manual (see Strauch et al., 2018).
Four options for sampling R are provided for Monte Carlo simulation at each node, identified in the model driver by selecting a probability distribution: uniform, lognormal, log-normal_spatial, and data_driven_spatial. The first two options assign spatially uniform random variables of R across the whole model domain with respective parameters of minimum and maximum and mean and standard deviation. The latter two spatial options are designed to represent spatial variability in R, constructed based on the statistics of annual maximum R obtained from an HSD using the STA utility. The lognormal_spatial option assigns mean and standard deviation of R at each node derived from the modeled R data, while the data_driven_spatial option uses a nonparametric sampling approach to sample from the cumulative distribution of R data produced for each node of the RMG. In this regional application of the landslide component, the Variable Infiltration Capacity (VIC) macroscale (1/16 • or 5 × 6 km grid cell) hydrology model is used as HSD (Liang et al., 1994).

Hydrologic data processing
A key aspect of the regional landslide modeling approach is the linking of landslide hazard to hydroclimatological forcing on regional scales. The Landlab LandslideProbability component is written with the capability to accept input from hydrologic model outputs, such as the VIC macroscale hydrologic model (Liang et al., 1994) we demonstrate in this paper. VIC is semi-distributed, predominantly physics-based macroscale hydrology model that characterizes elevationdependent differences in regional precipitation and temperature and their influence on recharge through regulating rain on snow, snow accumulation and snowmelt, evapotranspiration, and soil moisture (Hamlet et al., 2013).

54
R. Strauch et al.: A hydroclimatological approach to predicting regional landslide probability The steady-state subsurface model coupled with the infinite-slope stability equation in our model requires a steady-state recharge rate as input. Recharge refers to the input of water to subsurface flow from precipitation and snowmelt less evapotranspiration and soil storage. In a VIC model simulation, this condition can be obtained by adding base flow and surface runoff. Observations and model experiments suggest that widespread landslides are usually associated with the largest rainfall events (e.g., Page et al., 1994;Gorsevski et al., 2006). To characterize when the ground is likely to be the most saturated each year, daily base flow and surface runoff are summed at each VIC grid cell to represent daily recharge (mm d −1 ) and the annual maximum daily value is selected for each year of the dataset, similar to others (e.g., Benda and Dunne, 1997a;Borga et al., 2002;Istanbulluoglu et al., 2004). To obtain a steady-state average recharge rate in the upslope contributing area of each RMG, the Landlab STA utility is used (see Sect. 2.2. of this paper and Fig. 1. of Strauch et al., 2018)

Soil evolution model
Soil depth controls the temporal and spatial patterns of landsliding over geomorphic timescales and is considered one of the most significant variables controlling the FS stability index, especially at depths less than 1.5 m (Benda and Dunne, 1997a;Istanbulluoglu et al., 2004;Catani et al., 2010;Sidle and Ochiai, 2006). Soil depth can vary in space and time as a function of weathering and sediment transport in relation to climate, lithology, topographic position, and vegetation cover (Dietrich et al., 1995). Despite its fine grid resolution, the SSURGO database (DOA-NRCS, 2016) only broadly captures topographic controls on soil depth and reflects existing conditions in the field based on soil surveys. In an attempt to improve the representation of spatial granularity and local uncertainties of soil depth, a soil evolution model is used (Dietrich et al., 1995;Simoni et al., 2008;Pelletier and Rasmussen, 2009;Tesfa et al., 2009;Bellugi et al., 2015). The model is run to develop time series of soil depth from which triangular distribution parameters for soil depth (mode, minimum and maximum) can be obtained and used in the Landlab LandslideProbability component.
In the soil evolution model, a change in soil depth is modeled as the annual sum of local soil production, divergence of sediment flux due to soil creep, and soil removal by landslides (e.g., Tucker and Slingerland, 1997;Heimsath et al., 1997;Braun et al., 2001;Istanbulluoglu et al., 2004;Nicótina et al., 2011). The rate of soil production is related exponentially to local soil depth (Heimsath et al., 1997). Soil creep is linearly related to local elevation gradient (e.g., McKean et al., 1993). Soil removal by landslide initiation is modeled with the infinite-slope stability equation, implemented with representative parameters (Table 2). When FS ≤ 1, soil is removed to bedrock by setting it to a very small value (0.005 m). In each model iteration, C and T were randomly sampled and used in the FS Eq. (1a). Calibration of the soil evolution model is done by adjusting soil production rate and hillslope diffusivity parameter to obtain long-term soil loss consistent with long-term regional erosion rates. Details on model application and the utilization of model outputs are presented in Sect. 4.1.2.

Reproducibility
To publish a reproducible version of this research, we used the HydroShare (www.hydroshare.org) cyberinfrastructure platform, designed for reproducing, reusing, and sharing models (Tarboton et al., 2014;Horsburgh et al., 2016;Morsy et al., 2017). Steps that supported reproducibility included using the HydroShare sharing settings with a workflow that started with Private while data and models were developed, Discoverable while research was being shared with colleagues for review, and Public, once our results were accepted for publication. We used the Select a license function to add NonCommercial (NC) use to our Creative Commons license. We made use of the Groups social collaboration, by making early versions of our research results available to invited participants of workshops and tutorial demonstrations to our Landlab group in HydroShare. The data and model are accessed by launching Jupyter Notebooks that access Landlab installed on JupyterHub servers at the National Center for Supercomputing Applications (Yin et al., 2017;Castranova, 2017). HydroShare features enable our current and future researchers to use the Copy Resource function to replicate our published resource (i.e., the landslide model) in their own account with Derived from metadata that reference back to the published resource DOI, to serve as a starting point for their work. The Supplement provides instructions on how to access HydroShare and run a Jupyter Notebook that reproduces portions of the application below.

Study area
The model described above is applied within the geographical limits of the North Cascades National Park Complex (NOCA) in the state of Washington, USA, managed by the US National Park Service (Fig. 2). In recent decades, NOCA has experienced damaging and disruptive landslides that have impacted infrastructure and the public. Furthermore, the park area is covered by a recent soil survey between 2003 and 2009, including field investigation (DOA-NRCS and DOI-NPS, 2012), and has a complete map of mass wasting processes visually observed in the field (Riedel and Probala, 2005). The application is designed to demonstrate the potential capability of the Landlab LandslideProbability component using existing data in a real setting and to provide a site-specific stability analysis for landslide susceptibility for NOCA land management. NOCA is approximately 2757 km 2 , with 93 % wilderness, where motorized or mechanized devices are not allowed (DOI-NPS, 2012), which is ideal for studying naturally triggered landslides. The elevation ranges from about 100 to 2800 m (Fig. 2a). The terrain is composed of rock slopes at the highest elevations, short (< 100 m) soil-mantled hillslopes, and landslides upslope of relatively straight debris flow channels connected to the fluvial system. Over 300 glaciers occupy mountain peaks in NOCA. The influence of the Pacific Ocean, approximately 80 km to the west, provides a humid temperate climate. However, the north-southoriented Cascade Mountains create an effective orographic climate boundary, separating a wetter west side from a drier east side. Reported mean annual precipitation ranges from about 708 mm at the low elevations of the eastern slopes to 4575 mm at the highest mountain elevations west of the Cascade crest, with about 70 % falling in November through March (Fig. 2b). This spatial precipitation gradient is the result of orographically enhanced precipitation that leads to a strong rain shadow (Roe, 2005). Average annual air temperatures range from −2 to 11 • C, depending on elevation (DOA-NRCS and DOI-NPS, 2012).
In this study vegetation classes were grouped into herbaceous, shrubland, and forest using the 2014 NLCD data, based on the land-use-land-cover (LULC) classification of 2011 Landsat satellite imagery (Jin et al., 2013;USGS, 2014b). Other LULC types include water, wetland, snow/ice, barren, and developed (e.g., roads, campgrounds), covering about 13 % of NOCA. Based on this classification, forest, shrubland, and herbaceous vegetation represent 58, 17, and 12 % of the park, respectively. Elevation ranges for these vegetation classes are from 106 to 2363 m (forest), 110 to 56 R. Strauch et al.: A hydroclimatological approach to predicting regional landslide probability The park geology is composed of a complex mosaic that includes mostly faulted and folded sedimentary and volcanic rocks on the west side, unmetamorphosed sedimentary and volcanic rock on the eastern edge, and highly squeezed and recrystallized metamorphic rock originating from great depth in the middle (Haugerud and Tabor, 2009). Alpine and continental glaciation, along with rivers and mass wasting processes linking peaks with rivers, created the landscape. The glaciers eroded U-shaped valleys with steep valley walls prone to landslides and flat valley floors with gravel-bed rivers. The lower ends of many valleys on the east side with lower precipitation were not covered in alpine glaciers and have narrow, winding V-shaped canyons and steep, narrow rivers.
A park-wide landform mapping study identified six different types of mass wasting landforms: rock fall/topple, debris avalanche, debris torrent, slump/creep, sackung, and snow avalanche-impacted landforms (Riedel et al., 2015). Mass wasting landforms were identified using 1998 aerial photos on a scale of 1 : 12 000, 7.5 min topographic maps, bedrock geology maps, and field investigations. The minimum mapping unit was approximately 1000 m 2 , except for a few smaller slump units. In this study, we only used mapped debris avalanches for model confirmation as they often initi-ate by a shallow landslide. Debris avalanches typically represent a mixture of failed rock and debris. Their mapping included polygons that combine head scars, transport and scour channels, and deposition zones in a single polygon (Fig. 3a). We extract the highest 10 % of the elevations in the mapped debris avalanche polygons as landslide source areas through comparison to aerial imagery (Tarolli and Tarboton, 2006). This analysis located 75 % of landslide source areas at intermediate elevations from 1200 to 2000 m in NOCA (Fig. 3b).
Some areas in mountainous regions are covered by glaciers, permanent snowfields, and exposed bedrock, which are unsuitable locations to model landslides with the infiniteslope model (Borga et al., 2002). These landforms as well as wetlands and other water surfaces are excluded from our modeling domain. The total area excluded from the stability analysis accounts for about 21 % of NOCA's land area.

Model input fields
We used a grid resolution of 30 m from the National Elevation Dataset (NED) (USGS, 2014a). Evaluation of model performance was intended at this resolution for regional modeling as NASA's Shuttle Radar Topography Mission (SRTM) DEM is available globally at a 30 m resolution (USGS, 2017). The minimum mapping unit used for landslides is 30 m for NOCA (Riedel et al., 2015). Slope (S = tanθ ), total curvature (Curv) (i.e., both planar and profile), and contributing area (CA) attributes were derived from the DEM (Fig. 2a).

Vegetation and soil parameters
Vegetation classes are obtained from the NLCD at a 30 m resolution (Jin et al., 2013;USGS, 2014b). Parameters of a triangular distribution for C, ∅, T , and h s are provided in Table 1. In our case study, C represents root cohesion. Soils across the study domain are assumed cohesionless, due to low clay content (< 10 %) in this mountain substrate with large clasts (Kulhawy and Mayne, 1990). Estimating root cohesion is challenging because of temporal and spatial variability in root density and size, differential breakage or pullout mechanisms, interaction among roots, and difficulty in measuring on a field scale (Pollen and Simon, 2005;Schwarz et al., 2013). We developed spatial coverages for minimum, mode, and maximum C for NOCA by relating vegetation classes with corresponding published C values in the literature (Table 1), where field observations suggest a rightskewed distribution (Hammond et al., 1992;Schmidt et al., 2001;Gabet and Dunne, 2002;Hales et al., 2013). Based on ranges available in the literature, we selected a mode value as a commonly reported value, a minimum parameter as 30 % of the mode, representing death and loss of productivity (Sidle, 1991(Sidle, , 1992, and a maximum near the highest reported value for C. Forest have higher C than shrubland because of the greater root area and deeper roots (Arnone et In order to investigate the contribution of soil depth to mapping landslide probability, we developed and used two alternative soil depth products: one based on SSURGO and another based on a soil evolution model. The nationally available SSURGO database maintained by the Natural Resources Conservation Service (NRCS) is a readily available data source that includes data on depth to restrictive layer (DOA-NRCS, 2016), which we used to specify the mode of soil depth (Fig. 4c). Using the Soil Data Viewer of Esri ArcGIS (DOA-NRCS, 2015a), the weighted-average aggregation option is used to extract soil depth within each soil map unit (DOA-NRCS and DOI-NPS, 2012). SSURGO soil depth (SSURGO-SD) is uniform for each soil map unit and thus lacks finer-scale spatial heterogeneity and creates edge incongruities (Fig. 4c), a limitation identified previously for landslide modeling (Bordoni et al., 2015). A more spatially heterogeneous soil depth map is developed using the output of a soil evolution model. SSURGO-SD represents the recent conditions in soil depth. The difference between the actual soil depth in the field and the SSURGO-reported soil depth will likely be associated with the limited number of soil depth measurements used to develop SSURGO maps, measurement errors, and spatial interpolation assumptions. In addition, for the locations that have already produced landslides before SSURGO mapping, we assume that the maximum value of the triangular distribution represents the soil depth prior to a landslide. To represent uncertainty, minimum h s is assumed to be 70 % of the mode and maximum h s adds 10 % to the mode value. These values give a left-skewed triangular distribution, commonly observed for soil depth (Hammond et al., 1992). The selected skewed distribution was confirmed by the soil evolution model discussed in Sect. 4.1.2.
Transmissivity is derived as the product of weightedaverage aggregated K s of all soil layers above the restrictive layer and h s for each soil map unit (DOA-NRCS, 2015a). Similar to h s , this T value was considered the mode (Fig. 4d), and the minimum and maximum values needed for an asymmetrical triangular distribution were calculated as T min = T mode − 0.3 · T mode and T max = T mode + 0.1 · T mode . Closely related to soil depth, T is high in valley bottoms as well on plateaus because of deeper soils; thus, more water can move through the soil when saturated (Fig. 4d). T is low in the thin veneer soils below retreating glaciers as well on steeper side slopes.
The percentages of sand, silt, and clay for each soil map unit in NOCA were derived from SSURGO data using Soil Data Viewer (DOA-NRCS, 2015b). This revealed largely sandy loam or loamy sand soil textures, based on the USDA classification, across the NOCA. These soil textures corresponded to Unified Soil Classification System (USCS) soil types silty sand and well-graded (diverse particle size) fine to coarse sand, respectively. Reported ∅ values for these USCS soil types were assigned as the mode of ∅, ∅ mode used in triangular distribution. The developed land cover type was assigned an additional 3 • to the mode to compensate for higher soil density from development activity, such as compaction (Sidle and Ochiai, 2006). Given the mode and ranges of ∅ for these soil types, minimum and maximum ∅ were calculated to generate right-skewed distributions for both soil types as ∅ min = ∅ mode −0.18·∅ mode and ∅ max = ∅ mode +0.32·∅ mode , based on a literature review (i.e., Table 5.5 in Hammond et al., 1992, andTable 5.2 in Selby, 1993). The soil and water density terms in Eq. (1a) were assigned a constant value of 2000 and 1000 kg m −3 , respectively, similar to Pack et al. (2005). The factor of safety has been found to be insensitive to soil density (Hammond et al., 1992;Lepore et al., 2013).

Model recharge
We used existing VIC model runs for the PNW region developed through the Columbia Basin Climate Change Scenarios Project (Elsner et al., 2010;Hamlet et al., 2013). The project developed a calibrated implementation of VIC (1/16 • or 5 × 7 km grid resolution) covering the Columbia River basin in Washington to produce validated historical hydrologic simulations (water years 1916-2006) driven by spatially interpolated daily station observations of temperature and precipitation (Hamlet et al., 2013). Archived model output at a daily time step includes gridded base flow and runoff. Hydrologic simulations using VIC have also been run for all of the contiguous United States (CONUS) (data avail-able from Livneh et al., 2013Livneh et al., , 2015. We determined the maximum daily recharge for each year to generate a 91-year long time series to characterize the wettest ground saturation conditions for shallow landsliding. Modeling with maximum recharge provides an indicator of individual storm events that typically trigger shallow landslides (Lu and Godt, 2013), although lesser amounts of recharge may also be sufficient to trigger landslides in some locations. The average annual maximum daily recharge over NOCA is about 35 mm d −1 (±15 mm d −1 ), ranging from a low of 7 mm d −1 along the eastern edge of the park to a high of 79 mm d −1 on the western edge and at higher elevation peaks.

Geomorphic analysis and soil evolution
Understanding the spatial distribution of dominant geomorphic processes can aid the development of landslide hazard maps consistent with geomorphic theory. In this section, we discuss the mapping of dominant processes on the landscape on the slope and area domain and explore the proposed soil evolution model to develop modeled soil depth maps.

Investigation of process domains
Hillslope diffusion, landslide, debris flow, and fluvial transport processes leave unique imprints on landforms, manifested in the slope-contributing area (S-CA) domain as different scaling relationships (Montgomery and Dietrich, 1992;Tucker and Bras, 1998;Montgomery, 2001;Stock and Dietrich, 2003;Tarolli and Dalla Fontana, 2009). The infinite-slope factor-of-safety model is only applicable to the initiation of landslides. Therefore, hazards associated with debris flow scour and deposition cannot be predicted by this model. We used a S-CA plot and the infinite-slope stability theory to (1) identify process domains and limit the analysis of the landscape to slopes where there is shallow landslide potential, (2) evaluate observations of debris avalanches to identify landslide source areas, and (3) infer plausible ranges of the infinite-slope stability model parameters to corroborate those we compiled from the literature for NOCA (Table 1).
Our geomorphic analysis was based on plotting, on a loglog scale, S (as tan(θ ) and CA pairs of each DEM grid cell in NOCA, cells within mapped debris avalanches (including depositional areas), and the most likely source areas of landslides identified as the single highest-elevation grid cell within each mapped debris avalanche (Fig. 5). The general trend in the S-CA relationship is acquired for all grid cells of NOCA as well as debris avalanche (DA) cells by binning the data with respect to CA and calculating the mean S for each CA bin. The negative linear relation in the log-log plot suggest a power-law scaling in the form of S∼CA −B where B is the slope of the S-CA relation on the log-log domain, which reflects channel longitudinal profile concavity. Con- cavity is generally associated with fluid-driven processes, while the degree of concavity is tightly related to the nonlinearity of fluvial transport with respect to S and CA (Roering et al., 1999;Montgomery, 2001;Stock and Dietrich, 2003;Istanbulluoglu, 2009). Based on the scaling transitions that mark changes in concavity, process domains interpreted in Fig. 5 are (1) a hillslope zone where slope-dependent processes such as dry ravel and soil creep dominate, leading to convex slopes, (2) a landsliding zone where pore-pressuredriven slope failures introduce concavity as landslides arise with shallower slopes as recharge CA grows, (3) a debris flow or saturated landslide zone in headwater channels, where mass wasting processes in saturated ground evolve into highconcentration transport (Iverson et al., 1997), and (4) a fluvial region (Montgomery and Foufoula-Georgiou, 1993;Tucker and Bras, 1998). Debris-flow-dominated slopes were shown to exhibit reduced concavity relative to channels and porepressure-driven landslide zones in the S-CA domain (Montgomery and Foufoula-Georgiou, 1993;Tucker and Bras, 1998;Stock and Dietrich, 2003).
A threshold CA of approximately 1 km 2 and a slope threshold of θ = 17 • generally separates colluvial mass wasting and debris transport processes from fluvial processes ( Fig. 5; see also Legg et al., 2014). Nearly all grid cells within mapped debris avalanches plot to the left of the 1 km 2 dashed line. An average θ value of 17 • may also correspond to a low end of a slope threshold for landsliding. Fully saturated cohesionless soils are unconditionally stable at tan(θ) ≤ 1/2tan(∅) (i.e., half of ∅), assuming a ratio of water to saturated soil density of 0.5 (e.g., Montgomery and Dietrich, 1994). Solving for ∅ when θ = 17 • gives 34 • , generally consistent with selected ∅ values from soil texture (Table 1) (Hammond et al., 1992). Approximately 85 % of NOCA landscape lies above θ > 17 • , suggesting a dominant role of mass wasting processes in this landscape. We included areas above this slope threshold in our landslide model domain.
The red saturation curve is calculated as aR/T , where R/T is calibrated to 0.0005 m −1 (e.g., a/sinθ = 2000 m) to capture most landslide source cells (left of curve) and a scaling break in the binned S-CA plot (Fig. 5). The saturation curve partitions the landscape into partially saturated (left) and saturated (right) areas, which generally delineates the S-CA pairs separating landsliding from debris flow tracks that form under full soil saturation. For a T = 10 m 2 d −1 , R is 5 mm d −1 , which is within the range of the lowest maximum annual modeled recharge values in most of the study area, indicating that the plotted saturation line could reasonably map regions that experience saturation annually.
The three lines stacked vertically (i.e., cyan, green, and pink) plot the solution of S in the infinite-slope stability equation (Eqs. 1a and 2) as a function of CA, and given FS = 1, R/T = 0.0005 and ∅ = 34 • and select values of dimensionless cohesion, C * . Conditioned on the C * value, slopes that plot above the S-CA solution are unstable. Consistent with the binned S-CA data, the solution of the infinite-slope stability equation curves down as a function of CA, and following soil saturation, a constant instability S threshold is reached. Root cohesion is approximately 6 kPa for C * = 0.3 (middle green line) and 12 kPa for C * = 0.6 (upper pink line), assuming a soil depth of 1 m and cohesionless soil. These root cohesion values are reasonable for shrub and mature forest vegetation found in the literature (Table 1) and they facilitate stability with steeper slopes. When C * = 0 (bottom cyan line), landslides initiate at lower slopes than when cohesion is greater. This solution also envelops the low slope end of nearly all landslide source S-CA pairs identified from debris avalanche data. Only a small portion of the unstable areas plot above the C * = 0.6 solution of Eq. (1a), which implies areas with higher root cohesion.

Modeled soil depth
We ran the soil evolution model described in Sect. 2.4 at a population of representative topographic conditions and vegetation types (forest, shrub, herb) instead of running the simulations over the whole study domain. Capitalizing on the S-CA analysis (see Sect. 4.1.1), local θ ( • ), CA, and Curv triplets in each of the CA bins are used from the landscape dominated by colluvial transport processes (θ > 17 • and CA ≤ 1 km 2 ). In order to further classify landscapes within each CA bin, θ and Curv pairs are grouped into shallow (θ ≤ the 10th percentile θ ), moderately steep (between 10th and 90th percentiles of θ), and steep (θ ≥ the 90th percentile θ ) slope classes. Within each class, θ and Curv are averaged. This led to 53 triplets used for the soil evolution model, with the assumption that landslides do not significantly change local θ and Curv, implying long-term equilibrium conditions. The model is run for 10 000 years to represent the postglacial landscape (i.e., roughly the current interglacial period or Holocene) using the calibrated parameters listed in Table 2.
Local erosion is calculated within the soil evolution model. Calibration of the soil evolution model was performed by adjusting model parameters from the literature (e.g., Tucker and Slingerland, 1997;Nicótina et al., 2011) and comparing the mean annual rock erosion rate estimated by the model to long-term average rock erosion rates published for the Cascade Mountains, which range from 0.02 to 0.5 mm yr −1 over roughly the last several Myr (Reiners et al., 2002(Reiners et al., , 2003 and slightly higher rates over the last millennia of 0.08 to 0.57 mm yr −1 (Moon et al., 2011). In addition to published erosion rates, the resulting soil depths were compared to the SSURGO-SD, which ranged from 0.09 to 2.01 m across NOCA.
In Fig. 6 we show modeled mean annual erosion rates in relation to the mode of modeled soil depth (M-SD) for a steep and moderate slope class and illustrate the local variability in modeled soil depth under forest and shrub conditions. The relative frequency histogram of local soil depth resembles a triangular distribution, with mode values generally higher than mean values, indicating a negatively (left) skewed distribution for soil depth (Fig. 6a, c). Therefore, there is a higher frequency of deeper soil relative to shallower soils. Soil creep fills hollows, thickening soils, as FS gradually drops, leading to episodic landslides that evacuate sediment (Fig. 6b, d).
Both θ and Curv have been found to be correlated with soil depth (Heimsath et al., 1997;Braun et al., 2001;Mitchell and Montgomery, 2006;Hren et al., 2007). A multivariate nonlinear regression in the form of y = β 1 · x m 1 + β 2 · x 2 + C was fit to the mode of soil depth (predictand, y) given θ and Curv (predictors, x 1 and x 2 ) for each vegetation type with R 2 > 0.9 for all slope classes (not reported). Maps for the mode of M-SD were developed over the portion of the NOCA domain by applying the regression equations using the distributed θ and Curv and vegetation type at each grid cell. Minimum soil depth was set at 0.005 m, and maximum soil depth was set to 2 m. Outside the colluvial transport process domain are conditions outside the regression analysis; therefore, vegetated areas were assigned a depth of 0.5, 1, and 2 m for herbaceous, shrubland, and forest, respectively, to generate a contiguous soil depth map for NOCA consistent with SSURGO. Areas with barren land cover were assigned a soil depth of 0.05 m, representing the minimum range of modeled herbaceous areas. Developed areas were assigned a value of 0.5 m. Areas assigned fixed values are about 2 % of the model domain. The evolved soil depth was also used to revise T , using the K s provided by SSURGO, which provides a more distributed continuous field of T . The revised T map is used when Landlab is run based on the mode from M-SD.
M-SD exhibits substantially more spatial variability than the SSURGO-SD (Fig. 7). While both spatial soil depth distributions have similar median values, M-SD has a wider distribution with a higher proportion of shallower and deeper soils than SSURGO-SD. In general, the M-SD is shallower than SSURGO-SD on steeper, convex hillslopes with herbaceous or shrub vegetation and deeper on gentler, concave hillslope with forest vegetation. For both datasets, soil depth is deeper in the valleys and shallower near the ridge tops (Fig. 7c, d), consistent with other studies (Anagnostopoulos et al., 2015;Montgomery and Dietrich, 1994).
The maximum and minimum soil depth parameters of the triangular distribution were obtained by analyzing soil evolution model results. At most θ , CA, and Curv triplets used, a landslide occurred at least once over the modeled duration. As described in Sect. 3.2.1, given the negatively skewed nature of the temporally evolved soil depth (Fig. 6a, c), the maximum soil depth parameter of the triangular distribution was set equal to 10 % of the mode in all model simulations. Two scenarios for the minimum parameter of the triangular distribution were used to reflect soil depth uncertainty for contemporary and long-term conditions. In the first case, we set the minimum parameter as 70 % of the mode. The LandslideProbability model was run for this scenario for both SSURGO (SSURGO-SD) and M-SD input. In the long-term scenario, the minimum soil depth was set to 0.005 m, reflecting bedrock scour conditions by landslides. We argue that this assumption implicitly introduces a temporal uncertainty component to soil depth, which may be used to more accurately estimate the landslide return period over the long term. The model run was called M-SD LT for this case.

Probability of failure
The modeled annual probability of the failure of shallow landslides, P (F ), for NOCA simulated by the Landlab Land-slideProbability component using SSURGO-SD and two M-SD scenarios is shown in Fig. 8. In each model run, 3000 values were sampled (i.e., iterations) for model parameters at each grid cell in the Monte Carlo simulations. P (F ) derived from simulations exhibits low probabilities where slopes are moderate and cohesion is high (e.g., forest). Highly unstable areas largely correspond to steep barren landscape (13 % of the model domain) mostly located below retreating alpine glaciers, with steep glacial landforms, transitioning from glacier to colluvial processes (similar to Guthrie and Brown, 2008;Tarolli et al., 2008;Legg et al., 2014) (Fig. 9). These areas with a thin veneer colluvium, except for moraines, appear to be "continuously sliding" (Borga et al., 2002) or "chronically unstable" (Montgomery, 2001). Frequent slides impede the colonization of vegetation ( Dietrich et al., 1995;Istanbulluoglu and Bras, 2005). Slides in barren areas were not completely included in our landslide inventory as they do not pose major risks to humans and infrastructure.
Other locations of higher P (F ) are located in topographic hollows (Figs. 8,9). These converging areas accumulate deeper soils, which decreases the effectiveness of root cohesion, and enhance pore pressure through the convergence of subsurface flow (Dietrich et al., 1995). Converging areas often correspond to the upper portions of mapped debris avalanches, which clearly display higher landslide probabilities than the run-out portions downstream. Thus, the landslide probability visually appears to capture the source area of debris avalanches.
Substantial differences between P (F ) derived with different soil depth maps are evident (Figs. 8 and 10) and corroborate previous studies showing the influence of various soil depth estimates on landslide susceptibility (Dietrich et al., 1995;Okimura, 1989). In general, probabilities are higher and more spatially extensive when the model is parameterized using SSURGO-SD compared to both M-SD scenarios.
To investigate the spatial distribution of P (F ) in relation to soil depth, we plot the cumulative distribution of P (F ), referred to as the fraction of modeled area where P (F ) is less than or equal to a given value, for each simulation (Fig. 10a). We present our general observations of the spatial distribution of P(F) in the order of SSURGO-SD, M-SD, and M-SD LT as depicted in Fig. 8. Simulations show approximately 26, 38, and 49 % of the modeled domain (79 % of NOCA, where θ > 17 • ) as stable (i.e., P (F ) = 0) under the current vegetation cover and climate. We refer to these sites as unconditionally stable (i.e., stable even when saturated, and with minimum C and ∅ sampled) (Pack et al., 1998;Montgomery, 2001). A bimodal spatial distribution for P (F ) is evident (Fig. 10a, b). Areas with low probabilities, around P (F ) ≤ 0.1, dominate the spatial distribution of P (F ), manifested in a steep rise in the fraction of area from P (F ) = 0 to P (F ) = 0.1 (Fig. 10a). For P (F ) ≤ 0.1 (RP ≥ 10 years), the order of aerial cover for the model domain, including the stable regions, is 72, 85, and 87 %. When the unconditionally stable areas are excluded, the percentages become 46, 47, and 38 % for the three soil depth products used. This region approximately marks the first peak of the relative histogram of P (F ) (Fig. 10b).
In the broad 0.9 > P (F ) ≥ 0.1 range, the increase in the fraction of area with P (F ) is gradual especially for the two M-SD simulations (Fig. 10a). In the highly unstable regions, with P (F ) ≥ 0.9 (RP ≤ 1.1) as mapped in Figs. 8 and 9, the fractional area begins to rise again in all simulations (Fig. 10a). P (F ) = 1 occupies 11 and 7 % of the modeled area in the SSURGO-SD and M-SD simulations, which can be conceptually named as unconditionally unstable (i.e., un-stable even when dry and with the highest combinations of C and ∅ sampled) (Pack et al., 1998;Montgomery, 2001). The model run using the M-SD LT soil scenario shows a smaller area percentage, ∼ 6 %, with P (F ) ≥ 0.9, while SSURGO- SD and M-SD had 16 and 10 %. The M-SD LT soil scenario provides a more realistic estimate as some locations are not likely to produce slope failures annually due to limited soil development. The second peak of the relative frequency histogram of P (F ) appears when P (F ) > 0.9, largely associated with postglacial barren lands with steep mountain slopes and converging topography, especially in the case of SSURGO-SD (Fig. 10b). Dominant factors that control the relative fre-quency of P (F ) are labeled in Fig. 10b and further discussed in subsequent sections.
We expressed the annual probability of landsliding in the form of a RP, plotted with respect to the fraction of area for all three simulations, and mapped RPs for the M-SD LT scenario in Fig. 11. The M-SD LT reduces the probability and increases the return period estimates of landslide initiation, revealing the influence of long-term memory of landsliding on the probability distribution of soil thickness ob- tained from the soil evolution model. Therefore, the M-SD LT scenario would better suit the definition of RP, while the other two simulations provide reference for relative comparisons. In general and in concert with the P (F ), landslides at nearly all RPs affect a greater proportion of the domain when SSURGO-SD is used. Approximately 28 % of the model domain is simulated to have a landslide return period of less than or equal to 10 years (i.e., P (F ) ≥ 0.1 or frequent slides) based on SSURGO-SD, compared to half as much area, 15 %, for simulations using M-SD; M-SD LT had slightly less at 13 %. Low return periods (i.e., < 10 years) coincide with steep slopes in barren areas that show chronic landsliding, low-cohesion vegetation type, such as herbaceous plants, and some steep hollows.
At the high end of the return period, 46 % of the model domain was simulated to have landslides with a return period of ≥ 500 years for the SSURGO-SD scenario, including stable areas, compared to 52 and 70 % for model runs that used the M-SD and M-SD LT scenario, respectively (Fig. 11). High return periods (i.e., RP > 500 years, P (F ) < 0.002) are found where slopes are gentler, on divergent topography, and in forested areas. The fraction of the model domain with a landslide return period between 100 and 500 years is 10, 18, and 21 % for SSURGO-SD, M-SD, and M-SD LT, respectively, showing a larger fraction in the M-SD products. These landslide frequency rates relate to long-term averages, and the actual failures are likely to be clustered in space and time depending on triggering event and the time since the last landslide at the same location (Guthrie and Evans, 2004).
As soils in landslide locations are formed by sediment accumulation from surrounding hillsides and weathering of the local bedrock, landslides can be the main source of denudation across landslide-prone regions. The expected values of the mean annual denudation rate are approximated by the spatial mean of P (F ) · h s /(ρ r /ρ s ) for each simulation. This gives a spatial average of the long-term denudation rates due to landslides as 51.9 mm yr −1 , 7.06 mm yr −1 , and 5.04 mm yr −1 for SSURGO-SD, M-SD, and M-SD LT scenarios, respectively. While these rates are higher than the reported mean annual denudation rates in this region over the last millennia of 0.08 to 0.57 mm yr −1 (Moon et al., 2011), M-SD LT clearly gives the closest estimates to observations among the three soil depth scenarios. Over 1 order of magnitude variation in denudation rates is also common as part of long-term records of erosion rates (e.g., Molnar, 2004).
A critical question that remains is the following: what are the dominant controls that lead to the bimodal distribution of landslide probability in the modeled domain? First, we examined whether topography alone, represented by S and CA pairs, can explain this behavior. The S-CA data pairs from each model grid cell are colored by the value of P (F ) in the order from low to high value using output from the M-SD LT scenario (Fig. 12). As slopes get steeper (S > 0.45 or 24.2 • ), a relatively rapid increase in P (F ) in relation to slope from P (F ) = 0.4 to 1.0 can be seen, surrounded with lower probabilities. CA does not seem to impose a visually detectable increase in P (F ), which is likely largely due to the wet climate in the region. The landslide source cells identified from 66 R. Strauch et al.: A hydroclimatological approach to predicting regional landslide probability . This suggests that the majority of the domain with similar pairs of S and CA exhibits lower landslide probability, which can be largely attributed to the spatial distribution and influence of vegetation type and soil depth (e.g., Roering et al., 2003). We investigated the roles of vegetation, slope steepness, and soil depth on P (F ) in relation to elevation (Fig. 13). From low to high elevations, vegetation changes from predominantly forest (elevation < 1400 m) to coexisting shrub, herbaceous plants, and barren land (1400 to 2200 m) as a result of elevation-dependent ecoclimatic controls (e.g., temperature) on vegetation survival and growth (Fig. 13a). In this region of ecosystem transition, the mean P (F ) shows a persistent increase from 1400 m until a maximum is reached between 2200 and 2400 m, depending on simulation (Fig. 13b,  c). Observations of debris avalanche by elevation confirm the pattern of P (F ) dependence on elevation in relation to ecosystem change; 75 % of the extracted landslide initiation zones from mapped debris avalanches are located between 1200 and 2000 m (Fig. 3b). In the 1400 to 1900 m elevation range of the ecosystem transition zone, mean slope is relatively constant ∼ 0.75 m m −1 (∼ 37 • ) and rises up to 0.9 m m −1 (42 • ) between 1900 and 2200 m (Fig. 13c), consistent with the binned-averaged slopes of the landslide source area in the S-CA plot in Fig. 5. Mean soil depth begins to drop in both SSURGO and modeled soil depth products above 2200 m.
These model results confirm the strong control of ecosystem transition on landslide activity in the region. Below about 1400 m (∼ 40 % of NOCA), forested vegetation combined with deeper soils and moderate slopes keeps P (F ) low. In the 1400 to 2200 m range, loss of root cohesion with ecosystem transition combined with a gradual increase in landscape slopes contributes to increased P (F ). Above 2200 m elevation, soils become very shallow and slopes exhibit the steepest angles in the modeled domain. This combination leads to the largest variability in P (F ), combining the highest P (F ) values (P (F ) ≥ 0.9) mostly attributed to barren areas (∼ 6 % of the model domain in the M-SD LT scenario), with lower P (F ) values where thinner soils reduce the driving force within Eq. (1a). Total cohesion has been found to affect FS estimates more on thin soil than on thick soils (Hammond et al., 1992). The sensitivity of FS to cohesion is even more pronounced on steep slopes, especially when saturated (Sidle, 1984). Forest vegetation has also been found to stabilize slopes through the hydrological process or root water up-take and transpiration, which leads to drier soil conditions (Arnone et al., 2016b). In aggregate, thinner soils at higher elevations lead to lower mean P (F ), which we referred to as soil depth control (see also Sidle, 1984). The general contribution of elevation to the spatial organization of P (F ) is labeled in Fig. 10b.

Model evaluation
The performance of a landslide model is often based on its ability to capture existing mapped landslides. We statistically evaluated our model using receiver operating characteristics (ROC) (Fawcett, 2006) and success rate (SR) curves (Bellugi et al., 2015).
We limited our performance assessment to the source areas of the mapped debris avalanches. Source areas of debris avalanches were not mapped separately from the remaining debris avalanche features (i.e., transition and deposition zones), hindering the evaluation of model predictions (Tarolli and Tarboton, 2006). Source areas we identified in relation to elevation (4318 samples) were treated as observed landslide source cells during the validation of the landslide probability using ROC and SR performance metrics. In this validation, we excluded barren areas with slopes ≥ 37 • (∼ 5 % of the model domain), which characterizes slopes of active smallscale dry landslides (failure depth ≤ soil depth) more appropriately represented by nonlinear hillslope diffusion models (see Roering et al., 1999;DiBiase et al., 2010;Pelletier et al., 2013). For comparison of P (F ) with source area cells, we randomly sampled 50 000 grid cells outside mapped debris avalanches (∼ 2 % of the modeled domain), similar to the number of grid cells within the entire mapped debris avalanche areas. We recognize that the areas outside mapped debris avalanches have the potential to be unmapped landslides, other landslide types, or unstable areas lacking a triggering event; therefore, we interpret the test results conservatively.
ROC curves were used to examine how our model compares with randomly distributed landslides over the landscape. These curves are constructed from confusion matrices generated from comparisons between observed and modeled landslides, based on varying P (F ) thresholds (e.g., 0.1, 0.2, 0.3, etc.). Details on calculating the metrics used to generates these curves have been provided elsewhere (see Mancini et al., 2010;El-Ramly et al., 2002;Anagnostopoulos et al., 2015). A better-performing model will exhibit a curve toward the upper left of a false positive rate (x axis) and true positive rate (y axis) plot. A 1 : 1 line in the plot represents a trivial model that randomly assigns stable and unstable cells. The area under the curve (AUC) generated by the ROC curve quantifies the performance of a model for identifying landslide and non-landslide locations. The AUC statistic represents the probability of correctly ranking a landslide and non-landslide pair randomly selected from those two datasets (Hanley and McNeil, 1982). SR curves are similar to ROC curves, but plot the fraction of landscape predicted as unstable (x axis). Again, a relatively well-performing model would plots farther away from the 1 : 1 line, representing a trivial model.
For this comparison, we used the same datasets used in the cumulative probability analysis discussed Sect. 4.2. Both simulations using SSURGO and M-SD modeled 10 % source areas and non-landslide areas better than a random selection, as demonstrated by the curves plotted above the 1 : 1 line (Fig. 14). However, the model's strength in the classification is modest, as indicated by the AUC values of between 0.60 and 0.61 compared to an AUC of 1 representing a perfect classification. The Transient Rainfall Infiltration and Grid-Based Regional Slope-Stability Analysis -Probabilistic (TRIGRS-P) landslide model tested by Raia et al. (2014) found higher AUC results (i.e., 0.65 to 0.73). However, their study tested small areas (3 to 6 km 2 ) that were well-studied locations with detailed inventories of landslides resulting from one or two winter rainfall seasons, and the entire landslide was tested rather than source areas only.
ROC and SR curves provide an indication of how well the modeled simulations of P (F ) classify both observed landslide source cells and non-landslide grid cells compared to random classification. The crossing of ROC and SR curves in the simulations with M-SD (Fig. 14) implies that at higher probability thresholds, simulated probabilities delineate more false alarms (e.g., areas outside DAs that are unstable) than capturing source areas. This may be indicative of the high probability values at high elevations even outside the debris avalanches where vegetation is sparse, as was indicated above in the analysis of cumulative distribution plots. We found for our case study that the optimal probability threshold to maximizing landslides captured and minimizing false alarms (i.e., the point around the apex of the ROC curves) declines by half depending on the simulation: P (F ) ≥ 0.008 (i.e., RP ≤ 125 years) for SSURGO-SD, P (F ) ≥ 0.004 (i.e., RP ≤ 250 years) for M-SD, and P (F ) ≥ 0.002 (i.e., RP ≤ 500 years) for M-SD LT.
The modeled potentially unstable landscape has generally been greater than observed landslides when infinite-slope stability models are calibrated with limited observations (Sidle and Ochiai, 2006;Baum et al., 2010). As pointed out by Borga et al. (2002), concluding that there is an "overrepresentation" of areas potentially subject to shallow landsliding can be misleading because the absence of mapped landslides does not necessarily indicate an absence of landslide hazard over time across the landscape. Locations with high landslide probability outside mapped landslides in both simulations could be indicators of where to conduct additional investigations for missed landslides or areas on the verge of failing.
Validating hazard maps is challenging, especially in large areas of remote mountainous regions because inventories are typically incomplete and lack the date of landslide occurrence, different landslide types likely have different meteo- rological triggers, environmental conditions change after a landslide event, and unidentified high probability areas may fail in the near future even though they appear to be stable during an inventory (van Westen et al., 2006;Tarolli and Tarboton, 2006). Additional evaluation of model performance would benefit from field investigation in areas of high and low modeled P (F ) to identify any landslides or instability that may have been missed during the original inventory. Future work that couples the volume of sediment available for landsliding will lead to further improvements in estimating hazards and potential impacts from landslides.

Model limitations
For model design and computation efficiency, we made several simplifying assumptions. We neglect groundwater leakage to the bedrock in recharge estimation and apparent soil cohesion through the effect of surface tension in unsaturated zones (e.g., Lepore et al., 2013), both of which could be added to future updates to the component. Tree and snow surcharge is also disregarded, although it may have some stabilizing effect where soils are shallower than 1 m (Hammond et al., 1992). Our approach does not simulate the actual number of landslides, the landslide type, or the size of the landslide because the discretized nature of the failure field precludes specific knowledge of which and how many grid units may be involved in a failure at a particular time. These model omissions present opportunities for future customization of the component or coupling with other models. Modeled probability does not capture the run-out of debris avalanches, which can travel considerable distances in steep mountainous environments. Some unexpected results depicted higher probability in run-out portions of some debris avalanches when using SSURGO-SD, but these probabilities were lower when M-SD scenarios were used (e.g., Fig. 8, middle zoomed-in panels). Mis-mapping of probabilities of failure and observed landslide is likely attributed to variations in soil depth, material properties, and hydrologic routing (Schmidt et al., 2001). Model variables such as slope derived from DEMs developed with post-landslide mapping can also contribute to reduced probabilities in observed landslides where slope and soil depth were reduced. Furthermore, inventories over broad areas are challenging as landslides are isolated processes that may occur with regularity but may not be large in size (van Westen et al., 2006). Finally, steadystate flow, which we used for subsurface flow, neglects transient processes and roles of macropores. Macropores from decayed roots or animal activity can be important in transporting water relatively quickly from the surface to deeper soil layers and groundwater (Sidle et at., 2001;Gabet et al., 2003;Beven and Germann, 2013).

Conclusion
We develop a regional model of probabilistic shallow landslide initiation based on the infinite-slope stability equation coupled by steady-state subsurface hydrology driven by groundwater recharge. Uncertainty in model parameters is explicitly accounted for through Monte Carlo simulation. A geomorphic soil evolution model provides a spatially distributed soil depth alternative to homogeneous patches of soil depths provided by SSURGO. This feature allows the landslide model to be used where soil depth information is un-certain, sparse, or absent. Our model workflow developed in Landlab  is made up of a landsliding component, a Landlab utility for hydrologic data processing, and a model driver that runs the component. The model driver can be run on personal computers or online via HydroShare through cloud computing creating reproducible results. Our approach demonstrates the following points.
-Regional maps of landslide hazard produced with three different soil depth scenarios reveal alternative simulations of probability of landslide initiation, reflecting the importance in soil depth in landslide hazard prediction.
-Simulations using SSURGO-SD returned a higher probability of failures and shorter return periods than simulations using modeled soil depth products (M-SD and M-SD LT). The M-SD LT simulation further reduces the probability of failure and increases the return period. Mean annual denudation estimates from the M-SD LT scenario show closer estimates to published rates of denudation over the last millennia than the other simulations.
-The SSURGO-SD scenario provides a short-term tool for high-risk planning using conservative estimates of probability of failure, while M-SD LT provides longterm estimates arguably more consistent with landslide frequency in the region and useful for the management of ecosystems and aquatic habitats and the estimation of sediment budgets for watershed planning.
-Elevation-dependent patterns in the probability of landslide initiation show the stabilizing effects of forests at low elevations, an increased landslide probability with forest decline at mid-elevations (1400 to 2400 m), and soil limitation and steep topographic controls at high alpine elevations and in post-glacial landscapes. These dominant controls manifest themselves in a bimodal distribution of spatial annual landslide probability and peaks controlled by highly stable forested and chronically unstable post-glacial domains and other barren areas. This suggests that potential declines in forest cover with climate change could lead to widespread landslide activity.
-Model confirmation with limited observations revealed similar model confidence for the three hazard maps, suggesting suitable use as relative hazard products. Validation of the model with observed landslides is hindered by the completeness and accuracy of the inventory, estimation of source areas, and unmapped landslides.
-Our shallow landslide hazard model provides regionalscale estimates of the relative annual probability of shallow landslide initiation as well as the landslide return period, which is useful for civil protection through land use planning to minimize geohazard consequences from precipitation triggers.
Code and data availability. To facilitate ease of use of the landslide hazard model, we developed the landslide model as a component of Landlab, an open-source Python toolkit for two-dimensional numerical modeling of earth surface dynamics available at GitHub: http://github.com/landlab/landlab . Documentation, installation instructions, and software dependencies for the entire Landlab project can be found at http://landlab.github.io/. The Landlab project is tested on recent-generation Mac, Linux, and Windows platforms using Python versions 2.7, 3.4, and 3.5. The Landlab modeling framework is distributed under an MIT opensource license. A component user manual and driver scripts for the application of the Landlab LandslideProbability component can be found at https://github.com/RondaStrauch/pub_strauch_etal_esurf or in Strauch et al. (2018). Online access to the Landlab LandslideProbability model is freely provided through https://www.hydroshare.org, where data and code drivers are available to demonstrate and explore the model using interactive IPython notebooks in a JupyterHub. Thus, users can access, test, adapt, and apply the landslide model for their area of interest without downloading Landlab or the components. The data and driver code used in this analysis are available at Hy-droShare (Strauch et al., 2017(Strauch et al., , 2018. Existing demonstration driver codes can be adapted to fit data provided in raster format by the user to create distributed data fields used as variables in the component. Instructions for accessing HydroShare and the online demonstrations, codes, and data used in this paper are provided in the Supplement.
and Spatial Studies through their maintenance and support for our use of the ROGER Supercomputer at the National Center for Supercomputing Applications (NCSA) at the University of Illinois at Urbana-Champaign.
Edited by: Greg Hancock Reviewed by: two anonymous referees