Bumps in river profiles : uncertainty assessment and smoothing using quantile regression techniques

The analysis of longitudinal river profiles is an important tool for studying landscape evolution. However, characterizing river profiles based on digital elevation models (DEMs) suffers from errors and artifacts that particularly prevail along valley bottoms. The aim of this study is to characterize uncertainties that arise from the analysis of river profiles derived from different, near-globally available DEMs. We devised new algorithms – quantile carving and the CRS algorithm – that rely on quantile regression to enable hydrological correction and the uncertainty quantification of river profiles. We find that globally available DEMs commonly overestimate river elevations in steep topography. The distributions of elevation errors become increasingly wider and right skewed if adjacent hillslope gradients are steep. Our analysis indicates that the AW3D DEM has the highest precision and lowest bias for the analysis of river profiles in mountainous topography. The new 12 m resolution TanDEM-X DEM has a very low precision, most likely due to the combined effect of steep valley walls and the presence of water surfaces in valley bottoms. Compared to the conventional approaches of carving and filling, we find that our new approach is able to reduce the elevation bias and errors in longitudinal river profiles.


Introduction
Rivers play a dominant role in the topographic evolution of the Earth surface and possibly other planetary bodies (Hack, 1957;Howard, 1998;Whipple et al., 2013).They transfer sediment from mountains to depositional basins, set the base level for hillslopes, and convey tectonic and climatic signals across landscapes.The geometry of a river, specifically its longitudinal profile, reflects climatic and tectonic forcing, variations in base level and sediment transport processes, and differences in bedrock erodibility.Gradients along rivers, for example, reflect spatial variations in uplift rates (Whipple et al., 2013;Mudd et al., 2014;Scherler et al., 2014) and indicate the extent of past glaciations (Brardinoni and Hassan, 2006).Moreover, they can act as predictors for the zones of erosion and sediment accumulation during extreme events (Devrani et al., 2015) and reflect the repeated impact of mass-wasting events (Korup, 2006).Longitudinal river profiles and metrics derived from them (e.g., the normalized channel steepness metric k sn ; Wobus et al., 2006) have become important tools for studying the topographic evolution of mountain belts and deciphering changes in climate and tectonics (Bishop et al., 2005).
Data on longitudinal river profiles are usually derived from digital elevation models (DEMs; Gonga-Saholiariliva et al., 2011;Wobus et al., 2006).Different sensors and techniques, such as radar imaging, stereoscopy of optical imagery, terrestrial and airborne laser scanning, and structure from motion (SfM), provide DEMs at ever-increasing spatial resolution with unprecedented coverage.However, DEMs contain errors due to uncertainties in data acquisition, interpolation techniques, and spatial resolution (Fisher and Tate, 2006;Purinton and Bookhagen, 2017).Structures such as bridges, culverts, and reservoirs affect longitudinal river profiles derived from DEMs in ways that can either hide features present in reality or introduce patterns that do not represent the actual course of the profile (Schwanghart et al., 2013).Deviations from graded river profiles may reflect signals of climatic or tectonic perturbations, but often they relate to errors and artifacts that generate bumpy river profiles and thus introduce uncertainties that may compromise the interpretation of longitudinal river profiles (Harbor et al., 2005;Hayakawa and Oguchi, 2006;Wobus et al., 2006).
The objective of this study is to characterize and quantify the uncertainties of elevation values in longitudinal river profiles derived from near-globally and publicly available DEMs, including the new TanDEM-X DEM (Wessel, 2016).To achieve this goal, we devise new algorithms (quantile carving and constrained regularized smoothing) that use nonparametric quantile regression for assessing uncertainties and smoothing river profiles.Using lidar DEMs as benchmark data and our new algorithms, we study how longitudinal river profiles derived from globally available DEMs (Table 1) are affected by errors and how these errors depend on the topographic setting.Moreover, we examine the best choice of parameter values for our algorithms to guide their application.Our algorithms will aid the visual interpretation and automated analysis of longitudinal river profiles and have additional applications in hydrodynamic modeling.

DEM artifacts and the longitudinal river profile
DEMs are the product of a processing chain that converts spatially discrete altitude measurements into a simplified representation of the Earth surface (Wechsler and Kroll, 2006).The processing chain typically involves ground-based, airborne, or satellite-based data acquisition, post-processing, and georeferencing.Subsequent interpolation and filtering operations usually convert these measurements into gridded datasets (Fisher and Tate, 2006).Because each processing step contains assumptions and analytical uncertainties, DEMs are not perfect representations of the land surface but contain errors that propagate through DEM-derived products (Fisher and Tate, 2006;Holmes et al., 2000;Reuter et al., 2009;Schwanghart and Heckmann, 2012).
DEM errors are either random or systematic.Random errors typically occur due to uncertainties related to instrument precision.Radar-derived DEMs, for example, include speckle and thermal noise as well as timing and positioning errors (Falorni et al., 2005;Rizzoli et al., 2012).Random errors may also be autocorrelated; i.e., they cluster spatially (Oksanen and Sarjakoski, 2006).Systematic errors are specific to the type of acquisition method.Lens distortions may affect DEMs generated from both photogrammetric and SfM methods (James and Robson, 2014).Foreshortening, layover, and shadowing produce errors and voids in radarderived DEMs (Falorni et al., 2005).Systematic errors are also found in DEMs interpolated from topographic contour lines (Guth, 1999;Wobus et al., 2006).While random errors tend to hide deterministic structures in the data, systematic errors may lead to wrong interpretations inferred from structures not present in reality.
Providers of DEM data usually describe the accuracy of their product through comparison with ground control points.More detailed error analyses, however, reveal the spatial variability of height errors in DEMs (Gorokhovich and Voustianiouk, 2006;Scherler et al., 2008).Global accuracy statistics are thus often insufficient for information about the uncertainties of DEMs (Carlisle, 2005) and may lead to grave underestimation of the uncertainties in DEM-based modeling results (Canters et al., 2002;Hancock et al., 2006;Nardi et al., 2008).
Rivers present only a minor portion of the cells in a DEM, but this portion is particularly prone to certain types of errors (Guth, 2006;McMaster, 2002).Elevation anomalies along the valley bottoms of the Pieniny Mountains in Poland were found to range from −39 to +145 m in the X-band SRTM and in the ASTER GDEM from −52 to +88 m (Czubski et al., 2013;Ludwig and Schneider, 2006), thus vastly exceed-ing reported DEM accuracies (Jarvis et al., 2008;Tachikawa et al., 2011).Low DEM accuracy along rivers and valley bottoms can have several causes.In steep terrain, the shallow viewing angles of radar systems can lead to shadowrelated data loss.Although this problem can be mitigated by combining data from ascending and descending paths, it often cannot be completely eliminated (e.g., Rabus et al., 2003) and leads to systematically higher valley bottom altitudes in the C-Band SRTM DEM compared to the ASTER GDEM (Hayakawa et al., 2008).Difficulties in penetrating through thick riparian vegetation cause additional problems with DEMs derived from radar systems (Baade and Schmullius, 2014) and optical imagery (Gesch et al., 2012).As a consequence, patches of dense floodplain vegetation are likely to increase DEM values along river reaches, particularly where they are narrow.Although lidar-derived DEMs are often comprised of benchmark data in terms of accuracy and resolution, gridded elevations along valley bottoms may still contain errors due to specular reflection and signal absorption by water (Malinowski et al., 2016).
In addition, DEM resolution limits the accurate representation of valley bottom heights.Where valley bottom width is less than twice the pixel size, DEMs may not capture the actual cross-sectional shape correctly (Hengl, 2006).Finally, DEMs are simply unable to represent embankment underpasses such as bridges or culverts (Lindsay and Dhun, 2015).Unless manually or automatically removed, these features appear as positive excursions in the longitudinal river profiles, particularly at high spatial resolution (lidar-derived DEMs) and in human-altered landscapes (Schwanghart et al., 2013;Sofia et al., 2013).
Accurate representation of drainage structures can be obtained by algorithms that interpolate DEMs from scattered data points and contour lines (Hutchinson et al., 2011).Most of the abovementioned errors, however, cannot be avoided during DEM generation and require post-processing steps.Flow path enforcement (Lindsay, 2016) is a class of algorithms to establish flow connectivity between all pixels within a DEM and its edges.While these algorithms primarily adjust planform flow patterns, they can also be used to modify DEM values.For example, the flood filling of sinks ensures that each pixel has at least one neighboring pixel with equal or lower elevation (Fig. 1a).The drawback of this approach is that it creates potentially large flat areas where topographic information is lost, and it leads to unrealistic flow patterns imposed on the filled topography (Soille et al., 2003) that ultimately affect longitudinal river profiles (Watson et al., 2015).In contrast, carving (Fig. 1b) is a procedure to cut through blockages, typically by imposing the constraint that while moving downstream, no pixel is higher than its upstream neighbors (Lindsay, 2016;Schwanghart et al., 2013;Soille et al., 2003).In addition, there are combinations of carving and filling that minimize the costs (i.e., measured by the total amount of DEM modifications) of transforming the DEM during flow enforcement (Lindsay, 2016;Soille, 2004).However, no matter which flow enforcement approach is used, the derived river profiles will most certainly contain zero-gradient sections (Nardi et al., 2008) with abrupt steps at their downstream-facing side.Distinguishing such "artificial riffle-pool sequences" from actual ones is often not easy (Hayakawa and Oguchi, 2006).
Common approaches for removing errors in river profiles include running average filters (Hayakawa and Oguchi, 2006), contour interval subsampling (Wobus et al., 2006), smoothing splines (Harbor et al., 2005), and locally weighted regression (Aiken and Brierley, 2013).However, all of these approaches face similar problems: they fail to remove spikes if the length of the moving window is too small and average out potentially important features if the window is too large (Aiken and Brierley, 2013).Furthermore, conventional smoothing algorithms are typically applied to individual river reaches rather than entire river networks, which may lead to sharp discontinuities at river confluences.Finally, even robust approaches that aim at reducing the influence of individual outliers on the smoothed curve (Aiken and Brierley, 2013) may yield profiles that contain sections with downstream increasing elevation that results from long sections where valley bottom elevations are overestimated.An example of how outliers may generate downstream increasing elevations in profiles smoothed by a running average is shown in Fig. 1c.Harbor et al. (2005) used an iterative procedure that lowers downstream locations and raises upstream locations to correct such sections but thus produced river profiles may not be consistent with the data.
What is thus needed is an approach that (i) filters DEM artifacts from longitudinal river profiles while ensuring downstream decreasing elevations and that (ii) accounts for the fact that river elevations are usually overestimated.Moreover, the approach should be applicable to entire river networks instead of single rivers only and should be flexible enough to deal with breaks in along-river slopes where these actually exist.

The CRS algorithm
We wish to transform a bumpy river profile into a smoother profile so that it ideally contains elevations and along-river gradients that approximate those in reality.Common approaches such as the running average and local regression -possibly combined with flow enforcement techniques -insufficiently cope with systematic biases and outliers and may even produce profiles that increase in the downstream direction (Fig. 1).Our approach avoids these drawbacks and combines flow enforcement and smoothing in one procedure.
We developed a nonparametric quantile regression that we term the CRS (constrained regularized smoothing) algorithm.Our approach exploits the network topology of river networks (see Appendix A1) and estimates the τ th quantile function Q Z|X (τ ) of the random variate elevation Z con- ditional on the distance X in the upstream direction from the outlet.We chose a quantile regression rather than a least squares approach (see Appendix A2) despite the computational appeal of least squares.Quantile regression makes no assumptions about the underlying probability distribution of the observational noise and is more robust to outliers.Moreover, quantile regression provides the opportunity to obtain a comprehensive estimate of the distribution of Z (Koenker, 2010) and is therefore suited to applications that are interested in more than the mean as a single sample.
We propose a nonparametric approach to regression so that the river profile does not take a predetermined shape.River profiles are usually too variable to be characterized by a functional model (e.g., the stream power model) and any strict assumptions about profile form, for example a concave upward steady-state profile, would limit the range of potential profiles the algorithm could be applied to (Shepherd, 1985).In turn, a nonparametric free-form solution entails the inference of a very large number of parameters.In fact, there are as many parameters as elevation values in the profile such that the problem is ill posed.Regularization refers to the approach of including additional constraints to the solution of the problem.Here we use regularization to encode the preference for local smoothness and spatial autocorrelation (Sivia and Skilling, 2006) of the river profile, an assumption that is vital for any kind of spatial analysis (De Smith et al., 2007).Given that there is usually only one realization of Z for each value of X, assumptions about local smoothness are indeed necessary to make any statements about the distribution of Z(X).However, this requirement can be relaxed if we force profiles to decrease in the downstream direction.Here we first show how such a monotonicity constraint leads to an approach that we term "quantile carving" before we proceed with the CRS algorithm.
Commonly, quantiles are derived by sorting data and determining the values that separate the data into the desired proportions.However, quantiles can also be defined as an optimization problem of minimizing a sum of symmetrically (in the case of the median) or asymmetrically weighted absolute residuals (Koenker, 2010).If the τ th quantile function is Q z (τ ) = I z τ with I being the identity matrix, then z τ is found by minimizing the argument arg min where ρ τ is the loss function, and I is an indicator function that has a value of 1 if (z − Iz τ ) < 0 and 0 otherwise (Koenker, 2010).At this point, however, z τ is not conditional on distance x, but calculated independently for each point along the river profile.Unless multiple samples exist for each location, which is usually not the case, z τ equals z.However, we can explicitly condition z τ by adding prior knowledge about river geometry, i.e., by constraining elevations to be monotonously decreasing downstream (z τ (x) is greater than or equal to any downstream elevation).Formally, this is achieved by minimizing Eq. ( 1) subject to the inequality constraint z τ (x) ≥ z τ (x − δx), which is a minimization problem that can be solved by linear programming methods (Koenker, 2010; see Appendix A3).We refer to this approach of processing the longitudinal river profile as "quantile carving" and show examples of its application in Fig. 2. For quantiles approaching 0 and 100 %, the resulting profiles will follow those of the commonly used hydrological conditioning approaches carving and filling, respectively.Thus, carving and filling are extremal approaches and quantile carving provides a statistical framework that links the two.
A disadvantage of the commonly applied techniques of carving and filling is that they often introduce zero-gradient sections in the profile that are typically separated by artificial steps (Fig. 1).The same applies to quantile carving and disrupts our aim of a realistic representation of the river profile (Fig. 2).A smoother profile is characterized by less local variability and is devoid of sharp kinks.Given a longitudinal river profile, there are many ways of measuring how rough or wiggly the profile is.An intuitively appealing measure is the integrated second derivative z τ (x) 2 dx, which is not affected by the addition of a constant or linear function and has considerable computational advantages (Green et al., 1993).The CRS algorithm expands on quantile carving by determining the cost of a particular solution not only by its goodness of fit, but also by its roughness.This entails additionally minimizing the integrated second derivative so that the optimization involves a quadratic problem: The scalar s determines the weight of the integrated second derivative and is calculated by Eq. ( 4) where x is the spatial resolution of the DEM, n is the number of nodes in the network and p is the number of second derivative equations: Larger values of the smoothing parameter K result in smoother profiles (Fig. 3).We include the squared spatial resolution into the equation so that any linear transformation (e.g., distance scaling) of the river profile does not affect the smoothing results.Yet, this entails that profiles with different spatial resolutions must be smoothed with different values of K to obtain the same or at least similar results.If two profiles A and B have different spatial resolutions x A and x B , then smoothing both profiles will return similar results if the smoothing parameter K B is calculated from K A by Eq. ( 5).
The quadratic term in Eq. ( 3) renders the optimization problem suitable for quadratic programming (Takeuchi et al., 2005).By formulating the CRS algorithm as a quadratic programming problem (see Appendix A4), we obtain high flexibility for setting and relaxing constraints on the resultant profiles.For example, we may want to force elevations at particular locations to take on predefined values that were measured in the field and are precisely known.In other locations, such as waterfalls and rapids, dams, or river confluences, profiles may actually exhibit high curvatures that should remain in the smoothed profile.Provided that these locations are precisely known, the curvature constraint can be locally relaxed or entirely removed by setting variable values of K. Thus, unlike previous approaches, the CRS algorithm allows for the incorporation of prior knowledge about river-profile geometry and independent observations.Longitudinal river profiles of the Big Tujunga catchment (San Gabriel Mountains, USA) derived from the SRTM-1 are shown in Fig. 3.The profile values show large fluctuations that are particularly high in the central part of the profile where hillslope gradients adjacent to the river are highest.The results of the CRS algorithm for different values of K and τ are shown in Fig. 3b-e.Thus derived profiles are monotonously decreasing downstream while filtering the wiggles.The amount of smoothing is dictated by K.For K = 1 and τ = 0.5, the profile contains various steps and these steps remain for different values of τ .A higher value of www.earth-surf-dynam.net/5/821/2017/Earth Surf.Dynam., 5, 821-839, 2017 K more effectively filters the profile, but strongly influences the range between the 10th and 90th percentiles at some locations.The tributary in Fig. 3d and e lacks the small-scale variability such that the interquantile range derived with K = 1 is very low compared to K = 10.Any uncertainty estimate based on the interquartile range must thus depend on K.

Methods and data
Our goal in this study was to (1) assess the quality of near-globally and publicly available DEMs (hereafter termed global DEMs) for the analysis of longitudinal river profiles and to (2) examine the best choice of parameter values of the CRS algorithm and compare its performance with the commonly used approaches of filling and carving.This allows us to report ranges of parameter values that will guide the application of the CRS algorithm in other areas.
We addressed the first point by comparing river profiles obtained from global DEMs (SRTM-1, ALOS World 3D30 (AW3D), ASTER GDEM v2, TanDEM-X DEM; see Table 1, Fig. 4).Our test sites are the Yakima catchment in the undulating to hilly landscape of the Washington Mountains, USA, the Big Tujunga catchment in the hilly to mountainous landscape of the San Gabriel Mountains, USA, and the Kali Gandaki catchment in the mountainous landscape of the High and Lesser Himalayas, Nepal.The TanDEM-X DEM is available to us only for the Kali Gandaki catchment.We first compared the global DEM with those of lidar DEMs, where available, to obtain estimates of absolute errors.The lidar DEMs are bare earth DEMs derived from point clouds (> 3 points m −2 ) and were downloaded from the OpenTopography facility (Table 1).Due to their decimeter to subdecimeter accuracy, we consider the lidar data as our benchmark DEMs.The acquisition dates of the DEMs vary but we expect the temporal changes in the land surface to be minor.We used the reference ellipsoid defined by the world geodetic system WGS 84 as a basis for all DEMs and resampled them to the same spatial extent and resolution (30 m) using bilinear interpolation.We then derived flow directions and stream networks for selected channel heads from the downsampled lidar DEMs and obtained elevations along those networks from the global DEMs.This approach may overestimate errors in DEMs due to resampling and because lateral patterns of stream networks derived from different DEMs may differ.However, 2-D cross-correlation of global DEMs with the lidar DEMs did not reveal any systematic horizontal shifts between the DEMs.Due to the lack of benchmark data, we could not quantify absolute errors for the Kali Gandaki.Thus, we next determined relative errors (precision) along the river profiles by smoothing the profiles with the CRS algorithm.We chose the median (τ = 0.5) to obtain profiles along the local central tendency of the data and set the smoothing parameter K = 5, which we found by visual analysis to be suitable for filtering the data while sufficiently reproducing the structure of the profiles.We quantified the precision using the deviations of the profiles from the CRSsmoothed profiles.Because the study site in Nepal represents a cross section through the Himalayas with pronounced dif- ferences in topographic relief, we also assessed the spatial variability of precision moving down the river.Specifically, we tested whether median hillslope gradients within a distance of 1000 m of the Kali Gandaki exert any influence on the precision of river profiles.
To address the second point of our study, we used a derivative-free simplex search method to find values of the parameters K and τ that yield CRS-smoothed profiles that best approximate the lidar-derived profiles.Since good agreement of profiles is largely determined by both similar elevations and gradients, we defined the objective function f to be minimized as a function of the elevation (z i ) and gradient offsets (g i ) between the profiles: Again, we have to restrict our analysis to the Yakima and Big Tujunga catchments due to the unavailability of benchmark data for the Kali Gandaki catchment.Finally, we analyzed the sensitivity of profiles derived by the CRS algorithm to changes in the values K and τ .Specifically, we studied how variations in both parameters affect the goodness of fit between pixel-by-pixel elevations and alongriver gradients derived from the smoothed profiles and the lidar benchmark DEMs.
We implemented the proposed algorithms in TopoToolbox 2, which is software for the analysis of DEMs (Schwanghart and Scherler, 2014).The software is written in MAT-LAB and thus has direct access to a numerical library of optimization routines usually lacking in standard GIS software.To minimize Eqs. ( 1) and (3), we used the functions linprog and quadprog of the MATLAB optimization toolbox, which solve linear and quadratic problems, respectively (see also Appendix A1-A4).

Results
Elevation offsets between the global DEMs and the lidar data differ between sites and DEM types (Fig. 5).All river profiles are consistently biased towards higher altitudes compared to the lidar data (Table 2), but more so in the steeper terrain of the Big Tujunga catchment compared to the Yakima dataset.The observed bias does not affect the entire DEM but approaches zero on hillslopes (Fig. 6), so we infer that low accuracy particularly affects valley bottoms.Based on kernel-derived probability density distributions, the quantiles of the zero offset are less than the median (except for AW3D in Yakima) and lower in steep terrain.Despite predominant overestimation, negative offsets also exist and have minimum values that range from −3 to −45 m.We observe maximum absolute deviations of up to 49 m that are usually positive as indicated by positive skewness values for the SRTM-1 and AW3D datasets.An exception is the ASTER GDEM in which offsets are inconsistently skewed towards positive and negative values.In general, the differences between the ASTER-GDEM-derived and lidar-derived profiles are highly variable as indicated by a high standard deviation in both sites.
Deviations between the original and CRS-smoothed profiles from global DEMs further highlight errors in profiles obtained from different DEMs at different sites, including the site in central Nepal for which lidar data are unavailable (Fig. 7).With increasing topographic relief, residuals from the median profile are increasingly biased towards poswww.earth-surf-dynam.net/5/821/2017/Earth Surf.Dynam., 5, 821-839, 2017   itive values (Table 3).At the same time, the higher root mean square error (RMSE) shows that increasing relief results in reduced precision.Elevation values from the AW3D dataset show the highest precision at all sites, whereas the TanDEM-X DEM has the highest RMSE (Table 3, Fig. 8).Note that we only took elevation values from the TanDEM-X DEM into account that are flagged to be consistent, as indicated in the consistency mask that is provided together with the DEM data (Wessel, 2016).The above observations beg the question of what topographic parameters control the precision of longitudinal river profiles.Our data from central Nepal show that error magnitudes, estimated as the 95-5 % interquantile range, vary spatially (Fig. 9).In particular, there are sections along the Kali Gandaki where elevations derived from all DEMs have low precision (Fig. 9a, b).Hillslope gradients within 1000 m of distance from the river, measured and averaged within 10 km long river reaches, correlate with precision (Fig. 9c).This indicates that all DEMs have problems in precisely representing the bottom of valleys straddled by steep canyon walls.Again, the precision differs strongly between the DEMs, with  ASTER GDEM and TanDEM-X DEM having the lowest precision.
Based on the lidar data from the Yakima and Big Tujunga catchments, we found different optimal values of K and τ for the different DEMs (Table 4).The values of K and τ are particularly large for the ASTER GDEM, which is quite noisy along the Yakima River; τ values are always less than 0.5 and lower in the Big Tujunga, which is consistent with the percentile value τ 0 (0.01-0.55) determined previously (Table 2).In general, the CRS algorithm reduces the deviations from the benchmark profile by lowering the RMSE by a factor of 1.4-2.0.Compared to profiles derived from the carving and filling algorithms, the CRS algorithm yields considerably lower RMSE.
The improved agreement between the profiles derived from the CRS algorithm is also reflected by a higher pixelby-pixel correlation of the gradients of CRS-derived profiles and those of the benchmark DEMs. Figure 10 shows a comparison of gradients derived from the global DEMs and preprocessed by using different methods for the Yakima catchment.Profiles without preprocessing exhibit the largest scatter.The scatter is most effectively decreased by the CRS algorithm as it reduces the number of zero gradients and artificial steps compared to the methods of filling and carving.
Figure 11 illustrates the results from a sensitivity analysis of the CRS-derived profiles to changes in K and τ for the Yakima catchment and the AW3D DEM.The sensitivity analysis shows that increasing values of K can excessively www.earth-surf-dynam.net/5/821/2017/Earth Surf.Dynam., 5, 821-839, 2017  smooth the profile and thus impair the agreement between the profile elevations and gradients.Very low (< 0.01) and high (> 0.099) values of τ can also lead to grave mismatches of the smoothed profile and the benchmark profile as the quantile approach fails to penalize large deviations as opposed to a least squares approach.The analysis also shows that agreement between elevations or slopes is not necessarily derived from the same set of parameter values.Whereas agreement of elevations is largely obtained by varying τ in a rather broad range of K between 0.1 and 3 (Fig. 11a), agreement of gradients is largely within a narrow range of K and a broad range of τ between 0.3 and 0.8 (Fig. 11b).Fitting profiles to benchmark data may thus significantly depend on whether the objective function minimizes the differences between river elevations or gradients (Fig. 11c).

Positive bias of elevation in river profiles
Different methods of data acquisition and DEM generation have in common that DEM errors along valley bottoms are usually biased towards positive values (Hayakawa and Oguchi, 2006).Our analysis shows that this bias is more pronounced in steep terrain, which calls for caution when interpreting longitudinal river profiles in mountainous landscapes.
The bias may at least partly be due to the coarse resolution that averages over finer-scale topographic variability.As we resampled our benchmark lidar DEM to the resolution of our tested DEMs and there are still large differences to global DEMs, additional causes of the observed bias must exist that relate to the type of data acquisition and field site characteristics (Vaze et al., 2010).While we cannot resolve the underlying causes in detail, our results suggest that biases affect all DEMs, although to different degrees.

Precision of river profiles derived from global DEMs
In addition to the bias, the distribution of elevation errors in the longitudinal river profiles becomes increasingly wider and right skewed in landscapes of higher relief.As relief increases, the profiles obtained from all global DEMs have decreasing precision, but there is a more than fivefold difference between the precision of the different DEMs.In all three sites, the AW3D dataset has the highest precision, followed by the SRTM-1, which is a finding consistent with other DEM assessments (Purinton and Bookhagen, 2017).The precision of the ASTER GDEM is particularly low compared to SRTM-1 (Czubski et al., 2013) and AW3D.However, this may depend on location as the ASTER GDEM quality is sensitive to a number of issues that vary spatially.This includes the number of stacked stereo-pair scenes for the DEM generation, land and cloud cover, water masking, and registration quality (Purinton and Bookhagen, 2017).In our study site in Nepal, the precision of river profiles derived from the TanDEM-X DEM is much lower than the reported 10 m absolute and 2 m relative height accuracy (Gruber et al., 2012b;Purinton and Bookhagen, 2017).Low accuracy along river profiles may relate to water bodies that are generally incoherent areas in the underlying DEM scenes and have height estimates that are known to be noisy and inaccurate (Wessel, 2016).The resolution of 12 m and high accuracy on hillslopes of the TanDEM-X DEM (Purinton and Bookhagen, 2017) will offer new opportunities, but whether and how TanDEM-X DEMs will offer any advantage for the analysis of river profiles in high mountain areas compared to previous DEMs with lower resolution needs further study (Gruber et al., 2012a).

Error distribution and the CRS algorithm
The asymmetric distribution of residuals around the median (and mean) violates the assumptions of methods for filtering height errors that rely on least squares, such as local regression techniques.Moreover, the large absolute values of outliers further complicate the application of methods that rely on averaging (e.g., running average filters).Although positive bias is on average more common, our results show that 10-50 % of the profile values underestimate the actual river elevation.This underscores potential problems with us-ing carving to hydrologically trim DEMs.Our new quantile carving and CRS algorithms implement a quantile regression that remains largely unaffected by skewed data and outliers and thus prevents very low or high values from gaining excessive weight for shaping the longitudinal profile.
Our comparison of river profiles derived from global DEMs with those derived from lidar shows that the CRS smoothing of river profiles can decrease differences to actual river elevations and gradients compared to the commonly used methods of filling and carving.We derived optimal values of the smoothing parameter K and the quantile τ by comparing the smoothed profiles with lidar data that were available for subsets of the data, and this may be a common approach where high-resolution, precise, and accurate data are available for parts of the study area.Our results indicate, however, that optimal parameter values differ depending on whether the optimization's objective function minimizes errors between benchmark and smoothed elevations or gradients.Thus, the choice of objective function depends on is specific uses and is crucial for calibrating the parameters of the CRS algorithm.
Where benchmark data are unavailable, the most suitable value of K depends on the amount of scatter in the profiles, the true roughness (e.g., riffle-pool sequences in bedrock rivers vs. the relatively smooth profiles of alluvial rivers), and the type of application.One-dimensional hydraulic simulations, for example, may retain some of the variability of channel gradients while attempting to avoid the influence of erroneous data and thus choose a low value of K.In many ap-  4. Compared to the original profiles and those obtained by using the preprocessing techniques of filling and carving, the CRS algorithm is able to reduce the differences between along-river gradients as indicated by the narrow interquantile (90-10th percentiles) range.
plications, visual interpretation of profiles will aid the identification of suitable values of K and τ .More generally, it may be pragmatic to let K be as large as possible while satisfying the constraints of the data (Sivia and Skilling, 2006).Although there are ways to determine an optimal value of K by using cross-validation approaches (Garcia, 2010), qualitatively and visually cross-checking may be the most practical approach in many cases: too-small values of K will admit unnecessary and potentially erroneous structures, whereas toolarge values impede good agreement with the actual river profile (Sivia and Skilling, 2006).

Autocorrelated errors
River profiles are derived from measurements that give rise to errors of different types: random and systematic components and artifacts (Reuter et al., 2009).We have shown that the CRS algorithm can efficiently handle random errors and may reduce offsets that arise from systematic or artifactual deviations between actual river profiles and those derived from DEMs, thus improving an overall representation of longitudinal river elevations and gradients.Caution, however, must be taken with autocorrelated errors that we have not addressed here although they may significantly affect river profiles.Autocorrelation entails that if the true elevation at some location is overestimated in the DEM, then the elevation at a nearby pixel will likely also be overestimated (Temme et al., 2009).Autocorrelated errors have important consequences for the choice of smoothing parameters.In the case of short-range dependence, profiles may not be affected severely if a sufficiently large smoothness parameter is chosen.Long-range dependence, however, is neither easy to detect nor are its structures simply removed by using nonparametric regression.For example, it is difficult to ascribe the stepped patterns in Fig. 3b to actual riffle-pool sequences or to artifacts that should be smoothed (Fig. 3c).Although approaches to nonparametric regression exist that are able to cope with autocorrelated errors (Opsomer et al., 2001), their implementation and assessment were beyond the scope of this study.The CRS algorithm cannot differentiate actual steps in the profiles from artificial bumps.It smooths profiles to remove scatter and thus accentuates the actual patterns in the data.The quantile regression technique enables us to derive uncertainty bounds that can support data interpretation by quantifying DEM errors.

Applications and future developments
The CRS algorithm can be used in various fields of research.Primarily, in studies of river profiles one may wish to remove noise from profiles while preserving underlying patterns and quantifying uncertainty.In tectonic geomorphology, for example, knickzones in longitudinal river profiles are essential proxies for transient river adjustment (Bishop et al., 2005), but distinguishing actual knickzones from data artifacts is challenging (Neely et al., 2017).Interquantile ranges determined by the CRS algorithm provide an objective way to determine minimum elevation drops that a knickzone must have to be identified against the background noise.Hydrodynamic simulations in mountainous environments frequently use globally available DEMs, such as the SRTM or ASTER GDEM (Jarihani et al., 2015;Watson et al., 2015), but erroneous river profiles can introduce numerical instabilities (Paiva et al., 2011) and roughness, thus severely limiting the reliability of flood assessment, hazard zonation, and risk management (Watson et al., 2015).In a recent study on flash flood warning and hazard assessment in the Nepal Himalayas, a preliminary version of the CRS algorithm provided an important preprocessing step to improve the accuracy of estimating flow depth, flow speed, and flood wave arrival times (Bricker et al., 2017).
Applying quantile carving and the CRS algorithm is computationally more expensive compared to carving and filling.Run times depend largely on the number of nodes (or pixels) in a river network (Fig. 12).Thus, we applied our analysis to river networks only and not to entire DEMs, although this is possible in principle provided convergent flow networks (O'Callaghan and Mark, 1984) have been derived.Rapid preprocessing of DEMs with our algorithms should apply parallelization strategies to allocate individual drainage basins or reaches to different processors.Future studies may also want to make use of the high flexibility of the algorithms.For example, values of the quantile τ and the smoothing parameter K could be set for individual pixels or river reaches, thus varying the amount of smoothing spatially.This may be useful if the true variability in river gradients shall be retained where these are known to vary strongly, for example as a function of upslope area or hillslope relief (Fig. 9).Finally, the CRS algorithm is not restricted to elevation profiles, but can also be applied to other variables measured or calculated along river profiles, such as steepness (K sn ) or curvatures, both of which are usually prone to even larger uncertainties.

Conclusions
A comparison of longitudinal river profiles derived from coarse-resolution globally available DEMs and lidar data shows that DEMs commonly overestimate elevations along rivers in steep topography.Moreover, error distributions become increasingly wider and positively skewed if adjacent hillslope gradients are steep.Our analysis suggests that the AW3D DEM has the highest precision and lowest bias for the analysis of river profiles in mountain topography.In contrast, the 12 m resolution TanDEM-X DEM has a very low precision, most likely due to the combined effect of steep valley walls and the presence of water in valley bottoms.
To characterize and remove such systematic and random errors in DEMs, we developed the CRS algorithm for correcting and smoothing longitudinal river profiles.The approach adopts a nonparametric quantile regression, handles entire river networks instead of single river reaches, and ensures downstream decreasing elevations.Compared to the conventional approaches of carving and filling, we find that our new approach is able to reduce the elevation bias and errors in longitudinal river profiles.
Code availability.A software implementation of the CRS algorithm is available for the MATLAB software TopoToolbox at https: //github.com/wschwanghart/topotoolbox(Schwanghart and Scherler, 2014).The software allows for the tuning of several parameters.The parameter K dictates the degree of smoothing.The choice of the value of K involves a trade-off between filtering the river-profile variability present in reality and retaining data artifacts.Too-high values of K may introduce new artifacts.Estimating a suitable value of K depends on both the type and magnitude of errors in the data and the type of application.Benchmark data, such as lidar DEMs and precise (differential) global positioning system measurements, can be used to estimate optimal parameter values that provide the best fit between the benchmark data and the DEM.If benchmark data are unavailable, visually cross-checking the results of the algorithm with variable parameters can guide the choice of suitable parameter settings.We developed a graphical user interface in Topo-Toolbox (crsapp) that supports a trial-and-error approach and visual cross-checking.Additional examples and applications of the algorithms can be found at http://topotoolbox.wordpress.com.

Figure 1 .
Figure 1.Detail of a longitudinal river profile and the effects of different techniques to hydrologically correct and smooth the profile.

Figure 2 .
Figure 2. Quantile carving of the longitudinal profile of the Big Tujunga Creek.Quantile carving reconstructs the profile along different quantiles and thus continuously links the two common approaches carving and filling that run along the minima and maxima of the data, respectively.

Figure 3 .
Figure 3. Applying the CRS algorithm to selected streams in the Big Tujunga catchment (a).Panels (b)-(e) show details of the longitudinal river profile and the results for the median and 10-90th percentile for the smoothing parameters K = 1 and K = 10.

Figure 4 .
Figure 4. Sites and longitudinal river profiles in this study.Shaded reliefs are derived from SRTM-1 and colors indicate elevation in meters.

Figure 5 .
Figure 5. Elevation offsets between different DEMs and benchmark lidar DEMs calculated for all river pixels for the Yakima and Big Tujunga dataset (see Fig. 4).

Figure 6 .
Figure 6.The bias of the SRTM-1 in the Big Tujunga catchment approaches zero with increasing distance from the river.

Figure 7 .
Figure 7. Error distributions of longitudinal river profiles along the Yakima, Big Tujunga, and Kali Gandaki calculated by the offset between measured elevations and elevations obtained with the CRS algorithm with τ = 0.5 and K = 5.

Figure 8 .
Figure 8. Profiles derived from different global DEMs for the Kali Gandaki, Nepal.To avoid overlap, we added an offset of 100 m between the different profiles.Horizontal distances along rivers vary and were thus normalized to range between 0 and 1.The inset shows details from the TanDEM-X-DEM-derived profile where the dashed line is the original data, the solid line is the CRSderived profile (K = 10, τ = 0.5), and the shaded area delineates the interquartile range derived with the CRS algorithm (K = 10, τ = 0.25, and 0.75).

Figure 9 .
Figure 9. Planform patterns of DEM errors along the Kali Gandaki (a, b).The interquantile range (IQR) correlates with hillslope gradients adjacent to the river (within < 1000 m of distance; panel c).

Figure 10 .
Figure 10.Comparison of pixel-by-pixel along-river gradients derived from benchmark lidar DEMs (resampled to 30 m) and global DEMs in the Yakima catchment.Negative gradients in the benchmark and unprocessed DEMs are due to erroneously increasing river elevations in the downstream direction.The gray solid lines (thin lines: 10th and 90th percentiles, thick line: median) are derived from a quantile regression.The dashed line is a one-to-one reference line.The parameters of the CRS algorithm (τ and K) are those in Table4.Compared to the original profiles and those obtained by using the preprocessing techniques of filling and carving, the CRS algorithm is able to reduce the differences between along-river gradients as indicated by the narrow interquantile (90-10th percentiles) range.

Figure 11 .
Figure 11.Sensitivity analysis of the CRS algorithm and the smoothing parameter K and quantile τ for the Yakima catchment and the AW3D DEM.(a) Sensitivity of the root mean square error (RMSE) between smoothed elevations and elevations from the lidar-derived benchmark profile.A minimum RMSE is obtained for K = 0.26 and τ = 0.32 (white cross).(b) Sensitivity of the RMSE between gradients derived from the smoothed profiles and the benchmark profile.A minimum RMSE is obtained for K = 0.43 and τ = 0.63 (white cross).(c) Profile of the Yakima and its tributaries in detail (see inset).

Figure 12 .
Figure12.Run time of quantile carving and the CRS algorithm for river networks with different sizes.The number of nodes refers to the pixels in the DEM that constitute river pixels.Tests were run on a Windows 7 system with an Intel Xeon CPU (3.07 GHz, four cores) and 24 GB of installed memory (RAM).

Table 2 .
Statistics of offsets between river-profile elevations derived from different global DEMs and benchmark lidar DEMs; τ 0 refers to the quantile of the profile data in which the offset between lidar and global DEMs is zero.

Table 4 .
Optimal values of K and τ for the CRS algorithm applied to the Yakima and Big Tujunga catchments and different DEMs.The root mean square error (RMSE) is calculated from the deviations from the lidar data.