Supplementary material A: The Bayesian statistics procedure for calibrating the parameters in the modeling system

It has been well know that Bayesian statistics can evaluate the probabilistic risk of water quality management through calibrating unknown model parameters in terms of posterior joint distributions (Qian et al. 2003; Malve and Qian 2006). Bayesian statistics thus has been increasingly applied to calibrate parameters and address uncertainty in water quality model (Zobrist and Reichert 2006; Liu et al. 2008; Faulkner 2008; Shen and Zhao 2010; Chen et al. 2012).

In this study, the Bayesian statistics was applied to calibrate the unknown parameters α (the runoff effect coefficient, dimensionless), βj (TN export potential coefficient for jth land-use category, kg ha-1 per month), γ (the water depth effect coefficient, dimensionless), and η (in-stream TN retention potential coefficient, d-1) in the modeling system (see main article). These unknown parameters are treated as random variables with distributions derived from known information (i.e., ,, Tk, ,, Aj, rk, rm, dk and dm). The Bayes’ theorem can be written as (Qian et al. 2003; Shen and Zhao 2010):

(S1)

where P(θ/X) is the posterior distribution and represents the probability of the unknown model parameter (θ) values given the observed data (X). In our case, the unknown parameter θ refers to α, βj, γ, and η. P(X) is the expected value of the likelihood function over the parameter distribution as a normalizing constant. P(θ) is the prior distribution, the probability density function over all values of θ prior to the observed data. P(X/θ) is the probability density function (likelihood function), which describes the mechanistic and statistical relationships between the predictor (i.e., ) and response variables (i.e., α, βj, γ, and η). The Markov chain Monte Carlo (MCMC) algorithm was applied to obtain the numerical summarization of targeted parameters (Shen and Zhao 2010). There are three major steps in the Bayesian model using the MCMC sampling method (Malve and Qian 2006; Chen et al. 2012): i) formulating the prior probability distributions for targeted parameters, ii) specifying the likelihood function, and iii) MCMC sampling for the posterior probability distributions for targeted parameters.

To obtain the integrated prior distribution of targeted parameters α, βj, γ, and η, a database of the annual TN export coefficients (kg N ha-1 yr-1) for paddy field, dry farmland, residential land and forest and in-stream TN loss rate (d-1) was compiled from the worldwide literature. This resulted in 19–109 records from 45 published studies for the export coefficients and 94 records from 22 published studies for in-stream loss rate (Table 2) (reference sources see Supplementary material C). These studies for TN export coefficients were obtained from field plots, small catchments, large watersheds and national scales worldwide over the past 30 years, representing a broad climate (i.e. subtropical monsoon climate, monsoon climate of mid-latitudes, temperate marine climate, temperate continental climate, etc.) and land-use characteristics (i.e. catchments located in forest, agricultural, and urban areas with different dominant land-use types). These studies for in-stream TN loss rate were obtained from reach segments, single rivers, and entire river system scales, representing a broad hydrology (i.e. with average water depth ranging from <0.1 m to >10 m and average stream discharge ranging from < 0.34 m3 s-1 to > 283 m3 s-1), river morphology, and temperature characteristics. Collectively they incorporated a wide range of spatial and temporal scales. The database of the monthly TN export coefficients for paddy field, dry farmland, residential land and forest was derived from the annual values database through dividing by 12. According to the K–S statistical test, log-transformed monthly export coefficients and sqrt-transformed K for TN were conformably fitted with normal probability distributions (Table S1). These log-transformed normal probability distributions were used as the corresponding prior distributions of TN export potential coefficients for paddy field (βp), dry farmland (βd), residential land (βr) and forest (βf). The sqrt-transformed normal probability distribution was used as the corresponding prior distributions of in-stream TN retention potential coefficient η. There had no prior knowledge available for α and γ, thus their prior distributions were both assumed to follow uniform distributions with the range of (0, 100) (Liu et al. 2008; Shen and Zhao 2010).

Table S1 The prior distributions of seven parameters in the modeling system

Parameters / Mean / S.D. / Distribution / p / n
log(βp) [log (kg N ha-1 per month)] / -0.2710 / 0.5117 / Normal / <0.01 / 19
log(βd) [log (kg N ha-1 per month)] / -0.0987 / 0.5517 / Normal / <0.01 / 109
log(βr) [log (kg N ha-1 per month)] / -0.7160 / 0.4921 / Normal / <0.01 / 36
log(βf) [log (kg N ha-1 per month)] / -0.1047 / 0.5235 / Normal / <0.01 / 51
sqrt(η) [sqrt(d-1)] / 0.5288 / 0.2242 / Normal / <0.01 / 94
α / 0–100 / Uniform / - / -
γ / 0–100 / Uniform / - / -

To specify the likelihood function of α, βp, βd, βr, βf, γ, and η, Eqn. (3), Eqn. (4) and Eqn. (2) were incorporated into Eqn. (S1). The observed stream TN load (, kg per month) was taken as the modeled value with the consideration of random error ε (kg per month), which was formulated as:

(S2)

where ε follows a normal distribution with zero mean and variance of σ2. The variance σ2 was assumed to follow a standard non-informative diffuse inverse-Gamma distribution (1.0×10-3, 1.0×103) (Shen and Zhao 2010). Based on this normally and independently distributed stochastic error term ε, the likelihood function for deterministic function of the modeling system can be expressed as (Shen and Zhao 2010; Chen et al. 2012):

(S3)

where z=72, the number of months over the 6 year for each sub-catchment of the ChangLe River watershed in this case study. Eqn. (S3) describes the properties of the unknown parameters (i.e., α, βp, βd, βr, βf, γ, and η) when the model error (i.e., ε) was minimized, incorporating the uncertainty induced by the input variables. Finally, WinBUGS software (Spiegelhalter et al. 2003), which implements MCMC simulation using the Gibbs sampling method, was used for the Bayesian calibration (see Supplementary material B). To reduce the dependence of the magnitude of unknown parameters on the uncertainty and convergence efficiency, the prior mean values (Table S1) were adopted as the initial values in the WinBUGS software for implementing MCMC simulations with two chains, each with 10,000 iterations.

According to the Bayes’ theorem, it is expected that the modeling performance would be considerably influenced by the prior knowledge. To test the impact of prior knowledge on the parameter estimates, the modeling procedures for the sub-catchment mainstream were also performed under uniform prior distributions for targeted parameters, i.e., βp [0, 10], βd [0, 20], βr [0, 20], βf [0, 10], and η [0, 2], with 10,000 simulations of each chain. Unfortunately, the model seems to fail to be converged, as shown that Monte Carlo errors ranged in 9.5%–48.4% and Rhat ranged in 1.0729–1.7438 for different parameters (at convergence, Rhat≈1, Lunn et al. 2000). Therefore, the prior distributions of TN export coefficients and in-stream TN loss rate offered by in this study, which are derived from a wide range of literatures and incorporate a wide range of climatic and anthropogenic effects, are necessary for targeted parameter estimates in this case study and applicable for many other watershed.

Reference

Chen DJ, Dahlgren RA, Shen YN, Lu J (2012) A Bayesian approach for calculating variable total maximum daily loads and uncertainty assessment. Science of the Total Environment 430:59–67. doi: 10.1016/j.scitotenv.2012.04.042

Faulkner BR (2008) Bayesian modeling of the assimilative capacity component of nutrient total maximum daily loads, Water Resources Research 44:W08415, doi:10.1029/2007WR006638.

Liu Y, Yang P, Hu C, Guo H (2008) Water quality modeling for load reduction under uncertainty: A Bayesian approach. Water Research 42:3305–14. doi: 10.1016/j.watres.2008.04.007

Lunn DJ, Thomas A, Best N, Spiegelhalter D (2000). WinBUGS - a Bayesian modelling framework: concepts, structure, and extensibility. Statistics and Computing 10:325–37. doi: 10.1023/A:1008929526011

Malve O, Qian SS (2006) Estimating nutrients and chlorophyll a relationships in Finnish lakes. Environmental Science and Technology 40:7848–7853. doi: 10.1021/es061359b

Qian SS, Stow CA, Borsuk ME (2003) On Monte Carlo methods for Bayesian inference. Ecological Modelling 159:269–77. doi: 10.1016/S0304-3800(02)00299-5

Shen J, Zhao Y (2010) Combined Bayesian statistics and load duration curve method for bacteria nonpoint source loading estimation. Water Research 44:77–84. doi:10.1016/j.waters.2009.09.002

Spiegelhalter D, Thomas A, Best N, Lunn D (2003) WinBUGS user manual, Version 1.4. http://www.mrc-bsu.cam.ac.uk/bugs/winbugs/manual14.pdf (accessed Dec 2008)

Zobrist J, Reichert P (2006) Bayesian estimation of export coefficients from diffuse and point sources in Swiss watersheds. Journal of Horology 329:207–223. doi: 10.1016/j.jhydrol.2006.02.014


Supplementary material B: The WinBUGS code for the Bayesian calibrating parameters of the modeling system in the ChangLe River watershed.

# Example for nitrogen model in sub-catchment MS of the ChangLe River watershed

model

{

for (j in 1:J)

{

LL[j]<-log(L[j])

LL[j]~dnorm(mu[j], tau)

mu[j]<-log((pow(10,βp[j])*6237.64+pow(10,βd[j])*2940.76+pow(10,βr[j])*1958.42+pow(10,βf[j])*6454.91)*exp(α*r[j]/rm)*exp(-pow(η[j],2)*exp(-γ*d[j]/dm)*t0[j]/2)+L1[j]*exp(-pow(η[j],2)*exp(-γ*d[j]/dm)*t1[j])+L2[j]*exp(-pow(η[j],2)*exp(-γ*d[j]/dm)*t2[j])+L3[j]*exp(-pow(η[j],2)*exp(-γ*d[j]/dm)*t3[j])+L4[j]*exp(-pow(η[j],2)*exp(-γ*d[j]/dm)*t4[j])+L5[j]*exp(-pow(η[j],2)*exp(-γ*d[j]/dm)*t5[j])+L6[j]*exp(-pow(η[j],2)*exp(-γ*d[j]/dm)*t6[j]))

βp[j]~dnorm(-0.2710, 3.5809)

βd[j]~dnorm(-0.0987, 3.2860)

βr[j]~dnorm(-0.7160, 3.6488)

βf[j]~dnorm(-0.1047, 4.1297)

α~dunif(0,100)

γ~dunif(0,100)

η[j]~dnorm(0.5288, 19.8943)I(0, 2)

tau~dgamma(0.01,0.01)

sigma<-1/sqrt(tau)

}

Here, “tau” means the stochastic nodes, which is gamma distribution with the shape and scale is 0.01. “I” means the value interval for the parameter η.


Supplementary material C: Reference sources for TN export coefficients of paddy field, dry land farms, residential land and forest and in-stream TN loss rate.

Alarcon VJ, McAnally WH, Ervin GN, Brooks CP (2010) Using MODIS land-use/land-cover data and hydrological modeling for estimating nutrient concentrations. In Taniar D, Gervasi O, Murgante B, Pardede E, Apduhan BO (Eds.), Computational Science and Its Applications. Springer, Berlin. pp 501–514

Alexander RB, Böhlke JK, Boyer EW, David MB, Harvey JW, Mulholland PJ, Seitzinger P, Tobias CR, Tonitto C, Wollheim WM (2009) Dynamic modeling of nitrogen losses in river networks unravels the coupled effects of hydrological and biogeochemical processes. Biogeochemistry 93:91–116. doi: 10.1007/s10533-008-9274-8

Alexander RB, Elliott AH, Shankar U, McBride B (2002) Estimating the sources and transport of nutrients in the Waikato River Basin, New Zealand. Water Resources Research 38:1268–1290. doi: 10.1029/2001WR000878

Alexander RB, Smith RA, Schwarz GE (2000) Effect of stream channel size on the delivery of nitrogen to the Gulf of Mexico. Nature 403:758–761. doi: 10.1021/es052281m

Alexander RB, Smith RA, Schwarz GE, Boyer EW, Nolan JV, Brakebill JW (2008) Differences in phosphorus and nitrogen delivery to the Gulf of Mexico from the Mississippi River Basin. Environmental Science and Technology 42:822–830. doi: 10.1021/es0716103

Beaulac MN, Reckhow KH (1982) An examination of land use – nutrient export relationships. Journal of the American Water Resources Association 18:1013–1024. doi: 10.1111/j.1752-1688.1982.tb00109.x

Broad ST, Corkrey R (2011) Estimating annual generation rates of total P and total N for different land use in Tasmania, Australia. Journal of Environmental Management 92:1609–1617. doi: 10.1016/j.jenvman.2011.01.023

Cai M, Li HE, Zhuang YT, Wang QH (2004) Application of modified export coefficient method in polluting load estimation of non-point source pollution. Journal of Hydraulic Engineering 7:40–45 (in Chinese)

Chen DJ, Lu J, Shen YN, Dahlgren RA, Jin SQ (2009) Estimation of critical nutrient amounts based on input–output analysis in an agriculture watershed of eastern China. Agriculture, Ecosystems & Environment 134:159–67. doi: 10.1016/j.agee.2009.06.011

Chen DJ, Lu J, Shen YN, Gong DQ, Deng OP (2011) Spatio-temporal variations of nitrogen in an agricultural watershed in eastern China: Catchment export, stream attenuation and discharge. Environmental Pollution 159:2989–2995. doi: 10.1016/j.envpol.2011.04.023

Chen M, Chen J, Sun F (2010) Estimating nutrient releases from agriculture in China: An extended substance flow analysis framework and a modeling tool. Science of the Total Environment 408:5123–5136. doi: 10.1016/j.scitotenv.2010.07.030

Cooke SE, Prepas FF (1998) Stream phosphorus and nitrogen export from agricultural and forested watersheds in the Boreal Plain. Canadian Journal of Fisheries and Aquatic Sciences 55:2292–2299. doi: 10.1139/f98-118

Cornish PS (2003) Prioritising locations, industries and abatement actions for environmental improvement in a heterogeneous catchment: The Sydney Basin Australia. Proc 7th International Specialised Conference on Diffuse Pollution and Basin Management, International Water Association, Dublin, I (2):7–12

Daley M, Potter J, Difranco E, McDowell WH (2010) Nitrogen assessment for the Lamprey River watershed, New Hampshire Water Resources Research Centre (NH WRRC), Department of Natural Resources, The State of New Hampshire, Retrieved at: http://des.nh.gov/organization/divisions/water/wmb/coastal/documents/unh_nitrogenassessment.pdf

Darracq A, Destouni G (2005) In-stream nitrogen attenuation: Model-aggregation effect sand implications for coastal nitrogen impacts. Environmental Science and Technology 39: 3716–3722. doi: 10.1021/es049740o

Ding XW, Liu RM, Shen ZY (2006) Method for obtaining parameters of export coefficient model using hydrology and water quality data and its application. Journal- Beijing Normal University Natural Science Edition 42:534–538 (in Chinese)

Ding XW, Shen ZY, Hong Q, Yang ZF, Wu X, Liu RM (2010) Development and test of the export coefficient model in the upper reach of the Yangtze River. Journal of Horology 383:233–244. doi: 10.1016/j.jhydrol.2009.12.039

Ding XW, Shen ZY, Liu RM, Qi J (2008) Improved export coefficient model considering precipitation as well as terrain and its accuracy analysis. Resources and Environment in the Yangtze Basin 17:306–309 (in Chinese)

Dodd RC, McMahon G, Stichter S (1992) Watershed planning in the Albemarlo-Pamlico Estuaries System, Report 1 – Annual Average Nutrient Budgets. Report No. 92-10. Albermarle-Pamlico Study, NC. Dept. of Environment, Health & Natural Resources, Raleigh, NC

Drewry JJ, Newham LTH, Greene RSB, Jakeman AJ, Croke BFW (2006) A review of nitrogen and phosphorus export to waterways: context for catchment modelling. Marine and Freshwater Research 57:757–774. doi: 10.1071/MF05166

Elliott AH, Alexander RB, Schwarz GE, Shankar Ude, Sukias JPS, McBride GB (2005) Estimation of nutrient sources and transport for New Zealand using the hybrid mechanistic-statistical model SPARROW. Journal of Hydrology (NZ) 44:1–27

Frink CR (1991) Estimating nutrient exports to estuaries. Journal of Environmental Quality 20:717–724. doi:10.2134/jeq1991.00472425002000040002x