Spatial weighting of land use and temporal weighting of antecedent discharge improves prediction of stream condition

Land management to protect streams requires knowing which parts of the landscape most strongly influence stream condition. Understanding how flow through landscapes and along streams affects such land-use impacts requires knowing the period of antecedent discharge that most strongly influences condition. Both considerations require determination of optimal weighting schemes for predictors of stream condition. We calculated forest cover weighted by flow-path distance to 572 urban, peri-urban, and rural sites—in the Melbourne, Australia, region—sampled for macroinvertebrates, and antecedent discharge weighted by time preceding each of 1,723 samples. Using mixed linear models that accounted for spatial dependence, we aimed to determine the weighting curve shape and length that best predicted macroinvertebrate assemblage composition. The best model was a function of mean annual discharge, weighted forest cover, weighted imperviousness, weighted antecedent discharge, and their interactions. Optimal weightings were exponential—half-decay distance 35 m overland (plausible range 26–50 m), and 1.0 km in-stream (0.75–1.3 km) for forest cover—, and linear over ≥4 year for antecedent discharge. Model plausibility was more affected by weighting distance than the shape of the weighting function. Regardless of weighting curve shape, riparian forest effects on macroinvertebrate assemblages are strongest within 101–102 m from the stream, and 103 m upstream. Although exponential weightings are only marginally more plausible, they are the most realistic representation of physical processes. While our conclusions should not be interpreted as recommendations for buffer widths, they provide valuable insight into the scales of influence in the region and could be used to inform management decisions.


Introduction
Our ability to use and manage land while protecting streams and other water resources requires knowing which parts of the landscape most strongly influence stream condition.It is widely understood that land use closer to streams is likely to have a greater impact than more distant land (e.g.King et al. 2005).Analogously, the effect of more recent flow conditions has been shown to be greater than conditions of the more distant past (e.g.Bond et al. 2012;Rolls et al. 2012).Thus when modelling the effects of land use and recent flow on stream condition, it is important to quantify land cover and flow variables in ways that capture the spatial and temporal heterogeneity of effects.
Many studies have sought to quantify the effects of land use on stream ecosystems (e.g.Allan 2004;Stephenson and Morin 2009;Peterson et al. 2011;Thomson et al. 2012).A common shortcoming of many such studies has been their quantification of land-use impacts in ways that do not adequately account for spatial arrangement of land cover, and how it is likely to influence the magnitude of effects on stream ecosystems (King et al. 2005).In most large-scale studies of the effects of land cover, land use as a proportion of catchment has been used a candidate predictor, often assessed against a small number of alternative estimates of riparian land cover based on forest cover in stream buffers of one or more widths (from the stream), and lengths (upstream from the site of interest) (e.g.Roth et al. 1996;Strayer et al. 2003;Thomson et al. 2012).In addition to the statistical problems arising from likely collinearity among proportional land cover estimates (Van Sickle and Johnson 2008; an increase in proportional cover of one land cover class necessarily results in decreases in others: Stephenson and Morin 2009), it has been noted that the concept of an abruptly defined riparian buffer beyond which the landscape has no influence on the stream is mechanistically unrealistic (King et al. 2005).
A more mechanistically realistic approach is to continuously weight the influence of different land uses.King et al. (2005) and subsequent papers (Van Sickle and Johnson 2008;Peterson et al. 2011) have demonstrated that weighting of rural land use by its flow-path distance to-and along-streams using decay functions with a mechanistic underpinning, increases the predictive power of models of stream condition compared to lumped catchment proportions of land use.Use of near-stream land, and land in drainage lines in which water accumulates (Peterson et al. 2011), is likely to be an important driver of stream condition in landscapes in which water drains to streams through natural flow paths.However, human alteration of drainage lines, which is particularly common in urban landscapes can greatly increase the spatial extent of catchment land areas that have a strong influence on stream condition.Walsh and Kunapo (2009) demonstrated that proximity of the most downslope edge of impervious surfaces to urban stormwater drainage lines was a better distance measure for weighting the influence of such surfaces on stream macroinvertebrate assemblages than was distance to stream.They therefore argued that impervious runoff delivered through urban stormwater drainage networks is a likely primary driver of stream degradation, and that the influence of urban impervious surfaces is greatly extended beyond the riparian zone, potentially far into the catchment.Similar effects of drainage lines on transport of nutrients have been reported in agricultural landscapes with tile drains (Fraterrigo and Downing 2008).
The questions of the optimal weighting schemes for different land uses is related to the question of how best to weight antecedent flow as a predictor of stream condition.The dynamic nature of stream ecosystems means that antecedent flow conditions are an important driver of biotic structure and function (Rolls et al. 2012).However, few large-scale studies have quantified antecedent flow conditions as a predictor of stream condition (Bond et al. 2012).
In this paper we aim to identify the most plausible weighting model (shape) and parameterization (length) for forest cover and antecedent discharge as predictors of macroinvertebrate assemblage condition in streams of the Melbourne region, Australia.We also aim to assess if the optimal measure of weighted imperviousness derived by Walsh and Kunapo (2009) for an eastern portion of our study region remained a superior predictor to total imperviousness for the entire region.Assemblage condition was quantified using SIGNAL score, a macroinvertebrate biotic index (average score per taxon), which is widely used in south-eastern Australia.SIGNAL score was developed as an index of water quality impairment (Chessman 1995), it has been shown to be a strong univariate correlate of assemblage compositional similarity across gradients of land-use disturbance area in our study region (Walsh 2006).It is therefore a useful measure of the response of macroinvertebrate assemblages to human disturbance in the Melbourne region.
We used a large set of macroinvertebrate data across the region surrounding the metropolis of Melbourne in south-eastern Australia collected over 17 years.We used linear mixed models and tested optimal weightings across a wide range of potential model structures, including other variables that characterize the physiographic nature of sites to ensure the robustness of our conclusions.Our primary hypothesis was that the accuracy of the prediction of SIGNAL score would vary with the distance-weighting functions of forest cover and impervious area, as well as the temporal weighting function of antecedent discharge.Our results confirm this, showing that better predictions of stream ecological condition arise from models that take into account the greater influence of more proximate land use and more recent flow conditions.Incorporation of such information into predictive models will improve our understanding of land use impacts upon freshwaters.

Overview
We used multi-model inference (Burnham and Anderson 2002) to assess the relative plausibility of a wide range of models for predicting SIGNAL score across the Greater Melbourne region.The models used up to 10 predictor variables derived from multiple sources, which fall naturally into 4 different classes (Table 1).Three variables (process, number of riffle sample units per sample, number of spring sample units per sample) affect the likelihood of capture of different macroinvertebrate taxa.Four variables (mean annual discharge, catchment area, elevation, igneous geology) accounted for physiographic variation across the region likely to affect macroinvertebrate assemblages through effects on water chemistry and hydrology.Antecedent discharge accounted for temporal hydrologic variation among samples.We aimed to ensure that influences of human impact were restricted to the land-use variables, which were quantified as imperviousness (as an indicator of urbanization), and forest cover (as an indicator of land clearance), We sought optimal weightings for land use variables and antecedent discharge.The other variables were intended as classifiers of stream or sample type: thus we used modeled measures of discharge, assuming no human impact.We used mean annual discharge and antecedent discharge as indicators of the general hydrologic state of the streams in the absence of human impacts, and land use variables as indicators of potential changes to habitat quality and hydrologic patterns.

#Table 1 approximately here#
For forest cover, imperviousness and antecedent discharge, we tested the relative plausibility of different weighting schemes for these predictor variables in our models.For impervious runoff, we simply compared the plausibility of total imperviousness (TI, a simple measure of urban density) and attenuated imperviousness (AI, imperviousness weighted by flow distance to stormwater drains, and thus a more direct indicator of impervious runoff, as derived by Walsh & Kunapo, 2009).For forest cover, we compared the plausibility of attenuated forest cover calculated using 3 weighting models (exponential decay, linear decay and threshold), across a wide range of decay and threshold distances (to the nearest stream) for each model.For antecedent discharge, we compared its plausibility when calculated for different periods (6 mo to 5 y), either simply summed or weighted so that more recent monthly discharges have greater weight.Full methodological details are provided in Appendices 1 and 2.

Study region
The Melbourne region (Fig. 1) is physiographically diverse, ranging from dry western grassland-dominated basalt plains (mean rainfall 400-500 mm/y) with open Eucalyptus woodland along stream valleys, to increasingly wet uplands further east, rising to a mosaic of tall wet-sclerophyll Eucalyptus forests and Nothofagus rainforests in upland valleys (mean rainfall up to 2000 mm/y).Melbourne (population ~4.5 million) is surrounded by rural lands in which the dominant land use is pasture and non-irrigated cropping.Small areas of the region are used for intensive horticulture, totaling ~4% of the total land area (Melbourne Water et al. 2013).Forest, primarily with Eucalyptus spp. as the dominant tree, dominate the uplands of the east and northwest, but elsewhere occur as patches among agricultural and urban land uses (Fig. 1B).The upper Yarra River in the east of the region and some other upland streams of the region are impounded for Melbourne's water supply, but as the river or its tributaries are not used as water supply conduits, seasonal patterns of discharge in downstream waters are largely unchanged by abstraction (Walsh et al. 2007).#Fig. 1 approximately here#

Data sources
We used five primary data sources: 1.The primary dataset and the dependent variable, SIGNAL score, were derived from the Melbourne Water macroinvertebrate database, which contains data collected from sites across the Greater Melbourne region (Fig. 1A) since 1992 (Melbourne Water, unpublished data).4. Imperviousness measures and catchment area were derived from 2004 maps of impervious coverage and an associated nested dataset of stream reaches and catchments in the region (Grace Detailed-GIS Services 2012).While Melbourne is growing and expanding, urban infill and expansion primarily over the period of study resulted in only marginal increases in catchment imperviousness in already substantially urban streams, with no expansion into undeveloped catchments that were subsequently sampled.

5.
A 10-m-resolution DEM for the Greater Melbourne region (J Kunapo, personal communication) was used to standardize the resolution of the multiple data sources for calculation of predictor variables.

Compilation of the biological dataset
We used samples collected before April 2009, the most recent date of hydrologic data in the geofabric: although parts of the region were affected by wildfires in February 2009, none of the samples in the data were from fire-affected sites.All samples were collected by rapid bioassessment methods (RBA: Anon. 1994) either from riffles or pool edges, and either in autumn (Feb -Jun) or spring (Sep -Dec).84% of samples were sorted using a standard 30-min sort in the field, and 16% were subsampled in the laboratory, and sorted to 10% or 200 individuals, whichever was greater.
Typically two samples were collected at each site on a single day: either one from a riffle and a second from a pool edge, or two edge samples in the absence of a riffle.To reduce the effects of sampling error of RBA samples, data from pairs of samples were combined to produce presence-absence data for sample-pairs.1,382 sample-pairs were collected on the same date (or for 13 of them, within a month of each other).A further 655 sample-pairs were collected from one site in different seasons of the same year.Sample pairs are hereafter referred to as samples (and individual samples as sample units).
All data were standardized to family-level identification, except Chironomidae, which were identified to sub-family, and Acarina, Oligochaeta, Polychaeta and Nemertinea, which were not identified to finer levels.SIGNAL was calculated for each sample using the grades of EPA Victoria (2003).SIGNAL is the average of ratings (1-10) assigned to macroinvertebrate families found at a site, with higher ratings indicating more pollutionsensitive families.
While the distribution of sites was well spread across the region, many sites were close to other sites (Fig. 1).A temporal trend analysis of SIGNAL scores in a subset of our study sites found no evidence of spatial autocorrelation among them (Webb and King 2009), but this remains a possibility.To account for any effects of spatial autocorrelation in this data set, we allocated adjacent sites into groups along the same stream, conservatively ensuring that all members of each group were >5 river-km from sites in any other group ( Lloyd et al. 2005).This grouping rule resulted in the exclusion of samples from 136 sites that fell between site groups separated by <10 km.This left 1,723 samples, spread uniformly over the study period (Fig. 2A), from 572 sites in 343 site groups, each containing 1-35 samples to be used in the statistical modeling.3. Method by which the samples were sorted (process = field or lab).RBA samples sorted in the field are likely to be biased against some taxa (Humphrey et al. 2000), with potential effects on SIGNAL score.Two variables were extracted from the geofabric: 1. Mean annual discharge depth in the absence of human impacts (meanQ, mm/y: mean annual accumulated surface water surplus-derived from a simple water balance model using long-term rainfall and potential evapotranspiration data, and strongly correlated with discharge of unregulated rivers of the region (Stein et al. 2002)-divided by catchment area).MeanQ, as a catchment-standardized measure of annual stream discharge, distinguishes flow regimes among streams that rise in the drier western and lowland parts of the region from those that rise in wetter eastern and upland parts (Fig. 1C).
2. Proportion of catchment area underlain by igneous rock (igneous geology).This variable is a correlate of electrical conductivity of stream waters across the Melbourne region (Walsh et al. 2001).

Compilation of variables for optimal weighting searches a) Antecedent discharge
Antecedent discharge (of different periods) for each sample was calculated from the geofabric estimates of monthly accumulated surface water surplus in each segment in the period before each sample.We calculated two weighting schemes for antecedent discharge: 1. unweighted antecedent discharge for x months, calculated as -, where Q i is the discharge in the ith month before the sample date.Dividing the sum of those discharge values by the mean annual discharge (Qmean.ann)adjusted by the number of months summed expresses the result as a proportion of the long-term mean.
2. linearly weighted antecedent discharge for x months similarly as -.

b) Forest cover
Our decision to use forest cover as a variable portraying (the converse of) human land use in the region resulted from preliminary analyses in which we sought to predict SIGNAL score using a range of agricultural land classes.The patchiness and relative rarity of intensive agricultural practices in the region resulted in no greater predictive power from multiple agricultural classes than from a single class of cleared land (or its converse, forest cover).
The land-use data did not permit us to distinguish forest types across the region in our analyses.
To calculate weighted forest cover, we converted forest cover data from the MW land-use dataset and imperviousness data from the impervious dataset to 10-m rasters to match the flow-distance data calculated from the 10-m DEM (see Appendix 1 for more details).We used a distance weighting model after Van Sickle and Johnson (2008), assuming that the influence of forest cover in a grid cell is a non-increasing function of the flow path distance between that cell and the sampling site.Thus the influence of a cell at the sample site is 1, while a more distant cell has a fractional influence, We compared the plausibility of models using distance-weighted forest cover determined by one of three one-parameter weighting functions.These have been considered as explanatory models for distance-attenuated effects in past studies (Van Sickle and Johnson 2008), and span the range of attenuation shape.For example, curves with the most rapid decline near d = 0 could portray the mechanisms of pollutant or water uptake or loss along flow paths; whereas those with the least rapid decline near d = 0 are similar to those used widely in the assessment of fixed buffer widths.
1) exponential decay, Rather than reporting the values of  L and  W , for exponential and linear weighting, we report the half-decay distance (HD): the distance at which a grid cell would have a weighting of 0.5.For exponential decay HD = -ln(0.5)/,for linear decay, HD = 0.5.For threshold, we report the threshold value, 

c) Imperviousness
We compared (unweighted) TI with the most plausible value of (exponentially weighted) AI derived by Walsh and Kunapo (2009) for streams of eastern Melbourne (Fig. 3B).AI is a measure of the influence of impervious surfaces as determined by the anthropogenic stormwater drainage system associated with urban land.The formulation of AI differs from the weighting schemes we apply to forest cover here in two important ways.
First the weighting distance used in AI is the overland flow distance from the most downstream point of an impervious surface to the nearest stormwater drain (or stream if the flow path does not cross a drain: Fig. 3A).AI is partly a measure of altered land use, but also a measure of alteration to the hydrologic network.To capture differential effects of altered catchment flow paths among catchments, the denominator of AI is the total catchment area, rather than the weighted fractional influence of all grid cells as in eqn. 1.

Selection of candidate models
We used mixed linear models that accounted for spatial dependence to assess whether models incorporating weighted measures of imperviousness, forest cover and antecedent discharge were more plausible than models with unweighted measures.We included other predictor variables as described earlier to account for climatic and topographic variability and differences among sample types, independent of the three variables to be weighted.We thus sought to compare the plausibility of a subset of models that combined up to ten predictor variables and their interactions.
Candidate models were selected on the following basis.We expected that SIGNAL scores would be predicted well by a combination of forest cover, imperviousness, meanQ (Fig. 1B, C), antecedent discharge (Fig. 2B), and interactions between these variables.We also expected that some variation in SIGNAL would be explained by stream size (indicated by catchment area), elevation (combining the effects of stream size and temperature), igneous geology and the characteristics of the samples (sample type variables in Table 1).Models tested therefore included all four of the main predictor variables and their interactions, at least one of catchment area, elevation and igneous geology, and at least one sample type variable (Table 2).

Comparisons of model plausibility
We used the Akaike Information Criterion adjusted for small sample size We took an iterative approach to testing the effect of different weighting schemes for forest cover, imperviousness and antecedent discharge.All models tested included meanQ, one variant of each of imperviousness, forest cover and antecedent discharge, and all interactions of these four variables (Table 2).All model structures were first tested using unweighted forest cover, total imperviousness, and unweighted 2-y antecedent discharge, as an arbitrary starting model.The most plausible model structure was then used to determine the optimal antecedent discharge measure.This measure was then used in all candidate models and the most plausible model structure was identified again.The most plausible model structure was then used to assess which of unweighted imperviousness (TI) or weighted imperviousness (AI) was the better measure of imperviousness, and that measure was used in all candidate models to identify the most plausible model structure again.This structure was then used to identify the most plausible forest measure, and finally this measure was used in all model structures, to assess the most plausible model structure.
At each iteration, we also assessed models with each of the third-and fourth-order interactions of the four main variables removed: if their removal did not increase AIC c by >2, they were removed.The change in weighted variables did not change the interactions removed at this stage in any iteration.
For antecedent discharge, we assessed a) unweighted and b) linearly weighted antecedent discharge for 6, 12, 24, 30, 36, 64, and 72 months.For each weighting function for forest, we calculated for a wide range of values of  L and  W .We began with a range of values spaced approximately exponentially (e.g.HD L (for linear and exponential decay) or  L (for threshold) = 1, 2, 4, 8, 16, 32, 64, 128, 360, 640 m, and HD W or  W = 100, 300, 1000, 3000, 10000, 30000).We then used the region of lowest AIC to test ranges of  at finer scales.

Mixed model structures and assumptions
We transformed variables as necessary to reduce leverage of high values (Table 1).
The effect of sample groups was incorporated as a random effect.For all fixed effect model structures, we compared a simple linear model with two mixed models each with a different random component: a random intercept determined by site group with constant slope; and a random intercept determined by site group, and slope determined by antecedent discharge (Zuur et al. 2009).The first random structure assumes that SIGNAL score can vary among site groups, but that the modeled relationship with the fixed effects has the same slope.This model accounts for any potential spatial autocorrelation of samples within site groups.The second random structure assumes that the modeled relationship with fixed effects can vary with antecedent discharge among site groups.Models were calculated with the 'nlme' package of R (R Core Team 2013), using restricted maximum likelihood.We checked that model assumptions were met following the protocols of Zuur et al. (2009).

Most plausible model structures
Mixed models with a random intercept determined by site group were consistently much more plausible than linear models with the same fixed effect structure but not accounting for dependence among site groups (AIC c >300 in all cases).Models with a random intercept determined by site group, and slope determined by antecedent discharge were more plausible again (AIC c >30 in all cases).The random intercept and slope structure was thus used in all models for assessment of optimal weighted variables.
The structure of fixed model effects had a relatively small effect on AIC compared to weighting of forest, imperviousness and antecedent discharge (Table 1).The most plausible model structure included elevation, igneous geology, and process in addition to the four main variables.The only 3-or 4-way interaction that improved AIC was the interaction between meanQ, antecedent discharge and weighted forest cover.This final model structure was substantially more plausible than any other model structure or any model that used unweighted variables, and was used for assessment of optimal weighted variables.
Most plausible weighting schemes for antecedent discharge, imperviousness and forest cover The most plausible measures of antecedent discharge were linearly weighted, with all antecedent periods ≥4 y being equally plausible (Fig. 4).These weighted measures of antecedent discharge were all substantially more plausible than unweighted antecedent discharge of any period.Attenuated (weighted) imperviousness was a much more plausible predictor than total (unweighted) imperviousness in all models tested (Table 1).The most plausible measure of forest cover was exponentially weighted with half-decay distances of 35 m overland and 1.0 km in-stream, with ranges of equal plausibility (AIC c <2), of 26-50 m and 0.75-1.3km, respectively (Fig. 5, 6).Optimal distances for the three decay functions were consistent: optimal half-decay distances were similar for exponential and linear decay, and about half the threshold values.There was strong variation in AIC c with varying distance parameters (Fig. 5, 6), but less difference between the weighting functions (Fig. 5).
The final model strongly predicted SIGNAL score, with the Pearson correlation coefficient of 0.955 between fitted and observed values.

Discussion
Our results suggest that the influence of forests on in-stream macroinvertebrate assemblage composition is greatest along a riparian corridor extending 10 1 -10 2 m inland and 10 3 m upstream.Similarly, extending the inference of Walsh & Kunapo (2009) to the broader geographic context of this study, the influence of urban impervious runoff is strongest within 10 1 m of a stream, but this influence usually is extended far into upland parts of catchments by extensive stormwater drainage systems (e.g.Fig. 2B).Thus while rural and urban land uses may have widely different scales of influence, the flow distances over which their impacts can be moderated are comparably short.
Our finding of an optimal antecedent discharge length of ≥4 years is likely to be at least partly a function of the timing of our study, over a period of increasing dryness, encompassing a decadal drought (Fig. 2; Bond et al. 2008).This prevents us from generalizing this finding to periods of increasing wetness.The interaction between meanQ, attenuated forest cover and antecedent discharge (data ranges provided in Table 1) in the best model suggests that the influence of riparian forests differs between streams with different annual discharges, and between periods of different dryness.Such interactions point to new approaches to be explored in future studies, described below.

Optimal weighting of land cover
Exponential attenuation was the most plausible weighting function, and arguably the most mechanistically meaningful of the three functions, particularly in modeling the behavior of contaminant transport (King et al. 2005).However, as found by Van Sickle and Johnson (2008), the optimal parameterizations of the three functions did not differ strongly in their plausibility.In contrast, all three functions varied widely in plausibility with different decay lengths (e.g.half-decay distances <~540 m and >~1750 m in-stream, and <~15 m and >~200 m overland were substantially less plausible than the best model).Such a strong influence of parameterization (i.e.weighting distance) suggests that comparing different weighting models (i.e.shape) using a single parameterization for each scheme (e.g.Peterson et al. 2011) is unlikely to provide a reliable discrimination of different mechanisms.
A decline in influence of forest cover to near-zero over 10 2 m (as suggested by a halfdecay distance of 26-50 m, Fig. 5) is consistent with the findings of Van Sickle and Johnson (2008), who used similar methods to predict fish assemblage composition.However, they found that the influence of land use on fish assemblages extended for many km upstream, further than we found for macroinvertebrate assemblages.As they posited, the extent of instream influence might be shorter for macroinvertebrate assemblages, which are likely to be less mobile and shorter lived than are fish.
The strong influence of riparian forest cover suggested by our analyses concurs not ).However, it contrasts with the conclusions of Stephenson and Morin (2009), who reported that catchment forest cover independently explained more variation in a range of structural in-stream ecological measures than did buffered riparian forest cover.Our model produced a much stronger prediction of ecological response than did any of the models of Stephenson and Morin (2009), pointing to the importance of including a range of predictor variables that are likely to be driving variation unexplained by forest cover, as the primary predictor variable of interest.Other studies have reported varied in-stream ecological and water quality responses to riparian vs catchment land uses (e.g.Strayer et al. 2003;Uriarte et al. 2011), but these studies are not easily comparable to ours because they did not clearly distinguish the effects of urban and agricultural land use, nor consider differences in drainage among land uses.
The continued widespread use of lumped measures of land use (e.g.TI), which do not adequately represent the effects of catchment drainage networks, hampers advancement of ecological knowledge and the ability of ecologists to advise managers on the most appropriate actions to reduce land-use impacts on stream ecosystems, particularly urban land use.For instance, arguments about threshold effects of urban land use (Cuffney et al. 2011;King and Baker 2011) are limited by the use of lumped measures that sub-optimally represent the stressor driving the threshold response.Sheldon et al. (2012) found that no single scale of urban land cover was a strongly preferred predictor of in-stream ecological health in southeast Queensland, to which they attributed the likely importance of the broader catchment connectivity of urban catchments with stormwater drainage networks: a mechanism that was not captured by their urban land cover measures.The superiority of AI over TI as a predictor of SIGNAL score that we have demonstrated across the Melbourne region reinforces the importance of drainage networks in driving the urban degradation of stream ecosystems.
Optimal measure of antecedent discharge The influence of antecedent discharge of ≥4 years on SIGNAL score suggests a shift in assemblage composition with increasing dryness over the period of study.The latter decade of the study period was the worst drought since European settlement of the region (Bond et al. 2008).While our large-scale assessment of antecedent discharge cannot provide information on the mechanisms by which drought affects stream biota, our determination of an optimal measure of antecedent discharge points to the temporal scale over which supraseasonal droughts impact streams in our study region.Weighted antecedent discharge suggests that discharge of most recent months is of greatest influence on macroinvertebrate assemblages.The optimal length of ≥4 years suggests that the cumulative effects of a supraseasonal drought contribute to assemblage composition over at least 4 years.
As our period of study did not encompass a wetter period following the drought, we are unable to infer if the recovery of macroinvertebrate assemblage composition is slower than the rate of change observed during the drying period (Bond et al. 2008).Thus, the broader relevance of a ≥4-year weighted measure of antecedent discharge requires reassessment using data collected during multiple supra-seasonal periods of drying and wetting.

Further research directions
The possibility that the influence of riparian forests differs between stream types and between periods of differing dryness points to a potential shortcoming of our work: the fixed definition of the stream network, on which to-stream and in-stream distances are based.Our chosen threshold (catchment area = 1 km 2 ) was based on our observations of the catchment area of the smallest perennial streams in the eastern part of region.It is likely that regional gradients of climate and stream discharge mean that the threshold for stream formation varies, just as it does with antecedent discharge (Baker et al. 2007).Peterson et al. (2011) addressed this shortcoming by adjusting weighting distances by flow accumulation, to allow for less attenuation of effect as flow paths lengthen.In the earlier development of this study, we found that similar flow-accumulation weighting approaches did not optimize within a defined range of decay distances.We hypothesise that weighting narrow (1 cell) drainage lines increases the sensitivity of results to minor errors in land-use maps.
To address the problems of defining the interface between overland and in-stream flow, we propose two future refinements.Firstly, to account for regional variability in threshold area, we propose using stream discharge rather than catchment area as the criteria for determining the threshold area for stream definition.To avoid potential problems with using flow-accumulation to alter decay distance, we propose using a stream network with a very small threshold discharge, with each reach of the network classified into discharge ranges.To-stream decay distances can then be altered as a function of reach class.
Our primary objective was to quantify the most plausible average weighting schemes for a stream condition index, using an analysis across a physiographically diverse region.
Although we have accounted for broad physiographic patterns across the region, and our results accord well with similar research elsewhere, our model was developed only for a region encompassing several large watersheds.Hence, it remains possible that covariance between stream types and land use could have biased our results, or obscured variation across the region in the spatial extent of land use influences.Moreover, dimensions of riparian forests with greatest influence on stream ecosystems are likely to vary with stream size, topography, and long-term and antecedent climate (Richardson et al. 2012).Therefore our conclusions should not be interpreted as firm recommendations for a fixed buffer width for all streams across the region; further meta-scale analyses (e.g.Cuevas et al. 2006) would be needed to assess the generality of our results across larger scales, and their ability to be extrapolated to watersheds not included in the analysis.
In addressing our primary purpose, we developed a new approach to determining optimal spatial weights that produced excellent results, and is applicable to any stream network with reasonable land use data, hydrological data, and digital elevation models.The performance of our optimal model in detecting the effects of land use on stream macroinvertebrates argues strongly for the uptake of our approach in other locations.Finally, the contribution of antecedent discharge to model performance reminds us that even with severe land use impacts, stream macroinvertebrate assemblage composition is still largely driven by flow regime.1. 'Unweighted' models used unweighted 2-y afi, TI and total F. 'Weighted afi; Unweighted I, F' models used optimal afi (4-y linearly weighted), TI and total F. 'Weighted afi, I; Unweighted F' models used optimal afi, AI (optimal fit as determined by Walsh and Kunapo 2009) and total F. 'Weighted afi, I, F' models used optimal afi, AI, and optimal F (exponentially attenuated HDD L = 50 m, HDD W = 1500 m).meanQ * I * F * afi denotes each of these four effects and all their interactions.The final model is the best of the full-interaction models with 3-and 4-way interactions removed that did not improve AIC (colons denote simple interactions).

2.
Hydrologic and geological variables were derived from the surface network and catchments data of the Australian hydrological geospatial fabric, a national dataset derived from a 9-second (~220 x 270 m in the study region) digital elevation model, detailing spatial relationships between streams and their catchments (hereafter termed the geofabric: Bureau of Meteorology 2011).3. Forest cover data were derived from a land-use dataset for the Greater Melbourne region, compiled by Melbourne Water from two sources (Department of Planning and Community Development 2010; Department of Primary Industries 2011).

#
Fig. 2 approximately here# Compilation of predictor variables requiring no further calculation Elevation of each site (m) was calculated from the 10-m DEM.Catchment area (km 2 ) of each site was extracted from the imperviousness dataset.Three variables describing sample characteristics were derived from the macroinvertebrate database: 1. Number of spring sample units per sample (nspring = 0, 1 or 2).Season of collection potentially affects macroinvertebrate assemblage composition as a result of variability in phenology among taxa.2. Number of riffle samples (nriff = 0, 1, or 2).Macroinvertebrate taxa differ in their tendency to occur in riffle and edge habitats, which potentially has an effect on SIGNAL score.
where f L and f W are fractions determined by to-stream flow distance d L and in-stream flow distance d W (Fig. 3), and the parameters L and W, which control the degree of attenuation over a set distance.#Fig.3 approximately here# The cumulative distance-weighted forest cover, , for the catchment of each site is estimated by dividing the sum of the weighted fractional influence of all forested grid cells (C i = 1 if forested, 0 if not) by the weighted fractional influence of all grid cells.eqn. 1 (AIC c ) to assess the relative plausibility of alternative models of SIGNAL score.We report AIC c , the difference between a model's AIC c and that of the overall best model (i.e.AIC c of the best model = 0).Models with lower AIC c are more plausible, but models with AIC c  2 are considered equally plausible as the best model.AIC c of 4-7 indicates that the model with the lower AIC c is superior, with AIC c > 10 indicating that the model with the lower AIC c is strongly preferred (Burnham and Anderson 2002).
only with recent similar studies (Van Sickle and Johnson 2008; Sheldon et al. 2012), but with the broader field of ecological research identifying the strong influence of riparian zones on the ecological structure and function of streams (Naiman and Décamps 1997; Pusey and Arthington 2003

Fig. 1 .
Fig. 1. A. Sites across the Greater Melbourne region from which macroinvertebrate

Fig. 3 .
Fig. 3. Two alternative weighting schemes for catchment land use. A. The catchment

Fig. 5 .
Fig. 5. Differences in Aikake information criteria (AIC c ) for models with weighted

Fig. 6 .
Fig. 6.Contour plot of difference in Aikake Information Criterion (AIC c ) from the

Table 1 .
Fixed-effect variables, their classes as considered in the text, their codes (used in Table2) and transformations used in mixed linear models, their source, derivation, and range among the macroinvertebrate samples.Sources: geofabric(Bureau of Meteorology 2011); DCI dataset (Grace Detailed-GIS Services 2012); 10-m DEM (J Kunapo personal communication); MW land-use data (Department of Planning and Community Development 2010; Department of Primary Industries 2011).

Table 2 .
Fixed-effect structures of mixed models (with random intercept determined by site group and slope determined by antecedent discharge) and their AIC c values compared to the best model (AIC c = 0, in bold).Variable codes and transformations are defined in Table