Analyzing time series

We will use the ‘xclim’ package and it’s powerful time-series analysis tools to analyze the streamflow observations of the Salmon River basin. We will compute a few indicators, but you can refer to the xclim documentation to see how you can best make use of it for your specific needs.

%matplotlib inline

import xarray as xr
import xclim
from pandas.plotting import register_matplotlib_converters

from ravenpy.utilities.testdata import get_file, open_dataset

register_matplotlib_converters()

# Get the file we will use to analyze flows
file = "hydro_simulations/raven-gr4j-cemaneige-sim_hmets-0_Hydrographs.nc"
ds = open_dataset(file)

Base flow index

The base flow index is the minimum 7-day average flow divided by the mean flow.

help(xclim.land.base_flow_index)

The base flow index needs as input arguments the link to a NetCDF file storing the stream flow time series, the name of the stream flow variable, and the frequency at which the index is computed (YS: yearly, QS-DEC: seasonally).

out = xclim.land.base_flow_index(ds.q_sim)
out.plot()

To compute generic statistics of a time series, use the stats process.

help(xclim.generic.stats)
# Here we compute the annual summer (JJA) minimum
out = xclim.generic.stats(ds.q_sim, op="min", season="JJA")
out.plot()

Frequency analysis

The process freq_analysis is similar to the previous stat in that it fits a series of annual maxima or minima to a statistical distribution, and returns the values corresponding to different return periods.

help(xclim.generic.return_level)

For example, computing the Q(2,7), the minimum 7-days streamflow with a two-year reoccurrence, can be done using the following.

out = xclim.generic.return_level(ds.q_sim, mode="min", t=2, dist="gumbel_r", window=7)
out

An array of return periods can be passed.

out = xclim.generic.return_level(
    ds.q_sim, mode="max", t=(2, 5, 10, 25, 50, 100), dist="gumbel_r"
)
out.plot()

Getting the parameters of the distribution and comparing the fit

It’s sometimes more useful to store the fitted parameters of the distribution rather than storing only the quantiles. In the example below, we’re first computing the annual maxima of the simulated time series, then fitting them to a gumbel distribution using the fit process.

import json

with xclim.set_options(
    check_missing="pct", missing_options={"pct": {"tolerance": 0.05}}
):
    ts = xclim.generic.stats(ds.q_sim, op="max")

ts
with xclim.set_options(check_missing="skip"):
    pa = xclim.generic.fit(ts.isel(nbasins=0), dist="gumbel_r")
pa