Distributed hydrological modelling

Using Ravenpy to build a distributed hydrological model

In this notebook, we will demonstrate how to build a distributed hydrological model using Raven as well as “Routing product” (Generated by BasinMaker), a database of subbasins and how they link to one another in a river network. Currently, Routing product is only available for North American catchments. However, if in time it becomes available on a larger scale, it would be trivial to change the setup apply it to other supported regions.

# Import the list of possible model templates for distributed hydrological modelling
from ravenpy.config.emulators import (
    GR4JCN,
    HBVEC,
    HMETS,
    HYPR,
    SACSMA,
    Blended,
    CanadianShield,
    Mohyse,
)
import datetime as dt
import tempfile
from pathlib import Path

import matplotlib.pyplot as plt
import xarray as xr

from ravenpy import Emulator
from ravenpy.config import commands as rc
from ravenpy.config.emulators import GR4JCN
from ravenpy.extractors.routing_product import (
    BasinMakerExtractor,
    GridWeightExtractor,
    open_shapefile,
    upstream_from_coords,
)
from ravenpy.utilities.testdata import get_file, open_dataset

tmp_path = Path(tempfile.mkdtemp())

In the next step, we will get the Routing product file for our catchment. These can be downloaded here: http://hydrology.uwaterloo.ca/basinmaker/download_regional.html

# Get path to pre-downloaded BasinMaker Routing product database for our catchment
shp_path = get_file("basinmaker/drainage_region_0175_v2-1/finalcat_info_v2-1.zip")

# Note that for this to work, the coordinates must be in the small
# BasinMaker example (drainage_region_0175)
df = open_shapefile(shp_path)

# Gauge station for observations at Matapedia
# SubId: 175000128
# -67.12542 48.10417
sub = upstream_from_coords(-67.12542, 48.10417, df)

# Extract the subbasins and HRUs (one HRU per sub-basin)
bm = BasinMakerExtractor(
    df=sub,
    hru_aspect_convention="ArcGIS",
)

# Get the .rvh file that we will provide to the config and that links HRUs/subbasins to the river network
rvh = bm.extract(hru_from_sb=True)

Now that we have the HRUs and river network all setup, let’s get the hydrometeorological data. We first get the database of streamflows and then do the same for weather. You can provide your own for your own catchments, here we are using our datasets to keep things tidy.

# Streamflow observations file
qobs_fn = get_file("matapedia/Qobs_Matapedia_01BD009.nc")

# Make an obervation gauge from the observed streamflow
qobs = rc.ObservationData.from_nc(qobs_fn, alt_names=("discharge",))

Now prepare the meteorological data using the Gauge format. Note that this dataset of stations is a combination of stations that we iterate on, making a Gauge object for each station in our dataset:

# Meteo observations file
meteo_grid_fn = get_file("matapedia/Matapedia_meteo_data_stations.nc")

# Alternate names for variables in the files
alt_names = {
    "TEMP_MIN": "tmin",
    "TEMP_MAX": "tmax",
    "PRECIP": "pr",
}

# Make virtual Gauges
meteo_forcing_stations = [
    rc.Gauge.from_nc(
        meteo_grid_fn,
        data_type=alt_names.keys(),
        station_idx=i + 1,
        alt_names=alt_names,
    )
    for i in range(6)  # Since we have 6 stations
]

Now that we have the data, we can run the distributed model as usual. Note that we must provide the AVG_ANNUAL_RUNOFF parameter to initialize the catchment’s hydrological states for distributed models:

# Prepare the model configuration
model_config = GR4JCN(
    params=[0.529, -3.396, 407.29, 1.072, 16.9, 0.947],
    StartDate=dt.datetime(1998, 1, 1),
    EndDate=dt.datetime(2020, 12, 31),
    ObservationData=[qobs],
    Gauge=meteo_forcing_stations,
    GlobalParameter={"AVG_ANNUAL_RUNOFF": 40.65},
    **rvh,
)

# Run the model with the configuration we just built
distributed_outputs = Emulator(model_config).run(overwrite=True)

Explore the results, just like for any other model. However, this time we have a few gauges because the Routing Product integrates some gauges already. We want data for the first gauge:

# Show the hydrographs object
display(distributed_outputs.hydrograph)
# Plot the resulting streamflow
distributed_outputs.hydrograph.q_sim.isel(nbasins=0).plot.line(
    x="time", label="Distributed model", color="blue", lw=1.5
)

# Plot the observed streamflow
qobs_data = open_dataset(qobs_fn)
qobs_data.discharge.plot.line(x="time", label="Observations", color="red", lw=1.5)

plt.legend()