{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 02 - Extract geographical watershed properties"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Extract geographical watershed properties automatically using PAVICS-Hydro's geospatial toolbox\n",
    "\n",
    "Hydrological models typically need geographical information about watersheds being simulated: latitude and longitude, area, mean altitude, land-use, etc. Raven is no exception. This notebook shows how to obtain this information using remote services that are made available for users in PAVICS-Hydro. These services connect to a digital elevation model (DEM) and a land-use data set to extract relevant information.\n",
    "\n",
    "The DEM used in the following is the [EarthEnv-DEM90](https://www.earthenv.org/DEM), while the land-use dataset is the [North American Land Change Monitoring System](http://www.cec.org/north-american-environmental-atlas/land-cover-30m-2015-landsat-and-rapideye/). Other data sources could be used, given their availability through the Web Coverage Service (WCS) protocol.\n",
    "\n",
    "Since these computations happen on a specific Geoserver hosted on PAVICS, we need to establish a connection to that service. While the steps are a bit more complex, the good news is that you only need to change a few items in this notebook to tailor results to your needs. For example, this first code snippet is boilerplate and should not be changed.\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# We need to import a few packages required to do the work\n",
    "import os\n",
    "\n",
    "os.environ[\"USE_PYGEOS\"] = \"0\"\n",
    "import geopandas as gpd\n",
    "import matplotlib as mpl\n",
    "import matplotlib.pyplot as plt\n",
    "import numpy as np\n",
    "import rasterio\n",
    "from birdy import WPSClient\n",
    "\n",
    "# Utility that simplifies working with test data hosted on GitHub\n",
    "from ravenpy.testing.utils import get_file\n",
    "\n",
    "# This is the URL of the Geoserver that will perform the computations for us.\n",
    "url = os.environ.get(\n",
    "    \"WPS_URL\", \"https://pavics.ouranos.ca/twitcher/ows/proxy/raven/wps\"\n",
    ")\n",
    "\n",
    "# Connect to the PAVICS-Hydro Raven WPS server\n",
    "wps = WPSClient(url)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "In the previous notebook, we extracted the boundaries of a watershed, which were saved in the \"input.geojson\" file. We also downloaded the file and re-uploaded it to the workspace, so it should be available now to this workbook, too!\n",
    "\n",
    "We can now plot the outline of the watershed by loading it into `GeoPandas`."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# The contour can be generated using notebook \"01_Delineating watersheds, where it would be placed\n",
    "# in the same folder as the notebooks and available in your workspace. The contour could then be accessed\n",
    "# easily by defining it as follows:\n",
    "\"\"\"\n",
    "feature_url = \"input.geojson\"\n",
    "\"\"\"\n",
    "# However, to keep things tidy, we have also prepared a version that can be accessed easily for demonstration purposes:\n",
    "feature_url = get_file(\"notebook_inputs/input.geojson\")\n",
    "df = gpd.read_file(feature_url)\n",
    "display(df)\n",
    "df.plot()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Generic watershed properties\n",
    "\n",
    "Now that we have delineated a watershed, lets find the zonal statistics and other properties using the `shape_properties` process. This process requires a `shape` argument defining the watershed contour, the exterior polygon. The polygon can be given either as a link to a geometry file  (e.g. a geojson file such as `feature_url`), or as data embedded in a string. For example, if variable `feature` is a `GeoPandas` geometry, `json.dumps(feature)` can be used to convert it to a string and pass it as the `shape` argument.\n",
    "\n",
    "Typically, we expect users will simply upload a shapefile and use this code to perform the extraction on the region of interest."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "shape_resp = wps.shape_properties(shape=feature_url)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Once the process has completed, we extract the data from the response, as follows. Note that you do not need to change anything here. The code will work and return the desired results."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "(properties,) = shape_resp.get(asobj=True)\n",
    "prop = properties[0]\n",
    "display(prop)\n",
    "\n",
    "area = prop[\"area\"] / 1000000.0\n",
    "longitude = prop[\"centroid\"][0]\n",
    "latitude = prop[\"centroid\"][1]\n",
    "gravelius = prop[\"gravelius\"]\n",
    "perimeter = prop[\"perimeter\"]\n",
    "\n",
    "shape_info = {\n",
    "    \"area\": area,\n",
    "    \"longitude\": longitude,\n",
    "    \"latitude\": latitude,\n",
    "    \"gravelius\": gravelius,\n",
    "    \"perimeter\": perimeter,\n",
    "}\n",
    "display(shape_info)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Note that these properties are a mix of the properties of the original file where the shape is stored, and properties computed by the process (area, centroid, perimeter and gravelius). Note also that the computed area is in m², while the \"SUB_AREA\" property is in km², and that there are slight differences between the two values due to the precision of HydroSHEDS and the delineation algorithm.\n",
    "\n",
    "## Land-use information\n",
    "\n",
    "Now we extract the land-use properties of the watershed using the `nalcms_zonal_stats` process. As mentioned, it uses a dataset from the [North American Land Change Monitoring System](http://www.cec.org/north-american-environmental-atlas), and retrieve properties over the given region.\n",
    "\n",
    "With the `nalcms_zonal_stats_raster` process, we also return the raster grid itself. Note that this is a high-resolution dataset, and to avoid taxing the system's resource, requests are limited to areas under 100,000km²."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "stats_resp = wps.nalcms_zonal_stats_raster(\n",
    "    shape=feature_url, select_all_touching=True, band=1, simple_categories=True\n",
    ")"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Here we will get the raster data and show it as a grid. Here the `birdy` client automatically transforms the returned `geotiff` file to a `DataArray` using either `gdal`, `rasterio`, or `rioxarray`, depending on what libraries are available in our runtime environment. Note that `pymetalink` needs to be installed for this to work."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "features, statistics, raster = stats_resp.get(asobj=True)\n",
    "grid = raster[0]\n",
    "grid.plot()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "From there, it's easy to calculate the ratio and percentages of each land-use component. This code should also be left as-is unless you really know what you are doing."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "lu = statistics[0]\n",
    "total = sum(lu.values())\n",
    "\n",
    "land_use = {k: (v / total) for (k, v) in lu.items()}\n",
    "display(\"Land use ratios\", land_use)\n",
    "\n",
    "land_use_pct = {k: f\"{np.round(v/total*100, 2)} %\" for (k, v) in lu.items()}\n",
    "display(\"Land use percentages\", land_use_pct)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### Display the land-use statistics\n",
    "Here we can display the land-use statistics according to the land cover map, as a function of land cover raster pixels over the catchment. Again, this does not need to be modified at all. It can also be simply deleted if the visualization tools are not required for your use-case."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "unique, counts = np.unique(grid, return_counts=True)\n",
    "print(\"The land-use categories available are: \" + str(unique))\n",
    "print(\"The number of occurrences of each land-use category is: \" + str(counts))\n",
    "\n",
    "# Pixels values at '127' are NaN and can be ignored.\n",
    "from matplotlib.colors import Normalize\n",
    "\n",
    "norm = Normalize()\n",
    "norm.autoscale(unique[:-1])\n",
    "cm = mpl.colormaps[\"tab20\"]\n",
    "plt.bar(unique[:-1], counts[:-1], color=cm(norm(unique[:-1])))\n",
    "\n",
    "\n",
    "# plt.bar(unique[:-1], counts[:-1])\n",
    "plt.xticks(np.arange(min(unique[:-1]), max(unique[:-1]) + 1, 1.0))\n",
    "plt.xlabel(\"Land-use categories\")\n",
    "plt.ylabel(\"Number of pixels\")\n",
    "plt.show()\n",
    "\n",
    "grid.where(grid != 127).sel(band=1).plot.imshow(cmap=\"tab20\")\n",
    "grid.name = \"Land-use categories\"\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "These values are not very helpful on their own, so the following relationship will be helpful to map the grid to specific land-uses. We can see from this example that we have mostly \"Temperate or sub-polar needleaf forest\" with some \"Sub-polar taiga needleleaf forest\" and a bit of \"Temperate or sub-polar boardleaf deciduous forest\". Exact percentages can be computed from the array of values as extracted and displayed above.\n",
    "\n",
    "- 0: Ocean\n",
    "- 1: Temperate or sub-polar needleleaf forest\n",
    "- 2: Sub-polar taiga needleleaf forest\n",
    "- 3: Tropical or sub-tropical broadleaf evergreen forest\n",
    "- 4: Tropical or sub-tropical broadleaf deciduous forest\n",
    "- 5: Temperate or sub-polar broadleaf deciduous forest\n",
    "- 6: Mixed forest\n",
    "- 7: Tropical or sub-tropical shrubland\n",
    "- 8: Temperate or sub-polar shrubland\n",
    "- 9: Tropical or sub-tropical grassland\n",
    "- 10: Temperate or sub-polar grassland\n",
    "- 11: Sub-polar or polar shrubland-lichen-moss\n",
    "- 12: Sub-polar or polar grassland-lichen-moss\n",
    "- 13: Sub-polar or polar barren-lichen-moss\n",
    "- 14: Wetland\n",
    "- 15: Cropland\n",
    "- 16: Barren lands\n",
    "- 17: Urban\n",
    "- 18: Water\n",
    "- 19: Snow and Ice\n"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "Since the GeoTiff object was opened as an `xarray.Dataset` with the `.open_rasterio()` method, this makes it very easy to spatially reproject it with the `cartopy` library. Here we provide a sample projection, but this would need to be adapted to your needs."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "import cartopy.crs as ccrs\n",
    "\n",
    "# Set a CRS transformation:\n",
    "crs = ccrs.LambertConformal(\n",
    "    central_latitude=49, central_longitude=-95, standard_parallels=(49, 77)\n",
    ")\n",
    "\n",
    "ax = plt.subplot(projection=crs)\n",
    "grid.name = \"Land-use categories\"\n",
    "grid.where(grid != 127).sel(band=1).plot.imshow(ax=ax, transform=crs, cmap=\"tab20\")\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Terrain information from the DEM\n",
    "\n",
    "Here we collect terrain data, such as elevation, slope and aspect, from the DEM. We will do this using the `terrain_analysis` WPS service, which by default uses DEM data from [EarthEnv-DEM90](https://www.earthenv.org/DEM).\n",
    "\n",
    "Note here that while the feature outline is defined above in terms of geographic coordinates (latitude, longitude), the DEM is projected onto a 2D cartesian coordinate system (here NAD83, the Canada Atlas Lambert projection). This is necessary to perform slope calculations. For more information on this, see: https://en.wikipedia.org/wiki/Map_projection\n",
    "\n",
    "The DEM data returned in the process response here shows `rioxarray`-like access but using the URLs we can open the files however we like."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "terrain_resp = wps.terrain_analysis(\n",
    "    shape=feature_url, select_all_touching=True, projected_crs=3978\n",
    ")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "properties, dem = terrain_resp.get(asobj=True)\n",
    "\n",
    "elevation = properties[0][\"elevation\"]\n",
    "slope = properties[0][\"slope\"]\n",
    "aspect = properties[0][\"aspect\"]\n",
    "\n",
    "terrain = {\"elevation\": elevation, \"slope\": slope, \"aspect\": aspect}\n",
    "display(terrain)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "crs = ccrs.LambertConformal(\n",
    "    central_latitude=49, central_longitude=-95, standard_parallels=(49, 77)\n",
    ")\n",
    "\n",
    "dem.name = \"Elevation\"\n",
    "dem.attrs[\"units\"] = \"m\"\n",
    "ax = plt.subplot(projection=crs)\n",
    "dem.where(dem != -32768).sel(band=1).plot.imshow(ax=ax, transform=crs, cmap=\"gnuplot\")\n",
    "plt.show()"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "# We can also access the files directly via their URLs:\n",
    "properties, dem = terrain_resp.get(asobj=False)\n",
    "display(properties, dem)\n",
    "\n",
    "# Let's read the data from band=1 as numpy array\n",
    "display(rasterio.open(dem).read(1))"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## Overview\n",
    "\n",
    "A synthesis of all watershed properties can be created by merging the various dictionaries created. This allows users to easily access any of these values, and to provide them to Raven as needed."
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {
    "tags": []
   },
   "outputs": [],
   "source": [
    "all_properties = {**shape_info, **land_use, **terrain}\n",
    "display(all_properties)"
   ]
  }
 ],
 "metadata": {
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.13.11"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}