API

basmati.downloader

class basmati.downloader.HydroshedsDownloader(hydrosheds_dir: Union[str, pathlib.Path], delete_zip: bool)

Downloads and unzips HydroSHEDS dataset files from Dropbox

download_hydrobasins_all_levels(region: str) → None

Download HydroBASINS dataset, levels 1-12.

Parameters

region – region to download dataset for

download_hydrosheds_dem_30s(region: str) → None

Download 30s Digital Elevation Model for region.

Parameters

region – region to download DEM for

exception basmati.downloader.UnrecognizedRegionError

Region not one of the know 2-digit codes in HYDROBASINS_REGIONS

basmati.downloader.download_file_wget(url: str, basedir: pathlib.Path, filename: pathlib.Path) → pathlib.Path

Downloads a file from a given URL to the desired basedir / filename.

Uses wget and system command because requests cannot resolve the redirects of the stable dropbox links used by HydroSHEDS. See here for the base Dropbox directory: https://www.dropbox.com/sh/hmpwobbz9qixxpe/AAAI_jasMJPZl_6wX6d3vEOla?dl=0

Parameters
  • url – URL where file can be downloaded

  • basedir – directory to download to

  • filename – filename of file to download

Returns

filepath of downloaded file

basmati.downloader.download_main(dataset: str, region: str, delete_zip: bool) → None

Entry point for downloading HydroSHEDS datasets for the given region

Relies on HYDROSHEDS_DIR env var being set.

e.g.: $ basmati download -d <dataset> -r <region>

Raises

BasmatiError if HYDROSHEDS_DIR not set

Raises

BasmatiError region or dataset not recognized

Parameters
  • dataset – HydroSHEDS dataset to download

  • region – 2 character region code

  • delete_zip – delete downloaded zipfiles after extract

basmati.downloader.unzip_file(basedir: pathlib.Path, zipfilepath: pathlib.Path) → None

Completely extract a zip file to a given basedir

Parameters
  • basedir – directory to extract to

  • zipfilepath – path to zipfile

basmati.hydrosheds

basmati.hydrosheds.load_hydrobasins_geodataframe(hydrosheds_dir: Union[str, pathlib.Path], region: str, levels: Iterable = range(1, 7), hydrobasins_file_tpl: str = 'hybas_{region}_lev{level:02}_v1c.shp') → geopandas.geodataframe.GeoDataFrame

Load all data for the desired region and levels.

Parameters
  • hydrosheds_dir – directory of HydroSHEDS datasets

  • region – 2 character region code

  • levels – Pfafstetter levels to load

  • hydrobasins_file_tpl – filename template

Returns

geodataframe containing all the data for the desired region and levels

basmati.hydrosheds.load_hydrosheds_dem(hydrosheds_dir: Union[str, pathlib.Path], region: str, resolution: str = '30s', hydrosheds_dem_file_tpl: str = '{region}_dem_{resolution}.bil') → Tuple[numpy.ndarray, affine.Affine, numpy.ndarray, numpy.ndarray]

Load a HydroSHEDS Digital Elevation Model (DEM).

Parameters
  • hydrosheds_dir – directory of HydroSHEDS datasets

  • region – 2 character region code

  • resolution – resolution to load

  • hydrosheds_dem_file_tpl – filename template

Returns

bounds, affine transform, DEM and mask of the DEM

basmati.hydrosheds.is_downstream(pfaf_id_a: Union[int, str], pfaf_id_b: Union[int, str]) → bool

Calculate if pfaf_id_b is downstream of pfaf_id_a

Implemented as in https://en.wikipedia.org/wiki/Pfafstetter_Coding_System#Properties Works even if pfaf_id_a and pfaf_id_b are at different levels.

Parameters
  • pfaf_id_a – first Pfafstetter id (upstream)

  • pfaf_id_b – second Pfafstetter id (downstream)

Returns

True if pfaf_id_b is downstream of pfaf_id_a, False otherwise or if a == b

basmati.hydrosheds._find_downstream(gdf: geopandas.geodataframe.GeoDataFrame, start_basin_pfaf_id: int) → geopandas.geodataframe.GeoDataFrame

Find all downstream basins at the same level as the start basin.

Can also be used as a method on a gpd.GeoDataFrame: gdf.find_downstream(start_basin_pfaf_id)

Parameters
  • gdf – hydrobasins geodataframe to traverse

  • start_basin_pfaf_id – Pfafstetter id of start basin

Returns

filtered geodataframe at level of start basin based on which basins are downstream of start basin

basmati.hydrosheds._find_upstream(gdf: geopandas.geodataframe.GeoDataFrame, start_basin_pfaf_id: int) → geopandas.geodataframe.GeoDataFrame

Find all upstream basins at the same level as the start basin.

Can also be used as a method on a gpd.GeoDataFrame: gdf.find_upstream(start_basin_pfaf_id)

Parameters
  • gdf – hydrobasins geodataframe to traverse

  • start_basin_pfaf_id – Pfafstetter id of start basin

Returns

filtered geodataframe at level of start basin based on which basins are upstream of start basin

basmati.hydrosheds._find_next_level_larger(gdf: geopandas.geodataframe.GeoDataFrame, start_basin_pfaf_id: int) → geopandas.geodataframe.GeoDataFrame

Find basin one level lower (i.e. found basin is larger).

if start_basin_pfaf_id == 913, will return basin 91. Can return 0 or 1 basins.

Can also be used as a method on a gpd.GeoDataFrame: gdf.find_next_level_larger(start_basin_pfaf_id)

Parameters
  • gdf – hydrobasins geodataframe to traverse

  • start_basin_pfaf_id – Pfafstetter id of start basin

Returns

filtered geodataframe with 0 or 1 basins at level lower

basmati.hydrosheds._find_next_level_smaller(gdf: geopandas.geodataframe.GeoDataFrame, start_basin_pfaf_id: int) → geopandas.geodataframe.GeoDataFrame

Find basins one level higher (i.e. found basins are smaller).

if start_basin_pfaf_id == 91, will return basins 911, 912… 919. Can return 0-9 basins.

Can also be used as a method on a gpd.GeoDataFrame: gdf.find_next_level_smaller(start_basin_pfaf_id)

Parameters
  • gdf – hydrobasins geodataframe to traverse

  • start_basin_pfaf_id – Pfafstetter id of start basin

Returns

filtered geodataframe with 0-9 basins at level higher

basmati.hydrosheds._area_select(gdf: geopandas.geodataframe.GeoDataFrame, min_area: float, max_area: float) → geopandas.geodataframe.GeoDataFrame

Select basins from lower to higher levels that are between min_area and max_area in area.

Start by working out if any basins at e.g. level 1 are selected. Then move on to higher levels (smaller basins). At each level, only add basins if the basin at the level below has not been added. e.g. level 3 basins 411 to 419 will not be added if at level 2 basin 41 was added.

Can also be used as a method on a gpd.GeoDataFrame: gdf.area_select(min_area, max_area)

Parameters
  • gdf – hydrobasins geodataframe to traverse

  • min_area – minimum area of basin

  • max_area – maximum area of basin

Returns

filtered geodataframe from any level (favouring lower levels) with area between min and max

basmati.utils

basmati.utils.build_raster_from_geometries(geometries: Iterable[shapely.geometry.base.BaseGeometry], shape: Iterable[int], tx: affine.Affine) → numpy.ndarray

Build a 2D raster from the geometries (e.g. gdf.geometry)

Each geometry is assigned an index, which increments by one for each geometry.

Parameters
  • geometries – Individual geometries

  • shape – shape of desired raster

  • tx – affine transform to apply to each geometry before rasterizing

Returns

2D raster where each index is the raster of an individual geometry.

basmati.utils.coarse_grain2d(arr: numpy.ndarray, grain_size: List[int]) → numpy.ndarray

Coarse grain a 2D arr based on grain_size

Parameters
  • arr – array to coarse grain

  • grain_size – 2 value size of grain

Returns

coarse-grained array

basmati.utils.coarse_grain2d_ndim(arr: numpy.ndarray, grain_size: List[int]) → numpy.ndarray

Coarse grain an N-D arr based on grain_size

Parameters
  • arr – array to coarse grain

  • grain_size – N value size of grain

Returns

coarse-grained array

basmati.utils.sysrun(cmd: str) → subprocess.CompletedProcess

Run a system command

Gets all output (stdout and stderr). To access output: sysrun(cmd).stdout

Parameters

cmd – command to run

Raises

sp.CalledProcessError

Returns

result of cmd