API¶

basmati.downloader¶

class basmati.downloader.HydroshedsDownloader(hydrosheds_dir: Union[str, pathlib.Path], delete_zip: bool)¶

Downloads and unzips HydroSHEDS dataset files from Dropbox

download_hydrobasins_all_levels(region: str) → None¶

Download HydroBASINS dataset, levels 1-12.

Parameters: region – region to download dataset for

download_hydrosheds_dem_30s(region: str) → None¶

Download 30s Digital Elevation Model for region.

Parameters: region – region to download DEM for

exception basmati.downloader.UnrecognizedRegionError¶: Region not one of the know 2-digit codes in HYDROBASINS_REGIONS

basmati.downloader.download_file_wget(url: str, basedir: pathlib.Path, filename: pathlib.Path) → pathlib.Path¶

Downloads a file from a given URL to the desired basedir / filename.

Uses wget and system command because requests cannot resolve the redirects of the stable dropbox links used by HydroSHEDS. See here for the base Dropbox directory: https://www.dropbox.com/sh/hmpwobbz9qixxpe/AAAI_jasMJPZl_6wX6d3vEOla?dl=0

Parameters

url – URL where file can be downloaded
basedir – directory to download to
filename – filename of file to download

Returns

filepath of downloaded file

basmati.downloader.download_main(dataset: str, region: str, delete_zip: bool) → None¶

Entry point for downloading HydroSHEDS datasets for the given region

Relies on HYDROSHEDS_DIR env var being set.

e.g.: $ basmati download -d <dataset> -r <region>

Raises

BasmatiError if HYDROSHEDS_DIR not set

Raises

BasmatiError region or dataset not recognized

Parameters

dataset – HydroSHEDS dataset to download
region – 2 character region code
delete_zip – delete downloaded zipfiles after extract

basmati.downloader.unzip_file(basedir: pathlib.Path, zipfilepath: pathlib.Path) → None¶

Completely extract a zip file to a given basedir

Parameters

basedir – directory to extract to
zipfilepath – path to zipfile

basmati.hydrosheds¶

basmati.hydrosheds.load_hydrobasins_geodataframe(hydrosheds_dir: Union[str, pathlib.Path], region: str, levels: Iterable = range(1, 7), hydrobasins_file_tpl: str = 'hybas_{region}_lev{level:02}_v1c.shp') → geopandas.geodataframe.GeoDataFrame¶

Load all data for the desired region and levels.

Parameters

hydrosheds_dir – directory of HydroSHEDS datasets
region – 2 character region code
levels – Pfafstetter levels to load
hydrobasins_file_tpl – filename template

Returns

geodataframe containing all the data for the desired region and levels

basmati.hydrosheds.load_hydrosheds_dem(hydrosheds_dir: Union[str, pathlib.Path], region: str, resolution: str = '30s', hydrosheds_dem_file_tpl: str = '{region}_dem_{resolution}.bil') → Tuple[numpy.ndarray, affine.Affine, numpy.ndarray, numpy.ndarray]¶

Load a HydroSHEDS Digital Elevation Model (DEM).

Parameters

hydrosheds_dir – directory of HydroSHEDS datasets
region – 2 character region code
resolution – resolution to load
hydrosheds_dem_file_tpl – filename template

Returns

bounds, affine transform, DEM and mask of the DEM

basmati.hydrosheds.is_downstream(pfaf_id_a: Union[int, str], pfaf_id_b: Union[int, str]) → bool¶

Calculate if pfaf_id_b is downstream of pfaf_id_a

Implemented as in https://en.wikipedia.org/wiki/Pfafstetter_Coding_System#Properties Works even if pfaf_id_a and pfaf_id_b are at different levels.

Parameters

pfaf_id_a – first Pfafstetter id (upstream)
pfaf_id_b – second Pfafstetter id (downstream)

Returns

True if pfaf_id_b is downstream of pfaf_id_a, False otherwise or if a == b

basmati.hydrosheds._find_downstream(gdf: geopandas.geodataframe.GeoDataFrame, start_basin_pfaf_id: int) → geopandas.geodataframe.GeoDataFrame¶

Find all downstream basins at the same level as the start basin.

Can also be used as a method on a gpd.GeoDataFrame: gdf.find_downstream(start_basin_pfaf_id)

Parameters

gdf – hydrobasins geodataframe to traverse
start_basin_pfaf_id – Pfafstetter id of start basin

Returns

filtered geodataframe at level of start basin based on which basins are downstream of start basin

basmati.hydrosheds._find_upstream(gdf: geopandas.geodataframe.GeoDataFrame, start_basin_pfaf_id: int) → geopandas.geodataframe.GeoDataFrame¶

Find all upstream basins at the same level as the start basin.

Can also be used as a method on a gpd.GeoDataFrame: gdf.find_upstream(start_basin_pfaf_id)

Parameters

gdf – hydrobasins geodataframe to traverse
start_basin_pfaf_id – Pfafstetter id of start basin

Returns

filtered geodataframe at level of start basin based on which basins are upstream of start basin

basmati.hydrosheds._find_next_level_larger(gdf: geopandas.geodataframe.GeoDataFrame, start_basin_pfaf_id: int) → geopandas.geodataframe.GeoDataFrame¶

Find basin one level lower (i.e. found basin is larger).

if start_basin_pfaf_id == 913, will return basin 91. Can return 0 or 1 basins.

Can also be used as a method on a gpd.GeoDataFrame: gdf.find_next_level_larger(start_basin_pfaf_id)

Parameters

gdf – hydrobasins geodataframe to traverse
start_basin_pfaf_id – Pfafstetter id of start basin

Returns

filtered geodataframe with 0 or 1 basins at level lower

basmati.hydrosheds._find_next_level_smaller(gdf: geopandas.geodataframe.GeoDataFrame, start_basin_pfaf_id: int) → geopandas.geodataframe.GeoDataFrame¶

Find basins one level higher (i.e. found basins are smaller).

if start_basin_pfaf_id == 91, will return basins 911, 912… 919. Can return 0-9 basins.

Can also be used as a method on a gpd.GeoDataFrame: gdf.find_next_level_smaller(start_basin_pfaf_id)

Parameters

gdf – hydrobasins geodataframe to traverse
start_basin_pfaf_id – Pfafstetter id of start basin

Returns

filtered geodataframe with 0-9 basins at level higher

basmati.hydrosheds._area_select(gdf: geopandas.geodataframe.GeoDataFrame, min_area: float, max_area: float) → geopandas.geodataframe.GeoDataFrame¶

Select basins from lower to higher levels that are between min_area and max_area in area.

Start by working out if any basins at e.g. level 1 are selected. Then move on to higher levels (smaller basins). At each level, only add basins if the basin at the level below has not been added. e.g. level 3 basins 411 to 419 will not be added if at level 2 basin 41 was added.

Can also be used as a method on a gpd.GeoDataFrame: gdf.area_select(min_area, max_area)

Parameters

gdf – hydrobasins geodataframe to traverse
min_area – minimum area of basin
max_area – maximum area of basin

Returns

filtered geodataframe from any level (favouring lower levels) with area between min and max

basmati.utils¶

basmati.utils.build_raster_from_geometries(geometries: Iterable[shapely.geometry.base.BaseGeometry], shape: Iterable[int], tx: affine.Affine) → numpy.ndarray¶

Build a 2D raster from the geometries (e.g. gdf.geometry)

Each geometry is assigned an index, which increments by one for each geometry.

Parameters

geometries – Individual geometries
shape – shape of desired raster
tx – affine transform to apply to each geometry before rasterizing

Returns

2D raster where each index is the raster of an individual geometry.

basmati.utils.coarse_grain2d(arr: numpy.ndarray, grain_size: List[int]) → numpy.ndarray¶

Coarse grain a 2D arr based on grain_size

Parameters

arr – array to coarse grain
grain_size – 2 value size of grain

Returns

coarse-grained array

basmati.utils.coarse_grain2d_ndim(arr: numpy.ndarray, grain_size: List[int]) → numpy.ndarray¶

Coarse grain an N-D arr based on grain_size

Parameters

arr – array to coarse grain
grain_size – N value size of grain

Returns

coarse-grained array

basmati.utils.sysrun(cmd: str) → subprocess.CompletedProcess¶

Run a system command

Gets all output (stdout and stderr). To access output: sysrun(cmd).stdout

Parameters: cmd – command to run
Raises: sp.CalledProcessError
Returns: result of cmd