API¶
basmati.downloader¶
-
class
basmati.downloader.HydroshedsDownloader(hydrosheds_dir: Union[str, pathlib.Path], delete_zip: bool)¶ Downloads and unzips HydroSHEDS dataset files from Dropbox
-
download_hydrobasins_all_levels(region: str) → None¶ Download HydroBASINS dataset, levels 1-12.
- Parameters
region – region to download dataset for
-
download_hydrosheds_dem_30s(region: str) → None¶ Download 30s Digital Elevation Model for region.
- Parameters
region – region to download DEM for
-
-
exception
basmati.downloader.UnrecognizedRegionError¶ Region not one of the know 2-digit codes in HYDROBASINS_REGIONS
-
basmati.downloader.download_file_wget(url: str, basedir: pathlib.Path, filename: pathlib.Path) → pathlib.Path¶ Downloads a file from a given URL to the desired basedir / filename.
Uses wget and system command because requests cannot resolve the redirects of the stable dropbox links used by HydroSHEDS. See here for the base Dropbox directory: https://www.dropbox.com/sh/hmpwobbz9qixxpe/AAAI_jasMJPZl_6wX6d3vEOla?dl=0
- Parameters
url – URL where file can be downloaded
basedir – directory to download to
filename – filename of file to download
- Returns
filepath of downloaded file
-
basmati.downloader.download_main(dataset: str, region: str, delete_zip: bool) → None¶ Entry point for downloading HydroSHEDS datasets for the given region
Relies on HYDROSHEDS_DIR env var being set.
e.g.: $ basmati download -d <dataset> -r <region>
- Raises
BasmatiError if HYDROSHEDS_DIR not set
- Raises
BasmatiError region or dataset not recognized
- Parameters
dataset – HydroSHEDS dataset to download
region – 2 character region code
delete_zip – delete downloaded zipfiles after extract
-
basmati.downloader.unzip_file(basedir: pathlib.Path, zipfilepath: pathlib.Path) → None¶ Completely extract a zip file to a given basedir
- Parameters
basedir – directory to extract to
zipfilepath – path to zipfile
basmati.hydrosheds¶
-
basmati.hydrosheds.load_hydrobasins_geodataframe(hydrosheds_dir: Union[str, pathlib.Path], region: str, levels: Iterable = range(1, 7), hydrobasins_file_tpl: str = 'hybas_{region}_lev{level:02}_v1c.shp') → geopandas.geodataframe.GeoDataFrame¶ Load all data for the desired region and levels.
- Parameters
hydrosheds_dir – directory of HydroSHEDS datasets
region – 2 character region code
levels – Pfafstetter levels to load
hydrobasins_file_tpl – filename template
- Returns
geodataframe containing all the data for the desired region and levels
-
basmati.hydrosheds.load_hydrosheds_dem(hydrosheds_dir: Union[str, pathlib.Path], region: str, resolution: str = '30s', hydrosheds_dem_file_tpl: str = '{region}_dem_{resolution}.bil') → Tuple[numpy.ndarray, affine.Affine, numpy.ndarray, numpy.ndarray]¶ Load a HydroSHEDS Digital Elevation Model (DEM).
- Parameters
hydrosheds_dir – directory of HydroSHEDS datasets
region – 2 character region code
resolution – resolution to load
hydrosheds_dem_file_tpl – filename template
- Returns
bounds, affine transform, DEM and mask of the DEM
-
basmati.hydrosheds.is_downstream(pfaf_id_a: Union[int, str], pfaf_id_b: Union[int, str]) → bool¶ Calculate if pfaf_id_b is downstream of pfaf_id_a
Implemented as in https://en.wikipedia.org/wiki/Pfafstetter_Coding_System#Properties Works even if pfaf_id_a and pfaf_id_b are at different levels.
- Parameters
pfaf_id_a – first Pfafstetter id (upstream)
pfaf_id_b – second Pfafstetter id (downstream)
- Returns
True if pfaf_id_b is downstream of pfaf_id_a, False otherwise or if a == b
-
basmati.hydrosheds._find_downstream(gdf: geopandas.geodataframe.GeoDataFrame, start_basin_pfaf_id: int) → geopandas.geodataframe.GeoDataFrame¶ Find all downstream basins at the same level as the start basin.
Can also be used as a method on a gpd.GeoDataFrame: gdf.find_downstream(start_basin_pfaf_id)
- Parameters
gdf – hydrobasins geodataframe to traverse
start_basin_pfaf_id – Pfafstetter id of start basin
- Returns
filtered geodataframe at level of start basin based on which basins are downstream of start basin
-
basmati.hydrosheds._find_upstream(gdf: geopandas.geodataframe.GeoDataFrame, start_basin_pfaf_id: int) → geopandas.geodataframe.GeoDataFrame¶ Find all upstream basins at the same level as the start basin.
Can also be used as a method on a gpd.GeoDataFrame: gdf.find_upstream(start_basin_pfaf_id)
- Parameters
gdf – hydrobasins geodataframe to traverse
start_basin_pfaf_id – Pfafstetter id of start basin
- Returns
filtered geodataframe at level of start basin based on which basins are upstream of start basin
-
basmati.hydrosheds._find_next_level_larger(gdf: geopandas.geodataframe.GeoDataFrame, start_basin_pfaf_id: int) → geopandas.geodataframe.GeoDataFrame¶ Find basin one level lower (i.e. found basin is larger).
if start_basin_pfaf_id == 913, will return basin 91. Can return 0 or 1 basins.
Can also be used as a method on a gpd.GeoDataFrame: gdf.find_next_level_larger(start_basin_pfaf_id)
- Parameters
gdf – hydrobasins geodataframe to traverse
start_basin_pfaf_id – Pfafstetter id of start basin
- Returns
filtered geodataframe with 0 or 1 basins at level lower
-
basmati.hydrosheds._find_next_level_smaller(gdf: geopandas.geodataframe.GeoDataFrame, start_basin_pfaf_id: int) → geopandas.geodataframe.GeoDataFrame¶ Find basins one level higher (i.e. found basins are smaller).
if start_basin_pfaf_id == 91, will return basins 911, 912… 919. Can return 0-9 basins.
Can also be used as a method on a gpd.GeoDataFrame: gdf.find_next_level_smaller(start_basin_pfaf_id)
- Parameters
gdf – hydrobasins geodataframe to traverse
start_basin_pfaf_id – Pfafstetter id of start basin
- Returns
filtered geodataframe with 0-9 basins at level higher
-
basmati.hydrosheds._area_select(gdf: geopandas.geodataframe.GeoDataFrame, min_area: float, max_area: float) → geopandas.geodataframe.GeoDataFrame¶ Select basins from lower to higher levels that are between min_area and max_area in area.
Start by working out if any basins at e.g. level 1 are selected. Then move on to higher levels (smaller basins). At each level, only add basins if the basin at the level below has not been added. e.g. level 3 basins 411 to 419 will not be added if at level 2 basin 41 was added.
Can also be used as a method on a gpd.GeoDataFrame: gdf.area_select(min_area, max_area)
- Parameters
gdf – hydrobasins geodataframe to traverse
min_area – minimum area of basin
max_area – maximum area of basin
- Returns
filtered geodataframe from any level (favouring lower levels) with area between min and max
basmati.utils¶
-
basmati.utils.build_raster_from_geometries(geometries: Iterable[shapely.geometry.base.BaseGeometry], shape: Iterable[int], tx: affine.Affine) → numpy.ndarray¶ Build a 2D raster from the geometries (e.g. gdf.geometry)
Each geometry is assigned an index, which increments by one for each geometry.
- Parameters
geometries – Individual geometries
shape – shape of desired raster
tx – affine transform to apply to each geometry before rasterizing
- Returns
2D raster where each index is the raster of an individual geometry.
-
basmati.utils.coarse_grain2d(arr: numpy.ndarray, grain_size: List[int]) → numpy.ndarray¶ Coarse grain a 2D arr based on grain_size
- Parameters
arr – array to coarse grain
grain_size – 2 value size of grain
- Returns
coarse-grained array
-
basmati.utils.coarse_grain2d_ndim(arr: numpy.ndarray, grain_size: List[int]) → numpy.ndarray¶ Coarse grain an N-D arr based on grain_size
- Parameters
arr – array to coarse grain
grain_size – N value size of grain
- Returns
coarse-grained array
-
basmati.utils.sysrun(cmd: str) → subprocess.CompletedProcess¶ Run a system command
Gets all output (stdout and stderr). To access output: sysrun(cmd).stdout
- Parameters
cmd – command to run
- Raises
sp.CalledProcessError
- Returns
result of cmd