sgsPy
structurally guided sampling
Loading...
Searching...
No Matches

Functions

 sgspy.stratify.quantiles.quantiles.quantiles (SpatialRaster rast, int|list[float]|list[int|list[float]]|dict[str, int|list[float]] quantiles, bool map=False, str filename='', int thread_count=8, dict driver_options=None, float eps=.001, Optional[bool] plot=None, Optional[int] histogram_bins=None, Optional[bool] info=None)
 This function conducts stratification on the raster given by generating quantile probabilities according to the 'quantiles' argument given by the user.

Detailed Description

Function Documentation

◆ quantiles()

sgspy.stratify.quantiles.quantiles.quantiles ( SpatialRaster rast,
int | list[float] | list[int|list[float]] | dict[str,int|list[float]] quantiles,
bool map = False,
str filename = '',
int thread_count = 8,
dict driver_options = None,
float eps = .001,
Optional[bool] plot = None,
Optional[int] histogram_bins = None,
Optional[bool] info = None )

This function conducts stratification on the raster given by generating quantile probabilities according to the 'quantiles' argument given by the user.

The quantiles may be defined as an integer, indicating the number of quantiles of equal size. Quantiles may also be defined as a list of probabilities between 0 and 1. In the case of a raster with a single band, the quantiles may be passed directly to the quantiles argument as either type: int | list[float].

In the case of a multi-band raster image, the specific bands can be specified by the index of a list containing an equal number of quantiles as bands (list[int | list[float]).

If not all raster bands should be stratified, specific bands can be selected in the form of a dict where the key is the name of a raster band and the value is the quantiles (dict[str, int | list[float]).

if the map parameter is given, an extra output band will be used which combines all stratifications from the bands used into an extra outpu band. A single value in the mapped output band corresponds to a combination a single combination of values from the previous bands.

The thread_count parameter specifies the number of threads which this function will utilize the the case where the raster is large and may not fit in memory. If the full raster can fit in memory and does not need to be processed in blocks, this argument will be ignored. The default is 8 threads, although the optimal number will depend significantly on the hardware being used and may be more or less than 8.

the driver_options parameter is used to specify creation options for the output raster. See options for the Gtiff driver here: https://gdal.org/en/stable/drivers/raster/gtiff.html#creation-options The keys in the driver_options dict must be strings, the values are converted to string. THe options must be valid for the driver corresponding to the filename, and if filename is not given they must be valid for the GTiff format, as that is the format used to store temporary raster files. Note that if this parameter is given, but filename is not and the raster fits entirely in memory, the driver_options parameter will be ignored.

the eps parameter is used only if batch processing is used to calculate the quantiles for a raster. Quantile streaming algorithms cannot be perfectly accurate, as this would necessitate having the entire raster in memory at once. A good approximation can be made, and the error is controlled by this epsilon (eps) value. The Quantile streaming method is the method introduced by Zhang et al. and utilized by MKL: https://web.cs.ucla.edu/~weiwang/paper/SSDBM07_2.pdf https://www.intel.com/content/www/us/en/docs/onemkl/developer-reference-summary-statistics-notes/2021-1/computing-quantiles-with-vsl-ss-method-squants-zw.html

Additionally, the 'plot' parameter determines whether a histogram plot will be made of the stratified bands. The histogram will be the distribution of each of the raster bands which had stratified bands made from them, with indicators on the break values. The 'histogram_bins' parameter indicates the number of bands in each of the plotted histograms.

The 'info' parameter, when true, prints the quantile values of each raster band.

Examples

rast = sgspy.SpatialRaster('rast.tif')
srast = sgspy.stratify.quantiles(rast, quantiles=5)

rast = sgspy.SpatialRaster('rast.tif')
srast = sgspy.stratify.quantiles(rast, quantiles=[.1, .2, .3, .5, .7], filename="srast.tif")

rast = sgspy.SpatialRaster('multi_band_rast.tif')
srast = sgspy.stratify.quantiles(rast, quantiles=[5, 5, [.5, .75]], map=True)

rast = sgspy.SpatialRaster('multi_band_rast.tif')
srast = sgspy.stratify.quantiles(rast, quantiles={'zq90': 5})

Parameters

rast : SpatialRaster
raster data structure containing the raster to stratify

quantiles : int | list[float] | list[int|list[float]] | dict[str,int|list[float]]
specification of the quantiles to stratify

map : bool
whether to map the stratifiction of multiple raster bands onto a single band

filename : str
filename to write to or '' if no file should be written

thread_count : int
the number of threads to use when multithreading large images

driver_options : dict[]
the creation options as defined by GDAL which will be passed when creating output files

eps : float
the epsilon value, controlling the error of stream-processed quantiles

plot : optional[bool]
whether or not to plot a histogram of the values in the bands with indicators on the breaks

histogram_bins : Optional[int]
The number of bins in the plotted histogram.

info : Optional[bool]
when true, plot quantile values of each band after calculation

Returns

a SpatialRaster object containing stratified raster bands.