sgsPy
structurally guided sampling
Loading...
Searching...
No Matches
sgs::clhs::CLHSDataManager< T >

#include <clhs.h>

Public Member Functions

 CLHSDataManager (int nFeat, int nSamp, xso::xoshiro_4x64_plus *p_rng, size_t existingCount)
void addPoint (T *p_features, int x, int y)
void addExistingPoint (T *p_features, int x, int y)
void finalize (std::vector< std::vector< T > > &corr)
uint64_t randomIndex ()
void getRandomPoint (Point< T > &point)
quantileObjectiveFunc (std::vector< int > &sampleCountPerQuantile)
correlationObjectiveFunc (std::vector< std::vector< T > > &corr)
void getExistingFeatures (std::vector< T > &features, std::vector< int > &x, std::vector< int > &y)

Detailed Description

template<typename T>
class sgs::clhs::CLHSDataManager< T >

This class is responsible for managing the data for the clhs sampling method.

It contains a vector with the feature values of all added pixels, as well as the x and y values of those pixels, and can randomly return one of those pixels as desired. It also stores the correlation matrix of the raster.

Constructor & Destructor Documentation

◆ CLHSDataManager()

template<typename T>
sgs::clhs::CLHSDataManager< T >::CLHSDataManager ( int nFeat,
int nSamp,
xso::xoshiro_4x64_plus * p_rng,
size_t existingCount )
inline

Constructor, sets the nFeat, nSamp, and random number generator. Also sizes the vectors to 1,000,000 points (initially). The vector will resize as required, and sizes down once raster reading is completed.

set sizes and count values of existing pixels if required.

Parameters
intnFeat
intnSamp
xso::xoshiro_4x64_plus*p_rng
intexistingCount

Member Function Documentation

◆ addExistingPoint()

template<typename T>
void sgs::clhs::CLHSDataManager< T >::addExistingPoint ( T * p_features,
int x,
int y )
inline

This function saves specifically an existing sample point to the data manager. It takes a feature vector (array) as a parameter alongside the x and y indices of the pixel containing those features.

The x, y, and features vectors are updated accordingly.

Parameters
T*p_features
intx
inty

◆ addPoint()

template<typename T>
void sgs::clhs::CLHSDataManager< T >::addPoint ( T * p_features,
int x,
int y )
inline

This function saves a point to the data manager. It takes a feature vector (array), as a parameter alongside the x and y indices of the pixel containing those features.

The x, y, and features vectors are updated accordingly and resized if required.

Parameters
T*p_features
intx
inty

◆ correlationObjectiveFunc()

template<typename T>
T sgs::clhs::CLHSDataManager< T >::correlationObjectiveFunc ( std::vector< std::vector< T > > & corr)
inline

This function calculates the output to the objective function for the differences in the correlation matricies for the clhs method. This function penalizes the sample for having features which are correlated differently than the whole sample space.

The output of this function, along with the output of the quantile objective function are used to determine whether as ample should be kept or not. Lower values are better.

Parameters
std::vector<std::vector<T>>corr
Returns
T

◆ finalize()

template<typename T>
void sgs::clhs::CLHSDataManager< T >::finalize ( std::vector< std::vector< T > > & corr)
inline

This function is called once the raster has been read, meaining no more points will be added to the data manager. The correlation matrix, which is calculated just after the raster reading finishes, is passed as a parameter so that it can be saved.

The x, y, and features vectors are resized.

A mask, which is used along with the random number generator to generate random indices within the saved points, is generated. The mask value is all 1 starting from the most significant bit which is 1 when the index is at it's largest.

When anded against a new random number, the mask value will generate a number which can be any of the indices, and is quite likely not to be larger than the capacity. If it is larger than the capacity it can just be calculated again.

Parameters
std::vector<std::vector<T>>&corr

◆ getExistingFeatures()

template<typename T>
void sgs::clhs::CLHSDataManager< T >::getExistingFeatures ( std::vector< T > & features,
std::vector< int > & x,
std::vector< int > & y )
inline

Copy features, x, and y vectors from CLHS data manager.

Parameters
std::vector<T>&features
std::vector<int>&x
std::vector<int>&y

◆ getRandomPoint()

template<typename T>
void sgs::clhs::CLHSDataManager< T >::getRandomPoint ( Point< T > & point)
inline

This function sets the x, y, and features pointer for a given index.

Parameters
Point<T>&point
uint64_tindex

◆ quantileObjectiveFunc()

template<typename T>
T sgs::clhs::CLHSDataManager< T >::quantileObjectiveFunc ( std::vector< int > & sampleCountPerQuantile)
inline

This function calculates the output to the continuous objective function for the clhs method. The goal of the clhs method is to have a latin hypercube output of samples. This is defined as having a single sample between each quantile of the hypercube for every feature. This function returns 0 if the sample is a latin hypercube, and the number of samples off if it is not.

The output of this function, along with the output of the correlation objective function are used to determine whether a sample should be kept or not. Lower values are better.

Parameters
std::vector<std::vector<int>>&sampleCountPerQuantile
Returns
T

◆ randomIndex()

template<typename T>
uint64_t sgs::clhs::CLHSDataManager< T >::randomIndex ( )
inline

This function is for generating a random index among the points saved.

Due to using the random number generator and a mask direcly (faster than something like an std::uniform_int_distribution) there may be some values which are larger than the total number of indices which are occupied. Due to this, indices are generated until there's one which is a valid index.

The reason the generator is initially bit shifted by 11 is because the type of generator used is not as random in the first 11 bits.

Returns
uint64_t

The documentation for this class was generated from the following file: