Implementations

Included with this toolkit are a number of implementations for the interfaces described in the previous section. Unlike the interfaces, which declare operation and use case, implementations provide variations on how to satisfy the interface-defined use case, varying trade-offs, or results implications.

Image Perturbation

Class: RandomGrid

class xaitk_saliency.impls.perturb_image.random_grid.RandomGrid(n: int, s: Tuple[int, int], p1: float, seed: int | None = None, threads: int | None = None)

Generate masks using a random grid of set cell size. If the chosen cell size does not divide an image evenly, then the grid is over-sized and the resulting mask is centered and cropped. Each mask is also shifted randomly by a maximum of half the cell size in both x and y.

This method is based on RISE (http://bmvc2018.org/contents/papers/1064.pdf) but aims to address the changing cell size, given images of different sizes, aspect of that implementation. This method keeps cell size constant and instead adjusts the overall grid size for different sized images.

Parameters:

n – Number of masks to generate.
s – Dimensions of the grid cells in pixels. E.g. (3, 4) would use a grid of 3x4 pixel cells.
p1 – Probability of a grid cell being set to 1 (not occluded). This should be a float value in the [0, 1] range.
seed – A seed to use for the random number generator, allowing for masks to be reproduced.
threads – Number of threads to use when generating masks. If this is <=0 or None, no threading is used and processing is performed in-line serially.

get_config() → Dict[str, Any]

Return a JSON-compliant dictionary that could be passed to this class’s from_config method to produce an instance with identical configuration.

In the most cases, this involves naming the keys of the dictionary based on the initialization argument names as if it were to be passed to the constructor via dictionary expansion. In some cases, where it doesn’t make sense to store some object constructor parameters are expected to be supplied at as configuration values (i.e. must be supplied at runtime), this method’s returned dictionary may leave those parameters out. In such cases, the object’s from_config class-method would also take additional positional arguments to fill in for the parameters that this returned configuration lacks.

Returns:: JSON type compliant configuration dictionary.
Return type:: dict

perturb(ref_img: ndarray) → ndarray

Transform an input reference image into a number of mask matrices indicating the perturbed regions.

Output mask matrix should be three-dimensional with the format [nMasks x Height x Width], sharing the same height and width to the input reference image. The implementing algorithm may determine the quantity of output masks per input image. These masks should indicate the regions in the corresponding perturbed image that have been modified. Values should be in the [0, 1] range, where a value closer to 1.0 indicates areas of the image that are unperturbed. Note that output mask matrices may be of a floating-point type to allow for fractional perturbation.

Parameters:: ref_image – Reference image to generate perturbations from.
Returns:: Mask matrix with shape [nMasks x Height x Width].

Class: RISEGrid

class xaitk_saliency.impls.perturb_image.rise.RISEGrid(n: int, s: int, p1: float, seed: int | None = None, threads: int | None = 4)

Based on Petsiuk et. al: http://bmvc2018.org/contents/papers/1064.pdf

Implementation is borrowed from the original authors: https://github.com/eclique/RISE/blob/master/explanations.py

Generate a set of random binary masks

Parameters:

n – Number of random masks used in the algorithm. E.g. 1000.
s – Spatial resolution of the small masking grid. E.g. 8. Assumes square grid.
p1 – Probability of the grid cell being set to 1 (otherwise 0). This should be a float value in the [0, 1] range. E.g. 0.5.
seed – A seed to pass into the constructed random number generator to allow for reproducibility
threads – The number of threads to utilize when generating masks. If this is <=0 or None, no threading is used and processing is performed in-line serially.

get_config() → Dict[str, Any]

Return a JSON-compliant dictionary that could be passed to this class’s from_config method to produce an instance with identical configuration.

In the most cases, this involves naming the keys of the dictionary based on the initialization argument names as if it were to be passed to the constructor via dictionary expansion. In some cases, where it doesn’t make sense to store some object constructor parameters are expected to be supplied at as configuration values (i.e. must be supplied at runtime), this method’s returned dictionary may leave those parameters out. In such cases, the object’s from_config class-method would also take additional positional arguments to fill in for the parameters that this returned configuration lacks.

Returns:: JSON type compliant configuration dictionary.
Return type:: dict

perturb(ref_image: ndarray) → ndarray

Transform an input reference image into a number of mask matrices indicating the perturbed regions.

Output mask matrix should be three-dimensional with the format [nMasks x Height x Width], sharing the same height and width to the input reference image. The implementing algorithm may determine the quantity of output masks per input image. These masks should indicate the regions in the corresponding perturbed image that have been modified. Values should be in the [0, 1] range, where a value closer to 1.0 indicates areas of the image that are unperturbed. Note that output mask matrices may be of a floating-point type to allow for fractional perturbation.

Parameters:: ref_image – Reference image to generate perturbations from.
Returns:: Mask matrix with shape [nMasks x Height x Width].

Class: SlidingRadial

class xaitk_saliency.impls.perturb_image.sliding_radial.SlidingRadial(radius: Tuple[float, float] = (50, 50), stride: Tuple[int, int] = (20, 20), sigma: Tuple[float, float] | None = None)

Produce perturbation matrices generated by sliding a radial occlusion area with configured radius over the area of an image. When the two radius values are the same, circular masks are generated; otherwise, elliptical masks are generated. Passing sigma values will apply a Gaussian filter to the mask, blurring it. This results in a smooth transition from full occlusion in the center of the radial to no occlusion at the edge.

Due to the geometry of sliding radials, if the stride given does not evenly divide the radial size along the applicable axis, then the result plane of values when summing the generated masks will not be even.

Related, if the stride is set to be larger than the radial diameter, the resulting plane of summed values will also not be even, as there be increasingly long valleys of unperturbed space between masked regions.

The generated masks are boolean if no blurring is used, otherwise the masks will be of floating-point type in the [0, 1] range.

Parameters:

radius – The radius of the occlusion area in pixels as a tuple with format (radius_y, radius_x).
stride – The striding step in pixels for the center of the radial as a tuple with format (height_step, width_step).
sigma – The sigma values for the Gaussian filter applied to masks in pixels as a tuple with format (sigma_y, sigma_x).

get_config() → Dict[str, Any]

Return a JSON-compliant dictionary that could be passed to this class’s from_config method to produce an instance with identical configuration.

In the most cases, this involves naming the keys of the dictionary based on the initialization argument names as if it were to be passed to the constructor via dictionary expansion. In some cases, where it doesn’t make sense to store some object constructor parameters are expected to be supplied at as configuration values (i.e. must be supplied at runtime), this method’s returned dictionary may leave those parameters out. In such cases, the object’s from_config class-method would also take additional positional arguments to fill in for the parameters that this returned configuration lacks.

Returns:: JSON type compliant configuration dictionary.
Return type:: dict

classmethod get_default_config() → Dict[str, Any]

Generate and return a default configuration dictionary for this class. This will be primarily used for generating what the configuration dictionary would look like for this class without instantiating it.

By default, we observe what this class’s constructor takes as arguments, turning those argument names into configuration dictionary keys. If any of those arguments have defaults, we will add those values into the configuration dictionary appropriately. The dictionary returned should only contain JSON compliant value types.

It is not be guaranteed that the configuration dictionary returned from this method is valid for construction of an instance of this class.

Returns:: Default configuration dictionary for the class.
Return type:: dict

>>> # noinspection PyUnresolvedReferences
>>> class SimpleConfig(Configurable):
...     def __init__(self, a=1, b='foo'):
...         self.a = a
...         self.b = b
...     def get_config(self):
...         return {'a': self.a, 'b': self.b}
>>> self = SimpleConfig()
>>> config = self.get_default_config()
>>> assert config == {'a': 1, 'b': 'foo'}

perturb(ref_image: ndarray) → ndarray

Transform an input reference image into a number of mask matrices indicating the perturbed regions.

Output mask matrix should be three-dimensional with the format [nMasks x Height x Width], sharing the same height and width to the input reference image. The implementing algorithm may determine the quantity of output masks per input image. These masks should indicate the regions in the corresponding perturbed image that have been modified. Values should be in the [0, 1] range, where a value closer to 1.0 indicates areas of the image that are unperturbed. Note that output mask matrices may be of a floating-point type to allow for fractional perturbation.

Parameters:: ref_image – Reference image to generate perturbations from.
Returns:: Mask matrix with shape [nMasks x Height x Width].

Class: SlidingWindow

class xaitk_saliency.impls.perturb_image.sliding_window.SlidingWindow(window_size: Tuple[int, int] = (50, 50), stride: Tuple[int, int] = (20, 20))

Produce perturbation matrices based on hard, block-y occlusion areas as generated by sliding a window of a configured size over the area of an image.

Due to the geometry of sliding windows, if the stride given does not evenly divide the window size along the applicable axis, then the result plane of values when summing the generated masks will not be even.

Related, if the stride is set to be larger than the window size, the resulting plane of summed values will also not be even, as there be increasingly long valleys of unperturbed space between masked regions.

Parameters:

window_size – The block window size in pixels as a tuple with format (height, width).
stride – The sliding window striding step in pixels as a tuple with format (height_step, width_step).

get_config() → Dict[str, Any]

Return a JSON-compliant dictionary that could be passed to this class’s from_config method to produce an instance with identical configuration.

In the most cases, this involves naming the keys of the dictionary based on the initialization argument names as if it were to be passed to the constructor via dictionary expansion. In some cases, where it doesn’t make sense to store some object constructor parameters are expected to be supplied at as configuration values (i.e. must be supplied at runtime), this method’s returned dictionary may leave those parameters out. In such cases, the object’s from_config class-method would also take additional positional arguments to fill in for the parameters that this returned configuration lacks.

Returns:: JSON type compliant configuration dictionary.
Return type:: dict

classmethod get_default_config() → Dict[str, Any]

Generate and return a default configuration dictionary for this class. This will be primarily used for generating what the configuration dictionary would look like for this class without instantiating it.

By default, we observe what this class’s constructor takes as arguments, turning those argument names into configuration dictionary keys. If any of those arguments have defaults, we will add those values into the configuration dictionary appropriately. The dictionary returned should only contain JSON compliant value types.

It is not be guaranteed that the configuration dictionary returned from this method is valid for construction of an instance of this class.

Returns:: Default configuration dictionary for the class.
Return type:: dict

>>> # noinspection PyUnresolvedReferences
>>> class SimpleConfig(Configurable):
...     def __init__(self, a=1, b='foo'):
...         self.a = a
...         self.b = b
...     def get_config(self):
...         return {'a': self.a, 'b': self.b}
>>> self = SimpleConfig()
>>> config = self.get_default_config()
>>> assert config == {'a': 1, 'b': 'foo'}

perturb(ref_image: ndarray) → ndarray

Transform an input reference image into a number of mask matrices indicating the perturbed regions.

Output mask matrix should be three-dimensional with the format [nMasks x Height x Width], sharing the same height and width to the input reference image. The implementing algorithm may determine the quantity of output masks per input image. These masks should indicate the regions in the corresponding perturbed image that have been modified. Values should be in the [0, 1] range, where a value closer to 1.0 indicates areas of the image that are unperturbed. Note that output mask matrices may be of a floating-point type to allow for fractional perturbation.

Parameters:: ref_image – Reference image to generate perturbations from.
Returns:: Mask matrix with shape [nMasks x Height x Width].

Heatmap Generation

Class: DRISEScoring

class xaitk_saliency.impls.gen_detector_prop_sal.drise_scoring.DRISEScoring

This D-RISE implementation transforms black-box object detector predictions into visual saliency heatmaps. Specifically, we make use of perturbed detections generated using the RISEGrid image perturbation class and a similarity metric that captures both the localization and categorization aspects of object detection.

Object detection representations used here would need to encapsulate localization information (i.e. bounding box regions), class scores, and objectness scores (if applicable to the detector, such as YOLOv3). Object detections are converted into (4+1+nClasses) vectors (4 indices for bounding box locations, 1 index for objectness, and nClasses indices for different object classes).

If your detections consist of a single class prediction and confidence score instead of scores for each class, it is best practice to replace the objectness score with the confidence score and use a one-hot encoding of the prediction as the class scores.

Based on Petsiuk et al: https://arxiv.org/abs/2006.03204

generate(ref_dets: ndarray, perturbed_dets: ndarray, perturbed_masks: ndarray) → ndarray

Generate visual saliency heatmap matrices for each reference detection, describing what visual information contributed to the associated reference detection.

We expect input detections to come from a black-box source that outputs our minimum requirements of a bounding-box, per-class scores. Objectness scores are required in our input format, but not necessarily from detection black-box methods as there is a sensible default value for this. See the format_detection() helper function for assistance in forming our input format, which includes this optional default fill-in. We expect objectness is a confidence score valued in the inclusive [0,1] range. We also expect classification scores to be in the inclusive [0,1] range.

We assume that an input detection is coupled with a single truth class (or a single leaf node in a hierarchical structure). Detections input as references (ref_dets parameter) may be either ground truth or predicted detections. As for perturbed image detections input (perturbed_dets), we expect the quantity of detections to be decoupled from the source of reference image detections, which is why below we formulate the shape of perturbed image detections with nProps instead of nDets.

Perturbation mask input into the perturbed_masks parameter here is equivalent to the perturbation mask output from a xaitk_saliency.interfaces.perturb_image.PerturbImage.perturb() method implementation. These should have the shape [nMasks x H x W], and values in range [0, 1], where a value closer to 1 indicate areas of the image that are unperturbed. Note the type of values in masks can be either integer, floating point or boolean within the above range definition. Implementations are responsible for handling these expected variations.

Generated saliency heatmap matrices should be floating-point typed and be composed of values in the [-1,1] range. Positive values of the saliency heatmaps indicate regions which increase object detection scores, while negative values indicate regions which decrease object detection scores according to the model that generated input object detections.

Parameters:

ref_dets – Detections, objectness and class scores on a reference image as a float-typed array with shape [nDets x (4+1+nClasses)].
perturbed_dets – Object detections, objectness and class scores for perturbed variations of the reference image. We expect this to be a float-types array with shape [nMasks x nProps x (4+1+nClasses)].
perturb_masks – Perturbation masks numpy.ndarray over the reference image. This should be parallel in association to the detection propositions input into the perturbed_dets parameter. This should have a shape [nMasks x H x W], and values in range [0, 1], where a value closer to 1 indicate areas of the image that are unperturbed.

Returns:

A visual saliency heatmap matrix describing each input reference detection. These will be float-typed arrays with shape [nDets x H x W].

get_config() → dict

Return a JSON-compliant dictionary that could be passed to this class’s from_config method to produce an instance with identical configuration.

In the most cases, this involves naming the keys of the dictionary based on the initialization argument names as if it were to be passed to the constructor via dictionary expansion. In some cases, where it doesn’t make sense to store some object constructor parameters are expected to be supplied at as configuration values (i.e. must be supplied at runtime), this method’s returned dictionary may leave those parameters out. In such cases, the object’s from_config class-method would also take additional positional arguments to fill in for the parameters that this returned configuration lacks.

Returns:: JSON type compliant configuration dictionary.
Return type:: dict

iou(box_a: ndarray, box_b: ndarray) → ndarray

Compute the intersection over union (IoU) of two sets of boxes.

E.g.:

A ∩ B / A ∪ B = A ∩ B / (area(A) + area(B) - A ∩ B)

Parameters:

box_a – (np.array) bounding boxes, Shape: [A,4]
box_b – (np.array) bounding boxes, Shape: [B,4]

Returns:

iou(np.array), Shape: [A,B].

Class: OcclusionScoring

class xaitk_saliency.impls.gen_classifier_conf_sal.occlusion_scoring.OcclusionScoring

This saliency implementation transforms black-box image classification scores into saliency heatmaps. This should require a sequence of per-class confidences predicted on the reference image, a number of per-class confidences as predicted on perturbed images, as well as the masks of the reference image perturbations (as would be output from a PerturbImage implementation).

The perturbation masks used by the following implementation are expected to be of type integer. Masks containing values of type float are rounded to the nearest value and binarized with value 1 replacing values greater than or equal to half of the maximum value in mask after rounding while 0 replaces the rest.

generate(image_conf: ndarray, perturbed_conf: ndarray, perturbed_masks: ndarray) → ndarray

Generate a visual saliency heatmap matrix given the black-box classifier output on a reference image, the same classifier output on perturbed images and the masks of the visual perturbations.

Perturbation mask input into the perturbed_masks parameter here is equivalent to the perturbation mask output from a xaitk_saliency.interfaces.perturb_image.PerturbImage.perturb() method implementation. These should have the shape [nMasks x H x W], and values in range [0, 1], where a value closer to 1 indicate areas of the image that are unperturbed. Note the type of values in masks can be either integer, floating point or boolean within the above range definition. Implementations are responsible for handling these expected variations.

Generated saliency heatmap matrices should be floating-point typed and be composed of values in the [-1,1] range. Positive values of the saliency heatmaps indicate regions which increase class confidence scores, while negative values indicate regions which decrease class confidence scores according to the model that generated input confidence values.

Parameters:

image_conf – Reference image predicted class-confidence vector, as a numpy.ndarray, for all classes that require saliency map generation. This should have a shape [nClasses], be float-typed and with values in the [0,1] range.
perturbed_conf – Perturbed image predicted class confidence matrix. Classes represented in this matrix should be congruent to classes represented in the image_conf vector. This should have a shape [nMasks x nClasses], be float-typed and with values in the [0,1] range.
perturbed_masks – Perturbation masks numpy.ndarray over the reference image. This should be parallel in association to the classification results input into the perturbed_conf parameter. This should have a shape [nMasks x H x W], and values in range [0, 1], where a value closer to 1 indicate areas of the image that are unperturbed.

Returns:

Generated visual saliency heatmap for each input class as a float-type numpy.ndarray of shape [nClasses x H x W].

get_config() → dict

Return a JSON-compliant dictionary that could be passed to this class’s from_config method to produce an instance with identical configuration.

In the most cases, this involves naming the keys of the dictionary based on the initialization argument names as if it were to be passed to the constructor via dictionary expansion. In some cases, where it doesn’t make sense to store some object constructor parameters are expected to be supplied at as configuration values (i.e. must be supplied at runtime), this method’s returned dictionary may leave those parameters out. In such cases, the object’s from_config class-method would also take additional positional arguments to fill in for the parameters that this returned configuration lacks.

Returns:: JSON type compliant configuration dictionary.
Return type:: dict

Class: RISEScoring

class xaitk_saliency.impls.gen_classifier_conf_sal.rise_scoring.RISEScoring(p1: float = 0.0)

Saliency map generation based on the original RISE implementation. This version utilizes only the input perturbed image confidence predictions and does not utilize reference image confidences. This implementation also takes influence from debiased RISE and may take an optional debias probability, p1 (0 by default). In the original paper this is paired with the same probability used in RISE perturbation mask generation (see the p1 parameter in xaitk_saliency.impls.perturb_image.rise.RISEGrid).

Based on Hatakeyama et. al: https://openaccess.thecvf.com/content/ACCV2020/papers/Hatakeyama_Visualizing_Color-wise_Saliency_of_Black-Box_Image_Classification_Models_ACCV_2020_paper.pdf

Generate RISE-based saliency maps with optional p1 de-biasing.

Parameters:: p1 – De-biasing parameter based on the masking probability. This should be a float value in the [0, 1] range.
Raises:: ValueError – Input p1 was not in the [0,1] range.

generate(image_conf: ndarray, perturbed_conf: ndarray, perturbed_masks: ndarray) → ndarray

Generate a visual saliency heatmap matrix given the black-box classifier output on a reference image, the same classifier output on perturbed images and the masks of the visual perturbations.

Perturbation mask input into the perturbed_masks parameter here is equivalent to the perturbation mask output from a xaitk_saliency.interfaces.perturb_image.PerturbImage.perturb() method implementation. These should have the shape [nMasks x H x W], and values in range [0, 1], where a value closer to 1 indicate areas of the image that are unperturbed. Note the type of values in masks can be either integer, floating point or boolean within the above range definition. Implementations are responsible for handling these expected variations.

Generated saliency heatmap matrices should be floating-point typed and be composed of values in the [-1,1] range. Positive values of the saliency heatmaps indicate regions which increase class confidence scores, while negative values indicate regions which decrease class confidence scores according to the model that generated input confidence values.

Parameters:

image_conf – Reference image predicted class-confidence vector, as a numpy.ndarray, for all classes that require saliency map generation. This should have a shape [nClasses], be float-typed and with values in the [0,1] range.
perturbed_conf – Perturbed image predicted class confidence matrix. Classes represented in this matrix should be congruent to classes represented in the image_conf vector. This should have a shape [nMasks x nClasses], be float-typed and with values in the [0,1] range.
perturbed_masks – Perturbation masks numpy.ndarray over the reference image. This should be parallel in association to the classification results input into the perturbed_conf parameter. This should have a shape [nMasks x H x W], and values in range [0, 1], where a value closer to 1 indicate areas of the image that are unperturbed.

Returns:

Generated visual saliency heatmap for each input class as a float-type numpy.ndarray of shape [nClasses x H x W].

get_config() → Dict[str, Any]

Return a JSON-compliant dictionary that could be passed to this class’s from_config method to produce an instance with identical configuration.

In the most cases, this involves naming the keys of the dictionary based on the initialization argument names as if it were to be passed to the constructor via dictionary expansion. In some cases, where it doesn’t make sense to store some object constructor parameters are expected to be supplied at as configuration values (i.e. must be supplied at runtime), this method’s returned dictionary may leave those parameters out. In such cases, the object’s from_config class-method would also take additional positional arguments to fill in for the parameters that this returned configuration lacks.

Returns:: JSON type compliant configuration dictionary.
Return type:: dict

Class: SimilarityScoring

class xaitk_saliency.impls.gen_descriptor_sim_sal.similarity_scoring.SimilarityScoring(proximity_metric: str = 'euclidean')

This saliency implementation transforms proximity in feature space into saliency heatmaps. This should require feature vectors for the reference image, for each query image, and for perturbed versions of the reference image, as well as the masks of the reference image perturbations (as would be output from a PerturbImage implementation).

The resulting saliency maps are relative to the reference image. As such, each map denotes regions in the reference image that make it more or less similar to the corresponding query image.

Parameters:

proximity_metric –

The type of comparison metric used to determine proximity in feature space. The type of comparison metric supported is restricted by scipy’s cdist() function. The following metrics are supported in scipy.

‘braycurtis’, ‘canberra’, ‘chebyshev’, ‘cityblock’, ‘correlation’, ‘cosine’, ‘dice’, ‘euclidean’, ‘hamming’, ‘jaccard’, ‘jensenshannon’, ‘kulsinski’, ‘mahalanobis’, ‘matching’, ‘minkowski’, ‘rogerstanimoto’, ‘russellrao’, ‘seuclidean’, ‘sokalmichener’, ‘sokalsneath’, ‘sqeuclidean’, ‘wminkowski’, ‘yule’.

generate(ref_descr: ndarray, query_descrs: ndarray, perturbed_descrs: ndarray, perturbed_masks: ndarray) → ndarray

Generate a matrix of visual saliency heatmaps given the black-box descriptor generation output on a reference image, several query images, perturbed versions of the reference image and the masks of the visual perturbations.

Perturbation mask input into the perturbed_masks parameter here is equivalent to the perturbation mask output from a xaitk_saliency.interfaces.perturb_image.PerturbImage.perturb() method implementation. We expect perturbations to be relative to the reference image. These should have the shape [nMasks x H x W], and values in range [0, 1], where a value closer to 1 indicates areas of the image that are unperturbed. Note the type of values in masks can be either integer, floating point or boolean within the above range definition. Implementations are responsible for handling these expected variations.

Generated saliency heatmap matrices should be floating-point typed and be composed of values in the [-1,1] range. Positive values of the saliency heatmaps indicate regions which increase image similarity scores, while negative values indicate regions which decrease image similarity scores according to the model that generated input feature vectors.

Parameters:

ref_descr – Reference image float feature vector, shape [nFeats]
query_descrs – Query image float feature vectors, shape [nQueryImgs x nFeats].
perturbed_descrs – Feature vectors of reference image perturbations, float typed of shape [nMasks x nFeats].
perturbed_masks – Perturbation masks numpy.ndarray over the query image. This should be parallel in association to the perturbed_descrs parameter. This should have a shape [nMasks x H x W], and values in range [0, 1], where a value closer to 1 indicates areas of the image that are unperturbed.

Returns:

Generated saliency heatmaps as a float-typed numpy.ndarray with shape [nQueryImgs x H x W].

get_config() → dict

Return a JSON-compliant dictionary that could be passed to this class’s from_config method to produce an instance with identical configuration.

In the most cases, this involves naming the keys of the dictionary based on the initialization argument names as if it were to be passed to the constructor via dictionary expansion. In some cases, where it doesn’t make sense to store some object constructor parameters are expected to be supplied at as configuration values (i.e. must be supplied at runtime), this method’s returned dictionary may leave those parameters out. In such cases, the object’s from_config class-method would also take additional positional arguments to fill in for the parameters that this returned configuration lacks.

Returns:: JSON type compliant configuration dictionary.
Return type:: dict

Class: SquaredDifferenceScoring

class xaitk_saliency.impls.gen_classifier_conf_sal.squared_difference_scoring.SquaredDifferenceScoring

This saliency implementation transforms black-box confidence predictions from a classification-style network into saliency heatmaps. This should require a sequence of classification scores predicted on the reference image, a number of classification scores predicted on perturbed images, as well as the masks of the reference image perturbations (as would be output from a PerturbImage implementation).

This implementation uses the squared difference of the reference scores and the perturbed scores to compute the saliency maps. This gives an indication of general saliency without distinguishing between positive and negative. The resulting maps are normalized between the range [0,1].

Based on Greydanus et. al: https://arxiv.org/abs/1711.00138

generate(reference: ndarray, perturbed: ndarray, perturbed_masks: ndarray) → ndarray

Generate a visual saliency heatmap matrix given the black-box classifier output on a reference image, the same classifier output on perturbed images and the masks of the visual perturbations.

Perturbation mask input into the perturbed_masks parameter here is equivalent to the perturbation mask output from a xaitk_saliency.interfaces.perturb_image.PerturbImage.perturb() method implementation. These should have the shape [nMasks x H x W], and values in range [0, 1], where a value closer to 1 indicate areas of the image that are unperturbed. Note the type of values in masks can be either integer, floating point or boolean within the above range definition. Implementations are responsible for handling these expected variations.

Generated saliency heatmap matrices should be floating-point typed and be composed of values in the [-1,1] range. Positive values of the saliency heatmaps indicate regions which increase class confidence scores, while negative values indicate regions which decrease class confidence scores according to the model that generated input confidence values.

Parameters:

image_conf – Reference image predicted class-confidence vector, as a numpy.ndarray, for all classes that require saliency map generation. This should have a shape [nClasses], be float-typed and with values in the [0,1] range.
perturbed_conf – Perturbed image predicted class confidence matrix. Classes represented in this matrix should be congruent to classes represented in the image_conf vector. This should have a shape [nMasks x nClasses], be float-typed and with values in the [0,1] range.
perturbed_masks – Perturbation masks numpy.ndarray over the reference image. This should be parallel in association to the classification results input into the perturbed_conf parameter. This should have a shape [nMasks x H x W], and values in range [0, 1], where a value closer to 1 indicate areas of the image that are unperturbed.

Returns:

Generated visual saliency heatmap for each input class as a float-type numpy.ndarray of shape [nClasses x H x W].

get_config() → dict

Return a JSON-compliant dictionary that could be passed to this class’s from_config method to produce an instance with identical configuration.

In the most cases, this involves naming the keys of the dictionary based on the initialization argument names as if it were to be passed to the constructor via dictionary expansion. In some cases, where it doesn’t make sense to store some object constructor parameters are expected to be supplied at as configuration values (i.e. must be supplied at runtime), this method’s returned dictionary may leave those parameters out. In such cases, the object’s from_config class-method would also take additional positional arguments to fill in for the parameters that this returned configuration lacks.

Returns:: JSON type compliant configuration dictionary.
Return type:: dict

End-to-End Saliency Generation

Image Classification

Class: PerturbationOcclusion

class xaitk_saliency.impls.gen_image_classifier_blackbox_sal.occlusion_based.PerturbationOcclusion(perturber: PerturbImage, generator: GenerateClassifierConfidenceSaliency, threads: int = 0)

Generator composed of modular perturbation and occlusion-based algorithms.

This implementation exposes a public attribute fill. This may be set to a scalar or sequence value to indicate a color that should be used for filling occluded areas as determined by the given PerturbImage implementation. This is a parameter to be set during runtime as this is most often driven by the black-box algorithm used, if at all.

Parameters:

perturber – PerturbImage implementation instance for generating masks that will dictate occlusion.
generator – Implementation instance for generating saliency masks given occlusion masks and classifier outputs.
threads – Optional number threads to use to enable parallelism in applying perturbation masks to an input image. If 0, a negative value, or None, work will be performed on the main-thread in-line.

classmethod from_config(config_dict: Dict, merge_default: bool = True) → C

Instantiate a new instance of this class given the configuration JSON-compliant dictionary encapsulating initialization arguments.

This base method is adequate without modification when a class’s constructor argument types are JSON-compliant. If one or more are not, however, this method then needs to be overridden in order to convert from a JSON-compliant stand-in into the more complex object the constructor requires. It is recommended that when complex types are used they also inherit from the Configurable in order to hopefully make easier the conversion to and from JSON-compliant stand-ins.

When this method does need to be overridden, this usually looks like the following pattern:

D = TypeVar("D", bound="MyClass")

class MyClass (Configurable):

    @classmethod
    def from_config(
        cls: Type[D],
        config_dict: Dict,
        merge_default: bool = True
    ) -> D:
        # Perform a shallow copy of the input ``config_dict`` which
        # is important to maintain idempotency.
        config_dict = dict(config_dict)

        # Optionally guarantee default values are present in the
        # configuration dictionary.  This is useful when the
        # configuration dictionary input is partial and the logic
        # contained here wants to use config parameters that may
        # have defaults defined in the constructor.
        if merge_default:
            config_dict = merge_dict(cls.get_default_config(),
                                     config_dict)

        #
        # Perform any overriding of `config_dict` values here.
        #

        # Create and return an instance using the super method.
        return super().from_config(config_dict,
                                   merge_default=merge_default)

Note on type annotations: When defining a sub-class of configurable and override this class method, we will need to defined a new TypeVar that is bound at the new class type. This is because super requires a type to be given that descends from the implementing type. If C is used as defined in this interface module, which is upper-bounded on the base Configurable class, the type analysis will see that we are attempting to invoke super with a type that may not strictly descend from the implementing type (MyClass in the example above), and cause an error during type analysis.

Parameters:

config_dict (dict) – JSON compliant dictionary encapsulating a configuration.
merge_default (bool) – Merge the given configuration on top of the default provided by get_default_config.

Returns:

Constructed instance from the provided config.

get_config() → Dict[str, Any]

Return a JSON-compliant dictionary that could be passed to this class’s from_config method to produce an instance with identical configuration.

In the most cases, this involves naming the keys of the dictionary based on the initialization argument names as if it were to be passed to the constructor via dictionary expansion. In some cases, where it doesn’t make sense to store some object constructor parameters are expected to be supplied at as configuration values (i.e. must be supplied at runtime), this method’s returned dictionary may leave those parameters out. In such cases, the object’s from_config class-method would also take additional positional arguments to fill in for the parameters that this returned configuration lacks.

Returns:: JSON type compliant configuration dictionary.
Return type:: dict

classmethod get_default_config() → Dict[str, Any]

Generate and return a default configuration dictionary for this class. This will be primarily used for generating what the configuration dictionary would look like for this class without instantiating it.

By default, we observe what this class’s constructor takes as arguments, turning those argument names into configuration dictionary keys. If any of those arguments have defaults, we will add those values into the configuration dictionary appropriately. The dictionary returned should only contain JSON compliant value types.

It is not be guaranteed that the configuration dictionary returned from this method is valid for construction of an instance of this class.

Returns:: Default configuration dictionary for the class.
Return type:: dict

>>> # noinspection PyUnresolvedReferences
>>> class SimpleConfig(Configurable):
...     def __init__(self, a=1, b='foo'):
...         self.a = a
...         self.b = b
...     def get_config(self):
...         return {'a': self.a, 'b': self.b}
>>> self = SimpleConfig()
>>> config = self.get_default_config()
>>> assert config == {'a': 1, 'b': 'foo'}

Class: RISEStack

class xaitk_saliency.impls.gen_image_classifier_blackbox_sal.rise.RISEStack(n: int, s: int, p1: float, seed: int | None = None, threads: int = 0, debiased: bool = True)

Encapsulation of the perturbation-occlusion method using specifically the RISE implementations of the component algorithms.

This more specifically encapsulates the original RISE method as presented in their paper and code. See references in the RISEGrid and RISEScoring documentation.

This implementation shares the p1 probability with the internal RISEScoring instance use, effectively causing this implementation to utilize debiased RISE.

Parameters:

n – Number of random masks used in the algorithm. E.g. 1000.
s – Spatial resolution of the small masking grid. E.g. 8. Assumes square grid.
p1 – Probability of the grid cell being set to 1 (otherwise 0). This should be a float value in the [0, 1] range. E.g. 0.5.
seed – A seed to pass into the constructed random number generator to allow for reproducibility
threads – The number of threads to utilize when generating masks. If this is <=0 or None, no threading is used and processing is performed in-line serially.
debiased – If we should pass the provided debiasing parameter to the RISE saliency map generation algorithm. See the RISEScoring() documentation for more details on debiasing.

get_config() → Dict[str, Any]

Return a JSON-compliant dictionary that could be passed to this class’s from_config method to produce an instance with identical configuration.

In the most cases, this involves naming the keys of the dictionary based on the initialization argument names as if it were to be passed to the constructor via dictionary expansion. In some cases, where it doesn’t make sense to store some object constructor parameters are expected to be supplied at as configuration values (i.e. must be supplied at runtime), this method’s returned dictionary may leave those parameters out. In such cases, the object’s from_config class-method would also take additional positional arguments to fill in for the parameters that this returned configuration lacks.

Returns:: JSON type compliant configuration dictionary.
Return type:: dict

Class: SlidingWindowStack

class xaitk_saliency.impls.gen_image_classifier_blackbox_sal.slidingwindow.SlidingWindowStack(window_size: Tuple[int, int] = (50, 50), stride: Tuple[int, int] = (20, 20), threads: int = 0)

Encapsulation of the perturbation-occlusion method using specifically sliding windows and the occlusion-scoring method. See the SlidingWindow and OcclusionScoring documentation for more details.

Parameters:

window_size – The block window size as a tuple with format (height, width).
stride – The sliding window striding step as a tuple with format (height_step, width_step).
threads – Optional number threads to use to enable parallelism in applying perturbation masks to an input image. If 0, a negative value, or None, work will be performed on the main-thread in-line.

get_config() → Dict[str, Any]

Return a JSON-compliant dictionary that could be passed to this class’s from_config method to produce an instance with identical configuration.

In the most cases, this involves naming the keys of the dictionary based on the initialization argument names as if it were to be passed to the constructor via dictionary expansion. In some cases, where it doesn’t make sense to store some object constructor parameters are expected to be supplied at as configuration values (i.e. must be supplied at runtime), this method’s returned dictionary may leave those parameters out. In such cases, the object’s from_config class-method would also take additional positional arguments to fill in for the parameters that this returned configuration lacks.

Returns:: JSON type compliant configuration dictionary.
Return type:: dict

classmethod get_default_config() → Dict[str, Any]

Generate and return a default configuration dictionary for this class. This will be primarily used for generating what the configuration dictionary would look like for this class without instantiating it.

By default, we observe what this class’s constructor takes as arguments, turning those argument names into configuration dictionary keys. If any of those arguments have defaults, we will add those values into the configuration dictionary appropriately. The dictionary returned should only contain JSON compliant value types.

It is not be guaranteed that the configuration dictionary returned from this method is valid for construction of an instance of this class.

Returns:: Default configuration dictionary for the class.
Return type:: dict

>>> # noinspection PyUnresolvedReferences
>>> class SimpleConfig(Configurable):
...     def __init__(self, a=1, b='foo'):
...         self.a = a
...         self.b = b
...     def get_config(self):
...         return {'a': self.a, 'b': self.b}
>>> self = SimpleConfig()
>>> config = self.get_default_config()
>>> assert config == {'a': 1, 'b': 'foo'}

Image Similarity

Class: PerturbationOcclusion

class xaitk_saliency.impls.gen_image_similarity_blackbox_sal.occlusion_based.PerturbationOcclusion(perturber: PerturbImage, generator: GenerateDescriptorSimilaritySaliency, fill: int | Sequence[int] | ndarray | None = None, threads: int | None = None)

Image similarity saliency generator composed of modular perturbation and occlusion-based algorithms.

This implementation exposes its fill attribute as public. This allows it to be set during runtime as this is most often driven by the black-box algorithm used, if at all.

Parameters:

perturber – PerturbImage implementation instance for generating occlusion masks.
generator – GenerateDescriptorSimilaritySaliency implementation instance for generating saliency masks given occlusion masks and image feature vector generator outputs.
fill – Optional fill for alpha-blending the occluded regions based on the masks generated by the given perturber. Can be a scalar value, a per-channel sequence or a shape-matched image.
threads – Optional number threads to use to enable parallelism in applying perturbation masks to an input image. If 0, a negative value, or None, work will be performed on the main-thread in-line.

classmethod from_config(config_dict: Dict, merge_default: bool = True) → C

Instantiate a new instance of this class given the configuration JSON-compliant dictionary encapsulating initialization arguments.

This base method is adequate without modification when a class’s constructor argument types are JSON-compliant. If one or more are not, however, this method then needs to be overridden in order to convert from a JSON-compliant stand-in into the more complex object the constructor requires. It is recommended that when complex types are used they also inherit from the Configurable in order to hopefully make easier the conversion to and from JSON-compliant stand-ins.

When this method does need to be overridden, this usually looks like the following pattern:

D = TypeVar("D", bound="MyClass")

class MyClass (Configurable):

    @classmethod
    def from_config(
        cls: Type[D],
        config_dict: Dict,
        merge_default: bool = True
    ) -> D:
        # Perform a shallow copy of the input ``config_dict`` which
        # is important to maintain idempotency.
        config_dict = dict(config_dict)

        # Optionally guarantee default values are present in the
        # configuration dictionary.  This is useful when the
        # configuration dictionary input is partial and the logic
        # contained here wants to use config parameters that may
        # have defaults defined in the constructor.
        if merge_default:
            config_dict = merge_dict(cls.get_default_config(),
                                     config_dict)

        #
        # Perform any overriding of `config_dict` values here.
        #

        # Create and return an instance using the super method.
        return super().from_config(config_dict,
                                   merge_default=merge_default)

Note on type annotations: When defining a sub-class of configurable and override this class method, we will need to defined a new TypeVar that is bound at the new class type. This is because super requires a type to be given that descends from the implementing type. If C is used as defined in this interface module, which is upper-bounded on the base Configurable class, the type analysis will see that we are attempting to invoke super with a type that may not strictly descend from the implementing type (MyClass in the example above), and cause an error during type analysis.

Parameters:

config_dict (dict) – JSON compliant dictionary encapsulating a configuration.
merge_default (bool) – Merge the given configuration on top of the default provided by get_default_config.

Returns:

Constructed instance from the provided config.

get_config() → Dict[str, Any]

Return a JSON-compliant dictionary that could be passed to this class’s from_config method to produce an instance with identical configuration.

In the most cases, this involves naming the keys of the dictionary based on the initialization argument names as if it were to be passed to the constructor via dictionary expansion. In some cases, where it doesn’t make sense to store some object constructor parameters are expected to be supplied at as configuration values (i.e. must be supplied at runtime), this method’s returned dictionary may leave those parameters out. In such cases, the object’s from_config class-method would also take additional positional arguments to fill in for the parameters that this returned configuration lacks.

Returns:: JSON type compliant configuration dictionary.
Return type:: dict

classmethod get_default_config() → Dict[str, Any]

Generate and return a default configuration dictionary for this class. This will be primarily used for generating what the configuration dictionary would look like for this class without instantiating it.

By default, we observe what this class’s constructor takes as arguments, turning those argument names into configuration dictionary keys. If any of those arguments have defaults, we will add those values into the configuration dictionary appropriately. The dictionary returned should only contain JSON compliant value types.

It is not be guaranteed that the configuration dictionary returned from this method is valid for construction of an instance of this class.

Returns:: Default configuration dictionary for the class.
Return type:: dict

>>> # noinspection PyUnresolvedReferences
>>> class SimpleConfig(Configurable):
...     def __init__(self, a=1, b='foo'):
...         self.a = a
...         self.b = b
...     def get_config(self):
...         return {'a': self.a, 'b': self.b}
>>> self = SimpleConfig()
>>> config = self.get_default_config()
>>> assert config == {'a': 1, 'b': 'foo'}

Class: SBSMStack

class xaitk_saliency.impls.gen_image_similarity_blackbox_sal.sbsm.SBSMStack(window_size: Tuple[int, int] = (50, 50), stride: Tuple[int, int] = (20, 20), proximity_metric: str = 'euclidean', fill: int | Sequence[int] | ndarray | None = None, threads: int | None = None)

Encapsulation of the perturbation-occlusion method using specifically the sliding window image perturbation and similarity scoring algorithms to generate similarity-based visual saliency maps. See the documentation of SlidingWindow and SimilarityScoring for details.

Parameters:

window_size – The block window size as a tuple with format (height, width).
stride – The sliding window striding step as a tuple with format (height_step, width_step).
proximity_metric –
The type of comparison metric used to determine proximity in feature space. The type of comparison metric supported is restricted by scipy’s cdist() function. The following metrics are supported in scipy.

‘braycurtis’, ‘canberra’, ‘chebyshev’, ‘cityblock’, ‘correlation’, ‘cosine’, ‘dice’, ‘euclidean’, ‘hamming’, ‘jaccard’, ‘jensenshannon’, ‘kulsinski’, ‘mahalanobis’, ‘matching’, ‘minkowski’, ‘rogerstanimoto’, ‘russellrao’, ‘seuclidean’, ‘sokalmichener’, ‘sokalsneath’, ‘sqeuclidean’, ‘wminkowski’, ‘yule’.
threads – Optional number threads to use to enable parallelism in applying perturbation masks to an input image. If 0, a negative value, or None, work will be performed on the main-thread in-line.

get_config() → Dict[str, Any]

Return a JSON-compliant dictionary that could be passed to this class’s from_config method to produce an instance with identical configuration.

In the most cases, this involves naming the keys of the dictionary based on the initialization argument names as if it were to be passed to the constructor via dictionary expansion. In some cases, where it doesn’t make sense to store some object constructor parameters are expected to be supplied at as configuration values (i.e. must be supplied at runtime), this method’s returned dictionary may leave those parameters out. In such cases, the object’s from_config class-method would also take additional positional arguments to fill in for the parameters that this returned configuration lacks.

Returns:: JSON type compliant configuration dictionary.
Return type:: dict

classmethod get_default_config() → Dict[str, Any]

Generate and return a default configuration dictionary for this class. This will be primarily used for generating what the configuration dictionary would look like for this class without instantiating it.

By default, we observe what this class’s constructor takes as arguments, turning those argument names into configuration dictionary keys. If any of those arguments have defaults, we will add those values into the configuration dictionary appropriately. The dictionary returned should only contain JSON compliant value types.

It is not be guaranteed that the configuration dictionary returned from this method is valid for construction of an instance of this class.

Returns:: Default configuration dictionary for the class.
Return type:: dict

>>> # noinspection PyUnresolvedReferences
>>> class SimpleConfig(Configurable):
...     def __init__(self, a=1, b='foo'):
...         self.a = a
...         self.b = b
...     def get_config(self):
...         return {'a': self.a, 'b': self.b}
>>> self = SimpleConfig()
>>> config = self.get_default_config()
>>> assert config == {'a': 1, 'b': 'foo'}

Object Detection

Class: PerturbationOcclusion

class xaitk_saliency.impls.gen_object_detector_blackbox_sal.occlusion_based.PerturbationOcclusion(perturber: PerturbImage, generator: GenerateDetectorProposalSaliency, fill: int | Sequence[int] | ndarray | None = None, threads: int | None = 0)

Generator composed of modular perturbation and occlusion-based algorithms.

This implementation exposes its fill attribute as public. This allows it to be set during runtime as this is most often driven by the black-box algorithm used, if at all.

Parameters:

perturber – PerturbImage implementation instance for generating occlusion masks.
generator – GenerateDetectorProposalSaliency implementation instance for generating saliency masks given occlusion masks and black-box detector outputs.
fill – Optional fill for alpha-blending the occluded regions based on the masks generated by the given perturber. Can be a scalar value, a per-channel sequence or a shape-matched image.
threads – Optional number threads to use to enable parallelism in applying perturbation masks to an input image. If 0, a negative value, or None, work will be performed on the main-thread in-line.

Class: DRISEStack

Encapsulation of the perturbation-occlusion method using the RISE image perturbation and DRISE scoring algorithms to generate visual saliency maps for object detections. See references in the RISEGrid and DRISEScoring documentation.

Parameters:

n – Number of random masks used in the algorithm.
s – Spatial resolution of the small masking grid. Assumes square grid.
p1 – Probability of the grid cell being set to 1 (not occluded). This should be a float value in the [0, 1] range.
seed – A seed to pass to the constructed random number generator to allow for reproducibility.
fill – Optional fill for alpha-blending the occluded regions based on the masks generated by the RISEGrid perturber. Can be a scalar value, a per-channel sequence or a shape-matched image.
threads – The number of threads to utilize when generating masks. If this is <=0 or None, no threading is used and processing is performed in-line serially.