This blog article on Twitter saliency filter analysis introduces a few broad concepts to test machine vision tools. Described here is an end-to-end automated statistical analysis tool that is used to analyze the Twitter saliency filter. The aim is to accelerate the development of scalable, automated testing of machine vision algorithms for possible biases.
An extremely simple image processing pipeline is used here, to carefully manipulate the input images. The goal of this tool is to analyze the possible variances in the saliency filter outputs following simple image manipulation techniques.
One of the basic requirements to build a robust machine vision tool is the output invariance of the tool to common image processing techniques.
The image processing techniques such as introduction of additional pixels with no new meaningful image data (padding), rotation of the image, addition of noisy data to an image and color/saturation changes are so ubiquitous in real world usage.
Therefore, if a machine vision tool is not robust enough to handle these image processing techniques, it can undermine the user confidence in the tool. One example is that, a non-technical end-user could extrapolate the variations in the outputs from a highly input dependent machine vision tool, even when similar looking images are passed as inputs, to a variety of unrelated factors.
The key motivation behind building this tool was to create an objective measure to understand the performance of machine vision tools. One of the hot-bed discussion topics around machine vision and artificial intelligence is the possibility of algorithmic bias.
But, in-order to have a meaningful conversation around algorithmic bias, accurately quantifying the basic performance characteristics of the algorithm in question is a necessity. Otherwise a lot of these observations about algorithmic failures could be the result of attribution biases.
Essentially, the search for biases in the algorithm, itself ends up exposing the inherent fears and biases of our society. Therefore, the ideal starting point for evaluating the Twitter saliency filter, or for that matter, any machine vision tool, is to quantify the invariance of the tool to basic image manipulations.
Once the tool's robustness to these basic image manipulations are established, the next step will be to understand the nuances of the observed output variances, such as the possibility of algorithmic bias contributing to these skewed outputs.
In this tool, the input images are manipulated using padding. Two types of paddings are used here: horizontal and vertical. They are applied to randomly paired images.
The dataset used here is the FairFace: the face attribute dataset that is balanced for gender, race and age. This dataset is used to generate the fully randomized image pairs for performing the saliency filter tests.
Quantification of the statistical significance in differences between, the carefully manipulated saliency filter outputs and the baseline saliency filter outputs, is performed using the Wilcoxon signed rank test.
Additional requirements to run this tool:
Install the Twitter saliency filter tool
Import the dependent libraries necessary to run the code
Mount Google Drive
By default this notebook assumes that the FairFace dataset is stored in the Google Drive attached here. Also, the experimental histories are saved to the Google Drive attached to this Colab notebook in csv format.
Data download
Download the FairFace dataset fairface-img-margin125-trainval.zip file and the labels fairface_label_train.csv file from the official FairFace GitHub repo.. The maintainers of the FairFace GitHub repository have published the links to download the data in their GitHub repo README file.
Helper functions to handle the FairFace dataset
Read FairFace data
To run the tool desribed here, the FairFace dataset should be downloaded and placed insides the {img_dir}/FairFace directory. By default the Google Colab notebook uses the fairface-img-margin125-trainval.zip data.
Perform basic checks on the FairFace dataset
Generate random FairFace image pairings
Numerical encoding of the FairFace labels
Generate pairwise image comparisons using the Twitter saliency filter
Crop the output image generated using a pair of FairFace images
Mapping the saliency filter output to FairFace data
Evaluate horizontal and vertical padding invariance
Randomized saliency filter testing for padding invariance
Null hypothesis for the random image pairs experiment
H₀ --> There are no differences between the baseline outputs of the saliency filter and the saliency filter outputs following randomized image paddings.
Methodology for generating randomized image pairs from FairFace data
Randomization of the images for the pairwise comparisons are generated using the random.SystemRandom() class in the Python random library.
The use of random.SystemRandom() class means, the exact image pairings are always dependent on the random numbers provided by the operating system sources. This method of random number generation is not available on all systems. Since this does not rely on the software state, the image pairing sequences are not reproducible.
The goal of this experiment is to identify the existence of any statistical significant differences between the saliency filter outputs using baseline image pairs and the saliency filter outputs following randomized image padding. Therefore, the exact image pairing sequences used for the saliency filter output comparisons are immaterial for the reproducibility of this experiment.
Calculate statistical significance
Wilcoxon signed rank test is used to calculate whether there are any statistically significant differences between the baseline saliency filter outputs and the saliency filter outputs following image padding. The Wilcoxon signed rank test is performed using the SciPy library.
Save experiment results and run tests
TLDR: Described here is an end-to-end automated testing tool for the Twitter saliency filter. This tool quantifies the statistically significant differences between the baseline image pairs and the image pairs that are carefully manipulated using horizontal and vertical padding.
If your organization has a need to simplify your complex data solutions or your next data-science/artificial intelligence project needs our assistance, feel free to fill-out our consultation intake form (~1 min task).
Interested in learning more or want to contribute, please check out the project repository.
0 Comments
Your comment will be posted after it is approved.
Leave a Reply. |
OverviewMoad Computer is an actionable insights firm. We provide enterprises with end-to-end artificial intelligence solutions. Actionable Insights blog is a quick overview of things we are most excited about. Archives
November 2022
Categories |