fairness-in-machine-learning.html

<!doctype html><html lang=en-uk><head><script data-goatcounter=https://ruivieira-dev.goatcounter.com/count async src=//gc.zgo.at/count.js></script><script src=https://unpkg.com/@alpinejs/intersect@3.x.x/dist/cdn.min.js></script><script src=https://unpkg.com/alpinejs@3.x.x/dist/cdn.min.js></script><script type=module src=https://ruivieira.dev/js/deeplinks/deeplinks.js></script><link rel=preload href=https://ruivieira.dev/lib/fonts/fa-brands-400.woff2 as=font type=font/woff2 crossorigin=anonymous><link rel=preload href=https://ruivieira.dev/lib/fonts/fa-regular-400.woff2 as=font type=font/woff2 crossorigin=anonymous><link rel=preload href=https://ruivieira.dev/lib/fonts/fa-solid-900.woff2 as=font type=font/woff2 crossorigin=anonymous><link rel=preload href=https://ruivieira.dev/fonts/firacode/FiraCode-Regular.woff2 as=font type=font/woff2 crossorigin=anonymous><link rel=preload href=https://ruivieira.dev/fonts/vollkorn/Vollkorn-Regular.woff2 as=font type=font/woff2 crossorigin=anonymous><link rel=stylesheet href=https://ruivieira.dev/css/kbd.css type=text/css><meta charset=utf-8><meta http-equiv=X-UA-Compatible content="IE=edge"><title>Fairness in Machine Learning · Rui Vieira</title>
<link rel=canonical href=https://ruivieira.dev/fairness-in-machine-learning.html><meta name=viewport content="width=device-width,initial-scale=1"><meta name=robots content="all,follow"><meta name=googlebot content="index,follow,snippet,archive"><meta property="og:title" content="Fairness in Machine Learning"><meta property="og:description" content="Machine Learning fairness is directly related to almost all fields where Machine Learning can be applied:
Autonomous machines Job application workflow Predictive models for the justice system Online shopping recommendation systems etc. Many of the causes in ML unfairness or bias can be tracked to the original training data. Some common causes include:
Skewed observations Tainted observations Limited features Sample size disparity Proxies Some algortihms discussed in these pages:
Counterfactual Fairness (also how to create counterfactually fair models in Java) Group fairnessGroup fairness metrics are measures that assess the fairness of a decision-making process or outcome for different groups within a population."><meta property="og:type" content="article"><meta property="og:url" content="https://ruivieira.dev/fairness-in-machine-learning.html"><meta property="article:section" content="posts"><meta property="article:modified_time" content="2023-10-01T20:46:29+01:00"><meta name=twitter:card content="summary"><meta name=twitter:title content="Fairness in Machine Learning"><meta name=twitter:description content="Machine Learning fairness is directly related to almost all fields where Machine Learning can be applied:
Autonomous machines Job application workflow Predictive models for the justice system Online shopping recommendation systems etc. Many of the causes in ML unfairness or bias can be tracked to the original training data. Some common causes include:
Skewed observations Tainted observations Limited features Sample size disparity Proxies Some algortihms discussed in these pages:
Counterfactual Fairness (also how to create counterfactually fair models in Java) Group fairnessGroup fairness metrics are measures that assess the fairness of a decision-making process or outcome for different groups within a population."><link rel=stylesheet href=https://ruivieira.dev/css/styles.css><!--[if lt IE 9]><script src=https://oss.maxcdn.com/html5shiv/3.7.2/html5shiv.min.js></script><script src=https://oss.maxcdn.com/respond/1.4.2/respond.min.js></script><![endif]--><link rel=icon type=image/png href=https://ruivieira.dev/images/favicon.ico></head><body class="max-width mx-auto px3 ltr" x-data="{currentHeading: undefined}"><div class="content index py4"><div id=header-post><a id=menu-icon href=#><i class="fas fa-eye fa-lg"></i></a>
<a id=menu-icon-tablet href=#><i class="fas fa-eye fa-lg"></i></a>
<a id=top-icon-tablet href=# onclick='$("html, body").animate({scrollTop:0},"fast")' style=display:none aria-label="Top of Page"><i class="fas fa-chevron-up fa-lg"></i></a>
<span id=menu><span id=nav><ul><li><a href=https://ruivieira.dev/>Home</a></li><li><a href=https://ruivieira.dev/blog/>Blog</a></li><li><a href=https://ruivieira.dev/draw/>Drawings</a></li><li><a href=https://ruivieira.dev/map/>All pages</a></li><li><a href=https://ruivieira.dev/search.html>Search</a></li></ul></span><br><div id=share style=display:none></div><div id=toc><h4>Contents</h4><nav id=TableOfContents><ul><li><a href=#group-fairness :class="{'toc-h2':true, 'toc-highlight': currentHeading == '#group-fairness' }">Group fairness</a></li><li><a href=#statistical-parity :class="{'toc-h3':true, 'toc-highlight': currentHeading == '#statistical-parity' }">Statistical Parity</a></li><li><a href=#group-statistical-parity :class="{'toc-h4':true, 'toc-highlight': currentHeading == '#group-statistical-parity' }">Group Statistical Parity</a></li><li><a href=#statistical-parity-difference :class="{'toc-h4':true, 'toc-highlight': currentHeading == '#statistical-parity-difference' }">Statistical parity difference</a></li><li><a href=#disparate-impact-ratio :class="{'toc-h4':true, 'toc-highlight': currentHeading == '#disparate-impact-ratio' }">Disparate Impact Ratio</a></li></ul></nav><h4>Related</h4><nav><ul><li class="header-post toc"><span class=backlink-count>1</span>
<a href>Index</a></li><li class="header-post toc"><span class=backlink-count>1</span>
<a href=https://ruivieira.dev/model-fairness.html>Model fairness</a></li><li class="header-post toc"><span class=backlink-count>1</span>
<a href=https://ruivieira.dev/machine-learning.html>Machine Learning</a></li></ul></nav></div></span></div><article class=post itemscope itemtype=http://schema.org/BlogPosting><header><h1 class=posttitle itemprop="name headline">Fairness in Machine Learning</h1><div class=meta><div class=postdate>Updated <time datetime="2023-10-01 20:46:29 +0100 BST" itemprop=datePublished>2023-10-01</time>
<span class=commit-hash>(<a href=https://ruivieira.dev/log/index.html#e23fe25>e23fe25</a>)</span></div></div></header><div class=content itemprop=articleBody><p>Machine Learning fairness is directly related to almost all fields where <a href=https://ruivieira.dev/machine-learning.html>Machine Learning</a> can be applied:</p><ul><li>Autonomous machines</li><li>Job application workflow</li><li>Predictive models for the justice system</li><li>Online shopping recommendation systems</li><li><em>etc.</em></li></ul><p>Many of the causes in ML unfairness or bias can be tracked to the original training data. Some common causes include:</p><ul><li>Skewed observations</li><li>Tainted observations</li><li>Limited features</li><li>Sample size disparity</li><li>Proxies</li></ul><p>Some algortihms discussed in these pages:</p><ul><li><a href=https://ruivieira.dev/counterfactual-fairness.html>Counterfactual Fairness</a> (also how to create counterfactually fair models <a href=https://ruivieira.dev/counterfactual-fairness-in-java.html>in Java</a>)</li></ul><h2 id=group-fairness x-intersect="currentHeading = '#group-fairness'">Group fairness</h2><p>Group fairness metrics are measures that assess the fairness of a decision-making process or outcome for different groups within a population. These metrics are used to evaluate the fairness of systems or policies that have an impact on various groups, such as race, gender, age, or other characteristics. Group fairness metrics can help identify potential biases in decision-making processes and ensure that outcomes are just and equitable for all individuals.</p><p>Some common types of group fairness metrics include:</p><ul><li><a href=#statistical-parity>Statistical Parity</a>: This metric assesses whether the proportion of positive outcomes (<em>e.g.</em> being approved for a loan) is the same for all groups.</li><li>Demographic parity: This metric assesses whether the probability of a positive outcome is the same for all groups.</li><li>Equal opportunity: This metric assesses whether the probability of a positive outcome is the same for individuals from different groups who have the same qualifications or characteristics.</li><li>Equalized odds: This metric assesses whether the true positive rate and false positive rate are the same for all groups.</li><li>Predictive parity: This metric assesses whether the error rates for different groups are the same, given the same predicted probability of a positive outcome.</li></ul><p>It is important to note that group fairness metrics are not a substitute for addressing the root causes of inequality, but they can help identify and mitigate potential biases in decision-making processes.</p><h3 id=statistical-parity x-intersect="currentHeading = '#statistical-parity'">Statistical Parity</h3><p>There are several different types of statistical parity metrics that can be used to assess the fairness of a decision-making process or outcome for different groups within a population. Some common types of statistical parity metrics include:</p><ul><li><a href=#group-statistical-parity>Group Statistical Parity</a>: This metric assesses whether the proportion of positive outcomes (e.g. being approved for a loan) is the same for all groups.</li><li><a href=#statistical-parity-difference>Statistical parity difference</a> (SPD), measures the difference between the proportion of positive outcomes for two groups.</li><li><a href=#disparate-impact-ratio>Disparate Impact Ratio</a> (DIR), measures the ratio between the proportion of positive outcomes for two groups.</li><li>Subgroup statistical parity: This metric assesses whether the proportion of positive outcomes is the same for subgroups within a larger group. For example, this could be used to assess the fairness of a hiring process for men and women within a particular job category.</li><li>Individual statistical parity: This metric assesses whether the probability of a positive outcome is the same for all individuals, regardless of their group membership.</li><li>Pairwise statistical parity: This metric assesses whether the probability of a positive outcome is the same for all pairs of groups. For example, this could be used to compare the probability of a positive outcome for men and women, as well as for men and people of other gender identities.</li></ul><p>It is important to note that no single statistical parity metric is a perfect measure of fairness, and different metrics may be more or less appropriate depending on the specific context and goals of the evaluation. It may also be helpful to use a combination of different statistical parity metrics in order to get a more comprehensive understanding of the fairness of a decision-making process or outcome.</p><h4 id=group-statistical-parity x-intersect="currentHeading = '#group-statistical-parity'">Group Statistical Parity</h4><p><strong>Statistical Parity Difference</strong> (SPD) and <strong>Group Statistical Parity</strong> are two different group fairness metrics that are used to assess the fairness of a decision-making process or outcome for different groups within a population.</p><p>Group statistical parity measures whether the proportion of positive outcomes (<em>e.g.</em> being approved for a loan) is the same for all groups. For example, if the proportion of race A applicants who are approved for a loan is 50%, and the proportion of race B applicants who are approved is also 50%, then the loan approval process could be considered fair according to this metric.</p><h4 id=statistical-parity-difference x-intersect="currentHeading = '#statistical-parity-difference'">Statistical parity difference</h4><p>Statistical parity difference (SPD), on the other hand, measures the difference between the proportion of positive outcomes for two groups. It is often used to assess the fairness of a decision-making process or outcome where there are two groups of interest, such as men and women or people of different racial groups. SPD is calculated as the difference between the proportion of positive outcomes for one group and the proportion of positive outcomes for the other group.</p><p>One key difference between group statistical parity and SPD is that group statistical parity assesses fairness for all groups within a population, while SPD is specifically designed to compare the fairness of two groups. Group statistical parity is also based on proportions, while SPD is based on the difference between proportions.</p><p>For example, consider a credit approval process where 60% of white applicants are approved and 50% of Black applicants are approved. According to group statistical parity, this process would not be considered fair, as the proportion of approved applicants is not the same for both groups. However, according to SPD, the difference between the proportions of approved applicants for the two groups is 10%, which may be considered acceptable depending on the specific context and goals of the evaluation.</p><p>The formal definiton of SPD is</p><p>$$
SPD=p(\hat{y}=1|\mathcal{D}_u)-p(\hat{y}=1|\mathcal{D}_p),
$$</p><p>where $\hat{y}=1$ is the favourable outcome and $\mathcal{D}_u, \mathcal{D}_p$ are respectively the privileged and unprivileged group data.</p><h4 id=disparate-impact-ratio x-intersect="currentHeading = '#disparate-impact-ratio'">Disparate Impact Ratio</h4><p><strong>Disparate Impact Ratio</strong> (DIR) is specifically a ratio-based statistical parity metric, as it measures the ratio of the probability of a positive outcome for one group to the probability of a positive outcome for another group. It is often used to assess the fairness of a decision-making process or outcome where there are two groups of interest, such as men and women or people of different racial groups.</p><p>The formal definition of DIR is</p><p>$$
DIR=\frac{p(\hat{y}=1|\mathcal{D}_u)}{p(\hat{y}=1|\mathcal{D}_p)}.
$$</p></div></article><div id=footer-post-container><div id=footer-post><div id=nav-footer style=display:none><ul><li><a href=https://ruivieira.dev/>Home</a></li><li><a href=https://ruivieira.dev/blog/>Blog</a></li><li><a href=https://ruivieira.dev/draw/>Drawings</a></li><li><a href=https://ruivieira.dev/map/>All pages</a></li><li><a href=https://ruivieira.dev/search.html>Search</a></li></ul></div><div id=toc-footer style=display:none><nav id=TableOfContents><ul><li><a href=#group-fairness>Group fairness</a><ul><li><a href=#statistical-parity>Statistical Parity</a></li></ul></li></ul></nav></div><div id=share-footer style=display:none></div><div id=actions-footer><a id=menu-toggle class=icon href=# onclick='return $("#nav-footer").toggle(),!1' aria-label=Menu><i class="fas fa-bars fa-lg" aria-hidden=true></i> Menu</a>
<a id=toc-toggle class=icon href=# onclick='return $("#toc-footer").toggle(),!1' aria-label=TOC><i class="fas fa-list fa-lg" aria-hidden=true></i> TOC</a>
<a id=share-toggle class=icon href=# onclick='return $("#share-footer").toggle(),!1' aria-label=Share><i class="fas fa-share-alt fa-lg" aria-hidden=true></i> share</a>
<a id=top style=display:none class=icon href=# onclick='$("html, body").animate({scrollTop:0},"fast")' aria-label="Top of Page"><i class="fas fa-chevron-up fa-lg" aria-hidden=true></i> Top</a></div></div></div><footer id=footer><div class=footer-left>Copyright &copy; 2024 Rui Vieira</div><div class=footer-right><nav><ul><li><a href=https://ruivieira.dev/>Home</a></li><li><a href=https://ruivieira.dev/blog/>Blog</a></li><li><a href=https://ruivieira.dev/draw/>Drawings</a></li><li><a href=https://ruivieira.dev/map/>All pages</a></li><li><a href=https://ruivieira.dev/search.html>Search</a></li></ul></nav></div></footer></div></body><link rel=stylesheet href=https://ruivieira.dev/css/fa.min.css><script src=https://ruivieira.dev/js/jquery-3.6.0.min.js></script><script src=https://ruivieira.dev/js/mark.min.js></script><script src=https://ruivieira.dev/js/main.js></script><script>MathJax={tex:{inlineMath:[["$","$"],["\\(","\\)"]]},svg:{fontCache:"global"}}</script><script type=text/javascript id=MathJax-script async src=https://cdn.jsdelivr.net/npm/mathjax@3/es5/tex-mml-chtml.js></script></html>