fair machine learning - uni-leipzig.de
TRANSCRIPT
![Page 1: FAIR MACHINE LEARNING - uni-leipzig.de](https://reader033.vdocuments.us/reader033/viewer/2022043006/626b4b93db83ca6c1d1dc695/html5/thumbnails/1.jpg)
FAIR MACHINE LEARNING
![Page 2: FAIR MACHINE LEARNING - uni-leipzig.de](https://reader033.vdocuments.us/reader033/viewer/2022043006/626b4b93db83ca6c1d1dc695/html5/thumbnails/2.jpg)
The Measure and Mismeasure of Fairness:A Critical Review of Fair Machine Learning
Sam Corbett-Davies Stanford University
Sharad Goel Stanford University
August 14, 2018
![Page 3: FAIR MACHINE LEARNING - uni-leipzig.de](https://reader033.vdocuments.us/reader033/viewer/2022043006/626b4b93db83ca6c1d1dc695/html5/thumbnails/3.jpg)
STRUCTURE
‣ MOTIVATION ‣ FAIRNESS ‣ ALGORITHMIC FAIRNESS ‣ RISK ASSESSMENT ‣ BASICS & ASSUMPTIONS
‣ FORMAL DEFINITIONS OF FAIRNESS ‣ LIMITS
‣ PROBLEMS WITH DESIGNING FAIR ALGORITHMS ‣ SUMMARY
1
![Page 4: FAIR MACHINE LEARNING - uni-leipzig.de](https://reader033.vdocuments.us/reader033/viewer/2022043006/626b4b93db83ca6c1d1dc695/html5/thumbnails/4.jpg)
MOTIVATION
2
Quelle: https://www.newyorker.com/magazine/2019/04/15/who-belongs-in-prison
Quelle: https://www.wn.de/Service/Verbrauchertipps/Kredite-Die-Top-10-Verwendungszwecke-fuer-einen-Privatkredit
Quelle: https://de.wikipedia.org/wiki/Datei:Schufa_Logo.svg
![Page 5: FAIR MACHINE LEARNING - uni-leipzig.de](https://reader033.vdocuments.us/reader033/viewer/2022043006/626b4b93db83ca6c1d1dc695/html5/thumbnails/5.jpg)
FAIRNESS
‣ fairness = !(discrimination) ‣ fair algorithms are algorithms that do not discriminate
‣ disparate impact = decision provokes unjustified differences between groups
3
![Page 6: FAIR MACHINE LEARNING - uni-leipzig.de](https://reader033.vdocuments.us/reader033/viewer/2022043006/626b4b93db83ca6c1d1dc695/html5/thumbnails/6.jpg)
4
ALGORITHMIC FAIRNESS: RISK ASSESSMENT
![Page 7: FAIR MACHINE LEARNING - uni-leipzig.de](https://reader033.vdocuments.us/reader033/viewer/2022043006/626b4b93db83ca6c1d1dc695/html5/thumbnails/7.jpg)
RISK ASSESSMENT: FLOW
= (age, gender, race, credit history) x = (xp, xu)
DECISION
prediction
X
Y
5
![Page 8: FAIR MACHINE LEARNING - uni-leipzig.de](https://reader033.vdocuments.us/reader033/viewer/2022043006/626b4b93db83ca6c1d1dc695/html5/thumbnails/8.jpg)
TRUE RISK: r(x) = Pr(Y=1 | X = x)
6
RISK ASSESSMENT: ASSUMPTION
![Page 9: FAIR MACHINE LEARNING - uni-leipzig.de](https://reader033.vdocuments.us/reader033/viewer/2022043006/626b4b93db83ca6c1d1dc695/html5/thumbnails/9.jpg)
10 TRUE RISK 10 TRUE RISK
DENSITY
7
RISK ASSESSMENT: RISK DISTRIBUTIONS
DENSITY
![Page 10: FAIR MACHINE LEARNING - uni-leipzig.de](https://reader033.vdocuments.us/reader033/viewer/2022043006/626b4b93db83ca6c1d1dc695/html5/thumbnails/10.jpg)
‣ ASSUMPTIONS
‣ 2 groups ‣ same mean (fixed for all
groups) ‣ different distribution ‣ distribution depends on x
8
Quelle: S. Corbett-Davies, S. Goel, 2018. The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning. Computer Research Repository.
RISK ASSESSMENT: RISK DISTRIBUTIONS
![Page 11: FAIR MACHINE LEARNING - uni-leipzig.de](https://reader033.vdocuments.us/reader033/viewer/2022043006/626b4b93db83ca6c1d1dc695/html5/thumbnails/11.jpg)
‣ DISTRIBUTION -> DECISION
‣ apply threshold on risk ‣ trade-off costs and benefits ‣ maximum utility of decision =
sweet spot between costs and benefits
10 TRUE RISK0.5
threshold
9
RISK ASSESSMENT: FORMING DECISIONS
DENSITY
![Page 12: FAIR MACHINE LEARNING - uni-leipzig.de](https://reader033.vdocuments.us/reader033/viewer/2022043006/626b4b93db83ca6c1d1dc695/html5/thumbnails/12.jpg)
‣ FIND OPTIMAL DECISION
‣ maximize utility
‣ u(0) = b00 * (1-r(x)) - c01*r(x)
‣ u(1) = b11 * r(x) - c10 * (1 - r(x))
benefit of correct positive / negative decision
costs of incorrect positive / negative decision
decision
10
RISK ASSESSMENT: UTILITY FUNCTIONS
![Page 13: FAIR MACHINE LEARNING - uni-leipzig.de](https://reader033.vdocuments.us/reader033/viewer/2022043006/626b4b93db83ca6c1d1dc695/html5/thumbnails/13.jpg)
‣ THRESHOLD RULES ‣ from utility functions:
u(1) u(0) r(x) (b00 + c10) / (b00 + b11 + c01 + c10) ≥ ↔ ≥
threshold that produces optimal
decisions
11
RISK ASSESSMENT: THRESHOLD RULES
![Page 14: FAIR MACHINE LEARNING - uni-leipzig.de](https://reader033.vdocuments.us/reader033/viewer/2022043006/626b4b93db83ca6c1d1dc695/html5/thumbnails/14.jpg)
‣ UTILITY FUNCTIONS & THRESHOLD RULES
with b00 = b11 = c10 = c01 = 1threshold t
12
TRUE RISK
RISK ASSESSMENT: THRESHOLD RULES
PROBLEM?
![Page 15: FAIR MACHINE LEARNING - uni-leipzig.de](https://reader033.vdocuments.us/reader033/viewer/2022043006/626b4b93db83ca6c1d1dc695/html5/thumbnails/15.jpg)
13
ALGORITHMIC FAIRNESS: FORMAL DEFINITIONS
![Page 16: FAIR MACHINE LEARNING - uni-leipzig.de](https://reader033.vdocuments.us/reader033/viewer/2022043006/626b4b93db83ca6c1d1dc695/html5/thumbnails/16.jpg)
‣ IDEA
‣ decisions should not explicitly depend on protected attributes
‣ forbids use of protected features in X
FORMAL DEFINITIONS: ANTI-CLASSIFICATION
14
![Page 17: FAIR MACHINE LEARNING - uni-leipzig.de](https://reader033.vdocuments.us/reader033/viewer/2022043006/626b4b93db83ca6c1d1dc695/html5/thumbnails/17.jpg)
‣ LIMITS OF ANTI-CLASSIFICATION ‣ implicit dependence on included features ‣ sometimes explicit use of group membership needed for fair
decision
15
FORMAL DEFINITIONS: ANTI-CLASSIFICATION
![Page 18: FAIR MACHINE LEARNING - uni-leipzig.de](https://reader033.vdocuments.us/reader033/viewer/2022043006/626b4b93db83ca6c1d1dc695/html5/thumbnails/18.jpg)
‣ LIMITS OF ANTI-CLASSIFICATION ‣ excluding could lead
to unjustified disparate impact ‣ example gender-
neutral vs. gender-specific recidivism rate
16
Quelle: S. Corbett-Davies, S. Goel, 2018. The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning. Computer Research Repository.
ALGORITHMIC FAIRNESS: ANTI-CLASSIFICATION
![Page 19: FAIR MACHINE LEARNING - uni-leipzig.de](https://reader033.vdocuments.us/reader033/viewer/2022043006/626b4b93db83ca6c1d1dc695/html5/thumbnails/19.jpg)
‣ IDEA ‣ all groups have the same classification errors ‣ classification errors: false positive / negative rates,
precision, recall, proportion of positive decisions
17
FORMAL DEFINITIONS: CLASSIFICATION PARITY
![Page 20: FAIR MACHINE LEARNING - uni-leipzig.de](https://reader033.vdocuments.us/reader033/viewer/2022043006/626b4b93db83ca6c1d1dc695/html5/thumbnails/20.jpg)
‣ CLASSIFICATION PARITY OF FALSE POSITIVE RATE
Look, we are different and
have the same label!
I don’t care. We still have
the same probability of
being wrongfully
incarcerated!
18
5 5
FORMAL DEFINITIONS: CLASSIFICATION PARITY
![Page 21: FAIR MACHINE LEARNING - uni-leipzig.de](https://reader033.vdocuments.us/reader033/viewer/2022043006/626b4b93db83ca6c1d1dc695/html5/thumbnails/21.jpg)
‣ CLASSIFICATION PARITY
40 %
60 %
40 %60 %
40 %60 %
DISTRIBUTION OF ALL DECISIONS
DISTRIBUTION OF SUBSETS
19
WOMEN
MEN
FORMAL DEFINITIONS: CLASSIFICATION PARITY
![Page 22: FAIR MACHINE LEARNING - uni-leipzig.de](https://reader033.vdocuments.us/reader033/viewer/2022043006/626b4b93db83ca6c1d1dc695/html5/thumbnails/22.jpg)
‣ LIMITS OF CLASSIFICATION PARITY ‣ risk distributions differ among groups ‣ depend entirely on X and how well X describes the group
‣ thresholds lead to unequal classification errors among groups
20
FORMAL DEFINITIONS: CLASSIFICATION PARITY
![Page 23: FAIR MACHINE LEARNING - uni-leipzig.de](https://reader033.vdocuments.us/reader033/viewer/2022043006/626b4b93db83ca6c1d1dc695/html5/thumbnails/23.jpg)
‣ LIMITS OF CLASSIFICATION PARITY ‣ hypothetical risk
distributions ‣ infra-marginal
statistics differ
21
Quelle: S. Corbett-Davies, S. Goel, 2018. The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning. Computer Research Repository.
FORMAL DEFINITIONS: CLASSIFICATION PARITY
![Page 24: FAIR MACHINE LEARNING - uni-leipzig.de](https://reader033.vdocuments.us/reader033/viewer/2022043006/626b4b93db83ca6c1d1dc695/html5/thumbnails/24.jpg)
‣ IDEA
‣ given any risk score, decisions must be independent of protected attributes
‣ ensures equal meaning of risk scores among all groups
Pr(Y = 1 | s(X), Xp) = Pr(Y=1 | s(X))
22
FORMAL DEFINITIONS: CALIBRATION
![Page 25: FAIR MACHINE LEARNING - uni-leipzig.de](https://reader033.vdocuments.us/reader033/viewer/2022043006/626b4b93db83ca6c1d1dc695/html5/thumbnails/25.jpg)
‣ IDEA
7
Look, we got the same
score!
Great! So we will get the same label!
7
23
FORMAL DEFINITIONS: CALIBRATION
![Page 26: FAIR MACHINE LEARNING - uni-leipzig.de](https://reader033.vdocuments.us/reader033/viewer/2022043006/626b4b93db83ca6c1d1dc695/html5/thumbnails/26.jpg)
‣ LIMITS OF CALIBRATION ‣ insufficient to guarantee: ‣ equitable decisions ‣ accurate risk scores
24
FORMAL DEFINITIONS: CALIBRATION
![Page 27: FAIR MACHINE LEARNING - uni-leipzig.de](https://reader033.vdocuments.us/reader033/viewer/2022043006/626b4b93db83ca6c1d1dc695/html5/thumbnails/27.jpg)
‣ LIMITS OF CALIBRATION
25Quelle: S. Corbett-Davies, S. Goel, 2018. The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning. Computer Research Repository.
FORMAL DEFINITIONS: CALIBRATION
![Page 28: FAIR MACHINE LEARNING - uni-leipzig.de](https://reader033.vdocuments.us/reader033/viewer/2022043006/626b4b93db83ca6c1d1dc695/html5/thumbnails/28.jpg)
26
PROBLEMS WITH DESIGNING FAIR ALGORITHMS
![Page 29: FAIR MACHINE LEARNING - uni-leipzig.de](https://reader033.vdocuments.us/reader033/viewer/2022043006/626b4b93db83ca6c1d1dc695/html5/thumbnails/29.jpg)
‣ MEASUREMENT ERROR
‣ decisions based on true risk ‣ not true only approximated through X and Y ‣ label bias (errors in Y) ‣ feature bias (errors in X)
PROBLEMS WITH DESIGNING FAIR ALGORITHMS
27
![Page 30: FAIR MACHINE LEARNING - uni-leipzig.de](https://reader033.vdocuments.us/reader033/viewer/2022043006/626b4b93db83ca6c1d1dc695/html5/thumbnails/30.jpg)
‣ MEASUREMENT ERROR: LABEL BIAS ‣ predicted Y in decision != observed Y: ‣ Pr(reoffend | released) != Pr(offend) ‣ e.g. pretrial: observed Y == crime that we know about
‣ no solutions yet ‣ but: check estimation strategy
PROBLEMS WITH DESIGNING FAIR ALGORITHMS
28
![Page 31: FAIR MACHINE LEARNING - uni-leipzig.de](https://reader033.vdocuments.us/reader033/viewer/2022043006/626b4b93db83ca6c1d1dc695/html5/thumbnails/31.jpg)
‣ MEASUREMENT ERROR: FEATURE BIAS ‣ differences in predictive power of features ‣ e.g. minorities more likely to be arrested => feature „past
criminal behaviour“ could skew data ‣ feature vector < real world features ‣ solutions: ‣ include group membership in predictive model ‣ use more data (more features)
PROBLEMS WITH DESIGNING FAIR ALGORITHMS
29
![Page 32: FAIR MACHINE LEARNING - uni-leipzig.de](https://reader033.vdocuments.us/reader033/viewer/2022043006/626b4b93db83ca6c1d1dc695/html5/thumbnails/32.jpg)
‣ SAMPLE BIAS ‣ sample data should reflect reality ‣ problems: ‣ reality: true distribution unknown ‣ time: model might become outdated
‣ no perfect solution: ‣ try use representative training data
PROBLEMS WITH DESIGNING FAIR ALGORITHMS
30
![Page 33: FAIR MACHINE LEARNING - uni-leipzig.de](https://reader033.vdocuments.us/reader033/viewer/2022043006/626b4b93db83ca6c1d1dc695/html5/thumbnails/33.jpg)
‣ risk assessment tools ‣ threshold-rules aim to maximize utility conditional on
approximated true risk ‣ imperfect mathematical definitions of fairness ‣ anti-classification, classification parity, calibration
‣ designing fair algorithms bears many other problems ‣ e.g. sample bias, feature and label bias
SUMMARY
31
![Page 34: FAIR MACHINE LEARNING - uni-leipzig.de](https://reader033.vdocuments.us/reader033/viewer/2022043006/626b4b93db83ca6c1d1dc695/html5/thumbnails/34.jpg)
‣ S. Corbett-Davies, S. Goel, 2018. The Measure and Mismeasure of Fairness: A Critical Review of Fair Machine Learning. Computer Research Repository.
‣ S. Goel, 2019. The Measure and Mismeasure of Fairness. Talk at Berkely University. https://simons.berkeley.edu/talks/measure-and-mismeasure-fairness. [last called Jan. 8th 2020]
‣ C. Dwork, M. Hardt, T. Pitassi, O. Reingold, and R. Zemel, 2012. Fairness through awareness. In Proceedings of the 3rd Innovations in Theoretical Computer Science Conference (ITCS ’12).
‣ M. Feldman, S. A. Friedler, J. Moeller, C. Scheidegger, and S. Venkatasubramanian. 2015. Certifying and Removing Disparate Impact. In Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD ’15).
SOURCES
![Page 35: FAIR MACHINE LEARNING - uni-leipzig.de](https://reader033.vdocuments.us/reader033/viewer/2022043006/626b4b93db83ca6c1d1dc695/html5/thumbnails/35.jpg)
‣ MODEL FORM AND INTERPRETABILITY ‣ best case: few features but much training data ‣ statistical strategy has little effect on estimates
‣ feature space high-dimensional or few training data ‣ statistical strategy important
‣ limited transparency of decisions ‣ field of interpretable machine learning
PROBLEMS WITH DESIGNING FAIR ALGORITHMS
x1
![Page 36: FAIR MACHINE LEARNING - uni-leipzig.de](https://reader033.vdocuments.us/reader033/viewer/2022043006/626b4b93db83ca6c1d1dc695/html5/thumbnails/36.jpg)
‣ EXTERNALITIES AND EQUILIBRIUM EFFECTS ‣ risk assessment tools could alter populations to which they
are applied ‣ populations/distributions change ‣ model becomes outdated ‣ need of new training data
PROBLEMS WITH DESIGNING FAIR ALGORITHMS
x2