intro to causalitytoni/courses/fairness/lectures/... · quick note on fairness and causality •...
Post on 07-May-2020
8 Views
Preview:
TRANSCRIPT
IntrotoCausalityDavidMadras
October22,2019
Simpson’sParadox
TheMontyHallProblem
TheMontyHallProblem
1. Threedoors– 2havegoatsbehindthem,1hasacar(youwanttowinthecar)
2. Youchooseadoor,butdon’topenit3. Thehost,Monty,opensanother door(nottheoneyouchose),and
showsyouthatthereisagoatbehindthatdoor4. Younowhavetheoptiontoswitchyourdoorfromtheoneyou
chosetotheotherunopeneddoor5. Whatshouldyoudo?Shouldyouswitch?
TheMontyHallProblem
What’sGoingOn?
Causation!=Correlation
• Inmachinelearning,wetrytolearncorrelationsfromdata• “WhencanwepredictXfromY?”
• Incausalinference,wetrytomodelcausation• “WhendoesXcause Y?”
• Thesearenotthesame!• Icecreamconsumptioncorrelateswithmurderrates• Icecreamdoesnotcausemurder(usually)
CorrelationsCanBeMisleading
https://www.tylervigen.com/spurious-correlations
CausalModelling
• Twooptions:1. Runarandomizedexperiment
CausalModelling
• Twooptions:1. Runarandomizedexperiment2. Makeassumptionsabouthowourdataisgenerated
CausalDAGs
• PioneeredbyJudeaPearl• Describesgenerativeprocessofdata
CausalDAGs
• PioneeredbyJudeaPearl• Describes(stochastic)generativeprocessofdata
CausalDAGs
• Tisamedicaltreatment• Yisadisease• Xareotherfeaturesaboutpatients(say,age)
• Wewanttoknowthecausaleffect ofourtreatmentonthedisease.
CausalDAGs
• Experimentaldata:randomizedexperiment• WedecidewhichpeopleshouldtakeT
• Observationaldata:noexperiment• PeoplechosewhetherornottotakeT
• Experimentsareexpensiveandrare• Observationscanbebiased• E.g.WhatifmostlyyoungpeoplechooseT?
AskingCausalQuestions• SupposeTisbinary(1:receivedtreatment,0:didnot)• SupposeY isbinary(1:diseasecured,0:diseasenotcured)• Wewanttoknow“Ifwegivesomeonethetreatment(T=1), whatistheprobabilitytheyarecured(Y=1)?”
• Thisisnot equaltoP(Y=1|T=1)• Supposemostlyyoungpeopletakethetreatment,andmostwerecured,i.e.P(Y=1|T=1)ishigh• Isthisbecausethetreatmentisgood?Orbecausetheyareyoung?
Correlation vs.Causation
• Correlation
• Intheobserveddata,howoftendopeoplewhotakethetreatmentbecomecured?• Theobserveddatamaybebiased!!
Correlationvs.Causation
• Let’ssimulate arandomizedexperiment• i.e.• CutthearrowfromXtoT• Thisiscalledado-operation
• Then,wecanestimatecausation:
Correlationvs.Causation
• Correlation
• Causation– treatmentisindependent ofX
InversePropensityWeighting
• Cancalculatethisusinginversepropensityscores• RatherthanadjustingforX,sufficienttoadjustforP(T|X)
P(T|X)
InversePropensityWeighting
• Cancalculatethisusinginversepropensityscores• Thesearecalledstabilizedweights
MatchingEstimators
• Matchupsampleswithdifferenttreatmentsthatareneartoeachother• Similartoreweighting
Review:Whattodo withacausalDAG
ThecausaleffectofTonYis
Thisisgreat!Butwe’vemadesomeassumptions.
Simpson’sParadox,Explained
Simpson’sParadox,Explained
Size
Trmt Y
Simpson’sParadox,Explained
Size
Trmt Y
MontyHallProblem,Explained
Boringexplanation:
MontyHallProblem,Explained
Causalexplanation:• Mydoorlocationis
correlatedwiththecarlocation,conditioned onwhichdoorMontyopens!
CarLocationMyDoor
OpenedDoor
https://twitter.com/EpiEllie/status/1020772459128197121
MontyHallProblem,Explained
Causalexplanation:• Mydoorlocationis
correlatedwiththecarlocation,conditioned onwhichdoorMontyopens!
• ThisisbecauseMontywon’tshowmethecar
• Ifhe’sguessingalso,thencorrelationdisappears
CarLocationMyDoor
Monty’sDoor
StructuralAssumptions
• AllofthisassumesthatourassumptionsabouttheDAGthatgeneratedourdataarecorrect
• Specifically,weassumethattherearenohiddenconfounders• Confounder:avariablewhichcausallyeffectsboththetreatment(T)andtheoutcome(Y)• No hiddenconfoundersmeansthatwehaveobservedallconfounders
• Thisisastrongassumption!
HiddenConfounders
• CannotcalculateP(Y|do(T))here,sinceUisunobserved
• Wesayinthiscasethatthecausaleffectisunidentifiable• Eveninthecaseofinfinitedataandcomputation,wecannevercalculatethisquantity
X
T Y
U
WhatCanWeDowithHiddenConfounders?
• Instrumentalvariables• Findsomevariablewhicheffectsonly thetreatment
• Sensitivityanalysis• Essentially,assumesomemaximumamountofconfounding• Yieldsconfidenceinterval
• Proxies• Otherobservedfeaturesgiveusinformationaboutthehiddenconfounder
InstrumentalVariables
• Findaninstrument – variablewhichonlyaffectstreatment• Decouplestreatmentandoutcomevariation
• Withlinearfunctions,solveanalytically• Butcanalsouseanyfunctionapproximators
SensitivityAnalysis
• Determinetherelationshipbetweenstrengthofconfoundingandcausaleffect• Example:Doessmokingcauselungcancer?(wenowknow,yes)• Theremay beagenethatcauseslungcancerand smoking• Wecan’tknowforsure!• However,wecanfigureouthowstrongthisgenewouldneedtobetoresultintheobservedeffect• Turnsout– verystrong
X Gene
Smoking Cancer
SensitivityAnalysis
• Theideais:parametrizeyouruncertainty,andthendecidewhichvaluesofthatparameterarereasonable
UsingProxies
• Insteadofmeasuringthehiddenconfounder,measuresomeproxies(V=fprox(U))• Proxies:variablesthatarecausedbytheconfounder• IfUisachild’sage,Vmightbeheight
• Iffprox isknownorlinear,wecanestimatethiseffect
X
T
U
Y V
UsingProxies
• Iffprox isnon-linear,wemighttrytheCausalEffectVAE• LearnaposteriordistributionP(U|V)withvariationalmethods• However,thismethoddoesnotprovidetheoreticalguarantees• Resultsmaybeunverifiable:proceedwithcaution!
X
T
U
Y V
CausalityandOtherAreasofML
• ReinforcementLearning• Naturalcombination– RLisallabouttakingactionsintheworld• Off-policylearningalreadyhaselementsofcausalinference
• Robustclassification• Causalitycanbenaturallanguageforspecifyingdistributionalrobustness
• Fairness• Ifdatasetisbiased,MLoutputsmightbeunfair• Causalityhelpsusthinkaboutdatasetbias,andmitigateunfaireffects
QuickNoteonFairnessandCausality
• Manyfairnessproblems(e.g.loans,medicaldiagnosis)areactuallycausalinferenceproblems!• WetalkaboutthelabelY– however,thisisnotalwaysobservable• Forinstance,wecan’tknowifsomeonewould returnaloanifwedon’tgiveonetothem!• Thismeansifwejusttrainaclassifieronhistoricaldata,ourestimatewillbebiased• Biasedinthefairnesssenseand thetechnicalsense
• Generaltakeaway:ifyourdataisgeneratedbypastdecisions,thinkveryhardabouttheoutputofyourMLmodel!
FeedbackLoops
• Takesustopart2…feedbackloops• WhenMLsystemsaredeployed,theymakemanydecisionsovertime• Soourpastpredictionscanimpactourfuturepredictions!• Notgood
UnfairFeedbackLoops
• We’lllookat“FairnessWithoutDemographicsinRepeatedLossMinimization”(Hashimotoetal,ICML2018)• Domain:recommendersystems• Supposewehaveamajoritygroup(A=1)andminoritygroup(A=0)• Ourrecommendersystemmayhavehighoverallaccuracybutlowaccuracyontheminoritygroup• Thiscanhappenduetoempiricalriskminimization(ERM)
• Canalsobeduetorepeateddecision-making
RepeatedLossMinimization
• Whenwegivebadrecommendations,peopleleaveoursystem• Overtime,thelow-accuracygroupwillshrink
Distributionally RobustOptimization
• Upweight exampleswithhighlossinordertoimprovetheworstcase• Inthelongrun,thiswillpreventclustersfrombeingunderserved
• Thisendsupbeingequalto
Distributionally RobustOptimization
• Upweight exampleswithhighlossinordertoimprovetheworstcase• Inthelongrun,thiswillpreventclustersfrombeingunderserved
Conclusion
• Yourdataisnotwhatitseems• MLmodelsonlyworkifyourtraining/testsetactuallylookliketheenvironmentyoudeploythemin• Thiscanmakeyourresultsunfair• Orjustincorrect
• Soexamineyourmodelassumptionsanddatacollectioncarefully!
top related