finding tribes: identifying close-knit individuals from employment patterns lisa friedland and david...
TRANSCRIPT
![Page 1: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns Lisa Friedland and David Jensen Presented by Nick Mattei](https://reader033.vdocuments.us/reader033/viewer/2022052701/56649cc05503460f94987247/html5/thumbnails/1.jpg)
Finding Tribes: Identifying Close-Knit Individuals fromEmployment Patterns
Lisa Friedland and David Jensen
Presented by Nick Mattei
![Page 2: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns Lisa Friedland and David Jensen Presented by Nick Mattei](https://reader033.vdocuments.us/reader033/viewer/2022052701/56649cc05503460f94987247/html5/thumbnails/2.jpg)
Introduction
Tribes – groups with similar traits in a large graph
Distinguish those that work together and move together intentionally
![Page 3: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns Lisa Friedland and David Jensen Presented by Nick Mattei](https://reader033.vdocuments.us/reader033/viewer/2022052701/56649cc05503460f94987247/html5/thumbnails/3.jpg)
Relationship Knowledge Discovery
Exploit connections among individuals to identify patterns and make predictions
Discover underlying dependencies Links must be inferred
![Page 4: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns Lisa Friedland and David Jensen Presented by Nick Mattei](https://reader033.vdocuments.us/reader033/viewer/2022052701/56649cc05503460f94987247/html5/thumbnails/4.jpg)
Graph Mining
Discover Hidden Group Structures Animal Herds, Webpages, Employees
Time Series Analysis Co-integration (Economics)
Security and Intrusion Detection Dynamic Networks
![Page 5: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns Lisa Friedland and David Jensen Presented by Nick Mattei](https://reader033.vdocuments.us/reader033/viewer/2022052701/56649cc05503460f94987247/html5/thumbnails/5.jpg)
Motivation
National Association of Securities Dealers
Fraud Collusion 4.8 Million Records 2.5 Million Reps at 560,000 Firms 100 Years of Data
![Page 6: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns Lisa Friedland and David Jensen Presented by Nick Mattei](https://reader033.vdocuments.us/reader033/viewer/2022052701/56649cc05503460f94987247/html5/thumbnails/6.jpg)
Complications
Jobs not necessarily in order (or singletons) 20% of employees hold more than
one job at a time 10% begin multiple jobs (up to 16) on
one day Leave gaps between employment Mergers and acquisitions
![Page 7: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns Lisa Friedland and David Jensen Presented by Nick Mattei](https://reader033.vdocuments.us/reader033/viewer/2022052701/56649cc05503460f94987247/html5/thumbnails/7.jpg)
Model
![Page 8: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns Lisa Friedland and David Jensen Presented by Nick Mattei](https://reader033.vdocuments.us/reader033/viewer/2022052701/56649cc05503460f94987247/html5/thumbnails/8.jpg)
Finding Anomalously Related Entities Input:
Bipartite Graph: G = (R A, E) Entities: R = {r1, r2, …, rn} (People) Attributes: A = {a1, a2, …, am}
(Orgs.) Entities should connect several
attributes Model co-occurrence rates of pairs
of attributes
![Page 9: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns Lisa Friedland and David Jensen Presented by Nick Mattei](https://reader033.vdocuments.us/reader033/viewer/2022052701/56649cc05503460f94987247/html5/thumbnails/9.jpg)
Algorithm
![Page 10: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns Lisa Friedland and David Jensen Presented by Nick Mattei](https://reader033.vdocuments.us/reader033/viewer/2022052701/56649cc05503460f94987247/html5/thumbnails/10.jpg)
Simple Model Measures
JOBS = (Number of shared Jobs in the sequence)
YEARS = (Number of Years of overlap)
![Page 11: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns Lisa Friedland and David Jensen Presented by Nick Mattei](https://reader033.vdocuments.us/reader033/viewer/2022052701/56649cc05503460f94987247/html5/thumbnails/11.jpg)
Example Sequences
![Page 12: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns Lisa Friedland and David Jensen Presented by Nick Mattei](https://reader033.vdocuments.us/reader033/viewer/2022052701/56649cc05503460f94987247/html5/thumbnails/12.jpg)
Probabilistic Model
X = P(BrA -> BrB -> BrC -> BrD) = pa * tAB * tBC * tCD
Estimate: P(start branch i)
=(#reps ever at i) / (#reps in database) Tij = P(reps from i to j | #ever at i)
=(#reps leave i to go to j) / (ever at i)
ip
![Page 13: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns Lisa Friedland and David Jensen Presented by Nick Mattei](https://reader033.vdocuments.us/reader033/viewer/2022052701/56649cc05503460f94987247/html5/thumbnails/13.jpg)
Probabilistic Model
Null Hypothesis of Independent Movement
Movement Not Random Split and Merge Markov Chains
![Page 14: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns Lisa Friedland and David Jensen Presented by Nick Mattei](https://reader033.vdocuments.us/reader033/viewer/2022052701/56649cc05503460f94987247/html5/thumbnails/14.jpg)
Probabilistic Model (Different Paths)
Tij becomes Vij Vij = P(move to branch j at any point
after branch I | currently at i) = (# reps who go to branch j at any
point after working at i) / (# reps ever at i)
Now each vij >= tij and probabilities no longer sum to 1.
![Page 15: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns Lisa Friedland and David Jensen Presented by Nick Mattei](https://reader033.vdocuments.us/reader033/viewer/2022052701/56649cc05503460f94987247/html5/thumbnails/15.jpg)
Probabilistic Model (Different Paths) Vij becomes Wij
Wij = P (move to branch j at any point simultaneous to or after branch i | currently at i)
= (# reps who start at j at any point simultaneous or after starting at i) / (# of reps ever at i)
Now less precise in respect to direct transitions but more general
![Page 16: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns Lisa Friedland and David Jensen Presented by Nick Mattei](https://reader033.vdocuments.us/reader033/viewer/2022052701/56649cc05503460f94987247/html5/thumbnails/16.jpg)
PROB - TIMEBINS Bins of 1 year or more 10 people worked at each branch in
a bin period PiX = # reps ever at i during time
X / # reps in DB yiXjY = # reps ever at I during time
X and at j during time Y, where Y >= X / # reps ever at i during time X
![Page 17: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns Lisa Friedland and David Jensen Presented by Nick Mattei](https://reader033.vdocuments.us/reader033/viewer/2022052701/56649cc05503460f94987247/html5/thumbnails/17.jpg)
PROB-NOTIME Ignores order of job moves Use original pi
Zij = raw number of reps who are at both branches I and j during career
Transition Pr from i to j: = (zij / # reps ever at i) != (zij / # reps ever at j) =transition Pr from j to i
![Page 18: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns Lisa Friedland and David Jensen Presented by Nick Mattei](https://reader033.vdocuments.us/reader033/viewer/2022052701/56649cc05503460f94987247/html5/thumbnails/18.jpg)
Tribe Size
![Page 19: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns Lisa Friedland and David Jensen Presented by Nick Mattei](https://reader033.vdocuments.us/reader033/viewer/2022052701/56649cc05503460f94987247/html5/thumbnails/19.jpg)
Pairs
![Page 20: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns Lisa Friedland and David Jensen Presented by Nick Mattei](https://reader033.vdocuments.us/reader033/viewer/2022052701/56649cc05503460f94987247/html5/thumbnails/20.jpg)
Commonality of Job Sequence
![Page 21: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns Lisa Friedland and David Jensen Presented by Nick Mattei](https://reader033.vdocuments.us/reader033/viewer/2022052701/56649cc05503460f94987247/html5/thumbnails/21.jpg)
Disclosure Scores
![Page 22: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns Lisa Friedland and David Jensen Presented by Nick Mattei](https://reader033.vdocuments.us/reader033/viewer/2022052701/56649cc05503460f94987247/html5/thumbnails/22.jpg)
Homogenaity and Mobility
![Page 23: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns Lisa Friedland and David Jensen Presented by Nick Mattei](https://reader033.vdocuments.us/reader033/viewer/2022052701/56649cc05503460f94987247/html5/thumbnails/23.jpg)
![Page 24: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns Lisa Friedland and David Jensen Presented by Nick Mattei](https://reader033.vdocuments.us/reader033/viewer/2022052701/56649cc05503460f94987247/html5/thumbnails/24.jpg)
![Page 25: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns Lisa Friedland and David Jensen Presented by Nick Mattei](https://reader033.vdocuments.us/reader033/viewer/2022052701/56649cc05503460f94987247/html5/thumbnails/25.jpg)
Discussion JOBS, PROB, PROB-TIME, PROB-
NOTIME create tribes with higher than average disclosure scores
PROB creates more cross zip code results
PROB-TIME has higher phi-squared than all others
PROB favors large firms
![Page 26: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns Lisa Friedland and David Jensen Presented by Nick Mattei](https://reader033.vdocuments.us/reader033/viewer/2022052701/56649cc05503460f94987247/html5/thumbnails/26.jpg)
Discussion
JOBS and YEARS compute larger connected components
JOBS and PROB find same number of tribes but pick different groups as tribes
![Page 27: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns Lisa Friedland and David Jensen Presented by Nick Mattei](https://reader033.vdocuments.us/reader033/viewer/2022052701/56649cc05503460f94987247/html5/thumbnails/27.jpg)
Conclusions
With no explicit knowledge we can discover: Job transitions Geography Career track
![Page 28: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns Lisa Friedland and David Jensen Presented by Nick Mattei](https://reader033.vdocuments.us/reader033/viewer/2022052701/56649cc05503460f94987247/html5/thumbnails/28.jpg)
Conclusions
Needed: Ongoing process Multiple affiliations Arbitrary times Time is a paradox in domain
![Page 29: Finding Tribes: Identifying Close-Knit Individuals from Employment Patterns Lisa Friedland and David Jensen Presented by Nick Mattei](https://reader033.vdocuments.us/reader033/viewer/2022052701/56649cc05503460f94987247/html5/thumbnails/29.jpg)
Thanks!
Time for: Questions Comments Smart Remarks