type independent correction of sample selection bias via structural discovery and re-balancing...

3

Type Independent Correction of Sample Selection Bias via tructural Discovery and Re- balancing Jiangtao Ren 1 Xiaoxiao Shi 1 Wei Fan 2 Philip S. Yu 2 1 Sun Yet- Sun University, China 2 IBM T.J.Watson 3 University of Illinois at Chicago

Upload: matthew-stanley

Post on 27-Mar-2015

214 views

Category:

Documents

1 download

Report

Download

Embed Size (px):

TRANSCRIPT

Page 1: Type Independent Correction of Sample Selection Bias via Structural Discovery and Re-balancing Jiangtao Ren 1 Xiaoxiao Shi 1 Wei Fan 2 Philip S. Yu 2 1

Type Independent Correction of Sample Selection Bias via

Structural Discovery and Re-balancing

Jiangtao Ren1

Xiaoxiao Shi1

Wei Fan2

Philip S. Yu2

1Sun Yet-Sun University, China 2IBM T.J.Watson 3University of Illinois at Chicago

Page 2: Type Independent Correction of Sample Selection Bias via Structural Discovery and Re-balancing Jiangtao Ren 1 Xiaoxiao Shi 1 Wei Fan 2 Philip S. Yu 2 1

What is sample selection bias? Inductive learning: training data (x,y) is sampled from the universe of

examples. In many applications: training data (x,y) is not sampled randomly.

Insurance and mortgage data: you only know those people you give a policy.

School data: self-select

There are different possibilities of how (x,y) is selected (Zadrozny’04) S=1 denotes (x,y) is chosen. S is independent from x and y. Total random sample. S is dependent on y not x. Class bias S is dependent on x not on y. Feature bias. S is dependent on both x and y. Both class and feature.

Ubiquitous:Loan Approval, Drug screening, Weather forecasting, Ad Campaign, Fraud Detection, User Profiling, Biomedical Informatics, Intrusion Detection Insurance ,etc

Page 3: Type Independent Correction of Sample Selection Bias via Structural Discovery and Re-balancing Jiangtao Ren 1 Xiaoxiao Shi 1 Wei Fan 2 Philip S. Yu 2 1

Our method Key ideas:

Original Dataset Structural Discovery Structural Rebalance Corrected Dataset

Automatic Clustering

Advantages:1. Type Independent2. Model Independent3. Straightforward

2. Select “trustful” ones3. Label by neighbors

1. The same proportion

Review early ren high ren

Li Simin, Cong Rong, Yao Xiaoxiao, Feng Jing, Tang

PRIVACY-PRESERVING DATA PUBLISHING Paper presenter: Erik Wang Discussion leader: XiaoXiao Ma

Efficient and Numerically Stable Sparse Learning Sihong Xie 1, Wei Fan 2, Olivier Verscheure 2, and Jiangtao Ren 3 1 University of Illinois at Chicago,

REN Server Configuration

Early ren italy

XIAOXIAO EDUCATION LIMITED · English as a second language during the important early childhood years, Xiaoxiao Education is well positioned to continue to lead the private pre-school

Ren IFAC08 Tutorial

Ng ren consortium

Meanwhile at SEXYLAND - xiaoxiao xu

How to Start Your Business in China Jiangtao Yu, Ph.D

Latent Space Domain Transfer between High Dimensional Overlapping Distributions Sihong Xie Wei Fan Jing Peng* Olivier Verscheure Jiangtao Ren Sun Yat-Sen

Actively Transfer Domain Knowledge Xiaoxiao Shi Wei Fan Jiangtao Ren Sun Yat-sen University IBM T. J. Watson Research Center Transfer when you can, otherwise

Yongxiong Ren, p. 1 CV: Yongxiong Ren · Yongxiong Ren, p. 1 CV: Yongxiong Ren ADDRESS: EEB 500, Dept. of Electrical Engineering, Viterbi School of Engineering, USC, LA, CA 90089-2565,

Graph-based Iterative Hybrid Feature Selection Erheng Zhong † Sihong Xie † Wei Fan ‡ Jiangtao Ren † Jing Peng # Kun Zhang $ † Sun Yat-sen University ‡

CFI_539452_JIE REN

Xiaoxiao Had Oop

The Northern Renaissance. Italian Ren. spread to N. Euro. French Ren., English Ren., etc. 1453: Flanders—Ren. developed here first, had $$$$$$ Fr. & Eng

Lydia Song, Lauren Steimle, Xiaoxiao Xu , and Dr. Arye Nehorai

Barrel or Bilateral-shaped SNRs Jiangtao Li May 6th 2009

XIAOXIAO EDUCATION LIMITED

REN-ISAC Activities and REN-ISAC / Internet2 Focus Group Results Doug Pearson Technical Director, REN-ISAC Joint Techs, July 2005

Memory –efficient Data Management Policy for Flash-based Key-Value Store Wang Jiangtao 2013-4-12

EMAIL SPAM DETECTION USING MACHINE LEARNING Lydia Song, Lauren Steimle, Xiaoxiao Xu

Ren é Descartes

REN Certification

Crowdsourcing SecOps Through REN -ISAC - Internet2 · Crowdsourcing SecOps Through REN -ISAC . Kim Milford, REN-ISAC Executive Director. Chris O’Donnell, REN-ISAC Lead Security