learning data manipulation for augmentation and...

Learning Data Manipulation for Augmentation and Weighting

Zhiting Hu*, Bowen Tan*, Ruslan Salakhutdinov, Tom Mitchell, Eric XingCarnegie Mellon University, Petuum Inc.

Learning Data Manipulation• Data manipulation is often a crucial step to improve performance

• Data augmentation for small data problems

• Rule-based: image rotation, cropping; text synonym, …• Learning-based [Ratner et al.,17; Cubuk et al.,19]

• Data weighting for class-imbalance problems

• Rule-based: inverse class frequency, …• Learning-based: meta-learning [Ren et al., 18]; self-paced learning [Jiang et al.,18]

! = #$% & = &′ =augment

Specific to one manipulation type

( = 1 ( = 10

Learning Data Manipulation• Data manipulation is often a crucial step to improve performance

• Data augmentation for small data problems

• Rule-based: image rotation, cropping; text synonym, …• Learning-based [Ratner et al.,17; Cubuk et al.,19]

• Data weighting for class-imbalance problems

• Rule-based: inverse class frequency, …• Learning-based: meta-learning [Ren et al., 18]; self-paced learning [Jiang et al.,18]

! = #$% & = &′ =augment

Specific to one manipulation type

( = 1 ( = 10

Can we have a more generic learning method?

Background

Connecting the Dots between MLE and RL [Tan et al., 2019]

Parameterize Manipulation

• Standard MLE

[Hu et al., NeurIPS2019]

ℒ"#$,&# ', ( = *+(-,.) 01(-, .|3) − 5 KL ' -, . || 8 - 89 . - + ;(')

01 -, . 3 = <1 if -, . ∈ 3−∞ otherwise

Get valid rewards only when samples match training data


• Standard MLE


ℒ"#$,&# ', ( = *+(-,.) 01(-, .|3) − 5 KL ' -, . || 8 - 89 . - + ;(')

01 -, . 3 = <1 if -, . ∈ 3−∞ otherwise


Relax it


• Standard MLE

• Data augmentation

• Data weighting


ℒ"#$,&# ', ( = *+(-,.) 01(-, .|3) − 5 KL ' -, . || 8 - 89 . - + ;(')

01 -, . 3 = <1 if -, . ∈ 3−∞ otherwise


Relax it

0IJKL

-, . 3 = <1 if - ∼ NI - -∗, . -∗, . ∈ 3

−∞ otherwise

0IP -, . 3 = <

QR if -, . = -S, .S -S, .S ∈ 3−∞ otherwise

Any label-preserving transformations

Weight of the T-thtraining example

Use a Reward Learning Algorithm

• Intrinsic reward learning [Zheng et al., 18]

• Augment extrinsic reward for better performance

• Update model:

• Update reward:

• Map to data manipulation

• Update model:

• Update data:


!"#$%& ', ) = !"# ', ) + !,%&(', ))/0 = / + 1∇3ℒ"#$%&(/,5)

50 = 5 + 1∇,ℒ"#(/′(5))

Train model with augmented reward

New model /′ gets higher extrinsic reward

Use a Reward Learning Algorithm

• Intrinsic reward learning [Zheng et al., 18]

• Augment extrinsic reward for better performance

• Update model:

• Update reward:

• Map to data manipulation

• Update model:

• Update data:


!"#$%& ', ) = !"# ', ) + !,%&(', ))/0 = / + 1∇3ℒ"#$%&(/,5)

50 = 5 + 1∇,ℒ"#(/′(5))

Train model with augmented reward

New model /′ gets higher extrinsic reward

/0 = / + 1∇3ℒ78&%9(/,5)

50 = 5 + 1∇,ℒ:(/′(5))

Train model on manipulated data

New model /′ gets better valid-set perf.

Parameterization of Text Data Augmentation

Food was good , but the service was very disappointing . ! =neg# =Raw data

Random mask # = Food [Mask] [Mask] , but the service was very disappointing .

Augment data # = Food tasted nice , but the service was very disappointing .

Finetune BERT to condition on the label$


Low-Data Text Classification


30

31

32

33

34

35

36

37

38

SST-5 IMDB

Accuracy

Methods

Base

Synonym

Fixed

Ours

60

61

62

63

64

65

66

IMDB

Acc

urac

y

Methods

Base: BERT Synonym Fixed Ours

85

86

87

88

89

90

TREC

Acc

urac

y

Methods


85

86

87

88

89

90

TREC

Acc

urac

y

Methods


Train size: 40 per class; Val size: 2

Use a label-conditional BERT as the augmentation model !" # #∗, &

70

74

78

82

86

20:1000 50:1000 100:1000

Acc

urac

y

Imbalance ratio

Base: ResNet

Ours

Ren

Proportion

Class-Imbalance Image ClassificationCIFAR10 binary image classification


(weighting)

learning data manipulation for augmentation and...

Documents