rapid: rating pictorial aesthetics using deep learning...

48
REVIEW TO: RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1) Rating Image Aesthetics Using Deep Learning (Paper #2) Authors: Xin Lu, Zhe Lin, Hailin Jin, Jianchao Yang, James Z. Wang By: Kairanbay Magzhan

Upload: others

Post on 08-May-2020

6 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

REVIEW TO:

RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)Rating Image Aesthetics Using Deep Learning (Paper #2)

Authors: Xin Lu, Zhe Lin, Hailin Jin, Jianchao Yang, James Z. Wang

By: Kairanbay Magzhan

Page 2: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

Paper #2

Paper #1

~90% same

Page 3: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

Authors

Xin Lu

PhD student at the College of Information Science and

Technology, The Pennsylvania State

University, USA

Zhe Lin

Senior Research Scientist at Adobe Research, USA.PhD from University of

Maryland, USA

Hailin Jin

Principal Scientist at Adobe Systems Inc., Postdoctoral Researcher from University

of California, USA

Jianchao Yang

Researcher scientist at Adobe Technology

Laboratory, USA. PhD from University of Illinois at

Urbana-Champaign, USA

James Z. Wang

PhD from Stanford University, USA. Professor

and the Chair of Faculty Council at the College of Information Science and

Technology, The Pennsylvania State

University, USA

Page 4: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

Handcrafted features

Low level High level

Edge distribution Rule of third

Color histogram Golden ratio

Page 5: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

Handcrafted features

● Some aesthetic-relevant attributes may be unexplored and thus poorly defined

● Vagueness of certain photographic or psychologic rules. Difficulty in implementing them computationally, these handcrafted features are often merely approximations of such rules.

Page 6: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

Comparison

Generic features (SIFT, Fisher Vector etc.)

Handcrafted features (Color histogram, rule

of third etc.)>

Page 7: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

Motivation

“Image aesthetics depends on on a combination of local cues (e.g. sharpness and noise level) and global cues (e.g. the rule of third)”

Page 8: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

landscape portrait

Square

cnn

Page 9: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

Normalized but lost important information

Resize (Warp)

Page 10: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

Normalized but lost important information

Random crop

Page 11: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

Global views

Local views

Page 12: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

Data augmentation

Page 13: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

SCNN architectures

Page 14: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

WarpCenter crop Random cropPadding

Which input normalization to choose???

< <

Global view Local view

71.2%67.79%65.48%60.43%

Page 15: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

Parallelization

Page 16: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

DCNN (Double column CNN)

Page 17: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

AVA provides semantic and style tag. Can these

information improve the accuracy???

Page 18: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

What if we train individual DCNN for each semantic

category???

Page 19: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

Nope, this is time consuming

Page 20: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

RDCNN (Regularized double column CNN)

Page 21: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

Remarks for RDCNN● Only 1.4K images out of 230K images in AVA dataset contains the style

labels● Due to small number of training example, the number of filters are reduced by

half in Style-SCNN● Style attributes are extracted from fc256 in Style-SCNN● The input for style column is random crop image● Fine tune only the parameters in aesthetic column in backpropagation● Learning process is supervised by aesthetic label only● Due to small number of training sample for semantic tag, the Image Net

model used as pre-trained column

Page 22: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

Results (SCNN)

Page 23: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

Results (DCNN)

Random crop Center crop Warp Padding

71.8% 72.27% 73.25%

Local column Global column

Page 24: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

Results

● AVG_SCNN = the average result of SCNN where input was random crop and warp

Page 25: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

Aesthetic filters look smoother and cleaner!

Aesthetic filters CIFAR filters

Page 26: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

Correctly classified by DCNN but misclassified by SCNN

Input as local random crop. Most of misclassified images dominated by an object

Input as global warp. Most of misclassified images dominated by fine-grained details

Page 27: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

Attempt to use quad-column network...

Random crop

Warp

Center crop

Padding

NETWORK Nvidia Tesla M2070/M2090 GPU

Page 28: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

Attempt to use quad-column network...

Random crop

Warp

Center crop

Padding

SCNN 1

Fine tune last layers of quad

column network

SCNN 2

SCNN 3

SCNN 4

73.38%

Page 29: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

Content based image aesthetics

● Cat = trains network using categorized images

● Generic = trains network using AVA training set

● Adapt= proposed network adaptation approach

● once an image is associated with an obvious semantic meaning, then the global view is more important than the local view in terms of assessing image aesthetics

● global view and the local view contribute to the aesthetic quality categorization of content-specific images.

Page 30: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

Style-CNN

Page 31: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

Style-CNN

● mAP - mean Average Precision● Average Precision

Page 32: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

Style-CNN

Page 33: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

Correctly classified by RDCNNstyle but misclassified by DCNN

● Rule-of-thirds● HDR● black and white● long exposure● complementary

colors● vanishing point● soft focus

Page 34: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

1.5 millionIAD (Image Aesthetics

Dataset)

1.2 millionPHOTO.NET

300KDPChallenge

Page 35: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

Results on IAS dataset

SCNN (random crop)

SCNN (warp)

DCNN (random crop & warp)

73.21%

73.65%

74.6%

Page 36: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

Results on IAS dataset

SCNN (random crop)

SCNN (warp)

DCNN (random crop & warp)

72.11%

72.65%

72.9%

Bottom 20% Top 20%

● AVA test set contains images with rating in the middle● Utilizing top and bottom 20% of images reduced the number of training data

Page 37: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

Future studies (IAD)● Aperture/FNumber● ISO/ISOSpeedRatings● Shutter/ExposureTime● Lens/FocalLength

Page 38: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

Details and tools● ConvNet● Logistic Regression cost function● Nvidia Tesla M2070/M2090 GPU

Page 39: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

Time

1 day

2 day

3 day

3-4 day

4 day

AVA AVA AVA AVA IAD

Style SCNN SCNN DCNN RDCNN SCNN

Page 40: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

thank youterima kasihধন বাদ謝謝

ध यवादРахмет

So, let’s start to discuss?

Page 41: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

My experiments

Page 42: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

My experiments

72.4%

Page 43: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

My experiments

72.86%

Page 44: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

My experiments

70.08%

Page 45: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

My experiments

73.08%

Page 46: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

My experiments

f1(1)...f1(4096) f2(1)...f2(4096) f3(1)...f3(4096) f4(1)...f4(4096) f5(1)...f5(4096)

1 2 3 4 5

f (1), f (2), f (3), …, f (20480)

Page 47: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

My experiments

74.52%

Page 48: RAPID: Rating Pictorial Aesthetics using Deep Learning ...viprlab.github.io/files/magzhan-RAPID-compressed.pdf · RAPID: Rating Pictorial Aesthetics using Deep Learning (Paper #1)

???We took these normalized inputs (random crop, warp, center crop, padding) for SCNN training (2023 page, last sentences of first paragraph)

Using the selected network architecture, we trained and evaluated SCNN with four types of inputs (random crop, warp, center crop, padding) on the AVA dataset (2024 page, first sentences of second paragraph)