audeosynth: music-driven video montage - zichengl.netzichengl.net/stuff/montage-sg15talk.pdf ·...

35
AudeoSynth: Music - Driven Video Montage Zicheng Liao Zhejiang University Bingchen Gong Zhejiang University Lechao Cheng Zhejiang University Yizhou Yu University of Hong Kong ACM SIGGRAPH 2015

Upload: nguyenthien

Post on 31-Aug-2018

222 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: AudeoSynth: Music-Driven Video Montage - zichengl.netzichengl.net/stuff/montage-SG15talk.pdf · [Michel Chion 1994] Principle II: Cut-to-the-Beat

AudeoSynth: Music-Driven Video Montage

Zicheng Liao

Zhejiang University

Bingchen Gong

Zhejiang University

Lechao Cheng

Zhejiang University

Yizhou Yu

University of Hong Kong

ACM SIGGRAPH 2015

Page 2: AudeoSynth: Music-Driven Video Montage - zichengl.netzichengl.net/stuff/montage-SG15talk.pdf · [Michel Chion 1994] Principle II: Cut-to-the-Beat

The success of visual media synthesis

Video textures [2000] Animating pictures [2005] De-animating video [2012]

Progressive video loop [2013]Cinemagraphs [2012] Cliplets [2012]

Video synopsis [2008]

Page 3: AudeoSynth: Music-Driven Video Montage - zichengl.netzichengl.net/stuff/montage-SG15talk.pdf · [Michel Chion 1994] Principle II: Cut-to-the-Beat

The success of visual media synthesis

Image analogy [2001]Graph cut texture synth [2003]Texture synthesis [1999 & 2001]Pyramid blending [1983]

Gradient domain editing [2003] Digital photo montage [2004] stitching & panorama [2003]

Page 4: AudeoSynth: Music-Driven Video Montage - zichengl.netzichengl.net/stuff/montage-SG15talk.pdf · [Michel Chion 1994] Principle II: Cut-to-the-Beat

*Silent* pixels

Other dimensions of human sensation are absent

- hear, touch, smell or taste

- design for 5-sense [Jinsop Lee 2013]

Add sound to the game

- why sound?

Source: www.MontblancOneSecond.com [#NOT paper result]

Page 5: AudeoSynth: Music-Driven Video Montage - zichengl.netzichengl.net/stuff/montage-SG15talk.pdf · [Michel Chion 1994] Principle II: Cut-to-the-Beat

Co

nte

nt A

na

lysis

Op

tim

ization

Vid

eo M

on

tage

Music Driven Video Montage

Page 6: AudeoSynth: Music-Driven Video Montage - zichengl.netzichengl.net/stuff/montage-SG15talk.pdf · [Michel Chion 1994] Principle II: Cut-to-the-Beat
Page 7: AudeoSynth: Music-Driven Video Montage - zichengl.netzichengl.net/stuff/montage-SG15talk.pdf · [Michel Chion 1994] Principle II: Cut-to-the-Beat

Applications

Video summary and online sharing

Timelapse photography [Louie Schwartzberg 2011]

Hyperlapse videos [Joshi et al. 2015, Kopf et al. 2014]

Smartphone app in Apple Store or Google Play

Page 8: AudeoSynth: Music-Driven Video Montage - zichengl.netzichengl.net/stuff/montage-SG15talk.pdf · [Michel Chion 1994] Principle II: Cut-to-the-Beat

A challenging new task

How to formulate this task?

How to write an objective function?

How to find a solution?

How to evaluate?

How to translate the subtleties of an artistic process

into a machine algorithm?

Page 9: AudeoSynth: Music-Driven Video Montage - zichengl.netzichengl.net/stuff/montage-SG15talk.pdf · [Michel Chion 1994] Principle II: Cut-to-the-Beat

Principle I: Synchronization

Time & pace of visual activities to follow with music

Audio-Visual Synchresis

- Mental fusion when sound and visual occur at the same time

- An instinct for survival developed from the ancient

- Footsteps synchronized with music beat, popping with drum

- Film editing, animations, dancing (“dance to the beat”).

[Michel Chion 1994]

Page 10: AudeoSynth: Music-Driven Video Montage - zichengl.netzichengl.net/stuff/montage-SG15talk.pdf · [Michel Chion 1994] Principle II: Cut-to-the-Beat

Principle II: Cut-to-the-Beat

Montage: A language of visual expression

Timing is KING

- Music transition points

- Beginning of music bars

“Mosaic, assembling, or a juxtaposition of imagery, …

an orchestration” - Alfred Hitchcock

[Walter Murch 2001]

“to separate and punctuate an idea from what follows”

- Walter Murch

Alfred Hitchcock

Page 11: AudeoSynth: Music-Driven Video Montage - zichengl.netzichengl.net/stuff/montage-SG15talk.pdf · [Michel Chion 1994] Principle II: Cut-to-the-Beat

FormulationM

usi

cV

ideo

cli

ps

scaling factor

segment 1 segment 2 segment 3 segment 4

Music-Driven Imagery

Page 12: AudeoSynth: Music-Driven Video Montage - zichengl.netzichengl.net/stuff/montage-SG15talk.pdf · [Michel Chion 1994] Principle II: Cut-to-the-Beat

segment 1 segment 2 segment 3 segment 4

mu

sic

vid

eos

Energy function

pairs

synchronization

Page 13: AudeoSynth: Music-Driven Video Montage - zichengl.netzichengl.net/stuff/montage-SG15talk.pdf · [Michel Chion 1994] Principle II: Cut-to-the-Beat

Overview

Music

Video clipsVideo clipsVideo clips

Analysis

Video clipsVideo clipsmotion

frequency

dynamism

segments

note onsets

saliency

Optimization

Pre-

compute

Output

Rendering

MCMC

optimization

Energy

function

Page 14: AudeoSynth: Music-Driven Video Montage - zichengl.netzichengl.net/stuff/montage-SG15talk.pdf · [Michel Chion 1994] Principle II: Cut-to-the-Beat

Music analysis

MIDI: Musical Instrument Digital Interface

- Music industrial standard protocol (1983)

- Connects instruments, sequencers and software

- Online databases (free-midi.org; 8notes.com)

- Semantical encoding language of music A MIDI controller

source: http://wikipedia.org/MIDI

Page 15: AudeoSynth: Music-Driven Video Montage - zichengl.netzichengl.net/stuff/montage-SG15talk.pdf · [Michel Chion 1994] Principle II: Cut-to-the-Beat

MIDI formatMIDI event

TIME EVENT ID channel P1 P2

Event types:

ID P1 P2

Note off 0x8 pitch velocity

Note on 0x9 pitch velocity

Note aftertouch 0xA note # value

Controller 0xB controller # value

Program change 0xC program # channel

Channel aftertouch 0xD value NA

Pitch Bend 0xE value 1 value 2

Program change event: <0xC program# channel>

Program #

01 – 08: Piano Timbres

09 – 16: Chromatic percussion

17 – 24: Organ Timbres

25 – 32: Guitar Timbres

105 – 112: Ethnic Timbres

113 – 128: Sound Effects (Tinkle Bell, Breath noise, Bird Tweet, etc)

Music metadata

Clef, meter and tempo

Page 16: AudeoSynth: Music-Driven Video Montage - zichengl.netzichengl.net/stuff/montage-SG15talk.pdf · [Michel Chion 1994] Principle II: Cut-to-the-Beat

Music segmentation

Bottom up hierarchical segmentation

“Agglomerative image segmentation with superpixels”

Music bars as “superpixels”

Bar 1 Bar 2 Bar 3 Bar 4 Bar 5

Page 17: AudeoSynth: Music-Driven Video Montage - zichengl.netzichengl.net/stuff/montage-SG15talk.pdf · [Michel Chion 1994] Principle II: Cut-to-the-Beat

Music temporal saliency

For audio-visual alignment (synchronization)

8 note onset scores

pitch-peak, pitch-shift, deviated-pitch, before-a-long-interval, after-a-long-interval, start-of-a-bar, start-of-a-new-bar, start-of-a-different-bar

Convolve with Gaussian kernel

salie

ncy

MID

I

Page 18: AudeoSynth: Music-Driven Video Montage - zichengl.netzichengl.net/stuff/montage-SG15talk.pdf · [Michel Chion 1994] Principle II: Cut-to-the-Beat

Optical flow as generic visual descriptor [Liu et al. 2005]

Motion change rate (MCR)

Iterative back propagation [Yang et al. 2011]

Visual temporal saliency

Page 19: AudeoSynth: Music-Driven Video Montage - zichengl.netzichengl.net/stuff/montage-SG15talk.pdf · [Michel Chion 1994] Principle II: Cut-to-the-Beat

Video analysis cont’d

Motion frequency- Project motions in discretized directions

- Power spectral density analysis over time window

- Take the frequency with largest 𝑝𝑠𝑑

Flow peak and dynamism

𝑑 = 0

𝑑 = 1

𝑑 = 2

𝑑 = 3

…………

Page 20: AudeoSynth: Music-Driven Video Montage - zichengl.netzichengl.net/stuff/montage-SG15talk.pdf · [Michel Chion 1994] Principle II: Cut-to-the-Beat

Matching cost

Synchronization cost Pace/frequency cost

Page 21: AudeoSynth: Music-Driven Video Montage - zichengl.netzichengl.net/stuff/montage-SG15talk.pdf · [Michel Chion 1994] Principle II: Cut-to-the-Beat

Transition cost

Pace/velocity compatibility # tracks/dynamism compatibility

Page 22: AudeoSynth: Music-Driven Video Montage - zichengl.netzichengl.net/stuff/montage-SG15talk.pdf · [Michel Chion 1994] Principle II: Cut-to-the-Beat

Optimization

A combination of continuous and discrete optimization

Non-convex

Cannot do gradient descent

Two-Stage Optimization

Page 23: AudeoSynth: Music-Driven Video Montage - zichengl.netzichengl.net/stuff/montage-SG15talk.pdf · [Michel Chion 1994] Principle II: Cut-to-the-Beat

Stage 1m

usi

c

segment 1 segment 2 segment 3 segment 4

Stage 2

Page 24: AudeoSynth: Music-Driven Video Montage - zichengl.netzichengl.net/stuff/montage-SG15talk.pdf · [Michel Chion 1994] Principle II: Cut-to-the-Beat

MC

R

scalable sliding window

frame (video timeline)

Stage I

start framescaling factor

music timelinemusic temporal saliency

Global

alignment

music timeline

Temporal

snapping

end frame

Page 25: AudeoSynth: Music-Driven Video Montage - zichengl.netzichengl.net/stuff/montage-SG15talk.pdf · [Michel Chion 1994] Principle II: Cut-to-the-Beat

Stage II

Metroplis-Hasting algorithm- Two mutations options

- node label update

- Two nodes label swap

- Reversibility constraint- Uniform distribution for label update

segment 1 segment 2 segment 3 segment 4

musi

cvid

eos

Page 26: AudeoSynth: Music-Driven Video Montage - zichengl.netzichengl.net/stuff/montage-SG15talk.pdf · [Michel Chion 1994] Principle II: Cut-to-the-Beat

Result: wild

Input: 35 videos of wild life scene; music: Exploration (excerpt)

Page 27: AudeoSynth: Music-Driven Video Montage - zichengl.netzichengl.net/stuff/montage-SG15talk.pdf · [Michel Chion 1994] Principle II: Cut-to-the-Beat

Result: Aurora

Input: 36 aurora videos; Music: Someone like you (excerpt)

Page 28: AudeoSynth: Music-Driven Video Montage - zichengl.netzichengl.net/stuff/montage-SG15talk.pdf · [Michel Chion 1994] Principle II: Cut-to-the-Beat

Result: City timelapse

Input: 55 timelapse videos of city timelapse; Music: Clocks (excerpt)

Page 29: AudeoSynth: Music-Driven Video Montage - zichengl.netzichengl.net/stuff/montage-SG15talk.pdf · [Michel Chion 1994] Principle II: Cut-to-the-Beat

Comparison sync/no-sync

Feature turned offFeature turned on

Page 30: AudeoSynth: Music-Driven Video Montage - zichengl.netzichengl.net/stuff/montage-SG15talk.pdf · [Michel Chion 1994] Principle II: Cut-to-the-Beat

Without cut-to-the-beatWith cut-to-the-beat

cut-to-the-beatComparison

Page 31: AudeoSynth: Music-Driven Video Montage - zichengl.netzichengl.net/stuff/montage-SG15talk.pdf · [Michel Chion 1994] Principle II: Cut-to-the-Beat

User study

Experiment set up

- 5 groups: ours, - cut-to-the-beat, -sync, Avg User, Expert User

- 6 examples (right)

- 29 participants

- random order, rate from 1 to 5

- Subpopulation analysis by questionnaire

Aurora City timelapse

Happy birthday Adventure

Ballet Wild

Page 32: AudeoSynth: Music-Driven Video Montage - zichengl.netzichengl.net/stuff/montage-SG15talk.pdf · [Michel Chion 1994] Principle II: Cut-to-the-Beat

w/out sync.

Manual edits

w/out cut-to-beatOurs Expert User Avg user

User study resultsAverage rating of different methods Average rating of different examples

Fraction of best VS worst rate for each method Fraction of higher VS lower rate in pairs of methods

Page 33: AudeoSynth: Music-Driven Video Montage - zichengl.netzichengl.net/stuff/montage-SG15talk.pdf · [Michel Chion 1994] Principle II: Cut-to-the-Beat

Limitations

Manual selection of music

Storyline is not preserved

The use of MIDI: bad side and good side

Page 34: AudeoSynth: Music-Driven Video Montage - zichengl.netzichengl.net/stuff/montage-SG15talk.pdf · [Michel Chion 1994] Principle II: Cut-to-the-Beat

Future work

Replace MIDI with .wav and .mp3

Put human in the loop

Video-based music recommendation

...

Page 35: AudeoSynth: Music-Driven Video Montage - zichengl.netzichengl.net/stuff/montage-SG15talk.pdf · [Michel Chion 1994] Principle II: Cut-to-the-Beat

http://web.engr.illinois.edu/~liao17/montage.html

Thank You