-
MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and AccompanimentHao-Wen Dong*, Wen-Yi Hsiao*, Li-Chia Yang, Yi-Hsuan YangResearch Center of IT Innovation, Academia Sinica
*these authors contributed equally to this work
-
Outline。Goals & Challenges。Data。Proposed Model。Results & Evaluation。Future Works
2
Source Code https://github.com/salu133445/museganDemo Page https://salu133445.github.io/musegan/
https://github.com/salu133445/museganhttps://salu133445.github.io/musegan/
-
Goals & Challenges
-
[Source Code]https://github.com/salu133445/musegan
[Demo Page]https://salu133445.github.io/musegan/
Goals
Generate pop music。of multiple tracks
。in piano-roll format
。using GAN with CNNs4
-
Challenge IMultitrack Interdependency
5
vocal
piano
bass
drums
strings
music & clip by phycause
Multi-track GAN
-
Challenge IIMusic Texture
6
melody
chord(harmony)
-
Challenge IIMusic Texture
7
melody
chord(harmony)
Convolutional Neural Networks
-
Challenge IIITemporal Structure
8
paragraph 1 paragraph 2 paragraph 3
phrase 1 phrase 2 phrase 3 phrase 4
bar 1 bar 2 bar 3 bar 4
beat 1 beat 2 beat 3 beat 4
step 1 step 2 ··· step 24
song
phrase 2
4/4 time
-
Challenge IIITemporal Structure
9
bar 1 bar 2 bar 3 bar 4
beat 1 beat 2 beat 3 beat 4
step 1 step 2 ··· step 24
phrase 2 Fixed Structure
4/4 time
Convolutional Neural Networks
-
Data
-
Data Representation
11
pitch
time
Bar 1 Bar 2 Bar 3 Bar 4time step
Piano-roll
polyphonic multi-track
(with symbolic timing)
-
Data Representation
12
pitch
time
Bar 1 Bar 2 Bar 3 Bar 4
A3
t0 t1
Piano-roll
polyphonic multi-track
(with symbolic timing)
-
Data Representation
13
Multi-track Piano-roll
pitch
time
tracks
polyphonic multi-track
(with symbolic timing)
-
Data Representation
14
96 time steps
84pitches 5 tracks
4 bars
a 4×96×84×5 tensor
Drums
GuitarPiano
Strings
Bass
-
Data
LPD (Lakh Pianoroll Dataset)。>170,000 multi-track piano-rolls。Derived from Lakh MIDI Dataset。Mainly pop songs
Pypianoroll (Python package)。Manipulation & Visualization。Efficient I/O。Parse/Write MIDI files。On PYPI (pip installable)
15
[Dataset]https://salu133445.github.io/lakh-pianoroll-dataset
[Pypianoroll]https://salu133445.github.io/pypianoroll/
-
Proposed Model
-
D 1/0
real samples
Gz~pz G(z)
random noise fake samples
Generative Adversarial Networks
17
X~pX
-
D 1/0
log(1-D(X)) + log(D(G(z)))
log(1-D(G(z)))
GeneratorMake G(z) indistinguishable
from real data for DDiscriminator
Tell G(z) as fake data from X being real ones
real samples
Gz~pz G(z)
random noise fake samples
Generative Adversarial Networks
18
X~pX
4-bar phrases of 5 tracks
-
MuseGAN – An Overview
19
Gtemp
4 latent variables1 random noise
TemporalGenerator
BarGenerator
4 piano-roll matrices
Gbar
-
Bar Generator
Generator
20
zzzzz
zzzz
zzzz
GGGGG
-
Generator
21
z
Bar Generator
zzzzzzzzz
zzzz
GGGGGNo Coordination
Coordination
track-dependent
track-independent
-
zzzzz
Generator
22
z
Bar GeneratorGz
GGGGG
zzzz
zzzzzzzzz
zzzz
GGGGG
Temporal Generator
-
zzzzz
Generator
23
z
Bar GeneratorGz
GGGGG
zzzz
zzzzzzzzz
zzzz
GGGGG
TimeDependent Independent
TrackDependent Melody Groove
Independent Chords Style
-
MuseGAN
24
-
Network Architecture
25
zzzzz
z
Bar GeneratorGz
GGGGG
zzzz
zzzzzzzzz
zzzz
GGGGG
Input 32dense 1024reshape to 2, 1 × 512 channels 2, 1, 512transconv 512 2 × 1 2, 1 4, 1, 512transconv 256 2 × 1 2, 1 8, 1, 256transconv 256 2 × 1 2, 1 16, 1, 256transconv 128 2 × 1 2, 1 32, 1, 128transconv 128 3 × 1 3, 1 96, 1, 128transconv 64 1 × 7 1, 7 96, 7, 64transconv 𝑀𝑀 1 × 12 1, 12 96, 84,𝑀𝑀Output 96, 84 × 𝑀𝑀 channels
Input 32transconv 1024 2 1 3, 1024transconv 32 3 1 4, 32Output 32
Temporal Generator
-
Results
-
Results
27
More samples available on demo pagehttps://salu133445.github.io/musegan/
Sample 1 Sample 2
BassDrumsGuitarStringsPiano
Step 0 Step 700 Step 2500 Step 7900
Drum pattern
Chords
Bass LineBass
Drums
Guitar
Strings
Piano
https://salu133445.github.io/musegan/
-
step2000 4000 6000 800010
4
106
108
1010
1012
0
Negative Critic Loss
Neg
ativ
e C
ritic
Los
s
Objective Metrics
28
# of Used Pitch Classes
# of
Use
d Pi
tch
Cla
sses
step
Qualified Note Rate
Qua
lifie
d N
ote
Rat
e
step
Monitor the Training
-
User Study
29
H: harmoniousR: rhythmicMS: musically structuredC: coherentOR: overall rating
composer jamming hybrid
-
Accompaniment System
30
Generation from Scratch nothing 5-track
Accompaniment System single-track 5-track
Conditional GAN
-
Summary。MuseGAN
。a novel GAN for multi-track sequence generation。multi-track, polyphonic music。human-AI cooperative scenario
。Lakh Pianoroll Dataset (LPD) (new dataset)
。Pypianoroll (new Python package)
31
-
Future Works
-
Future Works
Full Song GenerationChallenges。hierarchical temporal structure。variable-length sequence generation
33
bar 1 bar 2 bar 3 bar 4
beat 1 beat 2 beat 3 beat 4
step 1 step 2 ··· step 24
phrase 2
paragraph 1 paragraph 2 paragraph 3
phrase 1 phrase 2 phrase 3 phrase 4
song
-
Future Works
Cross-modal GenerationChallenge。cross-modal temporal interdependency
Applications in Music。music + lyrics
。music + video
34
-
Q&AMuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and AccompanimentHao-Wen Dong, Wen-Yi Hsiao, Li-Chia Yang, Yi-Hsuan Yang
Source Code https://github.com/salu133445/museganDemo Page https://salu133445.github.io/musegan/
zzzzz
z
Bar GeneratorGz
GGGGG
zzzz
zzzzzzzzz
zzzz
GGGGG
https://github.com/salu133445/museganhttps://salu133445.github.io/musegan/
MuseGAN: Multi-track Sequential Generative Adversarial Networks for Symbolic Music Generation and AccompanimentOutlineGoals & ChallengesGoalsChallenge I�Multitrack InterdependencyChallenge II�Music TextureChallenge II�Music TextureChallenge III�Temporal StructureChallenge III�Temporal StructureDataData RepresentationData RepresentationData RepresentationData RepresentationDataProposed ModelGenerative Adversarial NetworksGenerative Adversarial NetworksMuseGAN – An OverviewGeneratorGeneratorGeneratorGeneratorMuseGANNetwork ArchitectureResultsResultsObjective MetricsUser StudyAccompaniment SystemSummaryFuture WorksFuture WorksFuture WorksQ&A