semantic soft segmentation - korea...
TRANSCRIPT
Semantic Soft Segmentation
YAĞIZ AKSOY (MIT)et al.ACM Transactions on Graphics 2018
Copyright of figures and other materials in the paper belongs original authors.
Presented by Qi-Meng Zhang
2019. 10. 24
Computer Graphics @ Korea University
Qimeng Zhang| 2019. 10. 24 | # 2Computer Graphics @ Korea University
What is Semantic Soft Segmentation?
Semantic Segmentation
https://towardsdatascience.com/semantic-segmentation-the-easiest-possible-implementation-in-code-193bf27b86b8
Image from:
Input
Hard segmentation
+Soft transitions
Soft segmentation
SSS
Qimeng Zhang| 2019. 10. 24 | # 3Computer Graphics @ Korea University
• Accurate representation of soft transitions between image regions is essential for high-quality image editing and compositing.
When fuzzy boundaries and transparency and involved
• Heavily on interaction
• Tedious task of object selection
Ex: Photoshop tool (magnetic lasso, magic wand)
1.Introduction
https://codebridgeplus.com/photoshop-magic-wand/
Qimeng Zhang| 2019. 10. 24 | # 4Computer Graphics @ Korea University
• In this work
Provide distinct segments of the image, while also representing the soft transitions between them accurately
Done fully automatically
1.Introduction
Contribution
Fig. 1.
Qimeng Zhang| 2019. 10. 24 | # 5Computer Graphics @ Korea University
• Based on SPECTRAL DECOMPOSITION
• Graph structure considering both
Low-level information
• Texture and color
High-level information
• Semantic cues
By deep learning
• Corresponding Laplacian Matrix
Reveals the soft transitions between semantic objects in eigenvectors
• Spatially varying model of layer sparsity
Generates high-quality layers from the eigenvectors
1.Introduction
How?
Qimeng Zhang| 2019. 10. 24 | # 6Computer Graphics @ Korea University
• Define the digital matting
1.Introduction
Matting
𝐼 = 𝛼 × 𝐹 + (1 − 𝛼) × 𝐵
Qimeng Zhang| 2019. 10. 24 | # 7Computer Graphics @ Korea University
1.Introduction
Matting Interface
𝛼 = 1𝛼 = 0 𝛼 ∈ [0,1]𝛼 = 1
𝛼 = 0
Qimeng Zhang| 2019. 10. 24 | # 8Computer Graphics @ Korea University
• Laplacian Matrix
1.Introduction
Graph Laplacian
𝐿 = 𝐷 −𝑊
Degree matrix Adjacency matrix
1 2
3
4
5
2 -1 -1 0 0
-1 2 -1 0 0
-1 -1 2 0 0
0 0 0 1 -1
0 0 0 -1 1
Laplacian Matrix
1. Symmetric2. Positive semi-definite3. Smallest eigenvalue of
L is 0.
*Also called affinity matrix
Qimeng Zhang| 2019. 10. 24 | # 9Computer Graphics @ Korea University
1 0
1 0
1 0
0 1
0 1
1.Introduction
Laplacian Matrix in Spectral Clustering
1 2
3 4
5
2 -1 -1 0 0
-1 2 -1 0 0
-1 -1 2 0 0
0 0 0 1 -1
0 0 0 -1 1
Laplacian Matrix
f1 f2
1 0
1 0
1 0
0 1
0 1
First K Eigenvectors
f1 f2
Qimeng Zhang| 2019. 10. 24 | # 10Computer Graphics @ Korea University
1.Introduction
Matrix with Image
Eigendecomposition
512×512
E.x: If only save the first(maximum) 50 eigenvalues, others set to 0
512×512
Qimeng Zhang| 2019. 10. 24 | # 11Computer Graphics @ Korea University
1.Introduction
Eigenvector with Segmentation
Smallest eigenvectors of the matting Laplacian
Spectral Matting[Anat Levin (MIT)et al./ CVPR 2007]
Qimeng Zhang| 2019. 10. 24 | # 12Computer Graphics @ Korea University
2.Related Work
Soft Segmentation(1/2)
Unmixing-Based Soft Color Segmentation for Image Manipulation[Yagiz Aksoy(Simon Fraser University.) et al./ TOG2017]
Color-based
Qimeng Zhang| 2019. 10. 24 | # 13Computer Graphics @ Korea University
2.Related Work
Soft segmentation(2/2)
Spectral Matting[Anat Levin (MIT)et al./ CVPR 2007]
A Closed-Form Solution to Natural Image Matting[Anat Levin (MIT)et al./ CVPR 2006]
Spatially connected soft segments
Matting Laplacian Matrix
Spectral Analysis
Qimeng Zhang| 2019. 10. 24 | # 14Computer Graphics @ Korea University
• Estimation of per-pixel opacities of a user-defined foreground region
Affinity-based method
2.Related Work
Natural Image Matting
KNN Matting [Chen, Li, and Tang (The Hong Kong University of Science and Technology). TPAMI, 2013.]
k nearest neighbors
Qimeng Zhang| 2019. 10. 24 | # 15Computer Graphics @ Korea University
2.Related Work
Targeted Edit Propagation
DeepProp: Extracting Deep Features from a Single Image for Edit Propagation[Yuki endo (University of Tsukuba) et al./ EUROGRAPHICS 2016]
Qimeng Zhang| 2019. 10. 24 | # 16Computer Graphics @ Korea University
2.Related Work
Semantic Segmentation
Pyramid Scene Parsing Network[Zhao(The Chinese University of Hong Kong)et al./ CVPR 2017]
Semantic Instance Segmentation via Deep Metric Learning[Alireza Fathi(Google) et al./ArXiv 2017]
Qimeng Zhang| 2019. 10. 24 | # 17Computer Graphics @ Korea University
• The problem description
Automatically generate a soft segmentation of the input image
• A decomposition into layers that represent the objects in the scene
• Including transparency and soft transitions
3.Method
𝛼: opacity value (𝛼 ∈ 0,1 )when 𝛼 = 0: fully transparent, 𝛼 = 1: fully opaque
Qimeng Zhang| 2019. 10. 24 | # 18Computer Graphics @ Korea University
3. Method
Overview
Relaxed sparsification
Qimeng Zhang| 2019. 10. 24 | # 19Computer Graphics @ Korea University
3.1 Background
Spectral Matting(Brief Summary)
Slide by Levon
Qimeng Zhang| 2019. 10. 24 | # 20Computer Graphics @ Korea University
• Recall the compositing equation
3.1 Background
The Matting Laplacian(1/3)
Slide by: CVFX @ NTHU
Qimeng Zhang| 2019. 10. 24 | # 21Computer Graphics @ Korea University
3.1 Background
The Matting Laplacian(2/3)
Slide by: CVFX @ NTHU
Qimeng Zhang| 2019. 10. 24 | # 22Computer Graphics @ Korea University
3.1 Background
The Matting Laplacian(3/3)
Slide by: CVFX @ NTHU
Rewrite as matrix
(𝑎𝑘𝐼1 + 𝑏𝑘 − 𝛼1)2
Least squares problem
……..
http://ocw.nthu.edu.tw/ocw/index.php?page=course&cid=125&More detail in:
Qimeng Zhang| 2019. 10. 24 | # 23Computer Graphics @ Korea University
3.1 Background
From Eigenvectors to Matting Components
𝛼𝑖𝑝:
𝐸: is a matrix containing the K eigenvectors of L with smallest eigenvalues
𝑦𝑖: is a the linear weights on the eigenvectors that define the soft segments
𝛾: the parameter controls the strength of sparsity prior
Build the corresponding normalized Laplacian matrix
Qimeng Zhang| 2019. 10. 24 | # 24Computer Graphics @ Korea University
• Defined a additional low-level affinity term
Represents color-based longer-range interactions
• Proposed a guided sampling based on an
over-segmentation of image
Generate 2500 superpixels
• Using SLIC(simple linear iterative clustering)
Estimate the affinity between each superpixel and all the superpixelswithin a radius that corresponds to 20% of the image size.
3.2 Nonlocal Color Affinity
SLIC Superpixels Compared to State-of-the-Art Superpixel Methods
[R.Achanta(EPFL) et al./ IEEE TRANS. PATTERN ANAL. MACH.INTELL 2012]
Qimeng Zhang| 2019. 10. 24 | # 25Computer Graphics @ Korea University
• The color affinity between the centroids of two superxiels s and t:
𝑐𝑠 , 𝑐𝑡: average colors of the superpixels of s and t
• Lies in [0,1]
erf: Gauss error function
𝑎𝑐,𝑏𝑐: controlling how quickly the affinity degrades and the threshold
where it becomes zero
• 𝑎𝑐 =50, 𝑏𝑐=0.05
3.2 Nonlocal color affinity
Equation
Qimeng Zhang| 2019. 10. 24 | # 26Computer Graphics @ Korea University
• This affinity essentially makes sure the regions with very similar colors stay connected in challenging scene structures
3.2 Nonlocal Color Affinity (cont’)
Fig. 3.
Qimeng Zhang| 2019. 10. 24 | # 27Computer Graphics @ Korea University
• A term that encourages the grouping of pixels that belong to the same scene object
The feature vectors are generated such that for two pixels 𝑝 and 𝑞 that belong to the same object 𝑓𝑝 and 𝑓𝑞 are similar.
𝑎𝑐,𝑏𝑐: controlling the steepness of the affinity function
෩𝑓𝑠 , ෩𝑓𝑡: average feature
3.3 High-level Semantic Affinity
Qimeng Zhang| 2019. 10. 24 | # 28Computer Graphics @ Korea University
•
3.3 High-level Semantic Affinity
Fig. 4.
Fig. 5.
Qimeng Zhang| 2019. 10. 24 | # 29Computer Graphics @ Korea University
• The Laplacian matrix L by adding the affinity matrices together
𝑊𝐿: is the matrix with the matting affinities
𝑊𝑐: is the matrix with the nonlocal color affinities
𝑊𝑆: is the matrix with the semantic affinities
𝜹𝑺, 𝜹𝑪 set to be 0.01
3.4 Creating the layers
Forming the Laplacian Matrix
Qimeng Zhang| 2019. 10. 24 | # 30Computer Graphics @ Korea University
• We extract the eigenvectors corresponding to the 100 smallest eigenvalues of L. 𝛾 = 0.8
K-means Algorithm on feature vectors
• 5 layers
3.4 Creating the layers
Constrained Sparsification
Fig. 7.Before grouping After grouping
Qimeng Zhang| 2019. 10. 24 | # 31Computer Graphics @ Korea University
• We define an energy function that promotes matte sparsity on the pixel-level while respecting the initial soft segment estimates from the constrained sparsification and the image structure
3.4 Creating the layers
Relaxed Sparsification (Energy1,2)
ො𝛼 is the layers created with the constrained sparsification
1
2
Relaxed Eqn (1)
Qimeng Zhang| 2019. 10. 24 | # 32Computer Graphics @ Korea University
3.4 Creating the layers
Relaxed Sparsification(Energy3,4)
3
Energy defining the spatial propagation of information in Eqn(6)
4
𝛻𝑐𝑝 is the color gradient in the image at pixel p computed using the separable kernels
• Differentiation of discrete multidimensional signals.[H.Farid(NYU) and E.P.Simoncelli(CNS)/ Image Process 2004]
Qimeng Zhang| 2019. 10. 24 | # 33Computer Graphics @ Korea University
• Final energy
• Matrix form
Solve this equation using preconditioned conjugate gradient optimization
3.4 Creating the layers
Relaxed Sparsification(Final Energy)
Fig. 6.
Qimeng Zhang| 2019. 10. 24 | # 34Computer Graphics @ Korea University
3.4 Creating the layers
Matrix Form of the Energy Function(1,2)
𝑁𝑖: number of layer𝑁𝑖𝑝: pixel in layer i
𝐶: 𝑁𝑖 × 𝑁𝑖𝑝
Qimeng Zhang| 2019. 10. 24 | # 35Computer Graphics @ Korea University
3.4 Creating the layers
Matrix Form of the Energy Function(3,4)
𝐷𝑢: diagonal matrix built with 𝑢𝑖𝑝
𝐷𝑣: diagonal matrix built with 𝑣𝑖𝑝
Qimeng Zhang| 2019. 10. 24 | # 36Computer Graphics @ Korea University
• Using DeepLab-ResNet-101 as feature extractor
3.5 Semantic Feature Vectors
Qimeng Zhang| 2019. 10. 24 | # 37Computer Graphics @ Korea University
• Train this network on the semantic segmentation task of the COCO-Stuff dataset
• Refine the feature map generated by this network to be well-aligned to image edges using the guided filter
• Use principal component analysis (PCA) to reduce the dimensionality to three
3.5 Semantic Feature Vectors
Fig. 8.
Qimeng Zhang| 2019. 10. 24 | # 38Computer Graphics @ Korea University
• Sparse eigendecomposition
MATLAB
640×480 image
• This step takes around 3 minutes
• Relaxed sparsification
Preconditioned conjugate gradient optimization(MATLAB)
50~80 iterations
This step takes around 30 seconds
• The run-time of our algorithm grows linearly with the number of pixels
3.6 Implementation Details
Qimeng Zhang| 2019. 10. 24 | # 39Computer Graphics @ Korea University
4.Experiment analysis
4.1 Spectral Matting & Semantic Segmentation
Fig. 9.
Qimeng Zhang| 2019. 10. 24 | # 40Computer Graphics @ Korea University
4.Experiment analysis
4.1 Spectral Matting & Semantic Segmentation
Fig. 10.
Qimeng Zhang| 2019. 10. 24 | # 41Computer Graphics @ Korea University
4.Experiment analysis
4.2 Natural Image Matting
PSPNet [Zhao et al. 2017] Mask R-CNN [He et al. 2017]
(f)
Fig. 11.
Qimeng Zhang| 2019. 10. 24 | # 42Computer Graphics @ Korea University
4.Experiment analysis
4.2 Natural Image Matting
Fig. 12.
Qimeng Zhang| 2019. 10. 24 | # 43Computer Graphics @ Korea University
4.Experiment analysis
4.3 Soft Color Segmentation
Fig. 13.
Qimeng Zhang| 2019. 10. 24 | # 44Computer Graphics @ Korea University
4.Experiment analysis
4.4 Using SSS for Image Editing
Fig. 14.
Qimeng Zhang| 2019. 10. 24 | # 45Computer Graphics @ Korea University
• Not optimized for speed
• One object may be divided into several layers
• Not provide instance-aware semantic information
• Fail at the initial constrained sparsification step when the object colors are very similar
• Grouping of soft segments may fail due to unreliable semantic feature vectors around large transition regions
5. Limitations and Future Work
Fig. 15.
Qimeng Zhang| 2019. 10. 24 | # 46Computer Graphics @ Korea University
• Proposed a method that generates soft segments that correspond to semantically meaningful regions in the image
Fusing the high-level information with low-level image features fully automatically
• Shown the soft segments with the semantic boundaries can be revealed by spectral analysis of the constructed Laplacian matrix.
• The proposed relaxed sparsification method for the soft segments can generate accurate soft transitions while also providing a sparse set of layers.
6. Conclusion