sparse representation, building dictionaries, and church street lily chan fully sampled6x...

63
Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled 6X undersampled 6X undersampled with CS reconstruction

Upload: erin-mccormick

Post on 11-Jan-2016

233 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

Sparse Representation, Building Dictionaries, and Church Street

Lily Chan

                             

          

                             

          

                             

          fully sampled 6X undersampled 6X undersampled with CS

reconstruction

Page 2: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

Overview

A. Basic Compressed Sensing Theory

B. Building Good Dictionaries

C1. Background Subtraction

C2. Estimating Crowd Size

Page 3: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

A. Basic Compressed Sensing Theory

Page 4: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

Compressed Sensing

Concepts from multiple academic disciplines

– Linear Algebra and Systems– Statistics and Probability– Signals and Systems– Computer Science– Mathematics

Page 5: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

Compressed Sensing

Concepts from multiple academic disciplines

– Linear Algebra and Systems– Statistics and Probability– Signals and Systems– Computer Science– Mathematics

Motivations for CS

– Faster sampling– Larger dynamic range– Higher-dimensional data– Lower energy consumption– New sensing modalities

Page 6: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

Compressed Sensing

Concepts from multiple academic disciplines

– Linear Algebra and Systems– Statistics and Probability– Signals and Systems– Computer Science– Mathematics

Motivations for CS

– Faster sampling– Larger dynamic range– Higher-dimensional data– Lower energy consumption– New sensing modalities

Applications

– Photography– Infrared Cameras– Facial Recognition– Pediatric MRI (Time reduced by

~10x)– Etc.

Page 7: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

• Compressive Sensing is based on the observation that many real-world signals and images are either sparse themselves or sparse in some basis or frame (i.e. compressible).

Compressed Sensing

Page 8: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

• Compressive Sensing is based on the observation that many real-world signals and images are either sparse themselves or sparse in some basis or frame (i.e. compressible).

• Acquires and reconstructs signals using a mathematical theory focused on measuring finite-dimensional signals in Rn.

Compressed Sensing

Page 9: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

• Compressive Sensing is based on the observation that many real-world signals images are either sparse themselves or sparse in some basis or frame (i.e. compressible).

• Acquires and reconstructs signals using a mathematical theory focused on measuring finite-dimensional signals in Rn.

• Enables data to be directly sensed in compressed form (lower sampling rate than traditional Nyquist), providing a sparse or compressible representation for signals.

Compressed Sensing

Page 10: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction
Page 11: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

In CS we seek to recover an nx1 vector x given m measurements y, with m << n and a dictionary A.

y = Ax

Compressed Sensing

Page 12: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

Ax

Page 13: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

Compressed Sensing

y = A x

Page 14: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction
Page 15: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

A

A

Page 16: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction
Page 17: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

AA

Page 18: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

A

A

A

Page 19: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

A

P0

Page 20: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

L0 Minimization (L0 Norm)

• The L0 norm returns the number of nonzero elements in each potential solution.

• Finding the sparsest solution (solution with the least number of nonzero elements) to the system by minimizing the L0 norm is the exact result desired for our system.

• Though this method sounds straightforward, it is very expensive to use and requires analysis of all possible arrangements of the k nonzero elements of the signal.

» Very impractical

Page 21: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction
Page 22: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction
Page 23: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction
Page 24: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction
Page 25: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

Sparsity and the L1 Norm• Norms measure the strength of a signal (size of the error / residual of the system)

• The goal is to find the x* A that minimizes x-x*p, which is the approximation error using an p norm.

• The larger p is, the more evenly spread out the error is among the two coefficients.

• Goal: Obtain the sparsest approximation of a point in 2-D space by a point in 1-D subspace.

• L1 provides the most practical sparsest approximation next to L0.

Page 26: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

A

Page 27: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction
Page 28: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction
Page 29: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction
Page 30: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction
Page 31: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction
Page 32: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

CS Software available

Open source software is now available for many applications of different CS

methods. – Most of this software is written in C/C++ and Matlab.– L1-magic is a popular Matlab-based collection of CS algorithms based on

standard interior-point methods.– Other software available include

• NESTA• TFOCS• SURE for Matrix Estimation• CurveLab• ChirpLab• SPARCO• TWIST• SparseLab• etc …

Page 33: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

SMALLbox• SMALLbox: Sparse Models, Algorithms and Learning for Large-scale data

• Purpose: To provide a unifying interface that enables an easy way of comparing dictionary learning algorithms through an API that enables interoperability between existing toolboxes.

• Current Functional Examples

– Image Denoising (with comparisons of different algorithms)– Automatic Music Transcription– Representation of image with patches from another one (Pierre Villars)– Incoherent Dictionary Learning

• Download SMALLbox at: https://code.soundsoftware.ac.uk/projects/smallbox

Page 34: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction
Page 35: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction
Page 36: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

A

Page 37: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction
Page 38: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction
Page 39: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction
Page 40: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

Image Denoising Example

Denoising Problem:

Given N noisy measurements,

y =Ax+v,

build dictionary A and recover x.

y = Ax + v,

where v is noise.

RLSDLA Denoised Image, PSNR = 32.38 dB, Time = 7.60 s

KSVD Denoised Image, PSNR = 32.35 dB, Time = 8.24 s

Page 41: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

Denoising Flow Chart

Update Dictionary A(Dictionary Learning)

Denoise by orthogonal

pursuit(Patch

Denoising)

Generate initial

dictionary A

Image reconstruction

Note: The dictionary update state is done one atom (column) at a time. Other non-zero data samples that do not use the atom (non-orthogonal to the atom) are fixed.

Page 42: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

Image Denoising ResultsKSVD vs. RLSDLA

Original

Noisy Image, PSNR = 22.23 dB RLSDLA Denoised Image, PSNR = 32.38 dB, Time = 7.60 s

KSVD Denoised Image, PSNR = 32.35 dB, Time = 8.24 s

RLSDLA Dictionary

KSVD Dictionary

Page 43: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

C1. Background Subtraction

MM

Page 44: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction
Page 45: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction
Page 46: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

Background Subtraction

Under rather weak assumptions, the Principal Component Pursuit (PCP) estimate solving

exactly recovers the low-rank L0 and the sparse S0.*

* Candès  E., X. Li, Y. Ma, and J. Wright, “Robust Principal Component Analysis”, Journal of the ACM, volume 58, no. 3, May 2011.

Page 47: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

Background Subtraction

• If we stack the video frames as columns of a matrix M, then the low-rank component L0 naturally corresponds to the stationary background and the sparse component S0 captures the moving objects in the foreground.

Page 48: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

Background Subtraction

• If we stack the video frames as columns of a matrix M, then the low-rank component L0 naturally corresponds to the stationary background and the sparse component S0 captures the moving objects in the foreground.

• Foreground objects, such as cars or pedestrians, generally occupy only a fraction of the image pixels and hence can be treated as sparse errors.

Page 49: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

Background Subtraction

Page 50: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

An augmented Lagrange multiplier (ALM) algorithm is used in the TFOCS toolbox to solve the convex PCP problem.*

* Candès  E., X. Li, Y. Ma, and J. Wright, “Robust Principal Component Analysis”, Journal of the ACM, volume 58, no. 3, May 2011.

Background Subtraction

Page 51: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

Background Subtraction

Page 52: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

Background Subtraction

• ALM achieves much higher accuracy than APG (Accelerated Proximal Gradient, in fewer iterations. *

• It works stably across a wide range of problem settings with no tuning of parameters. *

• ALM has an appealing (empirical) property: the rank of the iterates often remains bounded by rank(L0) throughout the optimization, allowing them to be computed especially efficiently. APG, on the other hand, does not have this property. *

* Candès  E., X. Li, Y. Ma, and J. Wright, “Robust Principal Component Analysis”, Journal of the ACM, volume 58, no. 3, May 2011.

Page 53: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

C2. Estimating Crowd Size Using Background Subtraction

• Objective: Estimate the number of objects passing through a video

• Video Locations– UVM Davis Center– Church Street Marketplace

• Total Video Time Analyzed: 119 minutes

• Total Actual Objects in all videos analyzed: 2638

• Concepts Used– Compressed Sensing

• Dictionary Learning• Background Subtraction

– Kalman Filters for object tracking

• Toolboxes– TFOCS (Templates for First-Order Conic Solvers )– Computer Vision System Toolbox from MATLAB

Page 54: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

Estimating Crowd Size Using Background Subtraction

Page 55: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

Estimating Crowd Size Using Background Subtraction

Automatic Object Counter

0

10

20

30

40

50

60

70

80

90

100

UVM_1 UVM_3 UVM_4 UVM_5 UVM_6 UVM_7 L1020306 L1020307

Video

Per

cen

t A

ccu

rate

Analysis without BS

Analysis with BS

Page 56: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

Estimating Crowd Size Using Background Subtraction

• Background Subtraction significantly increases counting accuracy in videos with background objects that are constantly moving:

– Natural Environments with unpredictable factors– Trees– Escalators

• If an object (or a group of objects) enters the video but stops moving, the algorithm will eventually count them as part of the background after a few frames until they start moving again, at which point they will be considered a new object.

• If a group of people walk at the same pace and travel in a tight pack, the current program will consider them one big object travelling through the video.

• Tracking accuracy is greatly improved when there are less inanimate objects in the video that could provide occlusion for the moving objects.

• There is currently no commercial technology available to count large crowds that is reliably accurate as of yet.

Page 57: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

Estimating Crowd Size Using Background Subtraction :

How to Run the programThe current automatic object counter is designed to analyze a folder of videosand output a comma-separated value file with the name of each video and thecount from the analysis.

Steps:

1) Install the TFOCS toolbox onto your computer:http://cvxr.com/tfocs/

2) Run AutoObjectCounter.m3) Choose the folder to be analyzed4) The analysis takes about 1 minute per second of video analyzed

for .avi formatted videos.5) Once the analysis is complete, a VidCountRslts.csv file will be in the

folder from step 3 containing the names of the videos in the folder with the corresponding counts of each video.

Page 58: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

Crowd Estimation Lessons Learned

• Video accuracy is best when the video taken is stable, hence a tripod is highly recommended.

• Taking video using a digital camera with .avi format output takes less memory, has faster processing time, and is easier to convert than using an iphone with .mov format output.

• Ensure the computer being used for processing has at least 8GB of RAM.

• Video segments longer than about 25 seconds may crash Matlab and your OS, depending on the individual processing power of the computer.

• Recommended video segment time is between 10 to 20 seconds.

• Shorter video segments allow for easier manual counting of moving objects.

• Talk to the mall administrators before taking videos inside the Church Street mall, otherwise the mall police will kick you out. Be discreet about taking videos, some people may become aggressive if they find you recording them.

Page 59: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

Future Improvements

• Coding Efficiency– Improving the Matlab code for efficiency would save computing

time and potentially allow for longer video segments without crashing the computer or requiring large amounts of processing power.

• Integrate Feature Recognition– The tracking of people would be more accurate for crowds if

feature recognition was integrated to enable tracking of individual people instead of blobs.

• Frame to Frame Shading Stabilization– Stabilization of background color and shading of the video from

frame to frame would eliminate false counts.

Page 60: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

ReferencesPapers

• Candès  E., “Compressive Sampling,” Proceedings of International Congress of Mathematicians, Madrid, Spain, 2006.

• Fornasier  M., and Rauhut  H., “ Compressive sensing,”  Handbook of Mathematical Methods in Imaging. ,  Springer ,  Heidelberg, Germany , ((2011)).

• J. Wright, et al., “Robust Face Recognition via Sparse Representation”, IEEE TRANS. PAMI, Mar 2006.

• M.A. Davenport, M.F. Duarte, Y.C. Eldar, and G. Kutyniok, “Introduction to Compressed Sensing,” Compressed Sensing: Theory and Applications, Cambridge University Press, 2012.

• D. Barchiesi and M. Plumbley, “Learning Incoherent Dictionaries for Sparse Approximation Using Iterative Projections and Rotations,” IEEE Trans. Signal Process., vol. 61, no. 8, pp. 2065, Apr. 2013.

• Y. Zhang, “Theory of Compressive Sensing via L1 Minimization: A Non-Rip Analysis and Extensions,” Rice University, Houston, TX, Tech. Rep., 2008.

• I. Ram, M. Elad, and I. Cohen, “The RTBWT Frame – Theory and Use for Images”, working draft to be submitted soon.

Page 61: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

ReferencesPapers

• Z. Lin, M. Chen, L. Wu, and Y. Ma. The augmented Lagrange multiplier method for exact recovery of a corrupted low-rank matrices. Mathematical Programming, submitted, 2009.

• Donoho, D.L.: Compressed Sensing. IEEE Trans. Info. Theory 52(4) (2006) 1289–1306

• I. Ram, M. Elad, and I. Cohen, “Image Processing using Smooth Ordering of its Patches”, to appear in IEEE Transactions on Image Processing.

• M. Elad, “Sparse and Redundant Representation Modeling — What Next?”, IEEE Signal Processing Letters, Vol. 19, No. 12, Pages 922-928, December 2012.

• A.M. Bruckstein, D.L. Donoho, and M. Elad, “From Sparse Solutions of Systems of Equations to Sparse Modeling of Signals and Images”, SIAM Review, Vol. 51, No. 1, Pages 34-81, February 2009.

• Candès  E., X. Li, Y. Ma, and J. Wright, “Robust Principal Component Analysis”, Journal of the ACM, volume 58, no. 3, May 2011.

Page 62: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction

ReferencesResources

• Compressed Sensing Audio Demonstration: http://sunbeam.ece.wisc.edu/csaudio/

• SMALLbox: https://code.soundsoftware.ac.uk/projects/smallbox

• Compressed Sensing Video Lectures– Low-rank modeling http://videolectures.net/mlss2011_candes_lowrank/– Matrix Completion via Convex Optimization:

Theory and Algorithms http://videolectures.net/mlss09us_candes_mccota/– An Overview of Compressed Sensing and

Sparse Signal Recovery via L1 minimization http://videolectures.net/mlss09us_candes_ocsssrl1m/– L1 Minimization http://videolectures.net/nips09_bach_smm/– Basics of probability and statistics for Machine Learning http://videolectures.net/bootcamp07_keller_bss/

• Least Squares Estimates: http://www.khanacademy.org

• Compressive Sensing Resources: http://dsp.rice.edu/cs

• TFOCS Toolbox: http://cvxr.com/tfocs/

• Computer Vision System Toolbox from MATLAB http://www.mathworks.com/products/computer-vision/

Page 63: Sparse Representation, Building Dictionaries, and Church Street Lily Chan fully sampled6X undersampled 6X undersampled with CS reconstruction