machine learning and quantum computing: a look at quantum support vector machines

Machine Learning and Quantum Computing: ALook at Quantum Support Vector Machines

Seminar at the Centre for Quantum Technologies

Peter Wittek

University of Boras

September 19, 2013

Machine Learning Quantum Computing and Machine Learning Support Vector Machines Quantum SVMs Conclusions

What Machine Learning Is Not

It is not statisticsData-drivenStrict assumptions on underlying distributions

It is not AIModel-drivenUncertainty is addressed

It is not data miningAlthough there is a considerable overlap

Peter Wittek Quantum Support Vector Machines


What Machine Learning Should Be About

Data-drivenLooking for patternsClasses, groups of similar objectsMainly quantitative, but can also be qualitative

Robust, tolerates noiseGeneralize well beyond training data



Characteristics

Loose collection of algorithmsNo common ground

Few assumptionsParameters can be a major obstacleComputationally intensive

Not easy to parallelizeN:N access patterns are commonOr N:K through a proxy



Nature-Inspired Methods

Many nature-inspired methodsComputational IntelligenceNeural networks, flocking algorithms, genetic algorithms,chemical reactions, etc.Also methods inspired by quantum mechanicsOthers: manifold learning, density-based clustering,support vector machines, etc.



Learning Approach

SupervisedBiomedical: recognizing cancer cellsRecognizing handwritingSpam detection

UnsupervisedRecommendation enginesFinding groups of similar patentsIdentifying trends in a dynamic environment



Ensembles



High-Performance Machine Learning

Petabytes of dataSparse, noisy, might be missing elements

There should be as few assumptions as possible

Large scale may not entail a need for quick learningmethods



Examples: Blood Pressure Monitoring

Simple, SVM-based pipeline achieved 5 % accuracy.Using cell phone camera.

Time

5

4

3

2

1

Coefficients

(a) Systolic blood pres-sure of 92 mm Hg

Time

5

4

3

2

1

Coefficients

(b) Systolic blood pres-sure of 107 mm Hg

Time

5

4

3

2

1

Coefficients

(c) Systolic blood pres-sure of 127 mm Hg



Examples: Self-organizing Maps



Main Research Directions

Learning a unitary transformation.Adiabatic quantum computing.Other methods.



Learning a Unitary Transformation

Black-box approach to learning.Observed input and output, learn the mapping function.

A form of quantum process tomography.Unknown function == unknown quantum channel.



Adiabatic Quantum Computing

Find the global minimum of a given functionf : {0,1}n 7→ (0,∞), where minx f (x) = f0 and f (x) = f0 iffx = x0.Consider the Hamiltonian H1 =

∑x∈{0,1}n f (x)|x〉〈x |. Its

ground state is |x0〉.To find this ground state, consider the HamiltonianH(λ) = (1− λ)H0 + λH1.It already demonstrated: search engine ranking and binaryclassification.Nonconvex loss function.



Other Methods

Quantum Bayesian inference.Pattern matching: unknown state to target known templatestate.Quantum particle swarm optimization.



Support Vector Machines: Risk Minimization andGeneralization

Training example set:

{(x1, y1), . . . , (xM , yM)},

xi ∈ RN are the data points.y ∈ {−1,1} are binary classes.

Minimize12

uT u + CM∑

i=1

ξi

subject to

yi(uT xi + b) ≥ 1− ξi , ξi ≥ 0, i = 1, . . . ,N.

Output is a hyperplane: yi := sgn(uT xi + b).Support vectors are the training data that lie on the margin.



Support Vector Machines: Nonlinear Embedding

Making a problem linearly separable after embedding intoa feature space by a nonlinear map φ.Only the constraints change:

yi(uTφ(xi) + b) ≥ 1− ξi , ξi ≥ 0, i = 1, . . . ,N.

The decision function becomes f (x) = sgn(uTφ(x) + b).

a) b)



Support Vector Machines: KKT Conditions and DualFormulation

Introduce Lagrangian multipliers.The partial derivatives in u, b, and ξ define a saddle pointof Lagrangian.

MaximizeM∑

i=1

αi −12

M∑i=1

M∑j=1

αiyiαjyjK (xi ,xj)

subject to

M∑i=1

αiyi = 0, αi ∈ [0,C], i = 1, . . . ,M.

K (xi ,xj) is the kernel function.No need to know the embedding function φ.



Least Squares Support Vector Machines

Use the l2 norm in the regularization term.

Minimize12

uT u +γ

2

M∑i=1

e2i

subject to the equality constraints

yi(uTφ(xi) + b) = 1− ei , i = 1, . . . ,N.

We obtain the following least-squares problem:(0 1T

1 K + γ−1I

)(bα

)=

(0y

)(1)

.The trade-off: zero αi -s for nonzero error terms ei .



The Outline of Quantum SVMs

Kernel matrix: O(M2N).Least-squares formulation: O(M3).Quantum variant: O(log(MN)).



Calculating the Gram Matrix

Generate two states, |ψ〉and |φ〉, with an ancilla variable;Estimate the parameter Z = |xi |2 + |xj |2 – the sum of thesquared norms of the two instances;Perform a projective measurement on the ancilla alone,comparing the two states.



Calculating the Gram Matrix

We calculate the dot product in the linear kernel asxT

i xj =Z−|xi−xj |2

2 .

|ψ〉 = 1√2

(|0〉|xi〉+ |1〉|xj〉) – from QRAM.

|φ〉 = 1Z (|xi ||0〉 − |xj ||1〉) is created simultaneously with Z .

To get |φ〉 and Z , evolve 1√2

(|0〉 − |1〉)⊗ |0〉 with theHamiltonian H = (|xi ||0〉〈0|+ |xj ||1〉〈1|)⊗ σx . Measure theancilla bit.Perform a swap test on |ψ〉 and |φ〉.Overall complexity: O(ε−1 log N).



Solving the Linear Equation

Core ideas:Quantum matrix inversion is fast.Simulation of sparse matrixes is efficient.Non-sparse density matrices reveal the eigenstructureexponentially faster than in classical algorithms.




F =

(0 1T

1 K + γ−1I

)= J + Kγ ,

J =

(0 1T

1 0

),

Kγ =

(0 00 K + γ−1I

).




1 We calculate the matrix exponential of F with theBaker-Campbell-Hausdorff formula.

e−i F∆t = e−iJ∆ttrKγ e

−iγ−1I∆ttrKγ e

−iK ∆ttrKγ + O(∆t2). (2)

2 We use quantum phase estimation using the exponentialto obtain the eigenstructure.

The sparse matrices J and the constant multiply of theidentity matrix are easy to simulate.




The kernel matrix K is not sparse.Quantum self analysis:

Multiple copies of a density matrix ρ.Perform e−iρt .The state plays an active role in its measurement, byexponentiation it functions as a Hamiltonian.

K is a normalized Hermitian matrix, which makes it a primecandidate for quantum self analysis.The exponentiation is done in O(logN).



Open Questions

The kernel function is restricted.Sparse data?O(log(MN)) states are required. The model is not sparse,we do not overcome this limit of least squares SVMs.



Summary

Machine learning algorithms are diverse.Growing data sets need both faster execution and better fitto unseen instances.Quantum approaches can help:

Nonconvex loss functions, nonclassical correlations, . . .Exponential speedup.


machine learning and quantum computing: a look at quantum support vector machines

Real Estate