![Page 1: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/1.jpg)
Machine Learning for NLPLecture 2: Basic linear algebra and optimization
UNIVERSITY OF
GOTHENBURG
Richard Johansson
September 4, 2015
![Page 2: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/2.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
math in machine learning
I machine learning is a �mathy� subject. . .
I the most important branches of mathematics used in ML:I probability and statistical theoryI linear algebraI optimization
I in this lecture, we'll have a look at the latter two
![Page 3: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/3.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
overview
basic linear algebra and its implementation in Python
basic optimization
![Page 4: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/4.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
recap: mapping features to numerical vectors
I we convert symbolic features to numbers when we usescikit-learn's classi�ers
vec = DictVectorizer()
Xe = vec.fit_transform(X)
![Page 5: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/5.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
types of vectorizers
I a DictVectorizer converts from attribute�value dicts:
I a CountVectorizer converts from texts (after applying atokenizer) or lists:
I a TfidfVectorizer is like a CountVectorizer, but also usesTF*IDF
![Page 6: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/6.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
vectors
I a tuple consisting of n numbers is called a vector
I the set of all possible tuples of length n is called ann-dimensional vector space
I for instance: (1, 2) is a 2-dimensional vector
I they can be interpreted geometrically, either as a point in acoordinate system
1 2
1
2
I . . . or as a direction (e.g. of motion or force)
![Page 7: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/7.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
basic linear algebra
the basic operations on vectors:
I scaling: α · v = α · (v1, . . . , vn) = (α · v1, . . . , α · vn)I addition and subtraction:v +w = (v1, . . . , vn) + (w1, . . . ,wn) = (v1 + w1, . . . , vn + wn)
I scalar product or dot product:v ·w = (v1, . . . , vn) · (w1, . . . ,wn) = v1 · w1 + . . .+ vn · wn
I vector length or norm:|v | = |(v1, . . . , vn)| =
√v1 · v1 + . . .+ vn · vn =
√v · v
![Page 8: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/8.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
examples: basic linear algebra
I 0.5 · (1, 0, 0, 1) = (0.5, 0, 0, 0.5)
I (1, 0, 0, 1) + (0, 0, 1, 1) = (1, 0, 1, 2)
I (1, 0, 0, 1) · (0, 0, 1, 1) = 1 · 0+ 0 · 0+ 0 · 1+ 1 · 1 = 1
I |(1, 0, 0, 1)| =√1 · 1+ 0 · 0+ 0 · 0+ 1 · 1 =
√2
![Page 9: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/9.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
simple linear algebra implementation
I naively, we could implement the basic vector operations inPython:
I def scale(a, v): return [a*vk for vk in v]I def vsum(v, w): return [vk+wk for (vk,wk) in
zip(v, w)]I def dot(v, w): return sum([vk*wk for (vk,wk) in
zip(v, w)])I def vlength(v): return math.sqrt(dot(v, v))
I however, this is ine�cient if the dimension of the vector spaceis high
![Page 10: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/10.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
linear algebra implementation: better
I NumPy and SciPy are Python libraries containing manymathematical functions
I they are interlinked and typically installed togetherI scikit-learn relies on both of them
I they use specialized math libraries to make computationsfaster
I e.g. BLAS for your processor or graphics card
I example with a 100 million dimension random vector:I my simple function dot(v, v) takes 81 secondsI numpy.dot(v, v) takes 0.15 seconds
![Page 11: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/11.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
NumPy linear algebra examples
>>> import numpy
>>> v1 = numpy.array([1, 0, 0, 1, 0])
>>> v2 = numpy.array([0, 2, 1, -2, 1])
>>> v1
array([1, 0, 0, 1, 0])
>>> v2
array([ 0, 2, 1, -2, 1])
>>> v1 + v2
array([ 1, 2, 1, -1, 1])
>>> 100 * v1
array([100, 0, 0, 100, 0])
>>> numpy.dot(v1, v2)
-2
>>> v1.dot(v2)
-2
>>> numpy.linalg.norm(v1)
1.4142135623730951
![Page 12: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/12.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
sparse vectors
I in NLP, feature vectors are a bit peculiar compared to someother �elds (e.g. speech and image processing):
I the vector spaces often have a very high dimensionI in each feature vector, most of the entries are zeroI ["prices", "fall"] → (0, 1, 0, . . . , 0, 1, 0, . . . , 0, 0, 0)
I sparse vector: keep track of non-zero entries only:[(2, 1), (10, 1)]
I in some cases, this saves memory and is much faster
![Page 13: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/13.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
sparse vectors in Python
I SciPy includes �ve di�erent types of sparse vectors
I in scikit-learn, DictVectorizer and CountVectorizer createvectors of the class csr_matrix
I more on this when we discuss classi�er implementation
I see also http:
//docs.scipy.org/doc/scipy/reference/sparse.html
![Page 14: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/14.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
matrices
I a matrix is a 2-dimensional array of numbers: a �list of lists�[1 2 0−2 1 0
]I note that a vector can be seen as a special case of a matrix: arow or a column
[−2 1 0
] −210
![Page 15: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/15.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
reasons for using matrices
I matrices have a geometric interpretation, as we'll see in amoment
I however, in this context we mainly care about them to speedup our programs
I we can see matrices as collections of vectorsI in Python, it's more e�cient to carry out a small number of
operations on large matrices than on many small vectors
![Page 16: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/16.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
basic matrix operations
the basic elementwise operations on matrices, similar to what wedid for the vectors:
I scaling: multiply all the cells by some number
10 ·[
1 23 4
]=
[10 2030 40
]I addition / subtraction:[
1 23 4
]+
[10 2030 40
]=
[11 2233 44
]
![Page 17: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/17.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
matrix multiplication
I matrix multiplication is an extension of the dot product forvectors
I each cell in the new matrix is computed as the dot productbetween a row and a column:[
1 23 4
]·[
10 2030 40
]=
[70 100150 220
]
![Page 18: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/18.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
matrix multiplication
I matrix multiplication is an extension of the dot product forvectors
I each cell in the new matrix is computed as the dot productbetween a row and a column:[
1 2
3 4
]·[10 2030 40
]=
[70 100150 220
]
![Page 19: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/19.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
matrix multiplication
I matrix multiplication is an extension of the dot product forvectors
I each cell in the new matrix is computed as the dot productbetween a row and a column:[
1 2
3 4
]·[
10 20
30 40
]=
[70 100
150 220
]
![Page 20: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/20.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
matrix multiplication
I matrix multiplication is an extension of the dot product forvectors
I each cell in the new matrix is computed as the dot productbetween a row and a column:[
1 23 4
]·[
10 2030 40
]=
[70 100150 220
]
![Page 21: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/21.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
geometric interpretation of matrix multiplication
I as mentioned, we use matrix multiplication (and other matrixoperations) mainly for e�ciency in this course
I a matrix multiplication instead of many dot products
I however, in geometry we can use matrix multiplication can beused to express many useful transformations
I scalingI rotationI projection from 3D to 2DI . . .
![Page 22: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/22.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
matrix multiplication in NumPy
A = numpy.array([[1, 2], [3, 4]])
B = numpy.array([[10, 20], [30, 40]])
print(A.dot(B))
![Page 23: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/23.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
overview
basic linear algebra and its implementation in Python
basic optimization
![Page 24: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/24.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
optimization
I what is optimization?
I unconstrained optimization: �nd the x that gives us theminimal (or maximal) value of some function f :
minx
f (x)
I constrained optimization: �nd the x that gives us theminimal (or maximal) value of f , where x satis�es some extraconditions:
minx
f (x)
such that x > 0
I today unconstrained optimization only
![Page 25: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/25.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
optimization in machine learning
I as we will see in the next lecture, several ML models areformulated as optimization of some mathematical function:
I support vector machineI logistic regressionI neural networksI . . .
I typically, we want to optimize a goodness of �t (how well wehandle the training set) and a regularizer (simplicity of theclassi�er)
![Page 26: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/26.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
one-variable example
−3 −2 −1 0 1 2 3−0.5
0.0
0.5
1.0
1.5
minimum
![Page 27: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/27.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
two-variable example
![Page 28: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/28.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
remember your highschool calculus. . .
I in your early school days, you might have seen the derivativeof a function
I intuition: the derivative measures the slope
−3 −2 −1 0 1 2 3−0.5
0.0
0.5
1.0
1.5
minimum
I if a �nice� function has a maximum or minimum, then thederivative will be zero there
![Page 29: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/29.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
the gradient
I the multidimensional equivalent of the derivative is called thegradient
I if f is a function of n variables, then the gradient is ann-dimensional vector, often written ∇f (x)
I intuition: the gradient points in the uphill direction
I again: the gradient is zero if we have an optimum
![Page 30: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/30.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
computing the gradient
![Page 31: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/31.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient descent
I as we saw, the gradient points in the uphill direction:
I this intuition leads to a simple idea for �nding the minimum:I take a small step in the direction opposite to the gradientI repeat until the gradient is close enough to zero
I this is called gradient descent
![Page 32: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/32.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient descent, pseudocode
I the same thing again, in pseudocode:
1. set x to some initial value, and select a suitable step size c
2. compute the gradient ∇f (x)3. if ∇f (x) is small enough, we are done4. otherwise, subtract c · ∇f (x) from x and go back to step 2
I conversely, to �nd the maximum we can do gradient ascent:then we instead add c · ∇f (x) to x
![Page 33: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/33.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
in Python
def gradient_ascent(x_init, y_init,
threshold = 0.001,
steplength = 0.01):
x = x_init
y = y_init
done = False
while not done:
gxy = gradient_of_my_function(x, y)
x += steplength * gxy[0]
y += steplength * gxy[1]
if numpy.linalg.norm(gxy) < threshold:
done = True
return (x, y)
![Page 34: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/34.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example
I let's optimize this function:
def f(x, y):
return math.exp(-(x-2)**2 - (y+1)**2)
I its gradient is
def gradient_of_f(x, y):
return (-2*(x-2)*f(x, y), -2*(y+1)*f(x, y))
![Page 35: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/35.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example
−1 0 1 2 3−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
0.0
0.5
0.150
0.300
0.450
0.600
0.750
0.900
![Page 36: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/36.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example
−1 0 1 2 3−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
0.0
0.5
0.150
0.300
0.450
0.600
0.750
0.900
![Page 37: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/37.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example
−1 0 1 2 3−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
0.0
0.5
0.150
0.300
0.450
0.600
0.750
0.900
![Page 38: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/38.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example
−1 0 1 2 3−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
0.0
0.5
0.150
0.300
0.450
0.600
0.750
0.900
![Page 39: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/39.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example
−1 0 1 2 3−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
0.0
0.5
0.150
0.300
0.450
0.600
0.750
0.900
![Page 40: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/40.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example
−1 0 1 2 3−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
0.0
0.5
0.150
0.300
0.450
0.600
0.750
0.900
![Page 41: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/41.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example
−1 0 1 2 3−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
0.0
0.5
0.150
0.300
0.450
0.600
0.750
0.900
![Page 42: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/42.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example
−1 0 1 2 3−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
0.0
0.5
0.150
0.300
0.450
0.600
0.750
0.900
![Page 43: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/43.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example
−1 0 1 2 3−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
0.0
0.5
0.150
0.300
0.450
0.600
0.750
0.900
![Page 44: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/44.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example
−1 0 1 2 3−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
0.0
0.5
0.150
0.300
0.450
0.600
0.750
0.900
![Page 45: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/45.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example
−1 0 1 2 3−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
0.0
0.5
0.150
0.300
0.450
0.600
0.750
0.900
![Page 46: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/46.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example
−1 0 1 2 3−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
0.0
0.5
0.150
0.300
0.450
0.600
0.750
0.900
![Page 47: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/47.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example
−1 0 1 2 3−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
0.0
0.5
0.150
0.300
0.450
0.600
0.750
0.900
![Page 48: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/48.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example
−1 0 1 2 3−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
0.0
0.5
0.150
0.300
0.450
0.600
0.750
0.900
![Page 49: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/49.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example
−1 0 1 2 3−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
0.0
0.5
0.150
0.300
0.450
0.600
0.750
0.900
![Page 50: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/50.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example
−1 0 1 2 3−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
0.0
0.5
0.150
0.300
0.450
0.600
0.750
0.900
![Page 51: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/51.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example
−1 0 1 2 3−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
0.0
0.5
0.150
0.300
0.450
0.600
0.750
0.900
![Page 52: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/52.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example
−1 0 1 2 3−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
0.0
0.5
0.150
0.300
0.450
0.600
0.750
0.900
![Page 53: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/53.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example
−1 0 1 2 3−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
0.0
0.5
0.150
0.300
0.450
0.600
0.750
0.900
![Page 54: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/54.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example
−1 0 1 2 3−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
0.0
0.5
0.150
0.300
0.450
0.600
0.750
0.900
![Page 55: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/55.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example
−1 0 1 2 3−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
0.0
0.5
0.150
0.300
0.450
0.600
0.750
0.900
![Page 56: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/56.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example
−1 0 1 2 3−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
0.0
0.5
0.150
0.300
0.450
0.600
0.750
0.900
![Page 57: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/57.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example
−1 0 1 2 3−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
0.0
0.5
0.150
0.300
0.450
0.600
0.750
0.900
![Page 58: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/58.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example
−1 0 1 2 3−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
0.0
0.5
0.150
0.300
0.450
0.600
0.750
0.900
![Page 59: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/59.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example
−1 0 1 2 3−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
0.0
0.5
0.150
0.300
0.450
0.600
0.750
0.900
![Page 60: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/60.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example
−1 0 1 2 3−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
0.0
0.5
0.150
0.300
0.450
0.600
0.750
0.900
![Page 61: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/61.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example
−1 0 1 2 3−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
0.0
0.5
0.150
0.300
0.450
0.600
0.750
0.900
![Page 62: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/62.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example
−1 0 1 2 3−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
0.0
0.5
0.150
0.300
0.450
0.600
0.750
0.900
![Page 63: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/63.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example
−1 0 1 2 3−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
0.0
0.5
0.150
0.300
0.450
0.600
0.750
0.900
![Page 64: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/64.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example
−1 0 1 2 3−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
0.0
0.5
0.150
0.300
0.450
0.600
0.750
0.900
![Page 65: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/65.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example
−1 0 1 2 3−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
0.0
0.5
0.150
0.300
0.450
0.600
0.750
0.900
![Page 66: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/66.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example
−1 0 1 2 3−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
0.0
0.5
0.150
0.300
0.450
0.600
0.750
0.900
![Page 67: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/67.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example
−1 0 1 2 3−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
0.0
0.5
0.150
0.300
0.450
0.600
0.750
0.900
![Page 68: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/68.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example
−1 0 1 2 3−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
0.0
0.5
0.150
0.300
0.450
0.600
0.750
0.900
![Page 69: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/69.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example
−1 0 1 2 3−3.0
−2.5
−2.0
−1.5
−1.0
−0.5
0.0
0.5
0.150
0.300
0.450
0.600
0.750
0.900
![Page 70: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/70.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
will we always reach the top?
I yes, ifI there is actually a topI the step is short enoughI the surface isn't too jumpy
I smarter versions of gradient ascent/descent try to adapt thestep length so that we don't go too slow in the beginning, orbounce around the top at the end
![Page 71: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/71.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example (2)
I let's optimize another function:
def f(x, y):
return math.exp( -(x-2)**2 - 0.5*(y+1)**2) \
+ 0.7 * math.exp( -0.7*(x+1)**2 - 0.8*(y-1)**2)
![Page 72: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/72.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example (2)
−3 −2 −1 0 1 2−3
−2
−1
0
1
2
0.150
0.150
0.300
0.300
0.450
0.450
0.600
0.600
0.750
0.900
![Page 73: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/73.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example (2)
−3 −2 −1 0 1 2−3
−2
−1
0
1
2
0.150
0.150
0.300
0.300
0.450
0.450
0.600
0.600
0.750
0.900
![Page 74: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/74.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example (2)
−3 −2 −1 0 1 2−3
−2
−1
0
1
2
0.150
0.150
0.300
0.300
0.450
0.450
0.600
0.600
0.750
0.900
![Page 75: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/75.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example (2)
−3 −2 −1 0 1 2−3
−2
−1
0
1
2
0.150
0.150
0.300
0.300
0.450
0.450
0.600
0.600
0.750
0.900
![Page 76: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/76.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example (2)
−3 −2 −1 0 1 2−3
−2
−1
0
1
2
0.150
0.150
0.300
0.300
0.450
0.450
0.600
0.600
0.750
0.900
![Page 77: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/77.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example (2)
−3 −2 −1 0 1 2−3
−2
−1
0
1
2
0.150
0.150
0.300
0.300
0.450
0.450
0.600
0.600
0.750
0.900
![Page 78: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/78.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example (2)
−3 −2 −1 0 1 2−3
−2
−1
0
1
2
0.150
0.150
0.300
0.300
0.450
0.450
0.600
0.600
0.750
0.900
![Page 79: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/79.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
gradient ascent example (2)
−3 −2 −1 0 1 2−3
−2
−1
0
1
2
0.150
0.150
0.300
0.300
0.450
0.450
0.600
0.600
0.750
0.900
![Page 80: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/80.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
local and global maxima/minima
I some functions have local maxima or minima
I these functions are harder to optimize because the local (butnot global) optima also have a gradient of 0
![Page 81: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/81.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
convex and concave functions
I a function is convex if it always curves downwards
I equivalently, if we draw a line between two points of thesurface, the surface is always below the line
−3 −2 −1 0 1 2 30.0
0.2
0.4
0.6
0.8
1.0
−3 −2 −1 0 1 2 30.0
0.2
0.4
0.6
0.8
1.0
1.2
I the point of this: if we �nd a local optimum (gradient is 0) ofa convex function, this is guaranteed to be the minimum
I conversely, a function is concave if it always curves upwards
![Page 82: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/82.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
stochastic gradient descent
I in some cases it is cumbersome to compute the gradientI because it depends on all the data in the training set
I stochastic gradient descent: simplify the computation bycomputing the gradient using just a small part
I typically, a single training example
I pseudocode:
1. set w to some initial value, and select a suitable step size c
2. select a single training instance x
3. compute the gradient ∇f (w) using x only
4. if we are �done�, stop5. otherwise, subtract c · ∇f (w) from w and go back to step 2
I (stopping criterion shouldn't be based on just a single instance)
![Page 83: Machine Learning for NLP Lecture 2: Basic linear algebra and … · -20pt UNIVERSITY OF GOTHENBURG linear algebra implementation: better I NumPy and SciPy are Python libraries containing](https://reader034.vdocuments.us/reader034/viewer/2022051811/6028d57822c5b57847071be1/html5/thumbnails/83.jpg)
-20pt
UNIVERSITY OF
GOTHENBURG
next lecture
I linear classi�ersI expressed as a dot product w · x
I we'll use the concepts discussed today to go into the details ofa few di�erent learning algorithms:
I perceptronI support vector classi�erI logistic regression
I implementations in NumPy/SciPy similar to the classi�ers inscikit-learn
I preparation for the second assignment