mat lab script

Upload: autechr

Post on 04-Jun-2018

237 views

Category:

Documents


0 download

TRANSCRIPT

  • 8/13/2019 Mat Lab Script

    1/52

    1 Week 1

    Covered are

    Matlab session basics Variables, elementary operations, and functions Vectors and vector operations in Matlab Function and vector plotting Floating-point arithmetics and numerical data types

    Basic programming constructs (loops,branching) Matlab scripts and functions

    1.1 A quick first session

    Create a directory where you keep your Lab-related files, switch to it, andtype

    matlab

    You can later switch to other and create new directories from within Matlab

    using the commands cd, mkdir. If you want Matlab to include specific di-rectories into its search tree, use the path command.

    To help you find available commands and their action, Matlab has four basicways for providing assistance:

    The command help provides a short description of all functions. Forinstance, to get help on the diary command, type help diary. Fur-ther, you can obtain list of all topics Matlab provides help on by typingsimplyhelp. From here, you can for instance continue with help ops,and so on. This process is useful if you are not 100% sure what youare looking for (this is what happens most of the time).

    Lookfor command: When using this command, the keyword need notto be a Matlab command. For example, type lookfor conjugate.This takes longer than using help because headers of all files aresearched for the keyword.

    1

  • 8/13/2019 Mat Lab Script

    2/52

    Using Help from Matlabs menu bar (can also be opened by typingdocat the command line).

    Using The Matlab web site at www.mathworks.comOpen an editor. It is convenient to invoke the Matlab editor from the menubar but you can also call any other editor, e.g., type

    ! emacs

    Add at the top of the file a comment with your name, like this

    % 110112 Lab, Week 1, Name: ...

    Save the file as Week1.txt, and exit Emacs. This returns you to Matlab.

    Now give execute the following commands:

    diary Week1.txt

    a=2,b=[3;-1],c=[1,2],d=0.023;e=a+i*c

    format long;pi

    format compact

    sin(b)

    e=f

    whos

    diary off

    Before discussing what happened, open the file Week1.txt. Note that every-thing what you saw on the screen is copied into the file opened by the diarycommand. Now you can edit it to your liking (e.g., you could turn it into aMatlab script or m-file, see later). If you want to continue writing into thisfile just repeat diary Week1.txt, and further Matlab dialog is appended.Note: This is also the recommended way of creating your report on theweekly problem sets: Start your problem solving session with adiary filename.extcommand, end it withdiary off, then edit and shorten the file accordingly,add answers to additional questions etc.

    Aspects of the above sequence of commands to be discussed:

    Role and impact of , ; inside and outside of [ ] Declaration of variables via assignments

    2

  • 8/13/2019 Mat Lab Script

    3/52

    Complex numbers Strange behavior of + (see matrix operations later) Built-in constants and functions (help elfun) Formatting (format) and workspace (who, whos, clear)

    Try some basic mathematical operations on your own. For example, adding,subtracting, multiplying two numbers.

    Now try:

    diary Week1.txt

    clear awho

    whos

    format short; 1/3

    format long;a=ans

    format bank;a

    format rat;[a d]

    What does format short e do? How does it differ from format long e?Write your answer by first typing:

    Eformat=here_you_put_your_answer

    Use the command format to go back to the default format.

    Next run:

    c = sqrt(5)

    c^2+c^4

    sin(pi*a)+b

    Note: Any sequence of letters, digits, and underscores can be used as thename of a variable, as long as the first character is a letter. Variable names

    are case sensitive, so variable_two, VARIABLE_TWO and Variable_Two arethree different variables. Good style: Use names that are self-explaining.

    3

  • 8/13/2019 Mat Lab Script

    4/52

    1.2 Vectors

    Matlabs core strength is the efficient handling of linear algebra problemswhich is based on efficient implementations of vector and matrix operations.We start with vectors, i.e., one-dimensional numerical arrays. In Matlab,vectors are a special case of matrices: 1 Nmatrices are row vectors, N 1matrices column vectors. The format is important (see below).

    Row vectors (similarily for column vectors) can be defined in various ways:The following commands

    u(1)=1;u(2)=2;u(3)=3;u(4)=5;u(5)=8

    v = [ 1 2 3 5 8 ]

    w1 = [1,2];w2 = [3,5,8];w=[w1 w2]w1=1;w2=1;x=1;while w2

  • 8/13/2019 Mat Lab Script

    5/52

    p(end-1)

    p(1:5)p([1 5 3 5 8])

    p(end:-1:1)

    fliplr(p)

    [p_sort,ind_sort]=sort(p)

    [min(p) p_sort(1) p(ind_sort(1));max(p) p_sort(L) p(ind_sort(L))]

    p([1 end])=[]

    Discussion:

    The given examples of frequently used functions for vectors (length,size, max, min, sort, fliplr, many more such as find, sum, diff,

    mean, norm etc. are available), have meaning for matrices as well. Fordetailed descriptions, use help.

    Indexing, reshuffling, and extracting subvectors. More examples (usinglogical expressions and find) will follow.

    Addition of matrices and vectors works pretty much as you expect:

    p = [3;0;-1;2]; q = 1:4; r = [0 10 10 0];

    q+r

    p+q

    q+2q*2

    q/2

    Thus, linear operations on vectors/matrices (as needed in many applications)require same size objects, the only exception is the addition of a scalar to avector: scalar + array = scalar*ones(size(array)) + array.

    The operator * denotes matrix multiplication, as we will discuss in moredetail later,

    p*q

    q*pq*r

    (can you explain what happened?). If you want to do element-by-elementmultiplication, you have to use the .*operator.

    5

  • 8/13/2019 Mat Lab Script

    6/52

    q.*r

    The scalar product (x, y) :=Ni=1 xiyi of two real vectors of length N cantherefore be computed in at least two ways (here we do it for the two rowvectors q , r defined above):

    scal_prod_qr=sum(q.*r)

    scal_prod_qr=q*r

    For complex vectors, the second version is the correct one. Scalar productsare e.g. useful in linear algebra, analytic geometry (length of vectors, angles),and statistics (covariance).

    The above remark also applies to division and exponentiation. As we alreadysaw, most built-in Matlab functions of a single argument (such as sin,sqrt,atan etc.) when called with a vector argument are applied componentwise.Funny things happen: Text strings may behave like vectors, see below.

    1 ./ (1:7)

    sqrt(1:7)

    x=funny

    3*x

    Note: Vectorization of function evaluations using .*,./,.^, and other vectorcommands is generally recommended if you start writing your own Matlab

    functions! Many more advanced Matlab routines expect user defined func-tion arguments, and assume that those in turn can be called with vectorarguments.

    Note: You can browse through the list of commands which you entered be-fore, by pressing the up and down arrow keys. Another useful feature is theincremental search: if you want to recall the previous command calculatingsome square root, press Control-R and type sqrt, and the computer willdisplay the previous command which contains the text sqrt. If you pressControl-R again, the computer will look for earlier commands with the textsqrt. Some other standard Unix idiom are also supported.

    1.3 Some plotting

    Let us conclude the first session with some basic plotting. We will use ezplot(easy-plot) and plot.

    6

  • 8/13/2019 Mat Lab Script

    7/52

    help ezplot

    ezplot(sin(t));ezplot(tan(x),[0,pi]);ylabel(y);

    ezplot(x^2+y^3-1,[-5 5 -3 1]);ylabel();

    title(Implicitly given algebraic curve);

    ezplot(cos(phi),sin(3*phi),[-pi,pi]);

    This command enables to plot explicit, implicit, and parametric curves in theplane. Note some automatic features (title and axis labels are automaticallyadded). The function can be defined as above (i.e., as expressions enclosedby), inline functions (see help inline, or using function handles:

    help function_handle

    f=@(a,b)a-b.^2;ezplot(f);

    You can improve the appearance using the buttons on the Figure menu bar,or the Matlab commands xlabel, ylabel, title, legend, axis etc., andsave/export the edited figure in various formats or print it.

    You may have noticed that each new plot command has overwritten the previ-ous one. With hold on you can change this: Until the next hold off or

    figure command is issued, new plot-related commands will not delete ex-isting graphs. This can be used to draw several graphs in one figure, or mark

    points on the graph.We will try this next in connection with plot which displays (interpolated)discrete functions.

    help plot;

    x1=[0:pi/2:2*pi];x2=[0:pi/200:2*pi];

    fig1=figure;

    plot(x1,sin(x1));

    plot(x2,sin(x2),k--);hold on;

    plot(x1,sin(x1),ro);

    legend(sin(x),zeros, maximum and minimum);

    axis tight;axis([0 pi 0 1.2]);hold off;

    figure;

    plot(x2,cos(x2)+0.05*randn(size(x2)),g);

    title(Cosine function with additive white noise (\sigma=0.05));

    7

  • 8/13/2019 Mat Lab Script

    8/52

    figure(fig1);

    diary off;Note: Matlab provides two basic functions for generating pseudo-randomnumbers,rand(positive numbers uniformly distributed in [0, 1]), and randn(Gaussian white noise with zero mean and unit variance).

    1.4 Floating-point, data types, logical ops

    On most modern CPUs, numerical computations are done in floating-pointarithmetics based on the IEEE-754 standard. I encourage you to read aboutthe standard (e.g., use the link available from the course webpage). Whatis important to have in mind is that all numbers that Matlab works withare stored in 64-bit words (= 8-byte words). In the double-precision format(which is Matlabs default), these 64 bits are used as follows:

    The first bit gives the sign s of the number (0 = +, 1 = ), the next 11 bits encode the available exponents 1022 e 1023 with

    a binary number (corresponding toe + 1023) which is strictly between0(= 00000000000) and 211 1 = 2047(= 11111111111) (actually, thetwo extreme values are reserved for special numbers (denormalizednumbers close to 0 and the big numbersInfandNaN representinginfinity), and

    the remaining 52 bits are used to describe the fraction or mantissa m(the valid binary digits of the number), and correspond to 15 16valid decimals.

    In binary representation, a typical machine number x is given by

    x= s 2e 1.m= (2e + m12e1 + m22e2 + . . . + m522e52).As a consequence, there are only finitely many so-called machine numbersavailable. In particular, there is a biggest and a smallest (non-zero) positivenumber, and there are little gaps between any two neighboring machinenumbers. The gaps become larger the bigger the numbers become in absolutevalue.

    realmax

    realmin

    eps

    8

  • 8/13/2019 Mat Lab Script

    9/52

    The last number (the gap near 1) is called machine accuracy (in double

    precision). Here are the consequences you have to have in mind. Whatever number we type at the prompt or compute using operations

    and functions is matched to the closest machine number. This act iscalledrounding, how actually rounding is carried out is regulated by thestandard but of no concern to you. This leads to rounding errors. Theabsolute value of this unavoidable error is growing with the absolutevalue of the numbers, its relative value however is bounded by eps.

    Since we typically do repeated operations on the input numbers ina numerical process, the rounding errors will propagate. They mayin the worst case accumulate and lead to invalid results (numericalinstability). Examples will follow. However, most computations willleave you with only a decent loss of accuracy, so you can usually trustthe first 10 decimals of your results (dont trust the last 4 digits).

    This loss of accuracy has many side-effects. E.g., testing two numbersfor equality by the logical operation == (see below) might not be agood idea, a better way is to say that two numbers are equal iftheir (relative) difference is smaller than a certain problem-dependenttolerance (for obvious reasons, this tolerance should not be smaller thaneps).

    You may exceed the numerical range during a computation (under-flow or overflow). In the more dangerous situation of overflow, Matlabusually continues with Inf and NaN in a way that is regulated by thestandard (in the latter case, error messages are displayed). A wise thingis to think a bit about the expected range of numbers, and scale it ifnecessary.

    realmax+1

    2*realmax

    1.0001*realmax-realmax

    realmin*pow2(-53);

    Inf-1000Inf-Inf

    In contrast to C, Matlab does not bother you with data typing. Whateveryou input, typically has double precision by default. If an operation leads to

    9

  • 8/13/2019 Mat Lab Script

    10/52

    a complex number, Matlab automatically detects this, doubles the storage

    and stores each complex as two real doubles (real and imaginary part). Otherdata types are available but not so often invoked, try help datatypes forinformation. Going to single precision allows you to store two numbers inone 64-bit word, and save memory space. The machine accuracy is reducedto about trustworthy 7 decimals, and error propagation becomes a biggerissue. People with a lot of text processing desires need to learn about charand string manipulation, we will not do it here.

    Logical operations such as >, ==, >=, ~=, & (logical and), | (logical or), andso on return zero-one vectors of type logical. So, if you see a zero-onevectorx, and you are not sure try islogical(x). Logical comparisons withvectors, matrices, etc. requires equal sizes, unless one of the objects is ascalar. Obviously, the answer 1 stands for True, and 0 for False.

    1~=0

    x=[-1 2 0.1 -pi 15];y=abs(x);

    x>=-1

    y>x

    x(y>x)

    z=rand(1,1000);z=[z 1 -z];sum(z)==1

    If you index a vector with another vector (of type logical) of the samelength, then the entries corresponding to the ones in the index vector are

    selected. What does the last example tell you?

    1.5 Some programming constructs

    Constructs for branching and loops are similar to the ones used in generalpurpose programming languages. The ifstatement has the following struc-ture:

    x=1.05;

    if (x0)disp(x is positive)

    else

    disp(x is zero)

    end

    10

  • 8/13/2019 Mat Lab Script

    11/52

    This is the way you would write the statement in an editor. On the command

    line, just type all commands in order, separated by ; or ,. The logicalexpressions can be replaced by any expressions that evaluate to numbers orlogicals. Although the purpose of the above construct is obviously to find outabout the sign of a real scalar x, it would be executed for any array x. Trywith a vector x. If you type help if, the system explains what happens inthe general case: The expression after if resp. elseifis evaluated, if all thereal parts of the results are non-zero then the following statement is executed(and all remaining elseif and else, if there are any left, are ignored). Ifat least one of the real parts is zero then the next elseif is checked in thesame manner and so on. If none of these checks are positive, we eventuallyget to else (and execute the following statement) or to end in which case

    nothing happens.

    You can combine logical expressions with |, &, or ~. Try this:

    v = [ 1 4 5 0 -3 0 7 ];

    w=(v >= 0) & ~(rem(v,2) == 0)

    v(w)

    Describe in words what these commands do. Next, try the functionsanyandall.

    if any(v==0)

    disp(Some elements of v are zero)

    end

    if all(v==0)

    disp(All elements of v are zero)

    else

    disp(Some elements of v are non-zero)

    end

    There are two more branching constructs: while and switch.

    Now about loops. What does the following loop do?

    x = 1 ;while (x

  • 8/13/2019 Mat Lab Script

    12/52

    A fancy way of doing the same is to write pow2(ceil(log2(1000))). The

    command ceil is rounding to the next larger integer. There are similarfunctions to round to the next smaller (floor) or closest integer (round),floor(x) is the same as computing the integer part ofx.

    Eventually, the for construct looks like this

    for var = vec

    body

    end

    Note a principal difference between while and for loops: In a forloop thevector vec is evaluated at the beginning, and for each of its elements the

    bodyis evaluated (changing elements in the vector vec inbodywill have noinfluence). In awhileloop, the condition is checked each timeendis reached,and changing inside the loop variables occurring in the condition enables usto exit the loop depending on the results of the computation. Here is anexample of a for loop:

    x=[-30:30];

    my_e=1;a=1;

    for k=1:100

    a=a.*x/k;

    my_e=my_e+a;

    endplot(x,abs((my_e-exp(x))./(exp(x)+1)));

    The abovefor loop computes an approximation to the value ex by using theTaylor polynomial of degree 100:

    ex T100(x) :=100k=0

    xk

    k! = 1 + x +

    x2

    2 +

    x3

    6 + . . . +

    x100

    100!

    which, according to your Calculus class, should be reasonably good for |x| 0)

    [i,j]=find(A==2)

    The last example gives you two index lists, in this case the row and columnindices of all elements ofA with value 2. How would you, using a few Matlaboperations, delete all zero rows and zero columns from a matrix X? Testyour solution on the matrixB .

    2.2 Matrix operations and linear systems

    Most Matlab functions when called with a matrix argument act or element-wise or on columns. All elementary functions belong to the first type. Others,

    17

  • 8/13/2019 Mat Lab Script

    18/52

    such assum,max,min,sort, belong to the second type. Element-wise multi-

    plication/division/exponentiation is defined for matrices of the same size inthe usual way (using a period in front of the operation symbol). E.g., if youwant to find the sum of the squares of all entries of a matrix A, you couldtype sum(sum(A.^2)). If you forget the period then or an error messageappears (ifA is not square), otherwise the sum of all elements ofA2 =A Ais displayed. Define suitable matrices and play around with these commands.For instance, what is the result of max(X,Y) resp. [v,ind]=sort(X)? Usehelp maxor help sortif unsure.

    However, matrices are most prominently used to represent linear maps be-tween finite-dimensional vector spaces, or to work systematically with linearsystems. For this, matrix multiplication, denoted in Matlab by *, is cru-cial. The product C=A*B of two matrices is well-defined only if the numberof columns ofA coincides with the number of rows in B, i.e. if size(A,2)andsize(B,1)return the same value. ThenChas the same number of rowsas A and the same number of columns as B , and its elements are defined byscalar products of the rows ofA with the columns ofB according to

    Cij =k

    AikBkj .

    Matlab returns errors if the dimension condition is not satisfied.

    From now on, we will deal with square matrices of the same size N

    N. Forthose, some special functions are defined.

    A = [ 1 3; 5 6 ]

    B = [ -3 2;6 -4]

    A*A

    A^2

    A*B-B*A

    sqrt(A)

    C=sqrtm(A)

    C*C

    det(A)

    det(B)

    Note: just like the function

    xis complex for negativex, its matrix analogueXreturns a real-valued result only for a certain class of square matrices,

    Ais not from this class. Since Matlab automatically continues with complex

    18

  • 8/13/2019 Mat Lab Script

    19/52

    results, you will not be warned about this.

    Suppose we want to solve the following system of linear equations.

    2x1+ x2+ x3 = 5

    4x1 6x2 = 22x1+ 7x2+ 2x3 = 9

    We can rewrite this system as a matrix equation of the form Ax= b.

    2 1 14 6 0

    2 7 2

    x1x2x3

    =

    5

    29

    Mathematically speaking, the solution of the equation Ax = b is found bypre-multiplying both sides with the inverse matrix A1 (if it exists), givingx= A1b. Hence, the solution of the above example is given by

    x=

    2 1 14 6 02 7 2

    1

    52

    9

    =

    11

    2

    We can copy this strategy directly to Matlab which provides the function invto do matrix inversion.

    A = [ 2 1 1 ; 4 - 6 0 ; - 2 7 2 ] ; b = [ 5 - 2 9 ] ;C = inv(A)

    x = C*b

    inv(B)

    rank(B)

    The existence of an inverse matrix A1 is equivalent to the unique solvabilityof the linear systemAx = b (for any right-hand side vectorb), or to det(A) =0, or that A has maximal rank (ifA is N Nthen this can be checked byrank(A)==N). This explains why inv(B) returns a warning and an infinitymatrix. Due to rounding effects, Matlab is not 100% sure when to say that

    Ais singular resp. rank-deficient resp. has zero determinant. In those cases,it only warns us but continues with the computation.

    There is a powerful alternative that Matlab provides for the solution of asystem of linear equations Ax = b, that avoids the explicit computation ofinverses. The expression

    19

  • 8/13/2019 Mat Lab Script

    20/52

    x = A \ b

    also returns the solution vectorx. The command does something meaningfuleven if A1 does not exist (in which case the system has no solution orinfinitely many solutions) or ifA is rectangular (over- and under-determinedsystems of linear equations), see below for a typical example. Similarly, c/Agives the vector x which solves the equation c = xA, so x = cA1 (for thisto make sense, c must be a row vector of length N). For example,

    c = [ 8 - 4 2 ] ;

    x = c / A

    max(abs(x*A-c))

    Note: For small-sized problems, the solvers in Matlab are pretty accurate.For large problems, due to rounding errors and instabilities of the algorithms,on the one hand, and ill-conditioning of the problem itself (to be discussedlater), larger discrepancies are possible. Double-checking does not hurt!

    2.3 Efficiency issues

    It should be noted that, even though A\b gives the same result as inv(A)*b,it is calculated in a different way (for insiders: it uses a version of the QRalgorithm, i.e., is based on the factorization A= QRinto an orthogonal ma-trixQ and an upper rectangular matrixR). Let us compare both strategies,

    to solve a random system of 500 linear equations in 500 unknowns:

    A = rand(500);

    b = rand(500,1);

    tic; inv(A) * b; toc

    tic; A \ b; toc

    Issuing a tic-tocpair of commands is a dirty way of measuring the elapsedtime in seconds between them. Since many processes may concurrently shareyour cpu this elapsed time measurement will vary from occasion to occasionbut still give a reasonable orientation. Repeat the last two lines a couple oftimes. If you want to have a more stable indicator, it is worth running a loopand averaging afterwards:

    n=10;

    t0=clock; for k=1:n; inv(A)*b; end; t_aver=etime(clock,t0)/n

    tic; for k=1:n; A\b; end; t_aver=toc/n

    20

  • 8/13/2019 Mat Lab Script

    21/52

    So the expressionA\bis approximately twice as fast evaluated thaninv(A)*b.

    We can distil a general rule from this:Avoid inverting matrices!

    It turns out that in practice it is almost never necessary to compute theinverse of a matrix. It is not the inverse itself that we are after, but we wantto use it for something (typically to solve a linear system), and in general,there is a faster way to this something which does not go via the matrixinverse. E.g., if a linear system has to be solved over and over again fordifferent right-hand sides but with the same matrixA, it seems advantageousto haveC=A1 as the matrix-vector multiplicationC*b is much faster thanA\bifNbecomes larger. However, the same advantage can be achieved by

    using LU- or QR-factorization provided by Matlab functions with the samename.

    The study of how much execution time is spent overall (and which partsof a numerical procedure consume it most) and how this depends on theparameters of the problem (here the dimension Nof the linear system andthe number of right-hand side vectors b to be treated) is important for thenumerical practice, especially, if time is a constraint (like in real-time systemsor optimal design of complex systems such as integrated circuits). In otherapplications, storage requirements might be a bottleneck. In textbooks onnumerical linear algebra, one often finds formulas of the number of arithmetic

    operations that an algorithm needs. Most of the algorithms for solving linearsystems with dense (= fully populated, in contrast to sparse = containinga high percentage of zero entries) matrices need CN3 + O(N2) arithmeticoperations with an algorithm-dependent constant C, and thus scale like N3

    with the dimensionNof the system. You can see that this is approximatelytrue by comparing the execution times of inv(A) resp. of A\b for N =150, 300, 600, 1200:

    N=150*pow2(1:4);

    for k=1:length(N);

    tt=0;A=rand(N(k));B=zeros(N(k));

    for m=1:3tic;inv(A);tt=tt+toc;

    end

    t(k)=tt/3;

    end

    21

  • 8/13/2019 Mat Lab Script

    22/52

    t

    t(2:end)./t(1:end-1)

    Indeed, enlarging the dimension ofA by a factor of 2 leads to a growth ofthe execution time by a factor close to 8 = 23. This helps you to decide howbig a problem can be solved in a given time. Note that the execution timein Matlab also depends on other factors (as we can see, the programmingoverhead kicks in for smallN, while for largeNand with limited swap spacememory allocation becomes a severe limiting factor and slows down executiontime further). Test this on your own.

    A good recommendation for setting up numerical computations for largerproblems is: Know your resources and constraints, always aim at an efficient

    and robust solution within these limits, check the results and the efficiencyon carefully designed test problems, and work on the bottlenecks if necessary.

    Usingtic-tocor clock-etimeor cputime, time the following three differentversions of creating the vector

    c= [1, 4, 9, 16, 25, . . . , (n 1)2, n2] :

    clear c;for k=1:n; c(k)=k*k; end;

    clear c;c=zeros(1,n);for k=1:n; c(k)=k*k; end;

    clear c;c=(1:n).^2;

    Get measurements for variousn 107 (try wisely, dont start withn = 107).E.g., use n = 2k, k = 8, . . . , 20. How do the measured times scale with n,e.g., linearly withn? What observations can you make, and what conclusionscan be drawn?

    22

  • 8/13/2019 Mat Lab Script

    23/52

    2.4 Eigenvalues and singular value decomposition

    For square matrices, the concept of eigenvectors and eigenvalues is crucial.A (complex) number is called eigenvalueof an N N matrix A if thereexists a non-zero vector x= 0 (called eigenvectorassociated with ) suchthat Ax = x. IfA is interpreted as the representation of a linear map onthe vector space VN of N 1 vectors, then such x give the directions inthe vector space that remain invariant under the linear map (i.e., Ax is amultiple ofx itself). From a linear algebra point of view, the eigenvalues arethose numbers for which the resolvent R(A) :=A Iis singular, i.e., theycan be computed as the zeros of the characteristic polynomial

    pA() := det(A

    I) = 0.

    That pA() is a polynomial of exactly degree N directly follows from thedefinition of the determinant. Thus,A has exactly N(complex) eigenvaluescounting multiplicity. The Matlab function eigreturns these eigenvalues asa column vector if it is called in the following form:

    A=[2 1 0;0 2 1;0 0 -3]

    eA=eig(A)

    B=rand(5)

    eB=eig(B)

    prod(eB)-det(B)

    det(B-eB(3)*eye(5))

    Three comments: 1) For upper triangular matrices such as A the eigenvaluescoincide with the diagonal entries. 2) The product of its eigenvalues equalsthe determinant of a matrix. In particular, a matrix is invertible iff all eigen-values are non-zero. 3) Matlab does not compute the eigenvalues of a matrixA by finding the zeros of the characteristic polynomial but in a numericallymore stable and efficient way.

    From a linear algebra point of view, the set of all eigenvalue/vector pairs isof interest. It is easy to prove, that eigenvectors associated with the same

    eigenvalue form a linear subspace ofVN(the so-called eigenspace or invari-ant subspace associated with ), invariant subspaces for different intersectonly at 0, in other words eigenvectors corresponding to different eigenvaluesare linearly independent. An important consequence is that the maximalnumber of linearly independent eigenvectors (for all eigenvalues together) is

    23

  • 8/13/2019 Mat Lab Script

    24/52

    always N. If it equals N, then A is called diagonalizable: Indeed, in thiscase, if (j, x

    j

    ),j = 1, . . . , N , is such a maximal system of eigenvalue/vectorpairs, one can form a diagonal matrix D with the j on the main diagonal,and the matrix

    X= [x1 x2 . . . xN]

    with the eigenvectors as columns. SinceAxj =jxj for everyj , by definition

    of matrix multiplication we therefore have

    A X=X D.

    The linear independence of the eigenvectors implies that the rank of X ismaximal, in other words, X is non-singular, and the inverse X1 exists.

    Thus, we can also writeA= X D X1.

    This is one of the important factorizations with a lot of uses in matrix algebra:It says that if we have enough linearly independent eigenvectors than, usinga similarity transform (= change of basis) facilitated by X, the matrix can betransformed into a diagonal one: X1AX = D. An important special casearises if A is symmetric (Hermitean), i.e., ifA==A using Matlab notation.Then A is diagonalizable, all eigenvalues are real, and X can be chosenorthogonal (unitary), i.e., X==inv(X) using Matlab notation. In this case

    A= X D X

    X

    A X=D.Matlabs function finds D, X satisfying A X = X D, also in the caseif A is not diagonalizable (you can find out about this by computing thedeterminant or the rank ofX). IfA is diagonalizable (or even symmetric)thenXcan be inverted and satisfies all these properties. Check it out!

    [X,D]=eig(A)

    A*X-X*D

    rank(X)

    det(X)

    [X,D]=eig(B)rank(X)

    inv(X)*B*X-D

    C=B+B

    C==C

    24

  • 8/13/2019 Mat Lab Script

    25/52

    [X,D]=eig(C)

    X-inv(X)c=rand(size(C,1),1);x=C\c;y=X*((X*c)./diag(D));x-y

    The last line of commands should convince you that, at least in principle, theeigenvalue decomposition for symmetric matricesA = XD X can also beused to solve linear systems with coefficient matrix A. However, other fac-torizations (e.g., LU-factorization) are more favorable in this respect. Pleasecompare the cputime for [X,D]=eig(A) with the one for B=inv(A) for ma-trices of various sizes N.Note: eig and other matrix algebra commands become less useful (due toboth storage problems, execution time, and instability issues) if the dimen-

    sion Nbecomes larger (say, larger than 2000 on our systems). Really largeproblemsN >>104 can still be treated if the matrices are sparse. Studentswith interest in such problems should read about the sparse matrix formatsupported by Matlab, and consult help eigs if the computation of eigen-values/vectors of large matrices is of concern.

    There is another important matrix factorization, the so-called singular value-decomposition (SVD, the Matlab command issvd, as expected) which is closein spirit to the eigenvalue decomposition but with two important formal dif-ferences: It always exists, and it even exists for arbitrary rectangular m nmatrices. The SVD is closely related to the least-squares solution of linearsystems, and to things like pseudo-inverses or finding the best orthonor-mal bases for representing a linear map between two vector spaces. Thebasic statement about SVD is as follows: For any m n matrix A, there isa decomposition

    A= U S Vwith amndiagonal matrixSwith non-negative diagonal elements arrangedin decreasing order (these are the singular values of A), a m m unitarymatrixUand annunitary matrixV. WhileUandVare closely related tothe eigenvectors of the symmetric matrices AA andAA, respectively, thenon-zero diagonal entries ofScoincide with the square-roots of the positiveeigenvalues of either of these matrices.

    [U,S,V]=svd(A)

    U*S*V-A

    [U1,D]=eig(A*A)

    C= [0.62 0.42 0.38 0.58;

    25

  • 8/13/2019 Mat Lab Script

    26/52

    0.40 0.62 0.58 0.40;

    0.40 0.58 0.62 0.40;0.58 0.40 0.30 0.62];

    [U,S,V]=svd(C)

    The typical practical application of the SVD can be explained on the lastexample. The matrixChas two relatively small singular values compared tothe two dominant ones. Not so much would change if they are set to zero.Check this by computing

    C1=U(:,1:2)*S(1:2,1:2)*(V(:,1:2))

    norm(C1-C)/norm(C)

    Make sure that you understand why the the formula for computingC1is reallydoing what was intended. From the formula you see that computing C_1*x(which is probably a reasonably good approximation to Cx) just needs twocolumns ofUand Vand the two dominating singular values. Now think of bigm nmatricesCand apply the same strategy: Compute the SVD, keep thelarge singular values only and neglect the small ones, and store the significantcolumns ofU, and V only. Then, the amount of arithmetic operations andstorage to do the multiplication C1x C x is obviously proportional to theproduct of the number of significant singular values and the dimension N=max(m, n) which is often much less then the original multiplication Cxwould

    take! Run your own example. Statistics and data processing make use of thisfact.

    2.5 Applications

    Todays applications are still a bit academic. They are aimed at explaining,where matrices and linear systems come from, and how the basic Matlabroutines such as \ (for solving a linear system), eig and svd help in theexploration. In some of the examples, we will use synthetic data (i.e., datagenerated in a reasonable fashion by Matlab itself), and not data that comefrom outside sources.

    Linear systems. A typical problem in applied sciences is to find an an-alytical expression of a function (between input and output parameters ofa certain process or device) that is only accessible through noisy measure-ments. This is for an unknown function f : X Y, we have data pairs

    26

  • 8/13/2019 Mat Lab Script

    27/52

    (xi, yi) X Y, i = 1, . . . , M , which are interpreted throughyi= f(xi) + wi

    (this is an additive noise model, wi is considered random and independentlydistributed with expectation 0 and following some probability distribution,e.g., the normal or Gaussian distribution). The aim is to find a close enoughapproximation to f. Since one does not know f beforehand, one needs touse some ansatz, e.g.,

    f(x) f(x) :=c1f1(x) + c2f2(x) + . . . + cNfN(x)where the fj(x) are a priori chosen shape functions and the cj unknown

    coefficients that we will determine using the available noisy data such thatf(x) f(x) can be expected. Often, the fj(x) are monomials or powerfunctions, exponentials, Gaussian bell-shaped functions or splines (the choicewill depend on the available a priori information on f(x)).

    Without going into details, we will present the common strategy (also knownas least-squares fit): We will assume that M N, and try to find the cj byminimizing the least-squares functional

    F(c1, . . . , cN) :=1

    2

    Mi=1

    (N

    j=1

    cjfj(xi)

    f(xi)

    yi)2

    for fitting fto the given data. This is reasonable also from a statistics pointof view, and the easiest approach numerically (one could replace the sum ofthe squares of the point-wise deviations|f(xi) yi| by their sum or theirmaximum which could be plausible in some applications but leads to a moredifficult optimization problem). It turns out that one does not have to useoptimization routines to find the optimal cj but just a little matrix algebra.For this, introduce the matrix A and the column vectors b given by

    Aij =fj(xi), bi= yi, i= 1, . . . , M, j= 1, . . . , N .

    Also, organize the unknown coefficients cj into a column vector c. Then theabove functional Fhas the short-hand expression

    F(c1, . . . , cN) =1

    2b Ac22=

    1

    2(b Ac) (b Ac),

    27

  • 8/13/2019 Mat Lab Script

    28/52

    and leads to another interpretation of the minimization problem: Find a

    vector c such that the residual vector r = b Ac of the over-determinedlinear systemAc= bis minimal in the Euclidean norm:r2 min. Exactlythis problem is solved by the generic Matlab command c=A\b (check withthe help command!). Alternatively, using Calculus of functions of severalvariables, one checks that any extreme point c of F needs to satisfy theso-called system of normal equations

    AAc= Ab.

    Check that both approaches lead to the same result (for larger problems,never use the normal equations!) by creating a Matlab script. An example

    with synthetic data could be as follows.

    M=100;w=0.1;x=pi*sort(rand(1,M));y=sin(x)+w*randn(1,M);

    plot(x,y,r.);

    Understand what this means: We have created data that satisfy the additivewhite noise model. Assume that you want to use 4-th order polynomials tofind a fit f for the function f(x) = sin(x) sitting in the background. I.e.,choose N=5 andfj(x) =x

    j1. FormA,b, solve for c using both approaches,then compute and plot f(x) resp. f(x) sin(x) using a sufficiently densegrid. Now play with different noise levels w, and other values for N orM.

    Note: What happens ifNgets close toM? For large values ofNandM, tocreateA, use Matlabs command for Vandermonde matrices. If the class ofansatz functions f(x) is the set of polynomials of a certain degree, the least-squares fit can be done more economically using special Matlab commandssuch as polyfit.

    Eigenvalue problems. The use of eigenvalues/eigenvectors is manifold.E.g., they can be used to solve matrix equations. For instance, ifAis diag-onalizable then

    A=rand(5);[X,D]=eig(A);

    B=X*diag(sqrt(diag(D)))*inv(X)

    sqrtm(A)B*B-A

    the above computed B solves X2 = A, i.e., represents a square root ofA.This way other matrix functions can be defined. E.g., check,

    28

  • 8/13/2019 Mat Lab Script

    29/52

    B=X*diag(exp(diag(D)))*inv(X)

    expm(A)With this example in mind, how would you define a matrix sine functionsinm?

    A more serious application is the investigation of initial value problems forlinear differential equations with constant coefficients, such as

    x(t) + dx(t) + k2x(t) = 0, x(0) =x0, x(0) =v0

    describing a linear oscillator with damping d. The standard approach is towrite any such problem as a linear system of first order equations,

    u1(t) = A11u1(t) + A12u2(t) + . . . + A1NuN(t), u1(0) =u10

    u2(t) = A21u1(t) + A22u2(t) + . . . + A2NuN(t), u2(0) =u20

    . . .

    uN(t) = AN1u1(t) + AN2u2(t) + . . . + ANNuN(t), uN(0) =uN0

    or, using vector functions and matrix notation,

    u(t) =Au(t), u(0) =u0.

    In the above example, one would set u(t) = [x(t); x(t)], and determineA=[0 1;-k*k -d], u0=[x0;v0]. Now, two observations. The uniquely

    existing solution of this homogeneous problem (no outer forces) is given byu(t) =etAu0, i.e., the matrix exponential describes the evolution of the tra-

    jectory. On the other hand, whether all the trajectories are damped, i.e.,tend to zero as t , can be found out solely from the eigenvalues ofA:Damping occurs iff all eigenvalues have negative real part! Check it out forthe toy example: Choose a parameter k > 0 and various d, positive andnegative. Compare with a plot of the trajectory (compute it using expm).This type of question is of crucial interest for electrical circuit evaluation andmodelling. We will deal with the numerical solution of linear and nonlineardifferential equations later.

    SVD. Here is a possible use of the SVD. Suppose you get the results of mea-surements represented in the form ofM1 vectors from a large numberNofprobes that belong to a small number of types (say, represent certain types oforganisms or stars which would return similar, distinctive measurement re-sults) but you dont know how many different types are covered in the given

    29

  • 8/13/2019 Mat Lab Script

    30/52

    set of measurements, nor which probe belongs to which type. The problem

    of finding this out is a classification problem. Here is how SVD comes in.Suppose, you arrange all the incoming measurements column by column intoa M N matrix A. If there were measurements coming from K differenttypes, many of these rows are similar (due to noise they are not identical, soapplying commands such asrankis not helping). Typically, SVD will revealthe number of classes with high fidelity: Since Ahas rankK(or less) up tonoise, it will have about K large singular values, and the rest will be verysmall.

    Check this idea by a simulation, this is best done by writing a script filerather than inputting commands line by line at the Matlab prompt. To thisend, fix a small number K K andperturbation levels.

    How would you proceed with finding the type of each measurement vector inan efficient way? Hint: You may want to use the SVD result to replace theoriginalM1 measurement vectors by auxiliaryK1 vectors containing stillenough of the relevant information, and use them for the classification into

    types. Note that this approach leads to an efficient classification algorithmwhich, with high probability, will recognize if a new measurement belongs toone of the existing types (or corresponds to a completely different type thatwas not among the ones represented by the initially available measurements).

    30

  • 8/13/2019 Mat Lab Script

    31/52

    3 Week 3

    Covered are

    Condition of a problem and stability of an algorithm Integration Solving differential equations Saving and loading of variables and data

    3.1 Condition and stability

    We already know that errors and error propagation are inevitable in numer-ical computations. In addition to modeling and discretization errors that wewill not discuss here, there are two major sources of errors in a computa-tional problem: Errors that lie in the nature of the problem at hand (e.g.,ill-posed problems, where small perturbations in the input will lead to largechanges in the output), and additional errors that can be attributed to theerror propagation in the concrete algorithm to obtain the output from theinput numerically.

    The first source of trouble is described by the condition of a problem. Thecondition measures the largest ratio of output error to input error, assumingthat the input error is arbitrary but small enough. We illustrate this on theexample of solving a system of linear equations Ax = b, where the input isthe right-hand sideb and the output is the solution x = A1b, and errors (asis often the case in floating-point arithmetics) are relative errors with respectto the Euclidean vector norm.

    e=0.00001;N=5;M=1000;

    format rat;

    A=hilb(N)

    Ainv=invhilb(N)

    format;

    [Z,D]=eig(A);[eig_max,i]=max(diag(D));

    b=Z(:,i)

    x=Ainv*b

    Rb=randn(N,M);Rb=Rb./(ones(N,1)*sqrt(sum(Rb.*Rb)));

    31

  • 8/13/2019 Mat Lab Script

    32/52

    B=b*ones(1,M)+e*Rb;

    X=Ainv*B;Rx=X-x*ones(1,M);Experimental_Condition=sqrt(max(sum(Rx.*Rx))/(x*x))/e

    cond(A)

    As A, we have chosen the symmetric Hilbert matrix hilb(N) of dimensionN= 5. The test uses the exact inverse invhilb(N). To see that this matrixis indeed badly conditioned (= resists the accurate solution of the associ-ated linear system), we have chosen a special right-hand side b of unit norm(the eigenvector of A associated with the largest eigenvalue), and createdM= 1000 random perturbations b of relative error e = 0.00001, and orga-nized them in a matrix B. The columns of the matrix Rx are the error vectors

    of the corresponding solutions xx= A1

    bA1

    b, andExperimental_Condcomputes the worst-case relative norm error of the solutions among the 1000tested cases divided by the relative error level e of the input perturbations.The built-in Matlab function cond(A) returns the theoretical value which isin good agreement with this experimental result. For symmetric matrices,the condition number can be defined as the ratio of largest to smallest in ab-solute value eigenvalue ofA, which again shows how crucial eigenvalues are.The interpretation of cond(A) is as follows: If right-hand sides of a linearsystem with this A are computed with relative error e then the computedsolution x could have a relative error of as much as 500000 times larger!This is a loss of about 5

    6 digits! Unfortunately, many large-size matrices

    appearing in applications are ill-conditioned. Checking the condition of ma-trices appearing in your applications is worthwhile!

    Another ill-conditioned operation which one needs to be aware of is the sub-traction of positive numbers of approximately the same magnitude(or, what is the same, addition of numbers of the same magnitude with op-posite signs). Test this on the formulaz= 1000(x + y), wherex = 1.000 andy= 1.001 lead to the exact result z= 1.e=0.00001;

    x=1.000+2*e*(rand(1,100)-0.5);

    y=-1.001+2*e*(rand(1,100)-0.5);

    z=1000*(x+y);

    Experimental_Condition=max(abs(z+1))/e

    Multiplication is well-conditioned in a floating-point environment (unless youforce under- or overflow), so the multiplication by 1000 is not the problem.

    32

  • 8/13/2019 Mat Lab Script

    33/52

    Also, you get the same condition for other error levels e. Most of the core

    computational problems have instances where they are well-conditioned, andinstances, where they are ill-conditioned. For the addition of two numbers,we have

    cond(x + y) =|x| + |y||x + y| (x + y >>eps),

    so addition of two positive numbers is always safe (i.e., the condition numberis 1, or close to one).

    However, the algorithm we choose may contribute additional errors. A stable(= good) algorithm will not return a numerical result that is less accuratethan predicted by the condition of the problem (we cannot, in general, expect

    that it does better either!). E.g., most of the algorithms for linear algebraproblems discussed last week, are reasonably stable (they may fail in thisrespect for some larger systems, however, it is hard to force them to do so).Examples that are easier to explain are the addition of many positive numbersand the computation of elementary functions. The following example showswhat goes wrong with addition:

    format long;

    x=10^8;j=0;while j

  • 8/13/2019 Mat Lab Script

    34/52

    by its power series (essentially, this problem has already been dealt with in

    Week 1). Since|ex ex|

    ex = |exx 1| |x x| = |x| |x x||x|

    for small perturbations, the (relative) condition of this mappingx y= exis about |x|, i.e., grows linearly with the magnitude of the input. Now, deter-mine the behavior of the following two algorithms for x [30, 30] imple-mented in the following function (you can retype it and save it asmy_exp.m,or download it from the course webpage).

    function y=my_exp(x,method);

    % Two algorithms for evaluation of exp(x) using power series% method==0: Evaluation using power series for all x

    % method~=0: As above for x>=0, alternative

    % exp(x)=1/exp(|x|) for x

  • 8/13/2019 Mat Lab Script

    35/52

    3.2 Integration

    Numerical integration for finding the value of an integral f(x) dx is nec-essary if the function y = f(x) is not available via nice formulas but onlythrough measurements or as a black box evaluation routine, or, especiallyin higher dimensions if the integration domain is not trivial. There are twobasic approaches.

    In composite quadrature rules such as implemented in quad,quadlforintegrals over intervals (or dblquad for integrals over rectangles), thedomain is divided into many small intervals (or rectangles) i, on eachof these the function is replaced by a polynomial via local interpolationusing a certain number of points xij in each i, thepi(x) are integratedexactly. The results are then added up:

    f(x) dx=i

    i

    f(x) dx i

    i

    pi(x) dx=i

    j

    wijf(xij)

    The weights wij can be pre-computed. The error depends on the sizeof the i, the degree of the polynomials pi(x) used locally, and thesmoothness off(x).

    Monte-Carlo type methods where pseudo-randomly or quasi-randomlybut more or less uniformly located pointsxj in are generated one by

    one, and the integral is approximated by the simple formula

    f(x) dx 1N

    j

    f(xj), N= 1, 2, . . . .

    It can be shown that under weak conditions on f(x), the error of thisapproximate integration process decays with order 1/

    (N) (or close

    to 1/Nfor quasi-random point sequences). This is relatively slow butsometimes the only option in higher-dimensional applications.

    For example, if we want to compute the integral 015 ex dx, wheref(x) =exis smooth, one could useformat long;

    [I,Fev]=quadl(my_exp(x,1),-15,0,1.e-5)

    f=inline(my_exp(x,P1),1); [I,Fev]=quadl(f,-15,0,1.e-14,[],0)

    35

  • 8/13/2019 Mat Lab Script

    36/52

    g=@my_exp; [I,Fev]=quadl(g,-15,0,1.e-14,[],1)

    [I,Fev]=quadl(my_exp,-15,0,1.e-5,[],0)I=1-exp(-15)

    format

    The example also demonstrates the typical ways of submitting functionnames to other Matlab functions: via expressions, inline objects, functionhandles, or by their name directly. It also shows the impact of changingtolerance, and parameters. Note that the actual error may be much smallerthan the actual tolerance (which is a desirable upper bound for the error sub-mitted by the user), and that the number of function evaluations requiredbyquadldramatically increases if the tolerance is close to machine accuracy

    eps. In order to fully understand the different calling options of quadl, usehelp quadl. An important feature which most of the built-in Matlab func-tions have is parameter handling for functions to be called (in this case, the

    method parameter of my_exp), and the option to be called with a variablenumber of input/output arguments. If you want to program such functionsyourself, this can be organized using the Matlab commands nargin(= num-ber of input arguments) and nargout (= number of input arguments).

    Just for fun, let us evaluate the area inside the ellipse

    x2 + y2/4 = 1

    (according to Calculus, this area is exactly 2, i.e., an ellipse with half-axesa= 1 and b = 2 fits two unit circles) using the Monte-Carlo technique. Hereis a slow version of the method using Matlabs randfunction which computesthe integral of the characteristic function of the quarter-ellipse located in theintersection with the first quadrant over the rectangle [0, 1] [0, 2] containingit. Try to understand what the program does.

    N=10000;clf;

    I=zeros(1,N);s=0;

    for n=1:N

    x=rand;y=2*rand;

    if x^2+y^2/4

  • 8/13/2019 Mat Lab Script

    37/52

    loglog(1:N,abs(I-pi/2),r.,1:N,1./sqrt(1:N),k--);

    legend(Error,N^{-1/2});xlabel(Number of points);title(Error behavior of Monte Carlo integration (loglog plot));

    3.3 Solving differential equations

    Differential equations are an important modeling tool. We mentioned theequation for a damped linear oscillator (and systems of linear ordinary dif-ferential equations with constant coefficients) in Week 2. Although mostlarge-scale systems of differential equations are better solved with special-ized software, Matlab has some tools for the initial exploration of ordinarydifferential equations (ODEs), where the unknown function or function vec-

    tor depends on only one independent variable often interpreted as time t(some functions for the more complex partial differential equations (PDEs)are available from Matlabs PDE toolbox but we will not talk about them).ODE solvers, and special visualization tools for them, can be found underhelp funfun. We will consider a standard model problem of a nonlinear os-cillator, the van der Pol equation (there is a demo program vdpodeavailablefor this, just check with the help function from Matlabs menu bar):

    x(t) + d(x(t)2 1)x(t) + x(t) = 0, x(0) =x0, x(0) =v0.

    Here x(t) is the displacement of the oscillator from its equilibrium point

    x = 0, v(t) =x(t) is the velocity, t is time, d >0 a characteristic constant(all quantities have been made dimensionless), and the above equation repre-sents Newtons law of motion. The proportionality factord(x(t)21) in frontofx(t)) depends nonlinearly on the motion of the oscillator: For large dis-placement values|x(t)| >>1 it damps the oscillation, for small|x(t)|

  • 8/13/2019 Mat Lab Script

    38/52

    and rewriting the above initial value problem as explicit (in the derivatives)

    ODE system of first order

    y1(t) = y2(t), y1(0) =x0,

    y2(t) = d(1 y1(t)2)y2(t) y1(t), y2(0) =v0.

    Matlab also has routines for implicit ODEs, mixed systems of ODEs andalgebraic equations, and many other types of dynamical systems dependingon one continuous independent variable but in many cases we can reduceproblems to the principal explicit form

    y(t) =f(t, y(t)), y(t0) =y0,

    wherey maps the time interval [t0, te] of interest intoRn, and the right-hand

    sidefhas as arguments the scalart R and the vectory Rn, and returns avector withn entries. To solve such a system by Matlabs ode???functions,we first have to create a Matlab function for the right-hand side functionf(t, y). The convention is that the vector y is arow or column vectorbut theoutput vector dy = f(t, y) is always a column vectorof the same length n.Additional parameters can be included. In our case, the following functionwould do:

    function dy=F_VanDerPol(t,y,d);

    % Computes the right hand-side of the ODE system% corresponding to the nonlinear van der Pol oscillator

    % described by

    % y" + d*(y^2-1)*y + y = 0

    % The parameter d>0 influences the amount of damping

    % (for large |y|>1) resp. instability (for small |y|>1.

    dy=[y(2);d*(1-y(1)^2)*y(2)-y(1)];

    You can skip the comments when typing the function, dont forget to save itin a file F_VanDerPol.m.

    All ODE solvers have the typical calling pattern

    [t,y,...]=ode???(odefun,tspan,y0,options,p1,p2,...);

    38

  • 8/13/2019 Mat Lab Script

    39/52

    whereodefun(the function coding the right-hand side, e.g., as function han-

    dle),tspan(the vector describing the interval of computation [t0, te]), andy0(the vector initial values) are mandatory. The contents of the optionsstruc-ture can be found out by typing help odeset, advanced users can changeit by using the odeset command. The remaining arguments are parametervalues passed to odefun (in our case, this would be the parameter d). Theoutputtis a k1 column vector of intermediate time points tj (automaticallygenerated by the ODE solver), and y is a k n matrix whose rows containapproximations to the components of the solution vector y(t) at these tj. Itcan be used to plot graphs of the solution resp. certain solution components.Try it out, first using a small parameter d = 1:

    [t,y]=ode45(@F_VanDerPol,[0 30],[0.01 -0.01],[],1); clf;subplot(1,2,1);plot(t,y(:,1),g,t,y(:,2),r); xlabel(t);

    legend(Deviation x,Velocity v); title(Solution components);

    subplot(1,2,2);plot(y(:,1),y(:,2),b);

    xlabel(x);ylabel(v);

    title(Trajectories in phase space);

    hold on;

    [t,y]=ode45(@F_VanDerPol,[0 30],[-5; 10],[],1);

    plot(y(:,1),y(:,2),k);

    subplot(1,2,2); legend(Small y_0,Large y_0);

    You can clearly see that solutions starting from close to the origin of thephase space (the x v plane) become periodic, and wind from the insideto a closed limit curve (the so-called limit cycle), similarly for trajectoriesstarting from large initial values: they wrap around the limit cycle from theoutside). The shape of the limit cycle is slightly asymmetric but still closeto the one for d = 0 when the oscillator is harmonic and undamped, and alltrajectories are circles. The period of the limit cycle (the time it takes to wraparound once) is always larger than 2, it grows with d. Other ODE Matlab

    routines will give similar results. There are some automatic plotting options,e.g., omitting the output [t,y]= will automatically (and dynamically) plotsolution components, trajectories in two- and three-dimensional phase spacescan be plotted by the functions odephas2and odephas3, respectively. Theycan be invoked by changing the options structure, see below.

    39

  • 8/13/2019 Mat Lab Script

    40/52

    The interesting switching feature of this oscillator is exhibited more clearly

    for large d. In that case, the non-stiff solvers will, as a rule, fail, and ode45is better replaced by ode15s:

    clf;

    ode15s(@F_VanDerPol,[0 4000],[0.01 -0.01],[],1000);

    axis([0 4000 -5 5]);

    ode15s(@F_VanDerPol,[0 4000],[3 0],my_opt,1000);

    axis([0 4000 -5 5]);

    my_opt=odeset(OutputFcn,@odephas2);

    ode15s(@F_VanDerPol,[0 4000],[3 0],my_opt,1000);

    The blue circles indicate the displacement component, the green circles thevelocity component. For large d, the destabilizing factor d(1 x(t)2) for|x(t)| < 1 is so severe that trajectories starting near the origin are almostimmediately moved very close to the limit cycle. Trajectories coming fromthe outside need some time. Close to the limit cycle, the velocity is almostzero if|x(t)| is in the range from 2 to about 1.3. If the absolute value of

    the displacement decreases further than the oscillators position is almostinstantaneously moved to the opposite side of the equilibrium point x = 0,i.e., from 1.3 to2 resp. from 1.3 to 2. This can be seen from theenormous values ofv(t). This switching back and forth phenomenon is notpossible in linear systems. Nonlinear systems can have more complicatedbehavior the more components the system has.Note: The plotting functions odeplot, odephas2 etc. plot points as theybecome available. The result is some kind of movie (a not very professionalone, if you want to produce better movies illustrating your computations ofdynamically changing objects, study themovie,getframe, etc. commands!).The speed is dictated by the internal time steps of the algorithm. As you

    can see from the plots, the numerical ODE solvers are intelligent in thatthey choose step-sizes (the distances between two subsequent tj) accordingto the desired accuracy in an automatic way, and not equi-spaced. Thisfeature allows to compute reasonable solutions in an efficient way. E.g.,the last ode15s call used only 507 internal time step to get from t0 = 0

    40

  • 8/13/2019 Mat Lab Script

    41/52

    to te = 4000, and the minimal time-step was about 0.000005 time units

    and the maximal time-step was about 280 time units. You can force thecode to compute (and output) approximate solutions at prescribed tj byreplacing the two-component vectortspanby a longer vector, e.g., by settingtspan=t0:h:te, whereh = (te t0)/Nis step-size of a uniform output meshwithNsubintervals in [t0, te].

    3.4 Saving/loading of data

    Larger projects require you to deal with data that come from other applica-tions, and save your own computational results for later use. The commandfor saving variables is save, followed by the filename in which to save the

    variables. By default, all variables are saved, but you can also specify ex-actly which variables should be saved. For instance, save comp a b*savesthe variableaand all variables whose name start with bin the filecomp. Mat-lab usually stores the variables in a binary file with extension .mat, whichmost other programs cannot read. It is also possible to save the numbersin a vector or matrix in plain text. The load command comes with similarfunctionality. Check this out:

    clear;

    A=sqrt([ 1 2 3;0 -1 3;-4 -9 0]);B=A(:,2:3);C=hilb(20);

    d=What the hell does all this mean?

    save result1;save -ascii result2.txt;

    save -ascii result3.dat C;

    clear;

    ! emacs result1.mat

    ! emacs result2.txt

    load result1 B d

    who

    clear;load result1

    load -ascii result2.txt

    load result3.dat

    Pay attention to the peculiarities of saving in plain text with the-asciiop-tion: Imaginary parts are lost, double precision numbers are saved in singleprecision, the saved file cannot be easily read (unless it is logically a rectan-gular array), and variable names are lost (if a matrix is stored in plain text

    41

  • 8/13/2019 Mat Lab Script

    42/52

    then the load command assigns it the filename of the file it was saved to as

    variable name.

    Saving to and reading from ascii files that have mixed contents requiresmore work (and C-style techniques). Some flexibility is provided by functionslike fprintf (which extends in many ways the format when writing to thescreen) for writing formatted text and fscanf for reading it. Here is anexample of influencing how matrix elements appear in a file they are writtento (it is modified from Matlabs help).

    x = 0.1:0.1:2; y = [x; exp(x); log(x)];

    fid = fopen(exp.txt,w);

    fprintf(fid,%3.1f %12.4f %10.6e\n,y);fclose(fid);

    y=load(exp.txt)

    fid = fopen(exp.txt,r);

    A=fscanf(fid,%g %g %g,[3 inf]);A=A

    fclose(fid);

    We will talk about the formatting conventions and the actions observed dur-ing class, for indepth information, take a C manual and use Matlabs helpsupport. One can also output to the screen (set fid=1 or omit this fileidentifier), similar to thedisp command for strings:

    L=101.2345;datatype=height above sea level;dimension=meters;

    fprintf(\n\n The current height above sea level is %8.2f meters. \n,L);

    fprintf(1,\n\n The current %s is %1.2f %s. \n,datatype,L,dimension);

    s=sprintf(\n\n The current %s is %d %s. \n,datatype,L,dimension)

    There are other commands that cause interaction of the user with the Matlabprogram: pausein a m-file will stop execution until you hit enter, pause(10)will stop the program for 10 seconds. Interactive input of variables to aprogram can be organized as follows:

    B=input(Input a 3x3 matrix of integers: )filename=input(Enter filename (including extension) as string: )

    y=load(filename)

    filename=input(Enter filename (including extension): ,s)

    y=load(filename)

    42

  • 8/13/2019 Mat Lab Script

    43/52

    Each time the system waits until you submit input from the keyboard (it will

    not check if your input is correct, you are responsible for programming suchchecks). Using thestells the system not to evaluate the input but to treatit as text (so, if you type ABC after an input call with the s argument,the stringABCis stored, if there is nos argument, Matlab searches for avariable ABC, tries to evaluate it according to its rules, and stores the result.Try this out.

    4 Week 4

    Covered are

    Data analysis: An example Nonlinear equations and optimization

    4.1 Data analysis: An example

    Atime seriesis a sequence of measurements over time, here denoted byX={Xi}. In this section we use Matlab in an exemplary way to analyse it.Please download the ascii file ozon.dat from the course webpage (for lateruse download also auto.m, the binary mat-file data.mat, lsq_bell.m, andnlsq_datafit.m). Load it as vector named ozon into the workspace usingan appropriate method (see the previous subsection). The numbers in thisvector represent measurements of the amount of ozone in the atmosphereabove Norrkoping, Sweden. The measurements were taken daily in the yearsfrom 1991 to 2000, so there are 3653 of them, and the timescale is equi-spaced(in units days). Now, plot this time series.

    Our goal here will be to split the time series into three parts:

    Seasonal part A pattern that repeats itself over regular intervals, for in-stance every year.

    Trend The overall trend over time; this should be a slowly varying curve.Noise The remainder, which cannot be explained. This is modeled as a

    random perturbation.

    43

  • 8/13/2019 Mat Lab Script

    44/52

    Let us try to estimate the seasonal part of this data set. That is, suppose we

    know that the data is periodic with period p then seasonal part is the meanof{Xi, Xi+p, Xi+2p, . . . }. Now let us try to find out what p is. To do this,we calculate the autocorrelation vector of ozon. This vector contains thecorrelation coefficients of the given vector X= (x1, . . . , xN) with its periodicshifts

    Xj = (xj+1, . . . , xN, x1, . . . , xj), j = 0, . . . , N 1.The correlation coefficient, denoted by (X, Xj), is a number between1and 1. High correlation coefficients ((X, Xj)-values close to 1) indicate thatX and Xj are very close to each other, thus, the first significant maximumof(X, Xj) forj >0 is a good guess for the period hidden in the data. Use

    the downloaded functionauto.mfor computing teh autocorrelation vector ofozon(core Matlab provides the function corrcoeffor computing correlationcoefficients and matrices).

    help auto;

    ac_ozon=auto(ozon);

    auto(ozon);

    [m,i]=max(ac_ozon(350:380))

    [m,i]=max(ac_ozon(720:750))

    [m,i]=max(ac_ozon(1080:1110))

    We see that there is a first peak in the autocorrelation vector with a lag ofaroundj = 370. For good reason, we guess that the period is in fact 365, thenumber of days in a year (we are not worrying about leap years). However,you can also proceed with any other reasonable number (e.g., the averagedlag between subsequent peak values).

    With the period at hand, we can now compute the seasonal component.It makes sense to smoothen it out using some kind of sliding filter(or convo-lution). The simplest approach is a so-called moving average of length 2k +1,where the entryXi of the time series is replaced by

    Xi = 1

    2k+ 1(Xik+ Xik+1+ . . . + Xi+k).

    Apply it with, say, k = 4 or k = 5 to the vector s=ozon(1:p) (we expandit by k elements to the left and right in a periodic fashion, to get a smoothvector of length p after the averaging):

    44

  • 8/13/2019 Mat Lab Script

    45/52

    k=4;p=365;s=ozon(1:p);

    se=[s(p-k+1:p);s;s(1:k)];...

    (to write a little loop that does the averaging is left to you!). Denote theresult again bys. By the way, instead of starting from ozon(1:p), one couldhave used also ozon(p+1:2*p) or some average of several such sections ofthe time series, and then smooth it (or smooth the original time series beforeextracting the seasonal component s). However, the results will not looksignificantly different.

    Finally, we put ten copies of s one after another, and we plot the originaltime series with the seasonal component given by s.

    ozon_s = [];

    for k = 1:10

    ozon_s = [ ozon_s; s ];

    end

    plot(ozon)

    hold on

    plot(ozon_s, r, LineWidth, 2);

    hold off

    Let us take a look at the deseasonalized time series, the part that remains

    after the seasonal component is eliminated.

    ozon_d = ozon(1:3650) - ozon_s;

    plot(ozon_d, .);

    Continue by computing the autocorrelation coefficients of the time seriesozon_d to see whether there are more seasonal cycles. After that, you cantry to see whether there is a trend in the data. E.g., one could look for alinear trend obtained by a least squares fit to ozon_d.

    A=[ones(3650,1) [1:3650]];a=A\ozon_d

    trend=A*a;ozon_dt=ozon_d-trend;

    plot(ozon_dt, r.);hold on;

    plot(trend, b, LineWidth, 2);

    hold off

    save ozon1 ozon ozon_s ozon_d ozon_dt

    45

  • 8/13/2019 Mat Lab Script

    46/52

    Obviously, the linear trend is very weak compared to the size of the fluctu-

    ations in ozon_dt. Finally, we may take a look at ozon_dt, i.e., the partthat remains after seasonal components and trend are subtracted from thetime series: are we just left with random perturbations? What is the dis-tribution? E.g., we can estimate expectation and standard deviation fromozon_dt, using the formulas

    EX= X= 1

    N

    i

    Xi, X=

    1

    N 1i

    (Xi EX)2,

    and test if the noise is normally distributed with these parameters. EX,X,and variance2Xcan be computed by the built-in Matlab functionsmean,std,

    and var respectively. These and other available functions for data analysisand statistical explorations can be found with help datafun. Since we can-not venture too far into statistics, we will just rely on a visual check.

    EX=mean(ozon_dt)

    SX=std(ozon_dt)

    clf;

    subplot(2,1,1)

    plot(ozon_dt,r.); title(Remainder of time series);

    subplot(2,1,2)

    plot(EX+SX*randn(3650,1),b.);

    title(Gaussian noise with same mean and variance);

    It looks like the noise question needs further study.

    4.2 Nonlinear equations and optimization

    Solving nonlinear equations is the last typical tasks, we will briefly discussin this introductory Lab. In contrast to linear equations (a subject of linearalgebra), existence, number, and finding of solutions of nonlinear equationsare very difficult questions. Most of the current codes for the numericalsolution of nonlinear equations are local in nature: The user has to specify aqualified initial guess of the solution or a small region containing it, and theprogram returns one of the solutions nearby (if there exist any!).

    Finding the roots of a polynomial is a classical example. Matlab providessome tools to deal with polynomials, among them roots (computes all n

    46

  • 8/13/2019 Mat Lab Script

    47/52

    real and complex roots, including multiple roots, of a polynomial of degree

    n), poly (computes the characteristic polynomial resp. reverses the actionof roots), with matrix input polyval (for the evaluation of a polynomialat a set of points), and polyfit (for finding a least-squares polynomial fitto a set of data points). The convention of submitting and returning thecoefficients of a polynomial p(x) = anx

    n +. . .+ a1x+ a0 is via a vector[an ... a1 a0]. E.g., the examples below show you an alternative wayof computing eigenvalues of small matrices (not recommended, always useeig!), and compute and display a smoother, polynomial replacement for theseasonal component ofozon:

    A=hilb(5)

    p=poly(A)x=roots(p)

    e=eig(A)

    load ozon1

    t=(1:365);p=polyfit(t,ozon(1:365),20);

    ozon_sp20=polyval(p,t);

    clf;

    plot(t,ozon(1:365),b.,t,ozon_s(1:365),r--,t,ozon_sp20,k);

    There is one general message: Polynomial fits (interpolation, approximation,...) to general data have poor properties near the interval ends, especially for

    higher degree, use them with caution or switch to fits with spline functions.More general scalar nonlinear equations can be treated by fzero. We illus-trate this by solving the equation

    ex =ax (a >0)

    forx. Ploth(x) =ex and g(x) =ax for some a >0, and convince yourselfthat there is a unique solutionx = x(a) which belongs to the interval [0, 1/a].To use fzero, you have to transform the equation into the form

    f(x, p1, p2, . . .) = 0 (here, f(x, a) =ex

    ax),

    and write a function f=my_fun(x,p1,p2,...) that computes this function.

    f=inline(exp(-x)-a*x,x,a);

    inta=[0.1:0.1:10];x=[];

    47

  • 8/13/2019 Mat Lab Script

    48/52

    for a=inta

    xa=fzero(f,[0 1/a],[],a);x=[x xa];end

    plot(inta,x);

    res=max(abs(exp(-x)-inta.*x))

    The second argument offzero is either a value x0 near the suspected solu-tion, or a two-vector representing an interval where fhas opposite sign atthe two end-points. Thus, writing xa=fzero(f,1/a,[],a) probably worksequally well. The empty matrix [] argument is for the option structure (i.e.,the default values are used). The result of computing the maximum residualerrorres looks convincing: The foundx = x(a) satisfies the equation almost

    up to machine accuracy!Unfortunately, core Matlab does not provide you with similar tools for solvingsystems of nonlinear equations(the function fsolveis part of the optimiza-tion toolbox, seehelp optim). Those arise at every (numerical) corner. E.g.,the stiff ODE solver use so-called implicit methods, where at each time-stepone or several nonlinear systems of the same size as the ODE system haveto be solved. The simplest implicit scheme for ODEs, the backward Eulermethod, would lead to the system (written in vector notation)

    yn+1= yn+ hf(tn+1, yn+1)

    for the next approximate solution vector yn+1. Here, f(t, y) is the vectorfunction from the right-hand side of the 1st-order system of ODEs.

    For an autonomous ODE system, i.e., if f(t, y) f(y), then solving thealgebraic system

    f(y) = 0

    f1(y1, y2, . . . , yN) = 0f2(y1, y2, . . . , yN) = 0

    . . .fN(y1, y2, . . . , yN) = 0

    (4.1)

    determines all the equilibrium points of the ODE system, i.e., all those y

    for which the initial value problemy(t) =f(y(t)), y(t0) =y

    ,

    possesses constant solutions y(t) y in time. These often determine thelong-time behavior of solutions (if such a y is stable, then trajectories that

    48

  • 8/13/2019 Mat Lab Script

    49/52

    come sufficiently close toy will be attracted to this point, i.e., y(t) y fort ).The other big source of nonlinear systems is smooth optimization. Here is adata analysis problem.

    clear;load data;who

    figure;

    mesh(datax,datay,dataz);

    title(Original data);

    Given are data triples (xi, yj) zij, and their display using the meshfunction shows a noisy surface over a rectangular parameter region (meshed

    by datax and datay). Since we see clearly two bell-shaped surface parts (ahill and a dip), a qualified guess for a smoothed function representationz= f(x, y) of the data could be of the form

    f(x, y;p) :=b0+ b1x + b2y+ c1ea1((xx1)2+(yy1)2) + c2e

    a2((xx2)2+(yy2)2),

    a linear combination ofn = 2 Gaussians with unknown centers (describedby the four parameters x1,y1,x2,y2) and unknown widths (a1, a2) and height(c1, c2) (the linear function b0+b1x+b2y is added for convenience, it couldcatch any potential linear trend in the data). We collect all these parame-ters in a vectorp of length 11.

    Obviously, this 11-parameter function f(x, y;p) cannot fit the data exactly(and we are not really interested in this either), i.e., we cannot achievezij = f(xi, yj, p) to hold for all pairs (i, j) by choosing appropriate para-metersp). Instead we try to minimize the nonlinear least-squares functional

    F(p) =1

    2

    i

    j

    (zij f(xi, yj;p))2 min .

    Matlabs optimization toolbox has the special routinelsqnonlinto solve thisproblem, if this toolbox is not installed then fminsearch is the only choicesince the parameter space of the optimization problem is multi-dimensional.

    The following Matlab script uses this routine in a particular fashion: It solvesthe 11-dimensional problem by first fitting one by one the two bell-shapedparts, and then after subtracting them a linear least-squares problem is solvedto find the linear part. This loop can be repeated several times to iterativelyimprove the result. Such a process is called sequential optimization, the main

    49

  • 8/13/2019 Mat Lab Script

    50/52

    advantage is that the partial problems solved at each step involve fewer vari-

    ables which makes the optimization method more robust, and sometimes evenmore cpu-time efficient. In addition to providing a function for computingthe optimization functional to be minimized

    F(x,y ,a,c) =1

    2

    i

    j

    (zij cea((xix)2+(yjy)2))2,

    appropriate starting values for the parameters a,c,x,y need to be read fromthe data (the choice of the starting values is crucial for the success of numer-ical schemes for higher-dimensional problems). The downloaded m-files lookas follows:

    % Sequential optimization of nlsq problem for two-dimensional data% fitting by two Gaussians and linear fit

    x=datax;y=datay;z=dataz;[M,N]=size(z);

    A=ones(M*N,1);

    p0=ones(M,1)*x;A=[A p0(1:M*N)];

    p0=y*ones(1,N);A=[A p0(1:M*N)];

    p0=[0.2 0.4 4 2];p1=[-1 -0.2 10 -1];p2=(A\(z(1:M*N)));

    dp=1;n=0;

    while (dp>0.001) & (n

  • 8/13/2019 Mat Lab Script

    51/52

    z0=p0(4)*exp(-p0(3)*(y-p0(2)).^2)*exp(-p0(3)*(x-p0(1)).^2);

    z0=z0+p1(4)*exp(-p1(3)*(y-p1(2)).^2)*exp(-p1(3)*(x-p1(1)).^2);z0=z0+p2(1)+p2(2)*ones(M,1)*x+p2(3)*y*ones(1,N);

    figure;

    mesh(datax,datay,z0); title(Smooth approximation to data);

    figure;

    mesh(datax,datay,z-z0); title(Remainder/noise);

    %%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%%

    function hatF=lsq_bell(p,x,y,z);

    % p=[hatx haty a c] is treated as independent argument of hatF

    % x (row vector of x mesh),

    % y (column vector of y mesh),

    % z (matrix of data over rectangular xy-mesh)% are treated as parameters for hatF

    % To be efficient, vectorization has been used as much as possible.

    x1=exp(-p(3)*(x-p(1)).^2);

    y1=exp(-p(3)*(y-p(2)).^2);

    hatF=0.5*sum(sum((z-p(4)*y1*x1).^2));

    Next, execute nlsq_datafit. Change the starting values (e.g., to zero val-ues for all parameters), and experiment with the stopping criteria (currently,n>=20 and dp

  • 8/13/2019 Mat Lab Script

    52/52

    part of the optimization toolbox). On the other hand, critical points of

    smooth nonlinear optimization problems (minima/maxima/saddle points ofa general differentiable optimization functional F(y)) are characterized asthe solutions of the system

    F(y) = 0

    Fy1

    (y1, y2, . . . , yN) = 0

    Fy2

    (y1, y2, . . . , yN) = 0

    . . .FyN

    (y1, y2, . . . , yN) = 0

    (4.3)

    (the so-called first-order necessary conditions in unconstrained opimization).

    Thus, optimization and nonlinear system solving have a lot in common. E.g.,our data smoothing problem can be recast as a system of 11 nonlinear equa-tions for the 11 parameters in the vector p (which in turn can be solvedusing a nonlinear least-squares formulation). The partial derivatives of thefunctional F(p) needed in (4.3) can easily be determined.