18 03sc fall 2011 differential equations

The Exponential Function

Of primary importance in this course is the exponential function

x(t) = eat ,

where a is a constant. We will assume you are completely familiar with the properties and graphs of this function.

Properties:

1. e0 = 1.

2. eat+c = eceat .

3. eat is never 0.

4. If a > 0 then lim eat = ∞ and lim eat = 0. t→∞ t→−∞

5. If a < 0 then lim eat = 0 and lim eat = ∞. t→∞ t→−∞

6. For any positive a, eat grows much faster than any polynomial.

Examples. lim et/t3 = ∞, lim te−t = 0. t ∞ t ∞→ →

Graphs

t

y

.5 1 1.5 2−1−2

1

2

3

4

5

6

y = et

t

y

1 2−1−2

1

2

3

4

5

6

y = e−t

Fig. 1. Graphs of et and e−t .

Variables and Parameters

1. Independent and Dependent Variables

When we write a function such as

f (x) = 3x2 + 2x + 1

we say that x is an independent variable: it can be freely set to any value (or any value within the given domain) and the value of the function is then computed.

When we give a name to the value of the function, such as

y = 3x2 + 2x + 1 or y = f (x)

we say that y is a dependent variable. That is, the value of y depends on the value we choose for x.

We can have systems of equations with more than one dependent variable. For example,

x = t2 − 1 y = 3et .

Here the dependent variables x and y depend on the independent variable t.

We can have functions with more than one independent variable. For example,

x = st2 − t − s

Here the independent variables are t and s and the dependent variable is x.

And, of course, we can have more than one of each:

x = st2 − t − s y = 3et+s .

As a matter of notation (often referred to by mathematicians as “abuse of notation”) we can use the dependent variable to also denote the function. So, for example, we can write

x = x(t) = t2 − 1.

Most of what we do will involve ordinary differential equations. These have only one independent and one dependent variable. Differential equations arise from many sources, and the independent variable can signify

Variables and Parameters OCW 18.03SC

many different things. Nonetheless, very often it represents time, and the dependent variable is some dynamical quantity which depends upon time. For this reason, in this course we will often use t for the independent variable.

2. Parameters

Parameters are similar to variables –that is, letters that stand for numbers– but have a different meaning. We use parameters to describe a set of (usually) similar things. Parameters can take on different values, with each value of the parameter specifying a member of this set of similar objects.

An example should make this clear. In calculus you learned to find the antiderivative (integral) of t2. There are many functions whose derivative is t2. For example,

t3/3 + 2 or t3/3 + π.

So, to give the full answer we write � t2 dt = t3/3 + c.

where c is called the constant of integration. In this case, each value of c specifies a single antiderivative. We call c the parameter of the set of all the antiderivatives of t2. Each value of the parameter c specifies a single antiderivative.

Sets are written formally using curly braces, e.g., {t3/3 + c : c any number}, but we will rarely do this. For example, we will write,

x(t) = t3/3 + c, where c is an arbitrary constant. (1)

This means a set of functions x = x(t) parametrized by c.

Sets can depend on more than one parameter. For example,

x e−t 7t(t) = c1 + c2e− where c1, c2 are arbitrary constants. (2)

Because each of the functions in (1) are similar –they all have a family resemblance– we say equation (1) gives a 1-parameter family of functions. Likewise, we say (2) gives a 2-parameter family of functions. You see the pattern!

2

Notations for Derivatives

We will writedy

, y� and Dydx

to all mean the derivative of y with respect to x. Only the first one specifies the independent variable x. In the other two you can only determine the independent variable from context.

When the independent variable is time t we will usually adopt the .physicists’ notation x for the derivative.

For second derivatives we have

d2y = y�� = D2y

dx2

all mean the second derivative of y with respect to x. If x = x(t) is a func..tion of time we will also write x.

For higher derivatives we will use the notations

dny = y(n) = Dny

dxn

to mean the nth derivative.

�

Differential Equations

1. Definition of Differential Equations

A differential equation is an equation expressing a relation between a function and its derivatives. For example, we might know that x is a function of t and .. .

x + 8x + 7x = 0. (1)

or perhaps the relation is more complicated, like

xx(5) + cos(t)etx + (x��x�x)6 = sin(5t). (2)

When the function in the differential equation has a single independent variable we call it an ordinary differential equation. That is, the derivatives are ordinary derivatives, not partial derivatives. This course is almost exclusively concerned with ordinary differential equations.

The Order of a Differential Equation The order of a differential equation is the order of the largest derivative appearing in it. Equation (1) is a second order differential equation. Equation (2) is a fifth order equation since the highest derivative is x(5) (in the first term).

2. Solving a Differential Equation

Solving a differential equation means finding a function that satisfies the equation. For many equations it can be hard or impossible to find a solution. One thing that is easy however is to check a proposed solution. We demonstrate with a few examples.

Example 1. Checking a Solution By Substitution Verify the y(t) = e3t is a solution to the differential equation

. y = 3y. (3)

Solution. To do this we simply substitute y = e3t into (3), and check that .the equation holds. On the left hand side of (3) we have y = 3e3t . On the right hand side we have 3y = 3e3t. Since both sides are equal, y = e3t is a solution.

Example 2. Rejecting a Solution by Substitution Show that y(t) = t3 is not a solution to the differential equation

. y = y/t. (4)

Differential Equations OCW 18.03SC

Solution. Again, we substitute the expression for y into (4).. Left hand side: y = 3t2.Right hand side: y/t = t2. Since the two sides are not equal, y = t3 is nota solution.

3. Parametrizing the Set of Solutions of a Differential Equation

Differential equations usually have more than one solution. We can describe them all at once using a parameter.

Example. Find all the solutions to .. x = 2t (5)

This is a standard calculus problem. Integrating twice and remembering to include the constants of integration gives

t3 x(t) = + c1t + c2,

3

where c1 and c2 are arbitrary constants. This expression gives a parametrization of the set of solutions to equation (5). The constants c1 and c2 are parameters. Every choice of c1 and c2 gives a different solution to (5). For example, x = t3/3 + 2t + 1 and t3/3 + πt + 2.718 are both solutions.

4. Initial Value Problems

Sometimes we have a differential equation and initial conditions. Together they make up an initial value problem. The meaning of the term initial conditions is best illustrated by example.

..Example. Solve the initial value problem x . = 2t with the initial conditions x(1) = 1, x(1) = 2. Solution. In the previous example we found the general solution of this differential equation

t3 x(t) = + c1t + c2.

3 We use the initial conditions to find the values of c1 and c2.

. x(t 2 .

) = t + c1 ⇒ x(1) = 1 + c1 = 2.x(t) = t3/3 + c1t + c2 ⇒ x(1) = 1/3 + c1 + c2 = 1.

Solving for c1 and c2 we get c1 = 1, c2 = −1/3. Thus, the solution to the initial value problem is

x(t) = t3/3 + t − 1/3.

2

Differential Equations OCW 18.03SC

5. Acronyms

It will be convenient at times to allow ourselves to use acronyms. Some of the most common are

1. Differential equation (DE).

2. Ordinary differential equation (ODE).

3. Initial value problem (IVP).

4. Initial conditions (IC).

3

Solution to an ODE

Quiz: Solution to an ODE.Which of the following is a solution to the ODE dy/dx = 2y + 1?

Choices:

a) y = ce2x − 1.

b) y = x2 + x + c.

c) y = ex/2 + c.

d) y = ce2x − 1/2.

e) y = e2x + c.

f) None of the above

Answer: (d)

This is a little long because at this point our only strategy is to check each potential solution by substitution. (We will remedy that soon!) Briefly:

a) Left side: dy/dx = 2ce2x . Right side: 2y + 1 = 2ce2x − 2 + 1 = 2ce2x − 1. Not equal. ⇒

b) Left side: dy/dx = 2x + 1. Right side: 2y + 1 = 2x2 + 2x + 2c + 1. ⇒Not equal.

c) Left side: dy/dx = 12 e

x/2. Right side: 2y + 1 = 2ex/2 + 2c + 1. Not⇒equal.

d) Left side: dy/dx = 2ce2x . Right side: 2y + 1 = 2ce2x − 1 + 1 = 2ce2x . ⇒Equal! This is the answer.

e) Left side: dy/dx = 2e2x . Right side: 2y + 1 = 2e2x + 2c + 1. Not⇒equal.

Introduction: The Most Important DE

1. The Most Important DE

The most important differential equation we will study is.y = ay. (1)

In words the equation says

the rate of change of y is proportional to y.

It is hardly an exaggeration to say that much of 18.03 is an elaboration on this fundamental equation. In any case, we need to understand this DE and its solutions thoroughly.

Because of its importance we will write down some other ways you might see it

y . = ay;

dy = ay(t); y� = ay; y

. − ay = 0. (2)dt

You should recognize all these as the same equation.

The solution to this equation is

y(t) = Ceat ,

where C is any constant. This is easily checked by substitution. Again, because this equation is so important we show the details.

Left side of (1): y . = aCeat .

Right side of (1): ay = aCeat .

Since, after substitution the left side equals the right side, y(t) = Ceat is indeed a solution of (1).

Because of the exponential in the solution equation (1) is said to model expontial growth (when a > 0) or decay (when a < 0). The constant a is known as the growth or decay constant.

In this course we will learn many techniques for solving differential equations. We will test almost all of them on equation (1). You should, of course, understand how to use these techniques to solve (1). However: whenever you see this equation you should remind yourself it models exponential growth or decay and know the solution without computation.

Other Basic Examples

1. Other Basic Examples

Here are some basic examples of DE’s taken from math and science. Except for example 1 we will not give solutions. We will do that and more with these DE’s as we go through the course.

Example 1. (From Calculus) dy

Solve for y satisfying = 2xdx

Solution. This problem is just asking for the anti-derivative of 2x:

y(x) = x2 + c.

Notice that there are many solutions, parametrized by c. An expression like this, which parametrizes all the solutions is called the general solution.

Example 2. (Heat Diffusion) A body at temperature T sits in an environment of temperature TE. Newton’s law of cooling models the rate of change in temperature by

T' = −k(T − TE),

where k is a positive constant. Note, the minus sign guarantees that the temperature T is always heading towards the temperature of the environment TE.

Example 3. (Newton’s Law of Motion: Constant Gravity) Near the earth a body falls according to the law

d2y = −g,

dt2

where y is the height of the body above the Earth and g is the acceleration due to gravity, 9.8 m/sec2.

Example 4. (Newton’s Law of Gravitation) Newton’s law of gravity says that the acceleration due to gravity of a body at distance r from the center of the Earth is

d2r 2 = −GME/r ,dt2

Other Basic Examples OCW 18.03SC

where ME is the mass of the Earth and G is the universal gravitational constant.

Example 5. (Simple Harmonic Oscillator: Hooke’s Law) Suppose a body of mass m is attached to a spring. Let x be the amount the spring is stretched from its unstretched equilibrium position. Hooke’s law combined with Newton’s law of motion says

.. .. mx = −kx ⇔ mx + kx = 0,

where k is the spring constant. The minus sign indicates that the force always points back towards equilibrium, as it does in the real world.

Example 6. (Damped Harmonic Oscillator) If we add a damping force proportional to velocity to the spring-mass system in example 5, we get

.. . .. . mx = −kx − bx ⇔ mx + bx + kx = 0,

.here −bx is the damping force and b is called the damping constant.

Example 7. (Damped Harmonic Oscillator with an External Force) If we add a time varying external force F(t) to the system in example 6, we get .. . .. .

mx = −kx − bx + F(t) ⇔ mx + bx + kx = F(t).

2

Separation of Variables

1. Separable Equations

We will now learn our first technique for solving differential equation. An equation is called separable when you can use algebra to separate the two variables, so that each is completely on one side of the equation. We illustrate with some examples.

Example 1. Solve y' = x(y − 1) dy

Solution. We rewrite the equation as = x(y − 1). Then separate the dx

variables dy

= x dx. y − 1

Next we integrate both sides. dy x2 = x dx ⇔ ln |y − 1| + c2 = + c1.

y − 1 2

We can amalgamate the two constants of integration into one constant:

2xln |y − 1| = + c3.

2

(We label the constant of integration c3 so we’ll have c still available later.) Next we solve for y as a function of x.

|y − 1| = ex2/2+c3 = ec3 ex2/2.

The absolute value signs can be removed, but then the right hand side might be positive or negative. We write this as

c3 ex2/2 c3 ex2/2y − 1 = ±e ⇔ 1 + ±e .

Finally we replace the constant ±ec3 by C to get the solution 2/2 y(t) = 1 + Cex

dy Note. For the more rigorously minded an expression like = x dx

y − 1 might give pause. However, this formal method is justified by the chain rule, in the same way change of variable (u-substitution) is justified for integration.

Separation of Variables OCW 18.03SC

1.1. Lost Solutions

Example 2. Find all the solutions to the DE

' y = 2x(1 − y)2. (1)

Solution. First, note there is a constant solution: y(x) = 1. It is easy to see this is a solution by substituting it into (1) –both sides of the equation become 0. We need to note this because, as we will see, the separation of variables method will not find this particular solution.

Now let’s solve the DE by separation of variables.

dy 1. Separate variables: = 2x dx.

(1 − y)2

12. Integrate: = x2 + C.

1 − y 1

3. Solve for y: y = 1 − . x2 + C

Notice that the constant solution y(x) = 1 is not in the parametrized family found by separation variables. We call this a lost solution because it is lost by separation of variables.

How did it get lost? The answer is in step (1) above, where the term dy

is only valid if y = 1. (1 − y)2

In general, for the separable DE y ' = f (x)g(y), all the roots of g(y) give lost (constant) solutions.

Example 3. Find all the lost solutions of y ' = (x + 1)ex(y2 − 8y + 7).

Solution. The factor y2 − 8y + 7 has roots y = 1 and y = 7. Therefore the lost solutions are the constant functions y(x) = 1 and y(x) = 7.

1.2. The Most Important DE

Even though we already know the solution, we should test our new technique on the DE for exponential growth/decay.

.Example 4. Solve y = ky.

dy Solution. Separate variables: = k dt.

y

Integrate: ln |y| = kt + c1. kt+C1 kt Exponentiate: |y| = e = ec1 e .

2

Separation of Variables OCW 18.03SC

Remove absolute value: y = ±ec1 ekt .

Let the constant ±ec1 kt= C: y = Ce .

All solutions to the DE are y t Cekt ( ) = .

If you look carefully you’ll see we did one rather sneaky thing. The solution y(t) = 0 is a lost solution, yet it appears to have been found by the separation of variables (set C = 0). What happened is that when we renamed ±ec1 as C we should have noticed that the exponential is never 0, so C = 0. Essentially, we included the lost solution by being a little sloppy and then getting lucky. We do not recommend this technique as a way to do mathematics!

3


Quiz: Separation of Variables.What is the general solution to the ODE dy/dx = 2y + 1? (Use separationof variables.)

Choices:

a) y = Ce2x − 1.

b) y = Cex/2 − 2.

c) x = y2 + y + C.

d) y = ex/2 + C.

e) y = Ce2x + 1.

f) y = Ce2x − 1/2.

g) y = e2x + C.

h) None of the above.

Answer: Separate variables: dy/(2y + 1) = dx

Integrate both sides: (1/2)ln|2y + 1| + c1 = x + c2.

Amalgamate the constants: ln |2y + 1| = 2x + c3.

Exponentiate and solve (if possible) for y in terms of x:

|2y + 1| = ec3 e2x ⇒ 2y + 1 = Ce2x ⇒ y = Ce2x − 1/2.

So the answer is: (f)


Quiz: Separation of Variables.What is the general solution to the ODE dy/dx = 2y + 1? (Use separationof variables.)

Choices:

a) y = Ce2x − 1.

b) y = Cex/2 − 2.

c) x = y2 + y + C.

d) y = ex/2 + C.

e) y = Ce2x + 1.

f) y = Ce2x − 1/2.

g) y = e2x + C.

h) None of the above.

Pick what you think is the correct choice and then look at the answer.

Is it Separable?

Quiz: (Is it Separable?.) Is y� + xy = x separable?

Choices:

a) Yes.

b) No.

Answer: Well, y� = x − xy = x(1 − y) so dy/(1 − y) = xdx: Yes, the equation is separable. We could go on to solve this, but you can do that on your own.

Is it Separable?


Choices:

a) Yes.

b) No.


Is it Separable?


Think about your answer and then look at the choices.

Solutions that Blow Up: The Domain of a Solution

Example 1. Solve the IVP y . = y2, y(0) = 1.

Solution. We can solve this using separation of variables.dy

Separate: y2 = dx.

Integrate: −1/y = x + C.

Solve for y: y = −1/(x + C).

Find C using the IC: y(0) = 1 = −1/C, therefore C = −1.

Solution: y = −1/(x − 1) = 1/(1 − x).

The graph has a vertical asympote at x = 1.

t

y

1 2−1−2

1

2

3

4

5

−1

−2

−3

−4

−5

Fig. 1. Graph of y = 1/(1 − x).

Starting at x = 0 the graph goes to infinity as x 1. Informally, we say →y blows up at x = 1. The graph has two pieces. One is defined on (−∞, 1) and the other is defined on (1, ∞). For technical reasons we prefer to say that we actually have two solutions to the DE. We indicate this by carefully specifying the domain of each.

y(x) = 1/(1 − x) y in the interval (−∞, 1) (1)

y(x) = 1/(1 − x) y in the interval (1, ∞). (2)

Thus, the solution to the IVP in this example is solution (1).

Solutions that Blow Up: The Domain of a Solution OCW 18.03SC

The rule being followed here is that solutions to ODE’s have domain consisting of a single interval. The example shows one reason for this: starting at (0, 1) on solution (1) there is no way to follow the solution continuosly to solution (2).

2

Modeling by First Order Linear ODE’s

1. Introduction

If we have a DE which models a situation involving a physical quantity y(t) then solving the DE means finding the unknown function y. Knowing the possible solutions y allows to understand the physical system. Of course someone needs to build the DE doing the modeling. In this note we will see how to do this for some real systems.

2. The savings account model

Modeling a savings account gives a good way to understand the significance of many of the features of a general first order linear ordinary differential equation.

Write x(t) for the number of dollars in the account at time t. It accrues interest at an interest rate r. The interest rate has units of percent/year. The more money in the account the more interest you earn. At the end of an interest period of Δt years (e.g. Δt = 1/12, or Δt = 1/365) the bank adds r x(t) Δt dollars to your account. This means the change Δx in your · · account is

Δx = rx(t)Δt .

r has units of (years)−1. Mathematicians and some bankers like to take things to the limit. Rewrite our equation as

Δx = rx(t) ,

Δt

and suppose that the interest period is made to get smaller and smaller. In the limit as Δt 0, we get the differential equation →

. x = rx

One of the beautiful facts about this type of modeling is that it covers more complicated situations. In our computation, there was no assumption that the interest rate was constant in time; it could well be a function of time, r(t). In fact it could have been a function of both time and the existing balance, r(x, t). Banks often do make such a dependence—you get a better interest rate if you have a bigger bank account. If x is involved, however, the equation is what is called nonlinear and we will not consider that case in this session.

Modeling by First Order Linear ODE’s OCW 18.03SC

Now suppose we make contributions to this savings account. We’ll record this by giving the rate of savings, q. This rate has units dollars per year, so if you contribute every month then the monthly payments will be q Δt with Δt = 1/12. This payment also adds to your account, so, when we divide by Δt and take the limit, we get

. x = rx + q.

Once again, your rate of payment into the account may not be constant in time; we might have a function q(t). Also, we can allow q(t) to be negative, which corresponds to withdrawing money from the account.

What we have, then, is the general first order linear ODE: . x − r(t)x = q(t). (1)

3. Linear insulation

Here is another example of a linear ODE. The linear model here is not as precise as in the bank account example.

A cooler insulates my lunchtime root beer against the warmth of the day, but ultimately heat penetrates. Let’s see how you might come up with a mathematical model for this process. You can jump right to equation (2) if you want, but we would like to spend a some time talking about how one might get there, so that you can carry out the analogous process to model other situations.

The first thing to do is to identify relevant parameters and give them names. Let’s write t for the time variable, x(t) for the temperature inside the cooler, and y(t) for the temperature outside.

Let’s assume that the insulating properties of the cooler don’t change over time. (We’re not going to watch this process for so long that the aging of the cooler itself becomes important! ) However, the insulating properties probably do depend on the inside and outside temperatures. Insulation affects the rate of change of the temperature: the rate of change at time t of temperature inside depends upon the temperatures inside and outside at time t. This gives us a first order differential equation of the form

. x = F(x, y)

Now it’s time for the next simplifying assumption, namely that this rate of change depends only on the difference y − x between the temperatures,

2

Modeling by First Order Linear ODE’s OCW 18.03SC

and not on the temperatures themselves. This means that

. x = f (y − x)

for some function f of one variable. If the temperature inside the cooler equals the temperature outside, we expect no change. This means that f (0) = 0.

Now, any reasonable function has a tangent line approximation, and since f (0) = 0 we have

f (z) ≈ kz .

That is, when |z| is fairly small, f (z) is fairly close to kz. (From calculus you know that k = f �(0), but we won’t use that here.) When we replace f (y − x) by k(y − x) in the differential equation, we are linearizing the equation. We get the ODE .

x = k(y − x).

The final assumption we are making, in justifying this last simplification, is that we will only use the equation when z = y − x is reasonably small— small enough so that the tangent line approximation is reasonably good. For large temperature differences the linearized model will not generally give realistic results.

We can write this equation as

. x + kx = ky. (2)

This is Newton’s law of cooling.

The constant k is called the coupling constant. It mediates between the two temperatures. It will be large if the insulation is poor, and small if it’s good. If the insulation is perfect, then k = 0. The factor of k on the right might seem odd, but it you can see that it is forced on us by checking units: the left hand side is measured in degrees per hour, so k must be measured in units of (hours)−1.

We can see some general features of insulating behavior from this equation. For example, the times at which the inside and outside temperatures coincide are the times at which the inside temperature is at a critical point:

. x(t1) = 0 exactly when x(t1) = y(t1). (3)

3

Geometric Methods: Introduction

In studying the first order ODE y� = f (x, y) the main emphasis is on learning different ways of finding explicit solutions. However, you should realize that most first order equations cannot be solved explicitly. For such equations, one tool we can resort to is graphical methods. Mostly we will use the computer to make the visualizations. But we will also learn to carry them out by hand to give rough qualitative information about how the graphs of solutions to the differential equation look geometrically.

We first introduce the basic protagonists of graphical methods: direction fields, isoclines and integral curves; the latter correspond to solutions. Look out for the intersection principle, a key theorem about them. The isoclines applet is a great tool for getting a feel for geometric methods; make sure to familiarize yourself with it. It will be particularly important for illustrating the power of graphical methods for understanding long-term behavior of solutions –the final topic in this session.

Direction Fields, Isoclines, Integral Curves

Graphical methods are based on the construction of what is called a direction field for the equation y� = f (x, y). To get this, we imagine that through each point (x, y) of the plane is drawn a little line segment whose slope is f (x, y).

dy Example. = 2x.

dx

x

y

Notice that the slope f (x, y) does not depend on y here. It is invariant under vertical translation.

In practice, the segments are drawn in at a representative set of points in the plane; if a computer draws them, the points are (usually) evenly spaced in both directions. If drawn by hand, however, they are not, because a different procedure is used, better adapted to human speed. To construct a direction field by hand, draw in lightly (or in dashed lines) what are called the isoclines for the equation y� = f (x, y). These are the one-parameter family of curves given by the equations

f (x, y) = m, m constant.

Along a given isocline, the line segments all have the same slope m; this makes it easy to draw in those line segments, and you can put in as many as you want. (Note: “iso-cline” = “equal slope”.)

Example. The figure below shows a direction field for the equation

y� = x − y.

The isoclines are the lines x − y = m, two of which are shown in dashed lines, corresponding to the values m = 0, −1.

Direction Fields, Isoclines, Integral Curves OCW 18.03SC

y' = x - y

-1 0 1 2 3 4

-1

0

1

2

3

4

x

ym=0m=-1

The m = 0 isocline is of special interest, as the direction field is horizontal along it; it is called the nullcline.

Once you have sketched the direction field for the equation y� = f (x, y) by drawing some isoclines and drawing in little line segments along each of them, the next step is to draw in (with a solid line) curves which are at each point tangent to the line segment at that point. Such curves are called integral curves or solution curves for the direction field. Their significance is this:

The integral curves are the graphs of the solutions to y� = f (x, y). By definition, this is the curve y = y(t) defined so that its slope at the point (x, y) is f (x, y).

Two integral curves (in solid lines) have been drawn for the equation y� = x − y. Notice they have the same slope as the direction field at every point they pass through.

Remark. While it is not compulsory to use light or dashed lines for isoclines, and solid lines for integral curves, it is a very good habit, especially when working by hand.

2

Isoclines

‘

Exercise. What are the isoclines for y� = y? Make a large diagram, and draw the isoclines for m = -2, -1, 0, 1, 2; use these to sketch the direction field. Draw some integral curves; how many different types of behaviors do there seem to be?

Answer.

x

y

m = −2

m = −1.5

m = −1

m = −0.5

m = 0

m = 0.5

m = 1

m = 1.5

m = 2

The isoclines are horizontal lines y = m. We can see in the figure three types of behavior for the integral curves. We know by solving the DE that they are given by y(x) = Cex, and these types are classified by the sign of C: positive, zero, or negative.

Remark. As the slope field is invariant under horizontal translation, integral curves are horizontal translations of each other. This will be discussed in much greater detail in the session on autonomous equations.

Numerical Methods: Introduction

The study of differential equations has three main facets:

• Analytic methods (also known as exact or symbolic methods).

Geometric methods. •

Numerical methods. •

In the first two sessions we introduced some of the tools from the first two categories; in this session, some methods from the third are presented.

Before proceeding, one should stress that most differential equations cannot be solved exactly; the importance of geometric and numerical methods should not be underestimated.

Most of the session is devoted to one of the most basic numerical technique, Euler’s method. We will learn how to implement it, and work through several examples. However, the method has limitations: without sufficient caution, it can often return answers with high errors. We will learn how to try to control and estimate these errors, and about some pitfalls to avoid. In the final part of this session, we will be introduced briefly to some more sophisticated and accurate numerical techniques.

Motivation and Implementation of Euler’s Method

1. What Would One Use Numerical Methods For?

The graphical methods described in the previous session give one a quick feel for how the solutions to a differential equation behave; they can also be very accurate at predicting long-term behaviour, e.g. in the presence of funnels. However, when the ’medium range’ solutions must be known accurately, and the equation cannot be solved exactly, numerical methods are usually the best option.

Also, even when an equation can be solved with an exact formula, it still might not be straightforward to compute values of a solution. For example, the equation y� = y with initial condition y(0) = 1 can be solved exactly: y(x) = ex . The number e is the value y(1). But how do you find out that in fact e = 2.718282828459045.... ? Here, too, you would use some kind of numerical methods. For ODE’s the simplest numerical method is called Euler’s method.

2. Review: The Tangent Line Approximation

Consider a function y(x), and a point (a, y(a)) on its graph. Call Ta(x) the function describing the tangent line to the graph at this point, The slope of the tangent line is y�(a) and it satisfies the equation

Ta(a + h) = y(a) + y�(a) h·

Suppose we are given a differential equation y� = f (x, y), with initial condition y(a) = c. One simple way to approximate y near a is to use the tangent line at a:

y(a + h) ≈ y(a) + y�(a) h· For example, consider the differential equation y� = y, with initial con

dition y(0) = 1. The tangent line approximation gives:

y(1) ≈ y(0) + y�(0) × 1 = 1 + 1 = 2

Of course, we know y(x) = ex, so y(1) = e ≈ 2.718. So, our estimate of 2 is rather crude. Suppose instead we were trying to approximate

√e

(= 1.649...). We get √

e = y(1/2) ≈ y(0) + y�(0) × 0.5 = 1 + 0.5 = 1.5

Motivation and Implementation of Euler’s Method OCW 18.03SC

a better estimate. Graphically, we can see the tangent line separates from the curve. The bigger the step h in the x direction the farther away the tangent line is from the curve.

The point is that the tangent line approximation works best for small h. The idea behind Euler’s method is to use a sequence of successive tangent line approximations, each of them with a fairly ’small’ h.

3. Euler’s method

We want to estimate the solution (integral curve) to y� = f (x, y) passing through (x0, y0). It is shown as a curve in the picture. We first choose a step size, denoted h. Starting at (x0, y0) we approximate the integral curve over the interval [x0, x0 + h] by the tangent line, which has slope f (x0, y0). (This is the slope of the integral curve, since y� = f (x, y).) This takes us as far as the point (x1, y1), which is calculated by the equations (see the picture)

x1 = x0 + h, y1 = y0 + h f (x0, y0) .

x

y

1/2 1

11.5

2

ey = ex

x

y

h

y1 − y0

(x0, y0)

(x1, y1)slope f(x0, y0)

x0 x1

h

Now we are at (x1, y1). We repeat the process, using as the new approximation to the integral curve the line segment having slope f (x1, y1). This takes us as far as the next point (x2, y2), where

x2 = x1 + h, y2 = y1 + h f (x1, y1) .

We continue in the same way. The general formulas telling us how to get from the (n − 1)-st point to the n-th point are

xn = xn−1 + h, yn = yn−1 + h f (xn−1, yn−1) . (1)

In this way, we get an approximation to the integral curve consisting of line segments joining the points (x0, y0), (x1, y1), . . . as shown in the figure below.

2

Motivation and Implementation of Euler’s Method OCW 18.03SC

x

y

x0 x1 x2 x3

Integral curve

Euler approx.

We will call the line segments "Euler struts", and their union the Euler polygon. It is an approximation to the integral curve y = y(x).

In doing a few steps of Euler’s method by hand, as you are asked to do in some of the exercises to get a feel for the method, it’s best to arrange the work systematically in a table.

Example. For the IVP: y� = x2 − y2, y(1) = 0, use Euler’s method with step size .1 to find y(1.2).

Solution. We use f (x, y) = x2 − y2, h = .1, and (1) above to find xn and yn.

n xn yn f (xn, yn) h f (xn, yn) 0 1 0 1 .1 1 1.1 .1 1.20 .12 2 1.2 .22

3

Exercise: An Example of Euler’s Method

‘

Exercise. Consider the first order ODE y� = f (x, y) = y2 − x with initial condition y(0) = −1. Estimate y(1), using Euler’s method with h = 0.5. Organize your answer in a table.

Answer. n xn yn f (x 2

n, yn) = yn − xn h f (xn, yn) 0 0.0 -1.00 1.00 0.50 1 0.5 -0.50 -0.25 -0.13 2 1.0 -.63

Exercise: An Example of Euler’s Method

‘

Exercise. Consider the first order ODE y� = f (x, y) = y2 − x with initial condition y(0) = −1. Estimate y(1), using Euler’s method with h = 0.5. Organize your answer in a table.

Euler’s Method: Exercises and Exploration

‘

Here are some problems for you to further explore the Euler’s method applet. Take time to play around with all three of them. Some of the phenomena you encounter will be explained in the next note in this session: Errors in Euler’s method.

Start by opening the Euler’s Method .

Concave and convex functions: overshooting and undershooting 1. Call up the DE y� = 0.5y + 1, use the initial point (−2, −1), and construct the Euler solutions for increasing x using the stepsizes h = 0.5, 0.25, and 0.125. Then compare with the actual solution. At each step, which is the best approximation? Is it too high or too low?

2. Now do the same for y� = y2 − 2y + 1, with initial point (−1, −1). Do you think you can predict when the approximation is going to be too high or too low?

A cautionary tale: separatrix crossing 3. Call up y� = y2 − x, use as the initial point (−0.98, 0). Select ’Actual’ and ’Euler 0.50’, and have the Applet draw the solutions. Then use the same initial condition with ’All Euler’. What happened? Can you describe the problem in the language of graphical solutions?

Errors In Euler’s Method

As we have seen with the applet, Euler’s method is rarely exact. In this section we try to understand potential sources of error, and find ways to estimate or bound it.

1. Common Error Sources

Let us stress this again: the Euler polygon is not an exact solution; the direction field at its vertices usually differs more and more from the direction field along the actual solution. At places where the direction field is changing rapidly, this can quickly produce very bad approximations: the variation of the direction field causes the integral curve to bend away from its approximating Euler strut. One trick ODE solvers use is to take smaller step sizes when the direction field is steep.

As a general rule Euler’s method becomes more accurate the smaller the step-size h is taken. However, if h is too small, round-off errors can appear and will accumulate, particularly on a pocket calculator; whenever in doubt, try to do all computations on a computer, keeping a high number of significant figures.

2. Estimating the Sign of the Error: Concave and Convex Functions

Can we predict whether the Euler approximation is too big or too small? In our first example when exploring the Euler Method applet they were too small. Pictorially, this was because the solution was curving upwards, leaving the polygons below it. How would we know without a picture whether a solution is “curving up” or “curving down”?

The mathematical concept corresponding to curving up is convexity: a function y(x) is called convex on an interval if y��(x) > 0 on that interval. Curving down is called concavity; the corresponding condition is y��(x) < 0. Any tangent to a convex function at a point lies below the function at that point. Thus, intuitively, if y��(x0) > 0, Euler’s estimate at x1 is likely to be too low. Similarly, if y��(x0) < 0, it is likely to be too high.

Let us do a worked example.

Example. We’ll use Euler’s method to estimate the value at x = 1.5 of the solution to y� = f (x, y) = y2 − x2 with y(0) = −1, using h = 0.5. We get the table:

Errors In Euler’s Method OCW 18.03SC

n xn yn f (xn, yn) h f (xn, yn) 0 0 -1 1 0.5 1 0.5 -0.5 0 0 2 1.0 -0.5 -0.75 -0.375 3 1.5 -0.875

We want to use the intuition developed in the paragraph above; how can we compute y��? Differentiate both sides of the equation y� = y2 − x2

dy2 with respect to x and use the chain rule for the term . to get:

dx

y�� = 2yy� − 2x

Thus y��(0) = 2(−1)(1) − 0 = −2. This means that the estimate is likely to be too large.

First: y = x − 1 with slope 1, so y� = y2 − x2 = (x − 1)2 − x2 = −2x + 1. When x is in the interval [0, 0.5] we have −2x + 1 ≤ 1. Second: y = −0.5 with slope zero, so y� = y2 − x2 = 0.25 − x2. When x is in [0.5, 1], this is nonpositive. Third: y = −0.75x + 0.25 with slope −0.75, so y� = y2 − x2 = −0.4375x2 −0.375x + 0.0625. We would like to compare this with -0.75 in the interval [1, 1.5]. At x = 1, we have equality, so it suffices to show that the first derivative of y� on the segment, or y�� = −0.875x − 0.375, is nonpositive for x in [1, 1.5], and it is.

3. Two cautionary tales

3.1. Dramatic overshoot

In the third example in our Euler Applet exploration we looked at the DE y� = y2 − x starting at (−0.98, 0). The actual solution was drastically different from any of the Euler polygons. This phenomenon is best explained in the language graphical solutions. There is a separatrix for y� = y2 − x that passes just above (-0.98,0); you can test this by playing around with the applet a little. The integral curve through (-0.98,0) remains below the separatrix; however, all the Euler polygons cross it, and subsequently behave like integral curves from the other side of the separatrix.

Depending on the initial point, this sort of phenomenon can be very difficult to avoid; one strategy can be to first study the equation using geometric techniques.

2

Errors In Euler’s Method OCW 18.03SC

3.2. Divergent estimates

Consider the IVP y� = y2, y(0) = 1. Let us try to estimate y(1) using Euler’s method. For h = 0.2, we get

n xn yn

n xn yn

0 0 1 1 0.2 1.2 2 0.4 1.49 3 0.6 1.93 4 0.8 2.68 5 1.0 4.11

(We omit the columns with f (xn, yn) and f (xn, yn)h.)

For smaller step sizes, we get the following estimates:h Estimate for y(1)0.1 37.6 0.05 91.25 0.02 238.21

What is going on? We can actually solve this equation explicitly, for instance with the separation of variables method of session one. The solution is:

y(x) = 1/(1 − x).

This is not defined for x = 1: as x 1−, y +∞.→ →

The lesson is that in practice, one should never simply choose a step size and accept the answer. You should try smaller and smaller h until the answer settles down. If it does, you have one good bit of evidence to accept the approximation; if it doesn’t, the method has failed. The computer does not eliminate the need to think!

3

Further Numerical Methods

Euler’s method is a first order method (no relation to first order equations). It is possible to show theoretically that for small enough h, the error in Euler’s method is at most C1h, where C1 is a constant that depends on the IVP. It is very hard to know ahead of time what C1 will be.

In the previous section, we saw that making h smaller was a way to decrease the error caused by the variability of the direction field. However, there are some more sophisticated methods that turn out to be even better.

1. General Approach

Looked at broadly Euler’s method is a way of stepping discretely from one point to the next to approximate the integral curve. The general formula for stepping from (xn, yn) to (xn+1, yn+1) is

xn+1 = xn + h, yn+1 = yn + mh,

where h is the stepsize in the x direction and m is the slope of the line we step along. In Euler’s method h is fixed ahead of time and m = f (xn, yn). (It would be more precise to write mn instead of m. We’ll use the simpler looking notation, with the understanding that m changes with each step.)

Other methods use other (and better) ways of choosing h and m. We start with some fixed stepsize methods. As the name suggests, we fix the stepsize h ahead of time and put all the work into finding m

2. The Improved Euler method

This is also called the Runge-Kutta 2 method or RK2, or the Heun method.

We start with the same data as for Euler’s method: an initial value problem y� = f (x, y), y(x0) = x0, and a step size h. We construct an RK2 polygon, made out of segments called RK2 struts, with endpoints (xn, yn). As before, xn+1 = xn + h. The difference between RK2 and Euler’s method is the rule for choosing the slope m for each strut. At each step we start by constructing the Euler strut. We let m be the average of the slope field at the two ends of the strut.

Example. Consider the differential equation y� = f (x, y) = y2 − x with initial condition y(0) = −1. Let us compute one step for the RK2 polygon with h = 1/2. Because m, x and y are reserved we’ll use the letters k, a and b for intermediate slopes and points.)

Further Numerical Methods OCW 18.03SC

1. Compute the slope at (x0, y0): k1 = f (0, −1) = 1. 2. Take an Euler step from (x0, y0) to (a, b): a = x0 + h = .5, b = y0 + k1h = −.5. 3. Compute the slope at (a, b): k2 = f (a, b) = f (.5, −.5) = −.25. 4. Average k1 and k2 to get m: m = (k1 + k2)/2 = .375. 5. Use m and h to take step from (x0, y0) to (x1, y1): x1 = x0 + h = .5, y1 = y0 + mh = −.8125.

You can check, e.g. by using the applet, that this brings us down closer to the actual solution curve than Euler’s method.

RK2 is a second order method: for small enough h, the error is at most C2h2, where the constant C2 depends on the IVP.

Each evaluation of the direction field takes time, which usually costs money. Euler’s method uses one evaluation per step, whereas RK2 uses two; therefore, if we want to compare efficiencies, we should compare Euler’s method with step size h to RK2 with 2h. In those cases, the error for Euler’s method is around C1h, whereas it is around C2(2h)2 = 4C2h2

for RK2. Even if C2 is larger than C1, for small enough h, the RK2 error will be significantly smaller than the Euler error. Besides, C2 is usually smaller than C1, which gives a second advantage to using RK2 over Euler’s method.

3. Runge-Kutta 4 method

This is usually shortened to RK4. It is a refinement of RK2; we start with the same data, and also build a polygon, whose segments are called RK4 struts. Again, at each step, the difference is in choosing the slope of the segment.

In RK4 you evaluate the direction field slope four times for each step. We won’t give the details, they are easy enough to look up.

Remark 1. While it’s straightforward to compute by hand, most people leave the computations in RK4 to a computer.

Remark 2. You might have noticed a pattern in the numbering of the Runge-Kutta techniques; Euler’s method is sometimes referred to as RK1.

RK4 is a fourth order method. For small enough h, its error is approximately C4h4. Again, the constant C4 depends on the IVP.

It is fair to compare the errors for Euler’s method with step size h, RK2 with step size 2h, and RK4 with 4h. Regardless of the values of C1, C2 and C4, for sufficiently small h, the RK4 error of C4(4h)4 will be significantly less

2

� �

Further Numerical Methods OCW 18.03SC

than the RK2 error of C2(2h)2 or the Euler error of C1h. Besides, C4 itself is usually smaller than C2 and C1.

Example. Let us go back to our original problem: estimating e by viewing it as the value at 1 of the solution to the initial value problem y� = y, y(0) = 1. We compare the errors of our three methods. In all cases, we use 1000 evaluations of the direction field.

Method Step size Error RK1 = Euler 0.001 1.3 × 10−3

RK2 = Heun 0.002 1.8 × 10−6

RK4 0.004 5.8 × 10−12

We can also estimate the constants Ci for this particular IVP: C1 1.3; C2 0.45; C4 0.023.

The (short) moral is that Euler’s method often offers poor precision, and that RK4 is essentially always the most accurate.

As you might have guessed, there are plenty of methods of higher order still; however, they also involve more overhead. Experience has shown that RK4 is a good compromise.

Remark. The initial value problem y = f (x) , y(a) = y0 has solution y(x) = y0 + a

x f (t)dt Our numerical methods for approximating y(x) correspond to integration approximation techniques:

• Euler’s method gives the the left end-point Riemann sum;

• RK2 gives the trapezoidal rule;

• RK4 gives Simpson’s rule.

4. Variable Stepsize methods

We saw that it is not wise to pick a single stepsize and accept the results of the Euler method. Likewise RK2 and RK4 can be fooled.

With the fixed stepsize methods you need to choose a value of h; do your computation; then redo it using stepsize h/2. You keep cutting the stepsize in half this until the answers stop changing. The variable step-size techniques carry this out at each step. There are an enormous number of such methods. What they have in common is estimating at each step whether the stepsize needs to be made smaller or can safely be made larger. In general, these provide the most accurate numerical methods at an acceptable cost in additional computation.

3

First Order Linear ODE’s: Introduction

Linear equations are the most basic and probably the most important class of differential equations. They will be the main focus of this course.

In this session we will introduce first order linear ordinary differential equations. That’s a long name, so we will typically shorten it to “first order linear equations” or “first order linear ODE’s”. In later sessions we will look at higher order linear equations.

After defining first order linear ODE’s, we will spend some time introducing the terminology that is used to describe them. In particular, we will borrow the language of systems and signals from engineering. Along the way we will see some examples of physical systems modeled by first order linear equations. Finally, we will state and prove the superposition principle. This principle is not difficult, but it is very important. We will show you several examples of superposition and remind you of its importance several times in this session.

�

First order Linear Differential Equations

To start we will define first order linear equations by their form. Soon, we will understand them by their properties. In particular, you should be on the lookout for the statement of the superposition principle and in later sessions a conceptual definition of linearity.

Definition. The general first order linear ODE in the unknown function x = x(t) has the form:

dx A(t) + B(t)x(t) = C(t). (1)

dt

As long as A(t) = 0 we can simplify the equation by dividing by A(t).

dx + p(t)x(t) = q(t) (2)

dt

We’ll call (2) the standard form for a first order linear ODE.

1. Terminology and Notation

The functions A(t), B(t) in (1), and p(t) in (2), are called the coefficients of the ODE. If A and B (or p) are constants (i.e. do not depend on the variable t) we say the equation is a constant coefficient DE.

.We use the familiar notations y� or y for the derivative of y. With some

. dy exceptions, we’ll use y = dt to mean the derivative with respect to time dy and y� for derivatives with respect to some other variable, e.g. y� = dx . If

there is any danger of confusion we’ll revert to the unambiguous Liebnitz dy notatiation: dy dx , etc. dt ,

2. Homogeneous/Inhomogeneous

If C(t) = 0 in (1) the resulting equation:

.A(t)x + B(t)x = 0

is called homogeneous1. Likewise, in standard form, x . + p(t)x = 0 is ho

mogeneous. Otherwise the equation is inhomogeneous.

1Homogeneous is not the same as homogenous (or homogenized). The syllable “ge” has a long e and is stressed in homogeneous, while the syllable “mo” is stressed in homogenous.

Now, letting the time interval Δt approach 0 we get the ODE dx = kx(d t) − t h.

Note: if the rates k and h are not constant, but vary with time, the modeling process will lead to the same differential equation:

dx dx = k(t)x(t) − h(t) or − k(t)x(t) =

dt dt −h(t).

Example 2. (Bank account) I have a bank account. It has x(t) dollars in it, i.e., x is a function of time. I can deposit money in the account and make withdrawals from it. The bank pays me interest for the money in my account. We will call the interest rate r, it has units of (year)−1.

2

First order Linear Differential Equations OCW 18.03SC

3. Examples

We will give two examples where we construct models that give first order linear ODE’s.

Example 1. In session 1 we modeled an oryx population x with natural growth rate k and harvest rate h:

. . x = kx − h, or x − kx = −h.

Fig. 1. Oryx. Image courtesy of Cape Town Craig on flickr.

We repeat the argument leading to this model. We start with the population x(t) at time t. A natural growth rate k means that after a short time Δt we would expect there to be approximately kx(t)Δt more oryx. However, in that same time hΔt oryx are harvested. So we have the net change in the oryx population:

ΔxΔx ≈ kx(t)Δt − hΔt = ⇒

Δt ≈ kx(t) − h.

http://www.flickr.com/photos/cdstrachan/4487656010/


In the old days a bank would pay interest at the end of the month on the balance at the beginning of the month. We can model this mathematically.

With Δt = 1/12, the statement at the end of the month will read:

x(t + Δt) = x(t) + rx(t)Δt + [deposits − withdrawals between t and t + Δt].

These days r is typically very small, say 1%/year = 0.01/year. And, you don’t get 1% each month! You get 1/12 of that.

You can think of a withdrawal as a negative deposit, so I will call everything a ’deposit’ and allow the sign to positive or negative.

Nowadays interest is usually computed daily. This is a step on the path to the enlightenment afforded by calculus, in which Δt 0 and the interest →is computed continuously.

In order to reach enlightenment, I want to record deposits minus withdrawals as a rate, in dollars per year. Suppose I contribute $100 sometime every month, and make no withdrawals. My total deposits up to time t, that is, my cumulative total deposit Q(t) has a graph like the following figure.

t

Q

112

212

312

412

512

100

200

300

400

Fig. 2. With periodic deposits Q(t) is a step function.

In keeping with letting Δt 0, we should imagine that I am making →this contribution continually at the constant rate of $1200/year. Then the graph of Q(t) is a straight line with slope 1200, shown in figure below. In this case, the derivative Q�(t) = q(t) is constant.

t

Q

112

212

312

412

512

100

200

300

400

Fig. 3. With continuous deposits the graph of Q(t) is a straight line.

3


In general, say I deposit at the rate of q(t) dollars per year. The value of q(t) might vary over time, and might be negative from time to time, because, with our convention, withdrawals are merely negative deposits.

So, (assuming q(t) is continuous),

x(t + Δt) ≈ x(t) + rx(t)Δt + q(t)Δt.

Now subtract x(t) and divide by Δt:

x(t + Δt) − x(t) ≈ rx + qΔt

Next, let the interest period Δt tend to zero:

. x = rx + q.

Note: q(t) can certainly vary in time. The interest rate can too. In fact the interest rate might depend upon x as well: a larger account will probably earn a better interest rate. Neither feature affects the derivation of this equation, but if r does depend upon x as well as t , then the equation we are looking at is no longer linear. So, for this example, let’s say r = r(t) and q = q(t).

We can put the linear ODE into standard form:

. x − r(t)x = q(t).

4

Is is Linear?

Quiz: Is it Linear?We will develop a theory of linear equations, complete with an algorithmfor solving them. It’s important to recognize them when you see them.

Which of the following are linear ODE’s? .1. x + x2 = t 2. x

. = (t2 + 1)(x − 1)

3. x . + x = t2

Choices: a) None b) (1) only c) (2) only d) (3) only e) All f) All but (1) g) All but (2) h) All but (3)

Answer: Equations (2) and (3) are linear; (1) is not: the correct answer is (f).

Is is Linear?


Which of the following are linear ODE’s? 1. x

. + x2 = t

2. x . = (t2 + 1)(x − 1)

3. x . + x = t2

Choices: a) None b) (1) only c) (2) only d) (3) only e) All f) All but (1) g) All but (2) h) All but (3)


Is is Linear?


Which of the following are linear ODE’s? .1. x + x2 = t 2. x

. = (t2 + 1)(x − 1)

3. x . + x = t2


Interpret the Graph

Quiz: Interpret the GraphReferring to the bank account example, at the point A on the graph of Q(t),what sort of transaction am I making?

t

Q

A

Choices: a) deposits b) withdrawals

Answer: (a), deposits. When the slope is positive Q is increasing; i.e. I am making deposits.

Solutions by Integrating Factors: Introduction

Our goal in this session is to derive formulas for solving both homogeneous and inhomogeneous first order linear ODE’s. For the inhomogeneous equations we will use what are called integrating factors.

The method of integrating factors is a beautiful technique for solving the general first order linear equations as well as some other types of DE’s. In later sessions we will learn other techniques that are easier and apply to higher order equations, but only apply in specific cases.

The main example in this session will be about the diffusion of heat. This is an interesting physical application and we will return to it several times in the course.

Finally we will connect the formula for the solution to the inhomogeneous equation with the superposition principle. The main point here will be to write the general solution to the inhomogeneous equation as the sum of a particular solution to that equation and the general solution to the corresponding homogeneous equation. In symbols,

x = xp + C xh.

A word of advice: pay special attention to what we mean by the phrase particular solution. The meaning differs slightly from the standard English sense and can be a little confusing at first.

�

�

Solutions to Linear First Order ODE’s

1. First Order Linear Equations

In the previous session we learned that a first order linear inhomogeneous ODE for the unknown function x = x(t), has the standard form

. x + p(t)x = q(t). (1)

(To be precise we should require q(t) is not identically 0.)

We saw a bank example where q(t), the rate money was deposited in the account, was called the input signal. We also saw an RC circuit example where the input signal was the voltage V(t) and q(t) = V �(t).

A first order linear homogeneous ODE for x = x(t) has the standard form . x + p(t)x = 0. (2)

We will call this the associated homogeneous equation to the inhomogeneous equation (1)

In (2) the input signal is identically 0. We will call this the null signal. It corresponds to letting the system evolve in isolation without any external ’disturbance’.

• In the bank example: if there are no deposits and no withdrawals the input is 0.

• In the RC circuit example: if the power source is turned off and not providing any voltage increase then the input is 0.

2. Solutions to the Homogeneous Equations

The homogeneous linear equation (2) is separable. We can find the solution as follows:

dx • Separate variables: x

= −p(t)dt.

• Integrate: ln |x| = − p(t)dt + c1. (We use c1 to save C for later.)

• Exponentiate: |x| = ec1 e− p(t)dt .

Solutions to Linear First Order ODE’s OCW 18.03SC

�

�

�

��

• Rename ec1 as C: |x| = C e− p(t)dt; C > 0.

• Drop the absolute value and recover the lost solution x(t) = 0:This gives the general solution to (2)

x(t) = C e− p(t)dt where C = any value. (3)

A useful notation is to choose one specific solution to equation (2) and call it xh(t). Then the solution (3) shows the general solution to the equation is

x(t) = Cxh(t). (4)

There is a subtle point here: formula (4) requires us to choose one solution to name xh, but it doesn’t matter which one we choose. We can say this somewhat awkwardly as choose an arbitrary specific solution.’ A typical choice is to set the parameter C = 1, but this is not necessary.

.Example. Solve x + 2tx = 0.

Solution.

dx • Separate variables: x

= −2tdt.

• Integrate: ln |x| = − 2tdt = −t2 + c1.

• Exponentiate and substitute C for ec1 : |x| = ec1 e−t2 = C e−t2

.

Drop the absolute value and also recover the lost solution: x(t) = C e−t2.•

In this example an obvious choice for xh is xh(t) = e−t2. It is clear the

general solution to the example is

x(t) = C xh(t) where C = any number.

3. Solution to Inhomogeneous DE’s Using Integrating Factors

We start with the integrating factors formula: the general solution to the .inhomogeneous first order linear ODE (1) ( x + p(t)x = q(t)) is

1 � x(t) =

u(t) u(t)q(t)dt + C , where u(t) = e p(t) dt . (5)

2


�

��

The function u is called an integrating factor.

This method, due to Euler, is easy to apply. We deduce it by the method of optimism, i.e., we introduce an integrating factor u and hope that it will help us.

Proof: We start with the product rule for differentiation

d . . (ux) = ux + ux.

dt and the equation (1): .

x + p(t)x = q(t).

Multiply both sides of the equation by some function u(t), whose value we will determine later: .

ux + upx = uq. (6)

In order to be able to apply the product rule we want the sum on the left . .hand side of the equation to have the form dt

d (ux) = ux + ux. There may be many functions u for which the left hand side has this form; we only need to find one of them. To do this, note that

d . . . . . dt (ux) = ux + upx ⇔ ux + ux = ux + upx ⇔ u = up.

The last equation is a separable DE for the unknown function u:

du = p(t) dt

u

and so: � ln |u| = p(t) dt

u = e p dt. (7)

Remember, we are looking for just one u, so any choice of anti-derivative of p(t) in equation (7) will do.

Now replace the left-hand side of (6) by dt d (ux) and solve for x:

.ux + upx = uq

d(ux) = uq

dt � u(t)x(t) = u(t)q(t)dt + c

1 x(t) = u(t)q(t)dt + c

u(t)

3


�

�

� �

This last equation is exactly the formula (5) we want to prove.

Example. Solve the ODE x . + 2x = e3t using the method of integrating

factors.

Solution. Until you are sure you can rederive (5) in every case it is worthwhile practicing the method of integrating factors on the given differential equation. (At the end, we will model a solution that just plugs into (5).)

Multiply both sides by u:

ux . + 2u(t)x(t) = u(t) e3t . (8)·

Next, find an integrating factor u so that the left-hand side is equal to dt d (ux) . .

(which equals ux + ux). . . .

ux + ux = ux + 2ux.u = 2u⇒

u(t) = e2t (we choose any one u that works).

Now substitute u(t) = e2t into (8), then replace the left-hand side by dt d (ux)

and solve for x.

d (e2tx) = e2te3t

dt

e2tx = 1

e5t + C (integrate the previous equation) ⇒ 5

x(t) = 1

e3t + Ce−2t (solve for x(t)).⇒ 5

Here is a model of the same solution using (5) directly.

Integrating factor: u(t) = e 2 dt = e2t (choose any one possibility). Solution:

1 � x(t) = u(t)e3t dt

u(t)

= e−2t e5t dt

= e−2t 1 e5t + C

5

= 1

e3t + Ce−2t .5

4


�

�

4. Comparing the Integrating Factor u and xh

Recall that in section 2 we fixed one solution to the homogeneous equation (2) and called it xh. The formula for xh is

xh(t) = e− p(t) dt ,

where we can pick any one choice for the antiderivative. Comparing this with the formula for the integrating factor

p(t)dt u = e

we get the following relationship between the two functions:

1 xh(t) = .

u(t)

The solution to the homogeneous equation (or for short the homogeneous solution) xh will play an extremely prominent role in the rest of the course.

5

Example: Heat Diffusion

Example. Heat Diffusion

Here we will model heat diffusion with a first order linear ODE. We will solve the DE using the method of integrating factors.

Every summer I put my root beer in a cooler, but after a while it still gets warm. Let’s model its temperature by an ODE. First we need to name the function that measures the temperature:

x(t) = root beer temperature at time t.

The greater the temperature difference between inside and outside, the faster x(t) changes. The simplest (linear) model of this is:

. x(t) = k(Text(t) − x(t)),

where k > 0 and Text(t) is the external temperature.

One check that this makes sense (with k > 0) is to note that when the outside temperature Text is greater than the inside temperature x(t), then . x(t) > 0 (since k > 0), so the temperature is increasing. Likewise,

.Text < x(t) x(t) < 0 the temperature is decreasing. ⇒ ⇒

This is indeed how heat behaves!

Rearranging the DE, we get the linear equation in standard form:

. x − kx = kText(t). (1)

This is Newton’s law of cooling; k could depend upon t and we would still have a linear equation, but let’s suppose that we are not watching the process for so long that the insulation of the cooler starts to break down!

Systems and signals analysis:

• The system is the cooler.

• The input signal is the external temperature Text(t).

• The output signal or system response is x(t), the temperature inside the cooler.

�

��

��

Example: Heat Diffusion OCW 18.03SC

Note that the right-hand side of equation (1) is k times the input signal, not the input signal itself. As usual, what constitutes the input and output signals is a matter of the interpretation of the equation, not of the equation itself.

To take a specific example, let

1 x(0) = 32◦F, k = and Text(t) = 60 + 6t in ◦ F,

3

where t denotes hours after 10AM. (That is, outside temperature is rising linearly.) We get the following differential equation and initial value:

. 1 x + x = 20 + 2t, x(0) = 32. (2)

3

Solution. Again, until you can do it every time you should practice rederiving the integrating factors formula. Here we will use it directly.

Integrating factor: u(t) = e 31 dt = e 3

1 t (choose any one possibility). Solution:

1 x(t) = u(t) (20 + 2t) dt + C

u(t) ·

= e−t/3 et/3(20 + 2t) dt + C

= e−t/3(60e 13 t + 6te

13 t − 18e

13 t + C) (using integration by parts)

1 x(t) = 60 + 6t − 18 + Ce− 3 t

1 = 42 + 6t + Ce− 3 t .

All that’s left is to use the initial condition to find C. We plug in t = 0, x(0) = 32 and solve for C.

32 = x(0) = 42 + C ⇒ C = −10.

The equation describing the temperature inside my cooler is:

x(t) = 42 + 6t − 10e−t/3.

We can use this to find how long it will take for my root beer to reach 60◦ F. (I don’t like it any warmer than that.) We need to solve

42 + 6t − 10e−t/3 = 60.

2

Example: Heat Diffusion OCW 18.03SC

It is probably easiest to graph this function and read the correct value off the graph.

t

◦F

3.521 2 3 4 5

15

30

45

60

Fig. 1. I have roughly 3.5 hours to enjoy my root beer.

Remark: At this point the method of integrating factors is the only technique we have to solve this problem. For many problems it is the only technique, but as mentioned in the session introduction, we will eventually learn easier methods that work in this case.

3

The Meaning of k

Quiz: The meaning of k. In the root beer cooling example the DE was:

. x(t) = k(Text(t) − x(t)).

What does it mean for k to be large?

Choices: 1. good insulation 2. bad insulation 3. nothing to do with insulation

Answer: When the insulation is good, k is small; when the insulation is bad k is large. When the insulation is perfect k is zero.

k is a coupling constant; when it is zero, the temperature inside the cooler is decoupled from the temperature outside. In the construction industry a number like k is pasted on windows; it’s called the U-value of the window.

Units

Quiz: UnitsLet x(t) be the temperature of my house in degrees Celsius with t in hours.Suppose it satisfies the ODE:

dx + kx = kTe(t).dt

1. What are the units on k?

2. What are the units on Te?

Choices: 1. Units on k:

a) degrees b) degees Celsius c) 1 d) k is dimensionless hour hour

2. Units on Te:

a) degrees b) degrees Celsius c) 1 d) Te is dimensionless hour hour

Answer:

1. The units on k are 1 Since x is in degrees Celsius and t has units hour : in hours, dx

dt has units degrees . Thus, kx has units degrees , which implies hour hourk has units 1

hour .

2. The units on Te are degrees Celsius: From the equation we see that Te has the same units as x.

Units


dx + kx = kTe(t).dt



Choices: 1. Units on k:

a) degrees b) degees Celsius c) 1 d) k is dimensionless hour hour

2. Units on Te:

a) degrees b) degrees Celsius c) 1 d) Te is dimensionless hour hour


Units


dx + kx = kTe(t).dt




��

Superposition and the Integrating Factors Solution

1. Another Proof of the Superposition Principle

The superposition principle is so important a concept that it is worth reviewing yet again. Here we will use the integrating factors formula for the solution to first order linear ODE’s to give another simple proof of this principle.

Recall, the standard first order linear ODE is . x + p(t)x(t) = q(t). (1)

We derived the integrating factors solution

x(t) = 1

u(t)q(t) dt + c , where u(t) = e �

p(t) dt , (2)u(t)

and where the integral is any specific choice of the antiderivative and c is the constant of integration.

The superposition principle says that if: . x1 is a solution to x + p(t)x(t) = q1(t)

and . x2 is a solution to x + p(t)x(t) = q2(t)

then for any constants a and b, ax1 + bx2 is a solution to . x + p(t)x(t) = aq1(t) + bq2(t).

More briefly, we can write

q1 � x1 and q2 � x2 aq1 + bq2 � ax1 + bx2. (3)⇒

To provide another way of thinking about this key principle, we’ll rephrase it again in physical terms. If equation (1) models a physical situation and we consider q(t) to be the input then the principle shown in (3) says superposition of inputs leads to superposition of outputs.

In fact, the proof takes only a few lines. Given the separate inputs q1(t) and q2(t), formula (2) gives the separate outputs ��

1 1 x1(t) = u(t)q1(t) dt + c1 and x2(t) = u(t)q2(t) dt + c2 u(t) u(t)

Now we use (2) to find the output for input q = aq1 + bq2. We will be able to choose any constant of integration, so, ahead of time, we choose

��

Superposition and the Integrating Factors Solution OCW 18.03SC

the constant of integration to be of the form c1 + c2. Using the standard properties of integrals, the output is then

1 x(t) = u(t)(aq1(t) + bq2(t)) dt + c1 + c2 u(t) ��

a b = u(t)q1(t) dt + c1 + u(t)q2(t) dt + c2 u(t) u(t) = ax1(t) + ax2(t) (which is what needed to be proved).

2. General = Particular + Homogeneous

In the general solution (2). we made a specific choice of the integral. By setting c = 0 this leads to a specific choice of the solution

1 � xp = u(t)q(t) dt.

u(t)

We call xp a particular solution, but this is a very poor name because there is nothing particularly particular about it. It is simply one specific solution. We could have chosen any other.

In the first note of this session we saw that the solution to the homogeneous equation (i.e., when q(t) ≡ 0). is related to the integrating factor u by xh(t) = 1/u(t). Using xp and xh we can rewrite the general solution (2) as

x(t) = xp(t) + cxh(t).

This tells us something interesting: one way to fully solve the inhomogeneous equation (1) is to first solve the homogeneous equation and then find any one solution, i.e., a particular solution, to the inhomogeneous equation.

We can use any method we want to find xp. One method is the method of integrating factors, but for many equations we will have easier methods.

Example 1. Find the general solution to x . + 1

t x = t2 by finding an xh and an xp.

.Solution. First we find xh. The associated homogeneous equation is x + 1 x = 0. This is separable and we easily solve it as follows. t

dx 1Separate variables: x = − t dt Integrate: ln = − ln t + c Set c = 0, drop absolute values and exponentiate: xh(

|xt)|

= 1 t .

| |

2

�

� �


Next, we use an integrating factor to find xp. Formula (2) says u(t) = e 1/t dt = eln(t) = t. (Of course, we knew this since u = 1/xh.) Thus, (again arbitrarily choosing the constant of integration to be 0)

xp(t) = 1 �

u(t)t2 dt = 1 �

t3 dt = t3

. u(t) t 4

The general solution to the problem is therefore

t3 c x(t) = xp(t) + cxh(t) = + . (4)

4 t

Notice, if we were a computer that didn’t know any better, we might have chosen a different xp(t), say xp(t) = t

43 + 1

t . We know this is a solution to our DE and so it has every right to be called a particular solution. In this case we would write our general solution as

t3 1 c x(t) = xp(t) + cxh(t) = + + . (5)

4 t t

Equations (4) and (5) are both valid as general solutions. This is because both equations really represent a whole family of solutions (you get a different family member for each value of c) and each family contains the same set of solutions. For example, we get the same solution if we take c = 5 in equation (4) or if we take c = 4 in equation (5).

.Example 2. Find the general solution to x + 2x = 4.

.Solution. The associated homogeneous equation is x + 2x = 0. This models exponential decay and has a solution xh(t) = e−2t .

We’ll use the method of optimism to find a particular solution. Since the right-hand side is a constant we guess a constant solution. By inspection we see that xp(t) = 2 is one solution.

Combining the homogeneous and particular solutions, we get that the general solution is

x(t) = 2 + ce−2t .

Example 3. Use the superposition principle to explain why x(t) = xp(t)+ cxh(t) is a solution to (1).

Solution. Let’s use the language of inputs/outputs and call the right-hand side of (1) the input. The superposition principle says a superposition of inputs leads to a superposition of outputs.

3


That is, since xp is a solution to the ODE with input q(t) and xh is a solution with input 0, we get the superposition xp + cxh is a solution with input q(t) + c 0 = q(t). This is exactly what we were asked to show. ·

4

Is it Particular?

.Quiz: The first order linear DE x + kx = t has general solution

x(t) = t/k − 1/k2 + ce−kt .

Which of the following could be chosen as a particular solution to the DE? a. t/k − 1/k2

b. t/k − 1/k2 + 3e−kt

c. t/k − 1/k2 + ce−kt

d. e−kt

Choices: 1. (a) only 2. (b) only 3. (d) only 4. (a) and (b) only 5. (a), (b) and (c) only 6. All of them.

Answer: (4): (a) and (b). (a) and (b) are both specific solutions so they can be particular solutions. (c) is the general solution, so it is not a particular solution. (We will accept the argument that c could be a specific constant and therefore this could be a particular solution.) (d) is a homogeneous solution not an inhomogeneous one.

Is it Particular?


x(t) = t/k − 1/k2 + ce−kt .


b. t/k − 1/k2 + 3e−kt

c. t/k − 1/k2 + ce−kt

d. e−kt

Choices: 1. (a) only 2. (b) only 3. (d) only 4. (a) and (b) only 5. (a), (b) and (c) only 6. All of them.


Is it Particular?


x(t) = t/k − 1/k2 + ce−kt .


b. t/k − 1/k2 + 3e−kt

c. t/k − 1/k2 + ce−kt

d. e−kt


Complex Arithmetic and Exponentials: Introduction

Complex numbers will be a fundamental part of the toolkit for this course. Using the complex roots of polynomials and allowing exponentials with complex exponents will simplify and unify our study of constant coefficient linear ODE’s. The key will be Euler’s formula

eit = cos(t) + i sin(t).

This formula will allow us to replace trigonometric functions which have hard-to-remember, hard-to-manipulate identities by complex exponentials which have easy to remember and manipulate rules.

� �

Complex Arithmetic

1. History

Most people think that complex numbers arose from attempts to solve quadratic equations, but actually they first appeared in connection with cubic equations. Everyone knew that certain quadratic equations, like

x2 + 1 = 0 or x2 + 2x + 5 = 0

had no solutions. The problem was with certain cubic equations, for example

x3 − 6x + 2 = 0.

This equation was known to have three real roots, given by simple combinations of the expressions

3 3A = −1 +

√−7 B = −1 −

√−7. (1)

For instance, one of the roots is A + B; it may not look like a real number, but it turns out to be one.

What was to be made of the expressions A and B? They were viewed as some sort of “imaginary numbers” which had no meaning in themselves, but which were useful as intermediate steps in calculations which would ultimately lead to the real numbers you were looking for (such as A + B).

This point of view persisted for several hundred years. But as more and more applications for these “imaginary numbers” were found, they gradually began to be accepted as valid “numbers” in their own rights, even though they did not measure the length of any line segment. Nowadays we are fairly generous in our use of the word “number”: numbers of one sort or another don’t have to measure anything, but to merit the name they must belong to a system in which some type of addition, subtraction, multiplication, and division is possible, and where these operations obey those laws of arithmetic one learns in elementary school and has usually forgotten by high school — the commutative, associative, and distributive laws.

2. Definitions

To describe the complex numbers, we use a formal symbol i representing

√−1 ; then a complex number is an expression of the form:

a + ib a, b real numbers. (2)

�

�

Complex Arithmetic OCW 18.03SC

If a = 0 or b = 0, they are omitted (unless both are 0); thus we write

a + i0 = a, 0 + ib = ib, 0 + i0 = 0

The definition of equality between two complex numbers is

a + ib = c + id a = c; b = d (a, b, c, d real numbers). (3)⇔

This shows that the numbers a and b are uniquely determined once the complex number a + ib is given; we call them respectively the real and imaginary parts of a + ib. (It might seem logical to call ib the imaginary part, but this would be less convenient.) In symbols,

a = Re(a + ib) b = Im(a + ib) (4)

Addition and multiplication of complex numbers are defined in the familiar way, making use of the fact that i2 = −1:

Addition

(a + ib) + (c + id) = (a + c) + i(b + d) (5)

Multiplication

(a + ib)(c + id) = (ac − bd) + i(ad + bc). (6)

Division is a little more complicated; what is important is not so much the final formula as the procedure that produces it; assuming c + id = 0, it is:

Division a + ib a + ib c − id ac + bd bc − ad c + id

= c + id

· c − id

= c2 + d2 + i

c2 + d2 . (7)

This division prodcedure made use of complex conjugation: if z = a + ib, we define the complex conjugate of z to be the complex number

z = a − ib (note that zz = a2 + b2). (8)

The size of a complex number is measured by its absolute value, or modulus, defined by:

|z| = |a + ib| = a2 + b2 ; (thus: zz = |z|2). (9)

2

Complex Arithmetic Examples

In the following we let z = 2 + 3i and w = 4 + 5i.

1. Real and Imaginary Parts

Re(z) = 2, Im(z) = 3, Re(w) = 4, Im(w) = 5.

Note: the imaginary part does not include i.

2. Addition and Subtraction

z + w = (2 + 3i) + (4 + 5i) = 6 + 8i z − w = (2 + 3i) − (4 + 5i) = −2 − 2i.

3. Multiplication

z w = (2 + 3i)(4 + 5i) = 8 − 15 + i(10 + 12) = −7 + 22i.·

4. Complex Conjugate and Magnitude

z = 2 + 3i = 2 − 3i

|z| = √

4 + 9 = √

13

z + z = 2 + 3i + 2 − 3i = 4 = 2 Re(z)

z · z = (2 + 3i)(2 − 3i) = 4 + 9 = 13 = |z|2

5. Division

Multiply numerator and denominator by the complex conjugate of the denominator:

w 4 + 5i 4 + 5i 2 − 3i 8 + 15 + i(−12 + 10) 23 2 z =

2 + 3i =

2 + 3i ·

2 − 3i =

13 =

13 −

13i.

Multiplication by i

Quiz: Multiplication by i.Multiplication by i has what effect on a complex number in the complexplane?

Choices:

a) It rotates the number around the origin by 90 degrees counterclockwise.

b) It rotates the number around the origin by 90 degrees clockwise.

c) It takes a number to the number pointing in the opposite direction with the same distance from the origin.

d) It reflects the number across the imaginary axis.

e) It reflects the number across the real axis.

f) None of the above.

g) I don’t know.

Answer: Answer: (a).

We compute: i(a + bi) = −b + ai, which is rotated by 90 degrees counterclockwise.

x

y

a+ bi−b+ ai

For example, i · 1 = i and i · i = −1.

x

y

1

i

−1

×i×i

Multiplication by i


Choices:

a) It rotates the number around the origin by 90 degrees counterclockwise.

b) It rotates the number around the origin by 90 degrees clockwise.

c) It takes a number to the number pointing in the opposite direction with the same distance from the origin.

d) It reflects the number across the imaginary axis.

e) It reflects the number across the real axis.

f) None of the above.

g) I don’t know.


Multiplication by i



Complex Conjugation

Quiz: Complex Conjugation.If z = −z, what does that tell us about the value of z = a + bi?

Choices:

a) z is purely imaginary.

b) z is real.

c) z has length 1.

d) z = 0.

e) None of the above.

Answer: Answer: (a)

a + bi = −(a − bi) implies a = −a = 0.

Complex Conjugation

Quiz: Complex Conjugation.If z = −z, what does that tell us about the value of z = a + bi?

Choices:

a) z is purely imaginary.

b) z is real.

c) z has length 1.

d) z = 0.

e) None of the above.


Complex Conjugation

Quiz: Complex Conjugation. If z = −z, what does that tell us about the value of z = a + bi?


�

�

Euler’s Formula, Polar Representation

1. The Complex Plane

Complex numbers are represented geometrically by points in the plane: the number a + ib is represented by the point (a, b) in Cartesian coordinates. When the points of the plane are thought of as representing complex numbers in this way, the plane is called the complex plane.

By switching to polar coordinates, we can write any non-zero complex number in an alternative form. Letting as usual

x = r cos(θ), y = r sin(θ)

we get the polar form for a non-zero complex number: assuming x + iy = 0,

x + iy = r(cos(θ) + i sin(θ)). (1)

When the complex number is written in polar form,

r = |x + iy| = x2 + y2. (absolute value, modulus).

We call θ the polar angle or the argument of x + iy. In symbols, one sometimes sees:

θ = arg(x + iy). (polar angle, argument).

The absolute value is uniquely determined by x + iy but the polar angle is not, since it can be increased by any integer multiple of 2π. (The complex number 0 has no polar angle.) To make θ unique, one can specify

0 ≤ θ < 2π. (principal value).

This so-called principal value of the angle is sometimes indicated by writing Arg(x + iy). For example,

Arg(−1) = π, arg(−1) = ±π, ±3π, ±5π, · · ·

Changing between Cartesian and polar representation of a complex number is essentially the same as changing between Cartesian and polar coordinates: the same equations are used and the same triangle appears in the plane. The figure below shows this. (You will learn what eiθ means in the next section.)

� � � �

Euler’s Formula, Polar Representation OCW 18.03SC

x

y

r = |z|

z = x+ iy = reiθ

y

xθ

z = x− iy = re−iθ

Fig 1. The complex plane.

Example 1. Give the polar form for: −i, 1 + i, 1 − i, −1 + i√

3.

Solution.

−i = i sin(3π/2) 1 + i = √

2(cos(π/4) + i sin(π/4))

−1 + i√

3 = 2(cos(2π/3) + i sin(2π/3)) 1 − i = √

2(cos(−π/4) + i sin(−π/4)).

2. Euler’s Formula

The abbreviation cis θ is sometimes used for cos(θ) + i sin(θ); for students of science and engineering, however, it is important to get used to the exponential form for this expression:

eiθ = cos(θ) + i sin(θ) Euler’s formula. (2)

Equation (2) should be regarded as the definition of the exponential of an imaginary power. A good justification for it is found in the infinite series:

t t2 t3 et = 1 + + + +

1! 2! 3! · · ·

If we substitute iθ for t in the series and collect the real and imaginary parts of the sum (remembering that i2 = −1, i3 = −i, i4 = 1, i5 = i, and so on), we get:

eiθ = 1 − θ

2!

2 +

θ

4!

4 − · · · + i θ −

θ

3!

3 +

θ

5!

5 − · · ·

= cos(θ) + i sin(θ)

2


in view of the infinite series representations for cos(θ) and sin(θ).

Since we only know that the series expansion for et is valid when t is a real number, the above argument is only suggestive — it is not a proof of (2). What it shows is that Euler’s formula (2) is formally compatible with the series expansions for the exponential, sine, and cosine functions.

3. Polar Representation

Using the complex exponential, the polar representation (1) is written:

x + iy = reiθ . (3)

The most important reason for polar representation is that multiplication of complex numbers is particularly simple when they are written in polar form. Indeed, by using Euler’s formula (2) and the trigonometric addition formulas, it is not hard to show that

eiθ1 eiθ2 = ei(θ1+θ2). (4)

This gives another justification for the definition (2) — the complex exponential follow the same exponential addition rules as the real exponential. The law (4) leads to the simple rules for multiplying and dividing complex numbers written in polar form:

Multiplication Rule

r1eiθ1 r2eiθ2 = r1r2ei(θ1+θ2). (5)·

To multiply two complex numbers, you multiply the absolute values and add the angles.

Reciprocal Rule

Division Rule

1 1 = e−iθ ; (6)

reiθ r

r1eiθ1

= r1 ei(θ1−θ2). (7)

r2eiθ2 r2

To divide by a complex number, divide by its absolute value and subtract its angle.

The reciprocal rule (6) follows from (5), which shows that

1 e−iθ iθre = 1.

r ·

3


Using (5), we can raise x + iy to a positive integer power by first using x + iy = reiθ :

(x + iy)n = rneinθ ; (8)

DeMoivre’s formula: The special case when r = 1 is called DeMoivre’s Formula:

(cos(θ) + i sin(θ))n = cos(nθ) + i sin(nθ). (9)

Example 2. Express:

a) (1 + i)6 in Cartesian form;

1 + i√

3b) in polar form. √

3 + i

Solution.

a) Change to polar form, use (8), then change back to Cartesian form:

i6π/4 i3π/2 (1 + i)6 = (√

2eiπ/4)6 = (√

2)6e = 8e = −8i.

iπ/3 1 + i√

3 2eb) Changing to polar form, √

3 + i =

2eiπ/6 = eiπ/6, using the division

rule (7).

You can check the answer to (a) by applying the binomial theorem to (1 + i)6 and collecting the real and imaginary parts; to (b) by doing the division in the Cartesian form then converting the answer to polar form.

3.1. Combining pure oscillations of the same frequency.

The equation which does this is widely used in physics and engineering; it can be expressed using complex numbers:

a cos(λt) + b sin(λt) = A cos(λt − φ), where a + bi = Aeiφ; (10)

in other words, A = √

a2 + b2, φ = tan−1(b/a). To prove (10), we have:

a cos(λt) + b sin(λt) = Re ((a − bi) (cos(λt) + i sin(λt)))· = Re(Ae−iφ eiλt)· = Re(Aei(λt−φ)) = A cos(λt − φ).

4

Complex Powers

Quiz: Complex Powers. (1 + i)4 = ?

Choices:

a) −1

b) 4

c) −4

d) −√

2

e) 4i


Answer: In polar form

iπ/41 + i = √

2e .

Thus, (1 + i)4 = (

√2eiπ/4)4 = 4eiπ = −4.

These powers all lie on a spiral emanating from the origin.

The answer is (c). (Thus, (1 + i) is a fourth root of −4.)

Complex Powers


Choices:

a) −1

b) 4

c) −4

d) −√

2

e) 4i



Complex Powers



Applet Exploration: Complex Exponential

‘

Start by opening the Complex Exponential applet .

The Unit Circle

1. Set a = 0 and b = 1. As you change t notice that eibt always lies on the unit circle. What happens if you change the value of b leaving a = 0? Explain this in terms of sin and cos.

2. When you increase a from 0 what happens to the circle? Explain this by expanding e(a+bi)t .

Complex Exponentials

Because of the importance of complex exponentials in differential equations, and in science and engineering generally, we go a little further with them. Euler’s formula defines the exponential to a pure imaginary power. The definition of an exponential to an arbitrary complex power is:

ea+ib = eaeib = ea(cos(b) + i sin(b)). (1)

We stress that the equation (1) is a definition, not a self-evident truth, since up to now no meaning has been assigned to the left-hand side. From (1) we see that

aRe(ea+ib) = e cos(b), Im(ea+ib) = ea sin(b). (2)

The complex exponential obeys the usual law of exponents:

ez+z� = ezez� , (3)

as is easily seen by combining (1) with the multiplication rule for complex numbers.

The complex exponential is expressed in terms of the sine and cosine functions by Euler’s formula. Conversely, the sine and cosine functions can be expressed in terms of complex exponentials. There are two important ways of doing this, both of which you should learn:

cos(x) = Re(eix), sin(x) = Im(eix); (4) 1 1

cos(x) = (eix + e−ix), sin(x) = (eix − e−ix). (5)2 2i

The equations in (5) follow easily from Euler’s formula; their derivation is left as an exercise. Here are some examples of their use.

Example. Express cos3(x) in terms of the functions cos(nx), for suitable n.

Solution. We use (5) and the binomial theorem, then (5) again:

cos3(x) = 1 (eix + e−ix)3

8

= 1 (e3ix + 3eix + 3e−ix + e−3ix)

8 1 3

= cos(3x) + cos(x).4 4

� � �

�

�

�

� �

�

Complex Exponentials OCW 18.03SC

As a preliminary to the next example, we note that a function like

eix = cos(x) + i sin(x)

is a complex-valued function of the real variable x. Such a function may be written as

u(x) + iv(x) u, v real-valued

and its derivative and integral with respect to x are defined to be

a) D(u + iv) = Du + iDv b) (u + iv)dx = udx + i vdx. (6)

From this it follows easily that

D(e(a+ib)x) = (a + ib)e(a+ib)x ,

and therefore

e(a+ib)x dx = 1

e(a+ib)x . (7)a + ib

Example. Calculate ex cos 2x dx by using complex exponentials.

Solution. The usual method is a tricky use of two successive integration by parts. Using complex exponentials instead, the calculation is straightforward. We have

e(1+2i)x = ex cos(2x) + iex sin(2x), by (1) therefore by (6b) � �

xe cos(2x) dx = Re( e(1+2i)xdx).

Calculating the integral,

e(1+2i)xdx = 1

e(1+2i)x by (7) 1 + 2i

= 15 −

25

i (ex cos(2x) + iex sin(2x)),

using (1) and complex division. According to the second line above, we want the real part of this last expression. Multiply and take the real part; you get the answer

1 2 ex cos(2x) dx = ex cos(2x) + ex sin(2x).

5 5

2

Complex Exponentials OCW 18.03SC

In this differential equations course, we will make free use of complex exponentials in solving differential equations, and in doing formal calculations like the ones above. This is standard practice in science and engineering, and it’s well worth getting used to.

3


Quiz: Complex Exponentials.The magnitude of e(a+bi)t is eat, and the argument of e(a+bi)t is bt. Whena > 0 and b > 0, we can think of e(a+bi)t as a point in the complex planewhich traces out a path as t varies.

The curve in the complex plane traced out by

(1+2πi)te

most closely resembles which of the following?

Choices:

a) A straight ray along the positive real axis

b) A circle with radius e and center at the origin

c) A circle with radius 1 and center at the origin

d) A spiral moving inwards and counterclockwise

e) A spiral moving outwards and counterclockwise

f) A spiral moving inwards and clockwise

g) A spiral moving outwards and clockwise

Answer: The magnitude of e(1+2πi)t is et and the argument is 2πt, so the answer is (e).




(1+2πi)te


Choices:

a) A straight ray along the positive real axis

b) A circle with radius e and center at the origin

c) A circle with radius 1 and center at the origin

d) A spiral moving inwards and counterclockwise

e) A spiral moving outwards and counterclockwise

f) A spiral moving inwards and clockwise

g) A spiral moving outwards and clockwise





(1+2πi)te



Finding n-th Roots

To solve linear differential equations with constant coefficients, we need to be able to find the real and complex roots of polynomial equations. Though a lot of this is done today with calculators and computers, one still has to know how to do an important special case by hand: finding the roots of

zn = α,

where α is a complex number, i.e., finding the n-th roots of α. Polar representation will be a big help in this.

Let’s begin with a special case: the n-th roots of unity: the solutions to

zn = 1.

To solve this equation, we use polar representation for both sides, setting z = reiθ on the left, and using all possible polar angles on the right; using the exponential law to multiply, the above equation then becomes

rneinθ = 1 e(2kπi), k = 0, ±1, ±2,· · · · .

Equating the absolute values and the polar angles of the two sides gives

rn = 1, nθ = 2kπ, k = 0, ±1, ±2, · · · ,

from which we conclude that

2kπ r = 1, θ =

n , k = 0, 1, · · · , n − 1. (1)

In the above, we get only the value r = 1, since r must be real and nonnegative. We don’t need any integer values of k other than 0, · · · , n − 1, since they would not produce a complex number different from the above n numbers. That is, if we add an, an integer multiple of n, to k, we get the same complex number:

θ� = 2(k + an)π

= θ + 2aπ; and eiθ� = eiθ , since e2aπi = (e2πi)a = 1. n

We conclude from (1) therefore that

the n-th roots of 1 are the numbers e2kπi/n , k = 0, · · · , n − 1. (2)

Finding n-th Roots OCW 18.03SC

This shows there are n complex n-th roots of unity. They all lie on the unit circle in the complex plane, since they have absolute value 1; they are evenly spaced around the unit circle, starting with the root z = 1; the angle between two consecutive roots is 2π/n. These facts are illustrated for the case n = 6 in the figure below

x

y

Fig. 1. The six solutions to the equation z6 = 1 lie on a unit circle in the complex plane.

From (2), we get another notation for the roots of unity (ζ is the Greek letter “zeta”):

the n-th roots of 1 are 1, ζ, ζ2, · · · , ζn−1, where ζ = e2πi/n . (3)

We now generalize the above to find the n-th roots of an arbitrary complex number w. We begin by writing w in polar form:

w = reiθ ; θ = Argw, 0 ≤ θ < 2π,

i.e., θ is the principal value of the polar angle of w. Then the same reasoning as we used above shows that if z is an n-th root of w, then

n iθ n i(θ+2kπ)/nz = w = re so z = √

re , k = 0, 1, · · · , n − 1. (4)

Comparing this with (3), we see that these n roots can be written in the suggestive form

n n iθ/n√w = z0, z0ζ, z0ζ2, · · · , z0ζn−1, where z0 =

√re . (5)

As a check, we see that all of the n complex numbers in (5) satisfy zn = w :

2


(z0ζ i)n = z0 nζni = zn 1i , since ζn = 1, by (3); 0 ·

= w, by the definition (5) of z0 and (4).

√1

√i

Solution. a) According to (3), the cube roots of 1 are 1, ω, and ω2, where

Example. Find in Cartesian form all values of a) 3 b) 4

ω = e2πi/3 = cos(2π/3) + i sin(2π/3) = − 1 + i

√3

2 2

ω2 = e−2πi/3 = cos(−2π/3) + i sin(−2π/3) = − 21 − i

√

23

.

The greek letter ω (“omega”) is traditionally used for this cube root. Note that for the polar angle of ω2 we used −2π/3 rather than the equivalent angle 4π/3, in order to take advantage of the identities

cos(−x) = cos(x) sin(−x) = − sin(x).

Note that ω2 = ω. Another way to do this problem would be to draw the position of ω2 and ω on the unit circle and use geometry to figure out their coordinates.

4 4b) To find √

i, we can use (5). We know that √

1 = 1, i, −1, −i (either by drawing the unit circle picture or by using (3)). Therefore by (5), we get

4 πi/8 √

i = z0, z0i, −z0, −z0i, where z0 = e = cos(π/8) + i sin(π/8); = a + ib, −b + ia, −a − ib, b − ia where z0 = a + ib = cos(π/8) + i sin(π/8).

Example. Solve the equation x6 − 2x3 + 2 = 0.

Solution. Treating this as a quadratic equation in x3, we solve the quadratic by using the quadratic formula; the two roots are 1 + i and 1 − i (check this!), so the roots of the original equation satisfy either

x3 = 1 + i or x3 = 1 − i.

This reduces the problem to finding the cube roots of the two complex numbers 1 ± i. We begin by writing them in polar form:

1 + i = √

2eπi/4, 1 − i = √

2e−πi/4.

(Once again, note the use of the negative polar angle for 1 − i, which is more convenient for calculations.) The three cube roots of the first of these are (by (4)),

3

� �


6 πi/12 6√

2e = √

2 (cos(π/12) + i sin(π/12))

6 3πi/4 6 π 2π 3π√2e =

√2 (cos(3π/4) + i sin(3π/4)) , since + = ;

12 3 46 6 π 2π 7π√

2e−7πi/12 = √

2 (cos(7π/12) − i sin(7π/12)) , since 12

− 3

= − 12

.

6 −1 + i −1 + iThe second cube root can also be written as

√2 √

2 = √

2.

3

This gives three of the cube roots. The other three are the cube roots of 1 − i, which may be found by replacing i by −i everywhere (i.e., taking the complex conjugate).

The cube roots can also be described according to (5) as

6 6z1, z1ω, z1ω2 and z2, z2ω, z2ω2 where z1 = √

2eπi/12, z2 = √

2e−πi/12.

4

Sinusoidal Functions: Introduction

In ways that will become clear as the course progresses sinusoidal functions play a key role. In particular, for constant coefficient equations they are the most important type of input.

In this session we will use Euler’s formula to gain a more detailed understanding and description of sinusoidal functions in terms of their amplitudes and phase angles. These notions will be used throughout the remainder of the course.

Sinusoidal Functions

1. Definitions

A sinusoidal function (or sinusoidal oscillation or sinusoidal signal) is one that can be wrtten in the form

f (t) = A cos(ωt − φ). (1)

The function f (t) is a cosine function which has been amplified by A, shifted by φ/ω, and compressed by ω.

• A > 0 is its amplitude: how high the graph of f (t) rises above the t-axis at its maximum values;

• φ is its phase lag: the value of ωt for which the graph has its maximum (if φ = 0, the graph has the position of cos(ωt); if φ = π/2, it has the position of sin(ωt));

• τ = φ/ω is its time delay or time lag: how far along the t-axis the graph of cos(ωt) has been shifted to make the graph of (1); (to see this, write A cos(ωt − φ) = A cos(ω(t − φ/ω)))

• ω is its angular frequency: the number of complete oscillations f (t) makes in a time interval of length 2π; that is, the number of radians per unit time;

• ν = ω/2π is the frequency of f (t): the number of complete oscillations the graph makes in a time interval of length 1; that is, the number of cycles per unit time;

• P = 2π/ω = 1/ν is its period, the t-interval required for one complete oscillation.

One can also write (1) using the time lag τ = φ/ω

f (t) = A cos (ω(t − τ)) .

2. Discussion

Here are the instructions for building the graph of (1) from the graph of cos(t). First scale, or vertically stretch, cos(t) by a factor of A; then shift the

Sinusoidal Functions OCW 18.03SC

result to the right by φ units (if φ < 0 the shift will actually be to the left); and finally scale it horizontally by a factor of 1/ω.

In the figure below the dotted curve is cos(t) and the solid curve is 2.5 cos(πt − π/2). The solid curve has

A = 2.5, ω = π, φ = π/2, τ = 1/2.

Vertically, the solid curve is 2.5 times the dotted one. Horizontally, the solid curve it 1/π times the dotted one. (The dotted curve takes 2π units of time to go through one cycle and the solid curve takes only 2 units of time.) The solid curve hits its first maximum at t = 1/2, i.e. at the t = τ, the time lag.

t-3 -2 -1 1 2 3 4 5 6 7

-2

-1

1

2

time lag τ = 1/2 one period = 2 one period = 2π

amplitude = 2.5

Fig. 1. Features of the graph of a sinusoid.

2

� � � � � �

� �

Mystery Sinusoid

Quiz: Mystery Sinusoid

t

-2 -1 0 1 2 3 4 5 6 7 8-2.5

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5

Fig. 1. Mystery sinusoid.

The graph of a sinusoidal function is displayed. The problem is to express it in the standard form

f (t) = A cos(ωt − φ).

Choices:

a) 2 cos �4πt + π 4� b) 2 cos π

4 t + π 4 c) 2 cos 4πt − π

4 d) 2 cos π

4 t − π 4 e) 2 cos (4t + 1) f) 2 cos (4t − 1)

Answer: The answer is (b)

The graph runs vertically between 2 and -2, so the amplitude is A = 2.

There are consecutive peaks at -1 and 7, so the period P = 8. Therefore, the angular frequency ω = 2π/P = π/4.

The curve has a time lag of τ = −1 (see the peak at -1). Since τ = φ/ω, we have φ = −ω = −π/4.

Hence the equation of the sinusoid is: π π

A cos(ωt − φ) = 2 cos t + .4 4

� � � � � �

Mystery Sinusoid


t

-2 -1 0 1 2 3 4 5 6 7 8-2.5

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5




Choices:

a) 2 cos 4πt + π b) 2 cos π 4 t + π c) 2 cos 4πt − π � 4� 4 4

d) 2 cos π 4 t − π

4 e) 2 cos (4t + 1) f) 2 cos (4t − 1)


Mystery Sinusoid


t

-2 -1 0 1 2 3 4 5 6 7 8-2.5

-2

-1.5

-1

-0.5

0

0.5

1

1.5

2

2.5





Applet Exploration: Trigonometric Identity

‘

Start by opening the Trigonometric Identity applet .

This mathlet illustrates sinusoidal functions and the trigonometric identity

a cos(ωt) + b sin(ωt) = A cos(ωt − φ), where a + ib = Aeiφ .

That is, (A, φ) are the polar coordinates of (a, b).

The sinusoidal function A cos(ωt − φ) is drawn here in red. A and φ are the amplitude and phase lag of the sinusoid. They are both controlled by sliders.

1. The phase lag φ measures how many radians the sinuoid falls behind the standard sinusoid, which we take to be the cosine. So when φ = π/2 you have the sine function. Verify this in the applet.

2. The final parameter is ω, the angular frequency. High frequency means the waves come faster. Frequency zero means constant. Play with the ω slider and understand this statement. Return the angular frequency to 2.

3. The trigonometric identity shows the remarkable fact that the sum of any two sinuoidal functions of the same frequency is again a sinusoid of the same frequency.

Use the a and b sliders to select coefficients for cos(ωt) and sin(ωt). The a slider modifies the yellow cosine curve in the window at bottom and the b slider modifies the blue sine curve. Notice that the sum of a cos(t) and b sin(t) is displayed in the top window in green (which is a combination of blue and yellow). There it is! - the linear combination is again sinusoidal, or at least appears to be.

4. The window at the right shows the two complex numbers a + ib and Aeiφ. The sinusoidal identity says that the green and red sinusoids will coincide exactly when the complex numbers a + ib and Aeiφ coincide. Verify this on the applet by pickong values of A and φ. and then adjusting a and b until the green and red sinusoids are the same.

The Sinusoidal Identity

The sum of two sinusoidal functions of the same frequency is another sinusoidal function with that frequency! For any real constants a and b,

a cos(ωt) + b sin(ωt) = A cos(ωt − φ) (1)

where A and φ can be described in at least two ways:

A = �

a2 + b2, φ = tan−1 b ; (2)

a

a + bi = Aeiφ . (3)

Conversely, we have

a = A cos(φ) and b = A sin(φ). (4)

Geometrically this is summarized by the triangle in the figure below.

a

bA

φ

Fig. 1. a + bi = Aeiφ .

One proof of (1) is a simple application of the cosine addition formula

cos(α − β) = cos(α) cos(β) + sin(α) sin(β).

We will now give an equivalent proof using Euler’s formula and complex arithmetic: The triangle in Figure 1 is the standard polar coordinates triangle. It shows a + ib = Aeiφ or a − ib = Ae−iφ. Thus

A cos(ωt − φ) = Re(Aei(ωt−φ))

= Re(eiωt Ae−iφ)· = Re((cos(ωt) + i sin(ωt)) (a − ib)) · = Re(a cos(ωt) + b sin(ωt) + i(a sin(ωt) − b cos(ωt))) = a cos(ωt) + b sin(ωt).

We should stress the importance of the trigonometric identity (1). It shows that any linear combination of cos(ωt) and sin(ωt) is not only periodic of

The Sinusoidal Identity OCW 18.03SC

period 2ωπ , but is also sinusoidal. If you try to add cos(ωt) to sin(ωt) “by

hand”, you will probably agree that this is not at all obvious.

We will call A cos(ωt − φ) amplitude-phase form and a cos(ωt)+ b sin(ωt) rectangular or Cartesian form. You should be familiar with amplitude-phase form; we usually prefer it because both amplitude and phase have geometric and physical meaning for us.

2

� �

� � ��

� �

Amplitude-Phase Form of a Sinusoid

Quiz: Amplitude-Phase Form of a Sinusoid.Put cos (ωt) +

√3 sin (ωt) into amplitude-phase form A cos (ωt − φ)?

Choices:

a) 2 cos ωt − π 4

πb) √

3 cos � ω � t − 3

��

c) 2 cos ω t − π 3

d) 2 cos ωt − π 3

πe) √

3 cos � ωt − 3

�

πf) √

3 cos � ωt −

� 4

Answer: � 2 πThe answer is (d) because A = 12 +

√3 = 2, and φ = tan−1

√

13 = 3 .

First Order Constant Coefficient Linear ODE’s

In this session we start our transition to studying constant coefficient linear ODE’s by looking at constant coefficient first order equations. Here we will continue to solve them using integrating factors. Starting in the next session we will learn some simpler methods for solving constant coefficient equations.

We will see that for many physical systems we can split the output (i.e. the solution to the DE) into pieces called the steady-state and the transient. This will correspond exactly to the way we have already written solutions as a particular solution plus the general homogeneous solution:

x(t) = xp(t) + xh(t).

With the steady-state solution equaling xp and the transient being xh.

We have already encountered examples of constant coefficient equations when modeling exponential growth (a bank account), heat diffusion (root beer cooling) and an RC circuit. To finish the session we will add examples of radioactive decay and mixing tanks to this list.

�

��

�

�

�

Solution to the Constant Coefficient First Order Equation

In this section we will consider the constant coefficient equation

.y + ky = q(t). (1)

(Constant coefficient means k is a constant.)

Solving is easy using the integrating factor u(t) = e k dt = ekt from session 5. We get the solution

y = e−kt ektq(t) dt + c (2)

= e−kt ektq(t) dt + ce−kt . (3)

As usual, we have a particular solution and a homogeneous solution, respectively

yp(t) = e−kt ektq(t) dt and yh(t) = e−kt .

The general solution to (1) is then

y(t) = yp(t) + c yh(t).

The Case k > 0. If k > 0 the system in (1) models exponential decay. That is, when the input is 0 the system response is y(t) = ce−kt, which decays exponentially to 0 as t goes to ∞.

In the general solution we call ce−kt the transient because it goes to 0.

The term e−kt ektq(t) dt is called the steady-state or long-term solution.

That is, cyh is the transient and yp is the steady-state solution.

The value of c in (2) is determined by the initial value y(0). The initial condition only affects the transient and not the long-term behavior of the solution. No matter what the initial condition, every solution goes asymptotically to the steady-state. That is, all solution curves approach the steady-state as t ∞.→

Solution to the Constant Coefficient First Order Equation OCW 18.03SC

y' = -y - 1/(x + 1) + 1

x

y

Steady state

Other solutions

Fig. 1. In the case k > 0 all solutions go asymptotically to the steady-state.

Since all the solutions approach each other, there is no precise way to choosethe one we call the steady-state. In fact, we can choose any one to be thesteady-state solution. Generally, we just choose the simplest looking solution.

The case k ≤ 0.When k ≤ 0 the homogeneous solution e−kt does not go asymptotically to0. In other words it is not transient. In this case it does not make sense to talk about the steady-state solution.

2

Examples of Constant Coefficient First order Equations

We have seen a number of physical systems that are modeled by first order constant coefficient linear ODE’s. Here we will recall some of them and introduce examples of radioactive decay and of mixing tanks.

Example 1. Radioactive Decay. Suppose we have some radioactive matter with decay constant k1. This means that the rate at which the radioactive material decays is proportional to the amount present at any given time. Let A(t) be the amount of matter at time t. The rate equation modeling this system is

dA dA dt

= −k1 A(t) ⇔ dt

+ k1 A = 0.

The solution to this is easily seen to be A(t) = A0e−k1t .

Now, suppose that when A decays it becomes a different radioactive substance B, which has its own decay constant k2. The rate equation for the amount B(t) of B is

dB = k1 A(t) − k2 B(t),dt

i.e. dB = (rate B is created from A) - (rate B decays). Using the solution for dt A(t) = A0e−k1t this becomes

dB + k2B(t) = k1 A0e−k1t .

dt

Example 2. Temperature-Concentration or Conduction-Diffusion. Newton’s law of cooling gives the equation

dT + kT = kTe(t).dt

Here, T(t) is the temperature of a body and Te(t) is the temperature of the surrounding environment. (See the heat-diffusion example in session 5.)

Example 3. Mixing Tanks. A tank containing a volume V of salt solution has solution entering the tank at rate r (in, say, liters/minute). The incoming solution has a concentration Ce(t) of salt. Solution leaves the tank at the same rate r. (This means the volume of solution in the tank is constant –we say the inflow and outflow rates are balanced.)

Examples of Constant Coefficient First order Equations OCW 18.03SC

r

r

V

Fig. 1. A mixing tank with balanced inflow and outflow.

We will assume the concentration of salt stays uniform throughout the tank. If the solution is continuously stirred this is a reasonable assumption.

Question. Let x(t) be the amount of salt in the tank. How do we write a differential equation modeling x(t)?

Answer. The rate of change of salt in the tank is

the rate salt flows in - the rate it flows out.

rate in = inflow rate × inflow concentration = r Ce(t).rate out = outflow rate × ouflow concentration = r x , since the outflowV concentration is the concentration in the tank which is x/V. Therefore,

dx x . r dt

= rCe(t) − rV

⇔ x + V

x = rCe(t). (1)

Notes: 1. In building the model it is best to let the dependent variable be the amount of salt and not the concentration. One good reason for this is that amounts add, but concentrations do not. For example, if I combine a solution with 2 grams of salt and one with 3 grams, I will have a solution with 5 grams of salt. But, if I combine a 2 g/liter solution with a 3 gm/liter solution, the new solution will have concentration somewhere between 2 and 3 g/liter, depending on how much of each solution is combined.

2. If we choose we can write the DE in terms of the concentration C(t) = x(t)/V. Simply divide the equation by V:

. x r x r . r r + = Ce(t) C + C(t) = Ce(t).V V V V

⇒ V V

. If we let k = r/V this DE becomes C + kC = kCe, which looks like the conduction-diffusion equation in example 2.

3. If the tanks are not balanced we can use the same logic to build the ODE

2

Examples of Constant Coefficient First order Equations OCW 18.03SC

modeling the mixing tanks. It becomes the linear DE

x . +

rV

2 x = r1Ce,

where r1, r2 are the inflow and outflow rates respectively. We can also write

this in terms of concentration C . +

r2 C = r1 Ce.V V

Example 4. RC Circuits In session 4 we discussed a cicuit with a resistor, capacitor and input voltage. It satisfies the ODE

dI 1R + I = E�. (2)

dt C

Here, R is the resistance, C is the capacitance, E is the voltage source and I is the current through the resistor.

IR

capacitorE

Fig. 2. RC circuit with input voltage E . dq Let q(t) be the charge on the capacitor, then I = dt and equation (2) can

be written in terms of q as

R dq

+ 1

q = E(t). (3)dt C

We’ll take a moment to remind you that the right-hand side of the DE is not always exactly the same as the input. In both (2) and (3) we consider Eto be the input to the system, but in (2) the right-hand side of the equation is E�.

3

��

Response to Discontinuous Input

We will continue looking at the constant coefficient first order linear DE . y + ky = q(t).

It has the integrating factors solution

y = e−kt ektq(t)dt + c . (1)

In this note we want to do an example where the input q(t) is discontinuous.

The most basic discontinuous function is the unit-step function at a point a, defined by: �

0 t < a ua(t) = (2)

1 t > a.

(We leave its value at a undefined, though some books give it the value 0 there, others the value 1 there.)

Example 1. We’ll look again at Newton’s law of cooling and my root beer cooler: .

y + ky = k f (t),

where, y(t) is the temperature inside the cooler and f (t) is the temperature of the air. It’s a nice, cool morning with constant temperature. Suddenly the sun comes out and the air warms up to a higher constant temperature. What’s the response of my cooler to this signal?

We’ll assume the sun comes out at time t = a, my cooler starts at t = 0 with temperature 0 and (somewhat idealized) the air temperature jumps instantly from 0 to 20 at time t = a. So f (t) = 20 ua(t) and our IVP is

. y + ky = k20ua(t), y(0) = 0.

Solution. For t < a we have the input is 0. Since y(0) = 0, the response is y(t) = 0.

.For t ≥ a the DE becomes y + ky = 20k with y(a) = 0. The solution

(which we have found before) is y(t) = 20 + ce−kt. Now we use the initial condition y(a) = 0 to the find the value of c. We get c = −20eka, so y(t) = 20 − 20ekae−kt for t ≥ a.

�

�

�

Response to Discontinuous Input OCW 18.03SC

We can now assemble the results for t < a and t ≥ a into one expression; for the latter, we also put the exponent into a more suggestive form.

0 0 < t < a;input = 20ua(t) −→ response = y(t) =

20 − 20e−k(t−a) t ≥ a. (3)

Note that the response is just the translation a units to the right of the response to the unit-step input at 0.

Our next example continues the temperature model with a different discontinuous input. In this case, the physical input is an external bath which is initially ice-water at 0 degrees, then replaced by water held at a fixed temperature for a time interval, then replaced once more by ice-water. To model the input we need the unit box function on [a, b]:

= 1 a ≤ t ≤ b

0 ≤ a < b; (4)uab 0 otherwise

Example 2. Find the response of the system . y + ky = kq, with IC y(0) = 0

to input q(t) = 20uab(t).

Solution. There are at least three ways to do this:

a) Express uab as a sum of unit step functions and use (3) together with superposition of inputs;

b) Use the function uab directly in a definite integral expression for the response;

c) Find the response in two steps: first use (3) to get the response y(t) for the input ua(t); this will be valid up till the point t = b.

Then, to continue the response for values t > b, evaluate y(b) and find the response for t > b to the input 0, with initial condition y(b).

We will follow (c), leaving the first two as exercises.

By (3), the response to the input ua(t) is:

0 0 ≤ t < a y(t) =

20 − 20e−k(t−a) t ≥ a.

2

Response to Discontinuous Input OCW 18.03SC

This is valid up to t = b, since uab(t) = ua(t) for t ≤ b. Evaluating at b,

y(b) = 20 − 20e−k(b−a). (5)

.For t > b we have uab = 0, so the DE is just y + ky = 0. This models exponential decay (our most important DE) and we know the solution:

y(t) = ce−kt . (6)

We determine c from the initial value (5). Equating the initial values y(b) from (5) and (6), we get:

ce−kb = 20 − 20e−kb+ka

from which: c = 20ekb − 20eka .

By (6): y(t) = 20(ekb − eka)e−kt , t ≥ b. (7)

After combining exponents in (7) to give an alternative form for the response we assemble the parts, getting:

0 0 ≤ t ≤ a

y(t) = 20 − 20e−k(t−a) a < t < b (8)

20e−k(t−b) − 20e−k(t−a) t ≥ b.

⎧ ⎪⎨ ⎪⎩

3

Exponential Input; Gain and Phase Lag: Introduction

The case of sinusoidal input is of great importance in applications. A sinusoidal function is a pure oscillation like cos(ωt) or sin(ωt), or more generally, A cos(ωt − φ). (As you can see, the last form includes both of the previous two by letting A = 1 and φ = 0 or π/2).

In the temperature model, sinusoidal input could represent the diurnal (day and night) varying of outside temperature. In the concentration model it could represent the diurnal varying of the level of some hormone in the bloodstream, or the varying concentration in a sewage line of some waste product produced periodically by a manufacturing process.

In this session we are going to restrict our attention to first order constant coefficient ODE’s. Before looking at sinusoidal input we will look at exponential input. For constant coefficient equations we will be able to use the method of optimism to find a solution without having to compute integrals. For sinusoidal input, we will use Euler’s formula to convert sinusoids to complex exponentials, and so our solutions for exponential input will apply to sinusoids as well.

First Order Response to Exponential Input

We start with an example of a linear constant coefficient ODE with exponential input signal.

.Example 1. Solve x + 2x = 4e3t .

Solution. We could use our integrating factor, but instead let’s use the method of optimism, i.e., the inspired guess. The inspiration here is based on the fact that differentiation reproduces exponentials:

d rt rt e = re .dt

Since the right hand side is an exponential, maybe the output signal x(t) will be also. Let’s try

xp(t) = Ae3t .

This is not going to be the general solution, so we use the subscript p to indicate it is just one particular solution. We don’t know what A is yet, but we will be led to its value by substitution. Substituting xp into the DE we get

Left hand side: x.

p + 2xp = 3Ae3t + 2Ae3t = 5Ae3t . Right hand side: 4e3t .

Equating the two sides we get

5Ae3t = 4e3t 5A = 4 A = 4/5. ⇒ ⇒

So, we were led to the value of A and we have that one solution to the DE is

4 3txp(t) = e .5

.The associated homogeneous equation x + 2x = 0 has general solution xh(t) = Ce−2t. By the superposition principle, we add xp and xh to get the general solution to our DE:

x(t) = xp(t) + xh(t) = 4

e3t + Ce−2t .5

First Order Response to Sinusoidal Input

1. Introduction

We are going to solve a first order constant coefficient DE with sinusoidal input: .

x + kx = B cos(ωt). (1)

Our strategy will be to use Euler’s formula to replace cos(ωt) by the complex exponential eiωt . We call this technique complex replacement. We illustrate it with an example and then rework the example in the general case.

2. Illustrative Example

Solve the ODE . x + 2x = 2 cos(2t). (2)

Solution. We will go through this example very carefully. After sufficient practice many of the steps can be done in your head.

The key is to introduce a new variable y with its own related ODE . y + 2y = 2 sin(2t). (3)

Now we combine x and y to make a complex variable z = x + iy. Combining equations (1) and (3) in the same manner we get

z . + 2z = 2 cos(2t) + 2i sin(2t) = 2e2it . (4)

We note that x = Re(z), so once we’ve found z(t) we can easily find x(t).

Equation (4) has exponential input and we know how to solve it: try a solution of the form zp(t) = Ae2it. Substituting this into the equation gives

Left hand side: z.

p + 2zp = 2iAe2it + 2Ae2it = (2 + 2i)Ae2it . Right hand side: 2e2it .


(2 + 2i)Ae2it = 2e2it A = 1/(1 + i).⇒

Thus, zp(t) = e2it/(1 + i).

The problem asks for x which is the real part of z. We can find x using polar or Cartesian coordinates. We will do it both ways. For most students using polar coordinates is less familiar. You should therefore learn it well

First Order Response to Sinusoidal Input OCW 18.03SC

because polar coordinates are easier to interpret and we are generally prefered. The sinusoidal identity can be used to convert one form to the other.

Polar Coordinates. In polar coordinates, 1 + i =

√2 eiπ/4. Using this in the formula for zp:

e2it e2it ei(2t−π/4) 1zp(t) =

1 + i = √

2eiπ/4 = √

2 = √

2 (cos(2t − π/4) + i sin(2t − π/4)) .

Taking the real part we get

1 xp(t) = Re(zp(t)) = √

2 cos(2t − π/4).

Finally, as always, we add the homogeneous solution to this to get the general solution:

x(t) = xp(t) + Ce−kt = √12

cos(2t − π/4) + Ce−kt .

Cartesian Coordinates. We use the complex conjugate to handle the denominator:

e2it cos(2t) + i sin(2t) 1 − i cos(2t) + sin(2t) + i(sin(2t) − cos(2t)) zp(t) =

1 + i =

1 + i ·

1 − i =

2.


cos(2t) + sin(2t)xp(t) = .

2

Exercise. Use the sinusoidal identity to show that the two solutions given in the previous example are, in fact, identical.

3. General Case

Solve the ODE . x + kx = B cos(ωt). (5)

(We assume k, ω and B are all positive.)

Solution. This is really just a matter of replacing the numbers in our illustrative example by the letters k, B and ω. We will not write down as much as before. If something is unclear you can go to the corresponding part of the example above to understand it.

2

�

First Order Response to Sinusoidal Input OCW 18.03SC

Do the complex replacement:

z . + kz = Beiωt , where cos(ωt) = Re(eiωt) and x = Re(z). (6)

Equation (5) has exponential input, so we try a solution of the form zp(t) = Ae2it. Substituting this into the equation gives

Left hand side: z.

p + kzp = iωAeiωt + kAeiωt = (k + iω)Aeiωt . Right hand side: Beiωt .


(k + iω)Aeiωt = Beiωt A = B/(k + iω).⇒

Thus, zp(t) = Beiωt/(k + iω). In polar coordinates

k + iω = k2 + ω2 eiφ , where φ = tan−1(ω/k) in the first quadrant.

(Because tan−1 is ambiguous, e.g tan(π/4) = tan(5π/4) = 1, we fix the value of tan−1 by saying which quadrant the complex number is in. In this case, since k, ω > 0, k + iω is in the first quadrant. Another way to do this would be to write φ = Arg(k + iω).) Thus,

Beiωt Beiωt Bei(ωt−φ) zp(t) =

k + iω = √

k2 + ω2eiφ = √

k2 + ω2.


B xp(t) = √

k2 + ω2 cos(ωt − φ).

Finally, as always, we add the homogeneous solution to this to get the general solution:

x(t) = xp(t) + Ce−kt = √k2

B

+ ω2 cos(ωt − φ) + Ce−kt .

3

Amplitude, Phase, Gain and Bode Plots

We found that the ODE

.x + kx = kB cos(ωt) (1)

has a particular solution

kB x(t) = √

k2 + ω2 cos(ωt − φ) = gB cos(ωt − φ) (2)

where φ = tan−1(ω/k). If we consider the input to be B cos(ωt) then the gain (= output amplitude/input amplitude) is g = k/

√k2 + ω2.

There is a lot more to learn from the formula (2) and its various pieces. The terminology applied below to solutions of the first order equation (1) applies equally well to solutions of second and higher order equations. We will discuss this more when we study second order equations. See also the Mathlet Amplitude and Phase: First Order for a dynamic illustration.

Let’s gather all the terminology in one place.

1. B cos(ωt) is the input (or input signal).

2. B is the input amplitude and ω is the input circular frequency.

3. x(t) is the output or response.

4. g = k/√

k2 + ω2 is called the gain or amplitude response. The input amplitude is scaled by the gain to give the output amplitude.

5. φ is called the phase lag.

Let’s fix the coupling constant k and think about how g and φ vary as we vary ω, the circular frequency of the signal. Thus we will regard them as functions of ω, and we may write g(ω) and φ(ω) in order to emphasize this perspective. We are supposing that we always have the same system and are watching its response to a variety of input signals. Graphs of g(ω) and −φ(ω) for values of the coupling constant k = .25, .5, .75, 1, 1.25, 1.5 are given below.

Amplitude, Phase, Gain and Bode Plots OCW 18.03SC

ω

g = gain

0 .5 1 1.5 2 2.5 30.00.10.20.30.40.50.60.70.80.91.0

Circular frequency of signal

Fig. 1. First order amplitude response curves

ω

−φ/π = phase shift in multiples of π

0 .5 1 1.5 2 2.5 3

-0.05-0.10-0.15-0.20-0.25-0.30-0.35-0.40-0.45-0.50

0

Circular frequency of signal

Fig. 2. First order phase response curves

These graphs are essentially Bode plots. (Technically, the Bode plots display log g(ω) and −φ(ω) against log ω.)

In this course we will focus more on the amplitude response curve (graph of gain vs. ω) than the phase response curve. The phase response is important, we just won’t have time to explore it. For equation (1) the amplitude response is rather simple: for any value of k the gain starts at 1 and decreases to 0 as ω goes to infinity.

2

First Order Autonomous DEs’: Introduction

In the last topic of this unit we will study autonomous first order differential equations. These are (in general) nonlinear equations of the form

. x = f (x).

.(Compare this with the general first order ODE x = f (x, t).) The word autonomous means self governing and indicates that the rate of change of x is governed by x itself and is not dependent on time.

Autonomous systems have the following properties:

1. They model conditions which are constant in time, though they may depend on the current value of x.

2. They are separable.

3. They can be hard to integrate.

4. We can say a lot about them without solving them.

5. They are time invariant: if y(t) is a solution then so is y(t − t0) for any value of t0.

Our goal will be to learn to get qualitative information about the solutions without actually finding them. For example, in population models we might want to know if the population stabilizes, crashes or explodes. For physical models, we may want to know if the system is self correcting or if it can, all by itself, fall apart.

To visualize the qualitative results we will use the phase line. This is a very simple one dimensional plot consisting of critical points and arrows.

Logistic Model: Qualitative Analysis

We will approach this topic through examples. As stated in the introduction, a first order autonomous equation is one of the form

. y = g(y).

1. Simple Examples

Example 1. Natural growth or decay with constant growth-rate k:

. y = ky.

Example 2. Bank account with interest rate not depending on time but possibly depending upon current balance and constant savings rate:

. y = I(y) y + q.

2. Logistic Population Model

Example 3. The logistic population model is a simple model that takes into account the limits the environment imposes on population growth. Suppose we have a model for a population y that has a variable growth rate k(y) which depends on the current population but not on time. That is,

. y = k(y) y. (1)·

Suppose that when y is small the growth rate is approximately k0, but that there is a maximal sustainable population M. This means that as y gets near M the growth rate decreases to zero. And, if y > M , the growth rate becomes negative and the population declines back to the maximal sustainable population.

In the simplest version of this, the graph of k(y) is a straight line with k = k0 when y = 0 and k = 0 when y = M.

Logistic Model: Qualitative Analysis OCW 18.03SC

y

k

k = k(y) = k0(1− y/m)

k0

M

Fig. 1. Line with vertical intercept k0 and horizontal intercept M.

The equation of this line is

k(y) = k0(1 − y/M).

(You can check that k(0) = k0, k(M) = 0 and k(y) < 0 for y > M.)

In this case equation (1) is known as the Logistic Population Model : . y = k0(1 − (y/M))y = f (y). (2)

This is more realistic than natural growth when you want to account for limits to growth. It is nonlinear but it is autonomoous.

Autonomous equations are always separable and, in this case, we could compute the resulting integral using partial fractions. But we are aiming for a qualitative grasp of the solutions, which we develop in the next example.

Example 4. Give a qualitiative picture of the solutions without solving equation (2).

Solution. We start by looking for constant solutions y(t) = y0. Since a constant has derivative 0, plugging this into (2) gives

0 = f (y0)

We see that y0 = 0 and y0 = M are the two points where f (y0) = 0. Thus we have two constant solutions y(t) = 0 and y(t) = M. Because a system at equilibrium is unchanging, we will call these solutions equilibrium so.lutions. Since y = f (y) = 0 when y = 0 and y = M we call 0 and M the critical points of the DE. To summarize, the following all say the same thing: 1. f (y0) = 0.

2


2. y(t) = y0 is an equilibrium solution. 3. y = y0 is a critial point.

To tie this to previous work, note the equation is separable and our constant solutions are none other than the lost solutions of the separable equation.

To understand the non-constant solutions we will sketch and analyze the direction field for equation (2). Clearly, each isocline, f (y) = c, is a horizontal straight line. For a fixed slope c, the isocline will consist of a horizontal line y = y0 where f (y0) = c.

As usual, first we look at the nullcline f (y) = 0. We already know the zeros of f (y) are 0 and M. So the nullclines are the pair of lines y = 0 and y = M. These are exactly the constant solutions found above.

t

y

y(t) = M

y(t) = 0

Fig. 2. The nullclines are also solution curves.

To get a clear picture of the other isoclines we will draw a graph of f (y) as a function of y. It’s a parabola opening downward, meeting the horizontal axis at y = 0 and y = M.

y//

f(y)OO

• •M

<f(y) < 0

>f(y) > 0

<f(y) < 0

Fig. 3. The graph of f (y) tells us where . y is positive and negative.

The graph shows that .for y < 0 y = f (y) is negative, .for 0 < y < M y = f (y) is positive, .for M < y y = f (y) is negative.

This is indicated on the graph by the arrows on the horizontal axis. The

3


. arrow points left (towards decreasing y) where y is negative and right (to.

wards increasing y) where y is positive. To make things clear, we also label the intervals as having f (y) positive or negative.

Now we can sketch the direction field. First, we draw the nullclines and since these are horizontal lines, we don’t need to sketch the direction field elements (little line segments) along them. Then we choose a horizontal line above y = M and sketch the direction field elements along it. (We .know they are negative because for y > M we know y < 0.) Similarly, we add an isocline between 0 and M and one below 0.

Finally we can sketch some solution curves:

t

y

M

0

Fig. 4. Direction field and solution curves for the logistic equation (2).

1. Since the slope field is constant in the t direction any solution curve can be translated left or right and still be a solution.

2. Since the lines y = 0 and y = M are solutions the other curves can’t cross them.

3. The solutions that start just above the equilibrium solution y = 0 must increase. Since they can’t cross the solution y = M they must go asymptotically to towards it. These bounded solutions are called logistic curves or S-curves. They represent the population drifting from just above the equilibrium y = 0 towards the one at y = M.

4. If the population exceeds the M, it tends back towards it. This represents environmental pressure related to overpopulation. M is called the carrying capacity of the environment.

4


5. In a population model we would never see y < 0. Mathematically, the solution curves that start below y = 0 decrease without bound.

3. Stable and Unstable Equilibria

Notice that in the logistic model all the solution curves that start near the equilibrium y = M go asymptotically towards it. (See figure 4.) We call such an equilibrium a stable equilibrium. Similarly, we call the equilibrium y = 0 an unstable equilibrium because all the curves that start near it move away.

4. Summary

The sketch of the solutions gives us our qualitive picture. We also defined a number of terms.

.1. Autonomous equation: y = f (y).

2. Equilibrium solutions: Constant solutions y(t) = y0 where f (y0) = 0.

3. Critical points: The value of the equilibrium solutions, i.e., values y0 where f (y0) = 0.

4. Stable equilibrium: An equilibrium solution where all nearby solution curves tend towards it.

5. Unstable equilibrium: An equilibrium solution where all nearby solution curves tend away from it.

6. Logistic population model, logistic curves: see above.

7. Carrying capacity: The stable equilibrium in the logistic model that all (positive) populations approach asymptotically.

In later examples we will learn how to systematically make a qualitative sketch of the solution curves.

5

Phase Lines

1. Definition .

Since the direction field for an autonomous DE, y = f (y), is constant on horizontal lines, its essential content can be conveyed more efficiently using the following recipe:

1. Draw the y-axis as a vertical line and mark on it the equilibria, i.e. where f (y) = 0.

2. In each of the intervals delimited by the equilibria draw an upward pointing arrow if f (y) > 0 and a downward pointing arrow if f (y) < 0.

This simple diagram tells you roughly how the system behaves. It’s called the phase line. The phase line captures exactly the information we use to get the qualitative sketch of solution curves. We illustrate this with some examples.

2. Examples .

Example 1. For the DE y = 3y: find the critical points, draw the phase line, classify the critical points by stability and use the phase line to give a qualitative sketch of some solution curves.

Solution. The steps to follow are: 1. Find the critical points. .2. Plot the graph of f (y) and determine where y is positive and negative. 3. Draw the phase line and find the stability of the critical points. 4. Sketch the solution curves.

1. We have f (y) = 3y. We easily see that the only critical point (root of f ) is y = 0.

. .2. The plot of f (y) is a straight line. We see y > 0 for y > 0 and y < 0 for y < 0.

Phase Lines OCW 18.03SC

y

y

f(y) = 3y

y > 0y < 0

3. We use the information from steps 1 and 2 to draw the phase line. We .put a large dot at the critical point. Since y > 0 in the interval y > 0 we add an arrow pointing upwards in that interval. Similarly the interval y < 0 gets a down arrow. Since the arrows point away from the critical point, the equilibrium is unstable. This is all shown in the figure below.

4. Once we have the phase line we can make a qualitative sketch of the solution curves. The equilibrium solution corresponds to the critical point. It is the horizontal line y(t) = 0. The solutions that start positive increase and those that start negative decrease. We present the solution curves next to the phase line so you can see that the phase line arrows represent the y-direction of the integral curve.

y

0unstablet

y

3. The phase line. 4. Qualitative sketch of solution curves.

.Example 2. Repeat example 1 for the logistic equation y = k0(1 − y/M)y.

Solution. We did all of this earlier except draw the phase line. 1. Critical points: y = 0, y = M.

2


y//

yOO

• •M

y = f(y)

<f(y) < 0

>f(y) > 0

<f(y) < 0

2. Graph of f (y)

y

M•stable

0•unstable

∨

∧

∨

t//

yOO

3. Phase line. 4. Sketch of solution curves.

3. Solutions Can Be Shifted in Time .

In an autonomous equation y = f (y), the direction field is constant in the horizontal direction. Said differently, the conditions represented by the ODE are constant in time. Consequently, any horizontal (time) translate of a solution is another solution. A "time translate" of a function y(t) is a function y(t − t0); the graph is shifted horizontally (to the right) by t0 units.

.Example 3. Solutions of y = k0y exhibit three different behaviors, illustrated by

y(t) = ek0t , y(t) = 0 and y = −ek0t .

Any solution is a horizontal translate of one of these three:

y(t) = ek0(t−t0), y(t) = 0 (whose only time-translate is itself), or

y(t) = −ek0(t−t0).

See the answer to example 1 for the graphical version of this.

3


4. Semistable Equilibria

Some equilibria are half stable and half unstable. We call them semistable

Example 4. Repeat examples 1 and 2 for the DE y . = y2.

Solution. Once again the algebra is trivial. 1. The only critical point is y. = 0. 2. The graph of f (y) shows y is always positive (except at 0). 3. The phase line shows the critical point at 0 is semistable.

y

y

y = y2

y > 0y > 0

y

semistable 0 t

y

Graph of f (y). Phase line. Sketch of solution curves.

5. Conclusion

The phase line shows the qualitative behavior of a system at a glance: the critical points are shown and you can tell the stability of each critical point by looking at the arrows around it; the arrows also tell you what happens to the integral curves in the long-run, as t goes to infinity.

4

Stability

Quiz: Stability. .In the autonomous equation y = f (y) , where f (y) has the graph shown, descibe the rightmost critical point.

y

y

y = f(y)

-1 1

Choices: a) stable b) unstable c) semistable c) can’t tell, could be any of them

Answer: Unstable. This is evident from the phase line.

y

1unstable

0stable

-1unstable

The phase line shows that the critical point at y = 1 is unstable.

Stability


y

y

y = f(y)

-1 1

Choices: a) stable b) unstable c) semistable c) can’t tell, could be any of them


Stability


y

y

y = f(y)

-1 1


Can Solutions have a Local Maximum?

Quiz: Can Solutions have a Local Maximum?.Can solutions of autonomous equations have a strict local maximum? (Fora function f (t) a strict local maximum at time t = a means f (a) is largerthan any nearby values of f (t). Graphically, it is at the top of a hill.)

Choices:

a) Yes.

b) No.

Answer: No. .

Suppose y(t0) = y0 and y(t0) = 0 then there is an equilibrium solution y(t) = y0. By the existence and uniqueness theorem this is the only solution with y(t0) = y0. We have shown that non-constant solutions never have derivative equal to 0, i.e. they don’t have any local maxima or minima.

We had to be careful in phrasing the question because constant functions have local maxima, just not strict local maxima. That is, all values of the function are maximum values, but no value of the function is larger than nearby values.

Linear vs. Nonlinear

The general linear first order ODE is

r(t)x�(t) + p(t)x(t) = q(t).

The general first order ODE is . y(t) = F(t, y(t)).

Representing the linear side of the debate is Linn E. R. and Chao S. representing the nonlinear side.

1. The Debate

Linn: I’d like to begin by making the point that there is a solution procedure for linear equations, which reduces the solution of any linear equation to integration. Multiply the equation through by a factor so that the two terms

d(ux)become the two terms in , then integrate.

dtSometimes you can just see this. For example,

2 . d t x + 2tx = t2( x).

dt

If we are in reduced standard form, i.e. when r = 1, then this can be done systematically with the following steps: 1. We seek u(t) such that

. d(ux)u(x + px) = ,

dt .

i.e. pu = u. This is separable, with solution

�

u e p(t) dt = .

(Any constant of integration will do here.)

2. Then integrate both sides of

d(ux) = uq

dt

and solve for x. This gives

x(t 1) = u− (t) �

u(t)q(t) dt.

Linear vs. Nonlinear OCW 18.03SC

The constant of integration is in this integral, so the general solution has the form

x 1(t) = xp(t) + cu− (t).

Another lovely feature of linear equations is that the constant of integration in the solution of a linear equation always appears right there.

The associated homogeneous equation is.x + px = 0. �

This is separable, with solution x t e− p t( ) = ( ) dth . Look! This is the recipro

cal of the integrating factor! Wonderful!

In most applications, u−1(t) falls off to zero as t gets large; the term cu−1(t) is a transient.

Chao: That’s a lot of integration. I’m more interested in the general behavior of solutions, rather than an incomprehensible expression of them as integrals or a boring expression of them in terms of sin, cos, and exponentials.

� �

� �

I prefer arguments like this: take an equation like. 2y = y − x.

This doesn’t have a single solution which you, Linn, in your linear cave, have anything to say about. But I can look at the direction field, recognize that there is a funnel along the curve y = −

√x. This means all solutions

near there are trapped and are asymptotic to −√

x. I can even argue that they are all ultimately a bit larger than −

√x. No integration involved and

very good information.

Or take an equation like the logistic equation

y . = k0y 1 −

yp

.

This is an auotomous equation, and remains so even if I allow a harvest rate, even one depending upon y :

y . = k0y 1 −

yp

− a(y).

This equation gives genuine insight into real population dynamics. By looking at the phase line it is easy to analyze the behavior of solutions, in a way useful for policy makers.

2


Linn: Well, now, most of the time a system is near equilibrium. Engineers get very anxious when their systems get too far from equilibrium. Let’s look at your nice nonlinear logistic equation. As you said, there’s a critical point at y = p, and so an equilbrium solution. Just how does the system relax to this equilibrium?

Let’s write y = p + u and change variables using

y u1 = .

p p − −

Substituting this in the logistic equation gives

2. u uu = −k0 p

(p − u) = −k0u + k0 p .

For small u the second term is very small, and can be ignored. This is called LINEARIZING!! the equation near equilibrium. Near equilbrium solutions to the nonlinear equation behave a lot like p + u where u is a solution to the linear equation .

u = −k0u.

So we can say that the population relaxes to equilibrium exponentially, as e−k0t approaches 0.

Chao: There you go again with your fancy exponentials. You think you know all about them, but your computer has to compute their values, after all, and the methods it uses are no different from the methods used to compute the values of linear equations. It has to use Euler’s method, or its fancier variants.

Linn: Speaking of fancy exponentials, I’d like to point out that smart people almost never use integrating factors to integrate linear equations with constant coefficients: .

x + kx = q(t). (LCC)

Yeah, the integrating factor is ekt, but even I don’t like the integrals that come out. But there are these great tricks, Chao! Suppose q(t) = Bert . Then, be optimistic! Maybe there’s an exponential solution of the form Aert. When we substitute into the two sides of equation (LCC) we get: .

Left side: x + kx = A(r + k)ert

Right side: Bert . Setting them equal to each other we get:

Bert = A(k + r)ert

3


Thus, B B

A = and xp(t rt ) = e (ERF)k + r k + r

. We’ve found a particular solution to x + kx = Bert. Because it is the response to exponential input we call this the Exponential Response Formula. (ERF) It works as long as k + r is not 0.

Chao: Bravo.

Linn: Thank you. And what’s better, did I ever tell you about the complex exponential? In the ERF r can be a complex number! Euler told us that

eiθ = cos(θ) + i sin(θ)

– it’s a point on the unit circle in the complex plane. So, trig functions are incorporated into the complex exponential!! To solve

. x − x = 3 cos(t)

I replace it by the different equation

z . − z = 3eit

of which my original DE is the real part. Then I can use the ERF (with k = −1, r = i, B = 3) to get

it ezp = 3

(−1 + i).

Since xp = Re(zp), all that’s left is to find the real part of zp.

eit −1 − i −3 cos(t) + 3 sin(t) + i(− cos(t) − sin(t)) zp = 3

(−1 + i) −1 − i =

2.

Therefore, 3 3

xp = Re(zp) = − 2

cos(t) + 2

sin(t).

For the general solution we just add in the general solution to the homogeneous equation, which is cet. It’s a little funny to call this a transient and I don’t, but it does give you the general solution.

4


Chao: You know, your method results in these sums of sines and cosines, which is very nice but I want to know what they look like. The nonlinear view of sines and cosines writes

a cos(ωt) + b sin(ωt) = A cos(ωt − φ),

where A, and φ are the polar coordinates of the point (a, b).

In your example, A = 3√

2/2 and φ = 3π/4. So, the solution is easy to draw and compare with the input signal.

I want to point out another charming feature of solutions to many non. linear equations. Take a simple one, for example y = y2. This is separable and can be solved in three short steps:

y−2dy = dx

−y−1 = x + c

y = 1/(c − x).

So the IVP with y(0) = 1 has c = 1 and y = 1/(1 − x). Its graph is asymptotic to the vertical line at x = 1. In other words, it is able to go off to infinity in finite time. It ends. The equation y = 1/(1 − x) actually represents two solutions: one for x < 1, and another for x > 1. If we are to say that a solution of a differential equation is determined by an initial value, we have to require that the graph be connected.

Linn: You call that a feature? That never happens to solutions to my equations. If they are going to go south on me, I know it from the coefficients or the input signal. As long as p(t) and q(t) are nice and finite (and r(t) is nonzero) so are all solutions. They live as long as I do!

Chao: Well, the world really is nonlinear. Newton’s law of gravitation is highly nonlinear. This kind of explosion actually happens in the case of Newton’s laws: Jeff Xia showed that in a certain 5-planet system two of

sin(1/t)the planets behaves more or less like , oscillating with increasing

t amplitude and increasing frequency as t → 0 (from the negative side).

Solutions to linear equations are not nearly as diverse and exciting!

2. Conclusion

In both 18.03 classes in spring 2010, Linear won the debate, but Nonlinear’s supporters were more enthusiastic.

5

Introduction

Constant coefficient linear DE’s lie at the heart of this course. In this session will see that they can be used to model many physical systems. Here we will focus on the damped harmonic oscillator. In particular we will discuss spring-mass-dashpot systems

Because second order equations are algebraically tractable we will focus on them. Fortunately they are varied enough to give us a lot of insight into the behavior of higher order systems.

In this session we will learn how to solve homogeneous equations, i.e. those where the input function is 0. The key step will be finding which exponential functions ert satisfy the DE. These will be our modal solutions. We will use superposition to build all the solutions out of the modal solutions.

The algebra will demand that we allow r to be a complex number, so we will need to use Euler’s formula to convert the complex exponential solution into solutions involving sines and cosines.

Second Order Physical Systems

1. Second Order Physical Systems

Second order equations are the basis of analysis of mechanical and electrical systems. We’ll build this important subject up slowly, starting with a simple mechanical system.

A spring is attached to a wall and a cart:

spring

mass

x

Fext

x = 0 x

We set up the coordinate system so that at x = 0 the spring is relaxed, which means that it is exerting no force. This is called the equilibrium position.

In addition to the spring, suppose that there is another force acting on the cart – an external force, maybe wind blowing on a sail attached to it, maybe gravity, or some other force. Then

.. mx = Fspr + Fext

The spring force is characterized by the fact that it depends only on position. In fact:

if x > 0, Fspr(x) < 0 if x = 0, Fspr(x) = 0 if x < 0, Fspr(x) > 0.

The simplest way to model this behavior (and one which is valid in general for small x, by the tangent line approximation) is

Fspr(x) = −kx, where k > 0

This is called Hooke’s Law and k is called the spring constant.

Replacing Fspr by −kx we get ..

mx + kx = Fext.

Second Order Physical Systems OCW 18.03SC

Any real mechanical system also has friction. Friction takes many forms. It is characterized by the fact that it depends on the motion of the mass. We will suppose that it depends only on the velocity of the mass and not on its position. Often the damping is controlled by a device called a dashpot. This is a cylinder filled with oil, that a piston moves through. Door dampers and car shock absorbers often actually work this way. We write .Fdash(x) for the force exerted by the dashpot. It opposes the velocity:

. .if x > 0, Fdash(x) < 0

. .if x = 0, Fdash(x) = 0

. .if x < 0, Fdash(x) > 0

The simplest way to model this behavior (and one which is valid in general .for small x, by the tangent line approximation) is

. .Fdash(x) = −bx, where b > 0.

This is therefore called linear damping and b is called the damping constant.

Putting this together we get the differential equation for the displacement x of the mass from equilibrium is

.. . mx + bx + kx = Fext. (1)

Equation (1) will be a rich source of examples in the remainder of the course. Diagramatically this looks like:

spring

mass

x

dashpotFext

x = 0 x

2

MIT OpenCourseWarehttp://ocw.mit.edu

18.03SC Differential Equations�� Fall 2011 ��

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

http://ocw.mit.edu

http://ocw.mit.edu/terms

The Characteristic Polynomial

1. The General Second Order Case and the Characteristic Equation

For m, b, k constant, the homogeneous equation.. .

mx + bx + kx = 0. (1)

is a lot like x . + kx = 0, which has as solution x = e−kt. We’ll be optimistic

and try for exponential solutions, x(t) = ert, for some as yet undetermined constant r.

To see which values of r might work, plug x(t) = ert into (1). Organize the calculation: the k] , b] , m] are flags indicating that we should multiply the corresponding line by this number.

k] x = ert .

b] x = rert

m] .. x = r2ert

m .. x + bx

. + kx = (mr2 + br + k)ert = 0.⇒

An exponential is never zero, so we can divide this equation by ert. We have found that ert is a solution to (1) exactly when r satisfies the characteristic equation

mr2 + br + k = 0.

The left hand side is a polynomial called, naturally enough, the characteristic polynomial and usually denoted p(r). (You will often also see s used as the variable instead of r. With this notation the characteristic polynomial is p(s) = ms2 + bs + k.)

.. .Example. Find all the solutions to x + 7x + 8x = 0.

Solution. The characteristic polynomial is r2 + 8r + 7 . We want the roots. One reason we wrote out the polynomial was to remind you that you can find roots by factoring it. This one factors as (r + 1)(r + 7) so the roots are r = −1 and r = −7, with corresponding exponential solutions are x1(t) = e−t and x2(t) = e−7t .

By superposition, the linear combination of independent solutions gives the general solution:

x(t) = c1e−t + c2e−7t .

The Characteristic Polynomial OCW 18.03SC

.Suppose that we have initial conditions x(0) = 2 and x(0) = −8 then

we can solve for c1 and c2. Use x. (t) = −c1e−t − 7e−7t and substitute t = 0

to get x(0) = c1 + c2 = 2 . x(0) = −c1 − 7c2 = −8

Adding these two equations yields −6c2 = −6, so c2 = 1 and c1 = 1. The solution to our DE with the given initial conditions is then x(0) = 2, . x(0) = −8 is

x(t) = e−t + e−7t .

2. The General nth Order Case

In the same way we can take the homogeneous constant coefficient linear equation of degree n

anx(n) + a1x . + a0x = 0+ · · ·

and get its characteristic polynomial,

p(r) = anrn + + a1r + a0· · ·

The exponential x(t) = ert is a solution of the homogeneous DE if and only if r is a root of p(r), i.e. p(r) = 0. By superposition, any linear combination of these exponentials is also a solution.

2




http://ocw.mit.edu


Linear Differential Equations

1. Linear Differential Equations

A linear differential equation is of the following form:

anx(n) + an−1x(n−1) + + a1x . + a0x = q(t). (1)· · ·

The ak’s are the coefficients. They may depend upon t (but not on x ). If an is not zero then the differential equation is said to be of order n.

If this models a physical system then the left hand side represents the system and the right hand side represents the input signal. The coefficients represent parameters of the system. For example, the mass, damping and .. .spring constants m, b and k in mx + bx + kx are the parameters of the system. In general, they may depend on time, e.g. maybe the force is actually a rocket, and the fuel burns so m decreases. Or maybe the spring gets softer as it ages. Maybe the honey in the dashpot gets thicker with time.

We will generally assume the coefficients are constant. In which case equation (1) is said to be a constant coefficient linear equation. It is, in fact, a good approximation of the non-constant coefficient equation as long as the coefficients vary on a time-scale that is much greater than the time-scale of the dynamical variable x.

2. Second Order Homogeneous Constant Coefficient Linear Equations

.. .We will study the spring system mx + bx + kx = Fext starting with the

case Fext = 0. .. . mx + bx + kx = 0. (2)

To ensure that (2) is of second order (and a realistic physical system) we always assume m > 0, but we will allow the case b = 0 and occasionally k = 0. With no external force the system evolves on its own. Think of a door that can swing back and forth or a ball on the end of a rubber band. As we did in first order equations we will call (2) a homogeneous linear differential equation.

3. The Undamped Case

The special case b = 0 is called .. undamped. This is called the simple harmonic oscillator. It’s ODE is mx + kx = 0 or

.. k x + x = 0.

m

Linear Differential Equations OCW 18.03SC

If we let ω = √

k/m our equation becomes

.. x + ω2x = 0.

We have seen before (and you can easily check) that x1(t) = cos(ωt) and x2(t) = sin(ωt) are solutions to this equation. Since the input is 0 and the equation is linear, we can use superposition of solutions to get the general solution

x(t) = a cos(ωt) + b sin(ωt) = A cos(ωt − φ) (3)

This is another fundamental fact you should memorize! (The second equality comes from the sinusoidal identity, which gives a = A cos φ and b = A sin φ.)

.We know (3) gives every solution because x(0) = a and x(0) = ωb, so

you can solve (uniquely) for a and b to give any desired intial condition.

2




http://ocw.mit.edu


Period of the Simple Harmonic Oscillator

..Quiz: What is the period of a nonzero solution of x + 4x = 0?

Choices:

a) Depends upon the solution

b) 2

c) π

d) 4

e) 2π

f) π/2

g) None of these.

Answer: (c) π.We have the natural frequency ω0 =

√k/m = 2, so the general solution is

x(t) = c1 cos(2t) + c2 sin(2t) = A cos(2t − φ)

in both rectangular and phase-amplitude form respectively.

(As a check, think of what t has to do to take 2t from 0 to 2π; or alternatively use P = 2π/ω0, with ω0 = 2.)




http://ocw.mit.edu




Choices:

a) Depends upon the solution

b) 2

c) π

d) 4

e) 2π

f) π/2

g) None of these.





http://ocw.mit.edu


Modes and Roots

A solution of the form x(t) = cert to the homogeneous constant coefficient linear equation

anx(n) + an−1x(n−1) + + a1x . + a0x = 0 (1)· · ·

is called a modal solution and cert is called a mode of the system. We saw previously that ert is a solution exactly when r is a root of the characteristic polynomial

np(s) = ans + an−1sn−1 + + a1s + a0.· · ·

Warning: This only works for homogeneous constant coefficient linear equations. It does not work for non-constant coefficient or inhomogeneous or nonlinear equations.

The roots of polynomials can be real or non-real complex numbers. (We need to be a little careful with our language because a real number is also a complex number with imaginary part 0.) Roots can also be repeated. Studying the second order equation will be enough to help us understand all of these possibilities. So, we study is (with a2 = m, a1 = b, a0 = k)

.. . mx + bx + kx = 0. (2)

which models a spring-mass-dashpot system with no external force. The characteristic equation is

ms2 + bs + k = 0.

1. Real Roots

We have already done this case earlier in this session. If the characteristic polynomial has real roots r1 and r2 then the modal solutions to (2) are x1(t) = er1t and x2(t) = er1t. The general solution if found by superposition

x(t) = c1x1(t) + c2x2(t) = c1er1t + c2er2t .

.. .Example 1. (Real roots) Solve the x + 5x + 4x = 0.

Solution. The characteristic equation is s2 + 5s + 4 = 0. This factors as (s + 1)(s + 4) = 0, so it has roots -1, -4. The modal solutions are x1(t) = e−t

and x2(t) = e4t. Therefore, the general solution is

x(t) = c1e−t + c2e−4t .

Modes and Roots OCW 18.03SC

2. Complex Roots

(Again, if we were being completely precise, this section would be called non-real complex roots to indicate a complex number with non-zero imaginary part.)

.. .Example 2. Solve the equation x + 4x + 5x = 0.

Solution. The characteristic polynomial is s2 + 4s + 5. Using the quadratic formula the roots are

−4 ±√

16 − 20 √s = = −2 ± −1 = −2 ± i.

2

So our exponential solutions are

z1(t) = e(−2+i)t and z2(t) = e(−2−i)t .

We use the letter z here to indicate the functions are complex valued.

The general solution is a linear combination of these two basic solutions. But, because the DE has real coefficients, we were expecting real valued solutions. We will finish this example and get our real solutions after stating and proving the following theorem.

Theorem (Real Solution Theorem): .. .If z(t) is a complex-valued solution to mz + bz + kz = 0, where m, b, and k are real, then the real and imaginary parts of z are also solutions.

Proof: Let u(t) be the real part of z and v(t) the imaginary part, so z(t) = u(t) + iv(t). Now, build the table.

k] z = u + iv . . .b] z = u + iv .. .. ..

m] z = u + iv

Summing with the coefficients (and remembering z is a solution to the homogeneous DE) gives

.. . .. . (mu + bu + ku) + i(mv + bv + kv) = 0.

Both expressions in parentheses are real, so the only way the sum can be zero is for both of them to be zero. That is, both u and v are solutions of (2) as claimed.

Back to the example: Using Euler’s formula

z1(t) = e(−2+i)t = e−2t cos t + ie−2t sin t.

2


The real part of e−2t cos t + ie−2t sin t is e−2t cos t and the imaginary part is e−2t sin t. We now have two basic solutions and can use superposition to find the general real valued solution

x(t) = c1e−2t cos(t) + c2e−2t sin(t).

Or we could have also written it as

x(t) = e−2t(c1 cos t + c2 sin t) = Ae−2t cos(t − φ).

This is a damped sinusoid with circular pseudo-frequency 1.

If we had chosen the other exponential solution

z2(t) = e(−2−i)t = e−2t(cos(−t) + i sin(−t))

then the basic real solutions would be

e−2t cos(−t) = e−2t cos(t) and e−2t sin(−t) = −e−2t sin(t).

Up to a sign these are the same basic solutions as was obtained from z1, so z2(t) would have work just as well.

.. .Example 3. Solve x + x + x = 0.

Solution. Characteristic equation: s2 + s + 1 = 0.

Roots: −1 ±

√1 − 4

= −1 ± i

√3

.2 2 2

Complex exponential solutions: z1(t) = e(−1+i√

3)t/2, z2(t) = e(−1−i√

3)t/2

Basic real solutions: x1(t) = Re(z1(t)) = e−t/2 cos(√

3t/2), Im(z1(t)) = e−t/2 sin(

√3t/2).

General real solution: x(t) = e−t/2(c1 cos(√

3t/2) + +c2 sin(√

3 t/2)) = Ae−t/2 cos(

√3 t/2 − φ).

.. .Example 4. Suppose that the equation mx + bx + kx = 0 has characteristic roots a ± ib. Give the general real solution.

Solution. In the previous examples we have established a pattern: Two basic real solutions are

eat cos(bt) and eat sin(bt)

and the general real solution is

x(t) = c1eat cos(bt) + c2eat sin(bt) = Aeat cos(bt − φ).

3


In words, the real part of the root is the coefficient of t in the exponential and the imaginary part is the angular pseudo-frequency in the trig functions.

For completeness we will walk through the derivation of this. One exponential solution is

z1(t) = e(a+ib)t = eat(cos(bt) + i sin(bt)).

The two basic solutions are the real and imaginary parts of z1. That is,

eat cos(bt) and eat sin(bt),

as claimed. ..

Example 5. Use the characteristic equation to solve x + 4x = 0.

Solution. You should have memorized the solution to this equation. Wewill check the characteristic equation technique against this known solution.Characteristic equation: s2 + 4 = 0.Roots: s2 = −4 ⇒ s = ±2i.Complex exponential solutions: z1 = e2it , z2 = e−2it .Basic real solutions: Re(z1) = cos(2t), Im(z1) = sin(2t).General real solution:

x = c1 cos(2t) + c2 sin(2t) = A cos(2t − φ)

(as expected).

3. Repeated Roots

Example 6. Solve .. x + 4x

. + 4x = 0. Then p(s) = (s + 2)2 has r = -2 as

a repeated root. The only exponential solution is e−2t . Another solution, which is not a constant multiple of e−2t, is given by te−t . We will not check this for now, you know how to do it: plug in and use the product rule.

So the general solution is

x(t) = c1e−2t + c2te−2t or x(t) = e−2t(c1 + c2t).

Example 7. (It’s all about the roots) Suppose the roots –with multiplicity– of a certain homogeneous constant coefficient linear equation are

3, 4, 4, 4, 5 ± 2i, 5 ± 2i.

4


Give the general real solution to the equation. What is the order of the equation?

Solution. The basic solutions are

e3t , e4t , te4t , t2e4t , e5t cos(2t), e5t sin(2t), te5t cos(2t), te5t sin(2t).

(For each repeated root we added a multiple of t to the basic solution.) Using superposition, the general solution is

x(t) = c1e3t + c2e4t + c3te4t + c4t2e4t

+ c5e5t cos(2t) + c6e5t sin(2t) + c7te5t cos(2t) + c8te5t sin(2t).

There are 8 roots, so the order of the differential equation is 8.

5




http://ocw.mit.edu


Damped Harmonic Oscillators: Introduction

In this session we will look carefully at the equation

.. .mx + bx + kx = 0

as a model of the damped harmonic oscillator. When the damping constant b equals zero we know the solution to this equation is

x(t) = c1 cos(ωt) + c2 sin(ωt) = A cos(ωt − φ),

where ω = √

k/m. Since this has a pure sinusoidal solution we call the system a simple harmonic oscillator. When b �= 0 we call the system a damped harmonic oscillator.

Our goal is to understand the effect of b on the system. We’ll see that when b is small the system is underdamped and the output is a damped sinusoid or damped oscillation. When b is large the system is overdamped and it no longer oscillates. Right between under and overdamping is a value of b called critical damping. We will learn how to find the critical damping value.

Our main tool will be the method of characteristic roots discussed in the last session. We will use the mathlet Damped Vibrations to visualize what we have learned.




http://ocw.mit.edu


�

Damped Harmonic Oscillators

In the last session we modeled a spring-mass-dashpot system with the constant coefficient linear DE

.. . mx + bx + kx = Fext,

where m is the mass, b is the damping constant, k is the spring constant and x(t) is the displacement of the mass from its equilibrium position.

spring

mass

x

dashpotFext

x = 0 x

We then assumed the external force Fext = 0 and used the characteristic equation technique to solve the homogeneous equation

.. . mx + bx + kx = 0. (1)

Restrictions on the coefficients: The algebra does not require any restrictions on m, b and k (except m = 0 so that the equation is genuinely second order). But, since this is a physical model, we will now require m > 0, b ≥ 0 and k > 0.

The Damped Harmonic Oscillator: The undamped (b = 0) system has equation ..

mx + kx = 0.

At this point you should have memorized the solution and also be able to solve this equation using the characteristic roots. The solution is

x(t) = c1 cos(ωt) + c2 sin(ωt) = A cos(ωt − φ).

Here ω = √

k/m and the solution is given in both rectangular and amplitude-phase form. The solution is always a sinusoid, which we consider a simple oscillation, and we call this system a simple harmonic oscillator.

Damped Harmonic Oscillators OCW 18.03SC

t

x

Figure 2: The output of a simple harmonic oscillator is a pure sinusoid.

When we add damping we call the system in (1) a damped harmonic oscillator. This is a much fancier sounding name than the spring-massdashpot. It emphasizes an important fact about using differential equations for modeling physical systems. It doesn’t matter whether x measures position or current or some other quantity. Any system modeled by equation (1) will respond just like the spring-mass-dashpot; that is, all damped harmonic oscillators exhibit similar behavior. We will see an important example of this principle whe we study the case of an RLC electrical circuit.

2




http://ocw.mit.edu


�

Under, Over and Critical Damping

1. Response to Damping

As we saw, the unforced damped harmonic oscillator has equation .. .

mx + bx + kx = 0, (1)

with m > 0, b ≥ 0 and k > 0. It has characteristic equation

ms2 + bs + k = 0

with characteristic roots −b ±

√b2 − 4mk

(2)2m

There are three cases depending on the sign of the expression under the square root:

i) b2 < 4mk (this will be underdamping, b is small relative to m and k).

ii) b2 > 4mk (this will be overdamping, b is large relative to m and k).

iii) b2 = 4mk (this will be critical damping, b is just between over and underdamping.

Mathematically, the easiest case is overdamping because the roots are real. However, most people think of the oscillatory behavior of a damped oscillator. Since this is connected to underdamping we start with that case.

Case (i) Underdamping (non-real complex roots) If b2 < 4mk then the term under the square root is negative and the

characteristic roots are not real. In order for b2 < 4mk the damping constant b must be relatively small.

First we use the roots (2) to solve equation (1). Let ωd = b2 − 4mk /2m. Then we have |

−|b

characteristic roots: 2m

± iωd. leading to

complex exponential solutions: e(−b/2m+iωd)t , e(−b/2m−iωd)t .The basic real solutions are e−bt/2m cos(ωdt) and e−bt/2m sin(ωdt).The general real solution is found by taking linear combinations of the twobasic solutions, that is:

x(t) = c1e−bt/2m cos(ωdt) + c2e−bt/2m sin(ωdt)

Under, Over and Critical Damping OCW 18.03SC

or

� �

x(t) = e−bt/2m(c1 cos(ωdt) + c2 sin(ωdt)) = Ae−bt/2m cos(ωdt − φ). (3)

Let’s analyze this physically. When b = 0 the response is a sinusoid. Damping is a frictional force, so it generates heat and dissipates energy. When the damping constant b is small we would expect the system to still oscillate, but with decreasing amplitude as its energy is converted to heat. Over time it should come to rest at equilibrium. This is exactly what we see in (3). The factor cos(ωdt − φ) shows the oscillation. The exponential factor e−bt/2m has a negative exponent and therefore gives the decaying amplitude. As t ∞, the exponential goes asymptotically to 0, so x(t)→also goes asympotically to its equilibrium position x = 0.

We call ωd the damped angular (or circular) frequency of the system. This is sometimes called a pseudo-frequency of x(t). We need to be careful to call it a pseudo-frequency because x(t) is not periodic and only periodic functions have a frequency. Nonetheless, x(t) does oscillate, crossing x = 0 twice each pseudo-period.

.. .Example 1. Show that the system x + 1x + 3x = 0 is underdamped, find its damped angular frequency and graph the solution with initial conditions . x(0) = 1, x(0) = 0.

Solution. Characteristic equation: s2 + s + 3 = 0.Characteristic roots: −1/2 ± i

√11/2.

Basic real solutions: e−t/2 cos(√

11 t/2), e−t/2 sin(√

11 t/2).General solution:

x(t) = e−t/2(c1 cos(√

11 t/2)+ c2 sin(√

11 t/2)) = Ae−t/2 cos(√

11 t/2 − φ).

Since the roots have nonzero imaginary part, the system is underdamped.The damped angular frequency is ωd =

√11/2.

The initial conditions are satisfied when c1 = 1 and c2 = 1/√

11. So,

x(t) = e−t/2 cos(√

11 t/2) + √111

sin(√

11 t/2) √

12 e−t/2 = √

11 cos(

√11 t/2 − φ),

where φ = tan−1(1/√

11).

2


Figure 1: The damped oscillation for example 1.

Case (ii) Overdamping (distinct real roots)If b2 > 4mk then the term under the square root is positive and the characteristic roots are real and distinct. In order for b2 > 4mk the dampingconstant b must be relatively large.

One extremely important thing to notice is that in this case the roots are both negative. You can see this by looking at the formula (2). The term under the square root is positive by assumption, so the roots are real. Since b2 − 4mk < b2 the square root is less than b and therefore the root −b +

√b2 − 4mk < 0. The other root is clearly negative.

Now we use the roots to solve equation (1) in this case. −b +

√b2 − 4mk −b −

√b2 − 4mk

Characteristic roots: r1 = , r2 = .2m 2m

Exponential solutions: er1t , er2t . General solution:

x(t) = c1er1t + c2er2t .

Let’s analyze this physically. When the damping is large the frictional force is so great that the system can’t oscillate. It might sound odd, but an unforced overdamped harmonic oscillator does not oscillate. Since both exponents are negative every solution in this case goes asymptotically to the equilibrium x = 0.

At the top of many doors is a spring to make them shut automatically. The spring is damped to control the rate at which the door closes. If the damper is strong enough, so that the spring is overdamped, then the door just settles back to the equilibrium position (i.e. the closed position) without oscillating –which is usually what is wanted in this case.

.. .Example 2. Show that the system x + 4x + 3x = 0 is overdamped and .graph the solution with initial conditions x(0) = 1, x(0) = 0. Which root controls how fast the solution returns to equilibrium?

3


Solution. Characteristic equation: s2 + 4s + 3 = 0.Characteristic roots: (this factors) −1, −3.Exponential solutions: e−t , e−3t .General solution:

x(t) = c1e−t + c2e−3t .

Because the roots are real and different, the system is overdamped.The intial conditions are satisfied when c1 = 3/2, c2 = −1/2. So, x(t) =3e−t/2 − e−3t/2.

0 1 2 3 4 5

0.0

0.2

0.4

0.6

0.8

1.0

t

x

Figure 2: The overdamped graph for example 2.

Because e−t goes to 0 more slowly than e−3t/2 it controls the rate at which x goes to 0. (Remember, it is the term that goes to zero slowest term that controls the rate.)

Case (iii) Critical Damping (repeated real roots)If b2 = 4mk then the term under the square root is 0 and the characteristicpolynomial has repeated roots, −b/2m, −b/2m.

Now we use the roots to solve equation (1) in this case. We have only one exponential solution, so we need to multiply it by t to get the second solution. Basic solutions: e−bt/2m , te−bt/2m . General solution:

x t e−bt/2m( ) = (c1 + c2t).

As in the overdamped case, this does not oscillate. It is worth noting that for a fixed m and k, choosing b to be the critical damping value gives the fastest return of the system to its equilibrium position. In engineering design this is often a desirable property. This can be seen by considering the roots, but we will not go through the algebra that shows this. (See figure (4).)

4


.. .Example 3. Show that the system x + 4x + 4x = 0 is critically damped and .graph the solution with initial conditions x(0) = 1, x(0) = 0.

Solution. Characteristic equation: s2 + 4s + 4 = 0.Characteristic roots: (this factors) −2, −2.Exponential solutions: (only one) e−2t .General solution:

x(t 2t) = e− (c1 + c2t).

Because the roots are repeated, the system is critically damped. The intial conditions are satisfied when c1 = 1, c2 = 2. So, x t e−2t( ) = (1 + 2t).

0 1 2 3 4 5

0.0

0.2

0.4

0.6

0.8

1.0

x

t

Figure 3: The critically damped graph for example 3.

Notice that qualitatively the graphs for the overdamped and critically damped cases are similar.

.. .The following figure shows plots for solutions to x + bx + x = 0 with .

initial conditions x(0) = 1, x(0) = 0. The three plots are b = 1 under-damped; b = 2 critically damped (dashed line); b = 3 overdamped. Notice that the critically damped curve has the fastest decay.

0 2 4 6 8

-0.2

0.2

0.6

1.0

t

x1

.. .Figure 4: Plots of solutions to x + bx + x = 0.

5




http://ocw.mit.edu


Introduction

We will continue our study of the “mass-spring-dashpot” system, governed by the differential equation

mx�� + kx� + bx = Fext(t).

Remember that m represents the mass of the dashpot, k the strength of the spring, and b the damping. Fext(t) represents some external driving force.

We’ve already seen how to solve this equation if there is no driving force, i.e., if we have

mx�� + kx� + bx = 0.

We will now discuss how to handle certain kinds of external driving functions, namely exponential and sinusoidal driving. We will find a general formula to handle these cases, and touch on the phenomenon of resonance, a very important concept which we’ll discuss in more detail in a few lectures.

The method of superposition, which we saw already, will be an important tool for us again.




http://ocw.mit.edu


Superposition

1. Superposition I

We saw the principle of superposition already, for first order equations. For example, we saw that if y1 is a solution to y� + 4y = sin(3t) and y2 a solution to y� + 4y = 2, then y1 + y2 is a solution to y� + 4y = sin(3t) + 2. Superposition will be useful for us again, though now we will use it in two slightly different ways. The first version we already used in a previous session, but let’s state it carefully and explicitly:

Superposition I: If y1 and y2 are solutions of a homogeneous linear equation, then so is any linear combination; that is, for any constants c1 and c2, the function y3 = c1y1 + c2y2 will also be a solution.

Example. Consider the ODE

t2y�� + ty� − 4y = 0.

This is homogeneous, since the constant term (the one not involving y or any of its derivatives) is zero. You can easily check by substitution that y1(t) = t2 and y2(t) = 1/t2 are both solutions. Thus

y(t) = c1t2 + c2/t2

is a solution for any c1 and c2.

Notice that we didn’t need the differential equation to have constant coefficients: linearity and homogeneity is enough.

If the equation is of second order with two solutions y1 and y2 such that neither is a multiple of the other, then

c1y1 + c2y2

will be the general solution. It has the right number of parameters. The restriction on the solutions is to make sure that they are really “different” solutions, for instance, in the above example, it would be incorrect to take y1 = t2 and y2 = 3t2, and then claim that

y(t) = c1t2 + c2 · 3t2 = (c1 + 3c2)t2

is the general solution.

Superposition OCW 18.03SC

2. Superposition II

Now consider the linear second order equation

mx�� + bx� + kx = Fext(t), (1)

and its associated homogeneous equation

mx�� + bx� + kx = 0. (2)

Superposition II: Suppose xp is any solution to (1). If xh is any solution to (2), then x = xp + xh is again a solution to (1).

This is similar to the way we used superposition for first order equations. To prove this, we just need to substitute x into (1) and check that it really is a solution:

mx�� + bx� + kx = m(xh + xp)�� + b(xh + xp)

� + k(xh + xp)

= (mxh�� + mx�p

�) + (bxh� + bx�p) + (kxh + kxp)

= (mxh�� + bx�h + kxh) + (mx�p

� + bx�p + kxp)

= 0 + Fext.

So indeed, it is a solution.

An important fact: if xh is the general solution to (2) (so it should have two parameters) then xp + xh is the general solution to (1). We’ll see an example of this shortly.

This proof works for linear equations of any order. For example, we already saw it as a consequence of the method of integrating factors for first order equations.

We’ve already seen how to find the general solution to the associated homogeneous equation (2) using the characteristic equation. Thus to find the general solution to (1), we simply need to do is find a single solution to this particular equation. This is what we’ll discuss next.

2




http://ocw.mit.edu


Exponential Input

Let’s consider the case where the driving function is an exponential Aeat , where A and a are constants. We will allow A and a to be complex, so this will also be useful for dealing with sinusoidal driving functions, e.g., Fext(t) = 3 cos(2t).

Let’s try to solve a particular example.

Example. Find the general solution to

x�� + 8x� + 7x = 9e2t .

We have no method yet, but we can at least try to guess (the method of optimism). We hope that we can get a solution which is similar in form to the right hand side. So let’s guess

x(t) = Ae2t ,

where A is an unknown constant. Substituting we get

x�� + 8x� + 7x = 4Ae2t + 16Ae2t + 7Ae2t = 27Ae2t .

Success! Setting A = 1/3, we have a solution xp = 1

e2t .3

We are not done yet, since we want the general solution. Now we only need to solve the homogeneous equation, and then we can apply Superposition II. The associated homogeneous equation is

x�� + 8x� + 7x = 0.

The characteristic polynomial is

p(r) = r2 + 8r + 7 = (r + 7)(r + 1).

The roots are −7 and −1, so we deduce that

xh = c1e−7t + c2e−t

is the general solution to the homogeneous equation. Thus the general solution to the original equation is

x = xh + xp = c1e−7t + c2e−t + 1

e2t .3

Exponential Input OCW 18.03SC

Make sure you understand why the first two terms have a parameter and why the third does not.

We can try this same approach to the general form

mx�� + bx� + kx = Beat ,

where B and a are constants. Again, we use the method of optimism, and try a solution of the form x(t) = Aeat , A being an unknown constant. Substituting, we find

mx�� + bx� + kx = m a2 Aeet + b aAeat + k Aeat · · · = (ma2 + ba + k)Aeet .

Thus to be a solution, we must set

BA = .

ma2 + ba + k

Notice that the denominator in this expression can be written succinctly as just p(a), where p is the characteristic polynomial we saw in the context of the homogeneous equation. We have

Exponential Response Formula (ERF). Consider the second order equation

mx�� + kx� + bx = Beat ,

and let p(r) = mr2 + kr + br be its characteristic polynomial. Then

x(t) = B

eat p(a)

is a particular solution, as long as p(a) �= 0.

(You might worry about the restriction p(a) �= 0—and you should. We’ll come back to that shortly.) This formula works essentially unchanged for higher order equations too—we’ll see that in a future session.

Don’t forget that this only gives a single particular solution. For the general solution, you must still solve the associated homogeneous problem and then apply Superposition II.

2




http://ocw.mit.edu


Sinusoidal Input

The exponential response formula works perfectly even if the number a in the exponential is complex. Let’s use this to solve problems with a sinusoidal driving.

Example. Find the general solution to

x�� + 8x� + 7x = 9 cos(2t).

We begin by using complex replacement and considering instead the equation

z�� + 8z� + 7z = 9e2it . (1)

Now we can apply the exponential response formula to obtain as a particular solution,

zp(t) = p(

92i)

e2it

9 2it = e(2i)2 + 16i + 7

9 2it = e .3 + 16i

Be careful with signs when you do these calculations! Remember i2 = −1.

To get a particular solution to (1), we must take the real part. We prefer the solution in amplitude-phase form, so we write

iφ3 + 16i = √

265e where φ = tan−1(16/3).

Thus 1

xp(t) = �(zp(t)) = √265

cos(2t − φ).

To get the general solution we must add the general solution of the homogeneous problem, which we already saw:

xh(t) = c1e−7t + c2e−t .

Thus we obtain the general solution

x = xp + xh = c1e−7t + c2e−t + 9

cos(2t − φ).√265

Sinusoidal Input OCW 18.03SC

Notice that from the exponential response formula, the amplitude of the particular solution is given by

BA = . |p(a)|

The ratio between the amplitude B of the driving force and the resulting amplitude of the solution is called the gain. So the gain is given by the formula

gain = 1/|p(a)|.

Let’s apply the above sequence of steps to the general case of a sinusoidal driving:

mx�� + bx� + kx = B cos(ωt).

The complixified equation is

mz�� + bz + kz = Beiωt .

From the exponential response formula with a = iω, a particular solution is

B iωtzp = e . p(iω)

Converting to polar form and then taking the real part, we get

B xp = |p(iω)| cos(ωt − φ),

where φ = arg(p(iω)). Notice that since a = iω, the gain is given by

1/|p(a)| = 1/|p(iω)|.

2




http://ocw.mit.edu


Generalized Exponential Response Formula

We can also solve an LTI DE p(D)x = q(t) with exponential input q(t) = Beat even when p(a) = 0. The answer is given by the following generalized Exponential Response formula (the proof of which we postpone to the session on Linear Operators.

Generalized Exponential Response Formula. Let p(D) be a polynomial operator with constant coefficients, and p(s) its s-th derivative. Then

p(D)x = Beat , where a is real or complex

has the particular solution

xp =

⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨ ⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩

Beat

p(a) if p(a) �= 0

Bteat

p�(a) if p(a) = 0 and p�(a) �= 0

Bt2eat

p��(a) if p(a) = p�(a) = 0 and p��(a) �= 0

. . .

Btseatif a is an s-fold zero

p(s)(a)

Note: Later when we cover resonance the case p(a) = 0, p�(a) �= 0 will be called the Resonant Response Formula

Example 1. Find a particular solution to the equation

.. x + 8x

. + 15x = e−5t

Solution. The characteristic polynomial is p(r) = r2 + 8r + 15. Since p(−5) = 0 we need to use the generalized ERF. Computing p�(r) = 2r + 8, which implies p�(−5) = −2. Therefore the generalized ERF gives

te−5t te−5t xp = .

p�(−5)= −

2

Example 2. Find a particular solution to

.. . x + 2x + 2x = e−t cos t.

Generalized Exponential Response Formula OCW 18.03SC

Solution. First we complexify the equation

.. z + 2z

. + 2z = e(−1+i)t , where x = Re(z).

The characteristic polynomial is p(r) = r2 + 2r + 2. Computing,

p(−1 + i) = (−1 + i)2 + 2(−1 + i)+ 2 = 0, p�(r) = 2r + 2, p�(−1 + i) = 2i.

Since p(−1 + i) = 0 we use the generalized ERF

te(−1+i)t te(−1+i)t te−t(cos t + i sin t)zp =

p�(−1 + i)=

2i =

2i

Finally we take the real part to get

te−t sin t xp = Re(zp) = .

2

2




http://ocw.mit.edu


The Simple Harmonic Oscillator

Let’s investigate the case of no damping (b = 0) a bit more carefully:

mx�� + kx = Fext(t).

Let’s define the parameter ωn =

√k/m

and use this to rewrite the equation in terms of m and ωn:

m(x�� + ωn 2 x) = Fext(t).

We saw this form of the equation in the previous session. Recall that the subscript “n” stands for “natural”. To remind ourselves why, consider the solution in the case of no driving force, i.e. Fext(t) = 0. The characteristic equation p(r) = m(r2 + ω2 ), has roots ±iωn. Thus, the general solution is n

xh = c1 cos(ωnt) + c2 sin(ωnt).

So even without a driver, if we give the system a nudge it will oscillate at it’s natural frequency ωn.

Now let’s add some sinusoidal input: Fext(t) = B cos(ωt). Using complex replacement, we must find a particular solution to

m(z�� + ωn2 z) = Beiωt .

Applying the exponential response formula with a = iω, we get

B iωt B iωtzp = e = e . p(iω) m(ωn

2 − ω2)

Taking the real part,

B xp = �(zp) =

m(ωn 2 − ω2)

cos(ωt)

is a particular solution. All in all, our general solution is

B x = c1 cos(ωt) + c2 sin(ωt) +

m(ωn 2 − ω2)

cos(ωt).

Again, the gain is given by 1/|p(a)| = 1/|p(iω)|.

The Simple Harmonic Oscillator OCW 18.03SC

Example. Let’s take m = 1, B = 1 and ωn = 2, and investigate the resulting particular solution as we vary ω, the frequency of the driving. The following figure shows the situation for ω = 3, 2.5 and 2.1.

0 2 4 6 8 103

2

1

0

1

2

3ω=3

ω=2.5

ω=2.1

Fig. 1. Solutions for different values of ω. What do you think happens as ω approaches the natural frequency?

It’s no surprise that the solution breaks down when ω = ωn. This situation is called pure resonance and we will investigate it in detail in an upcoming session. Notice that it corresponds to the case p(a) = 0 in the exponential response formula, since p(iωn) = 0. For now, we’ll note that it can be checked that the following is a solution:

B xp(t) = t sin(ωt).

2mω

Notice that the extra factor of t before the sine term. So the amplitude grows with time, as shown in the following figure.

2

The Simple Harmonic Oscillator OCW 18.03SC

0 2 4 6 8 103

2

1

0

1

2

3

Fig. 2. Solution at pure resonance.

More generally, the following is the counterpart to the exponential response formula in the pure resonance case, when p(a) = 0.

Resonant Response Formula (RRF). Consider the second order equation

mx�� + kx� + bx = Beat ,

with characteristic polynomial p. Then if p(a) = 0 and p�(a) �= 0, then

x(t) = B

teat p�(a)

is a particular solution.

Once we develop a small amount of algebraic machinery we will be able to give a simple proof of this formula.

3




http://ocw.mit.edu


Resonant Response Formula

‘

Exercise. Find the general solution to

x�� + 8x� + 7x = 2e−t .

Answer. The characteristic polynomial is

p(r) = r2 + 8r + 7.

This has roots −7 and −1. Thus p(−1) = 0 and we can’t use the exponential response formula. We must use the resonant response formula instead. So we get

xp = p�(

2 −1)

te−t = 31 te−t

as a particular solution. The general solution to the associated homogeneous problem is

xh = c1e−7t + c2e−t ,

and the final solution is

x = xh + xp = c1e−7t + c2e−t + 31 te−t .




http://ocw.mit.edu


Resonant Response Formula

‘

Exercise. Find the general solution to

x�� + 8x� + 7x = 2e−t .




http://ocw.mit.edu


Introduction

In this session we will continue to develop the important case of linear constant coefficient DE’s with sinusodial input.

We will start by defining stability. In a stable system the response to a periodic input will be essentially periodic. The word essentially indicates that there will be some transient behavior depending on the initial conditions, but this will die away over time.

For constant coefficient equations the important fact is that stability is equivalent to all the characteristic roots being negative, or if they are complex having negative real part. This will turn out to be a simple consequence of the fact the if a is negative then eat goes to 0 as t grows to infinity.

The other main goal of this session is to introduce the operator D and the notation p(D). We will use this to rephrase constant coefficient differential equations and to write elegant formulas for the gain, phase lag and response to sinusoidal input.




http://ocw.mit.edu


Stability

1. The Notion of Stability

A system is called stable if its long-term behavior does not depend significantly on the initial conditions.

It is an important result of mechanics that any system of masses connected by springs (damped or undamped) is a stable system. In network theory, there is a similar result: any RLC-network gives a stable system. In these notes, we investigate for the simplest such systems why this is so.

In terms of differential equations, the simplest spring-mass system or RLC-circuit is represented by an ODE of the form

a0y�� + a1y� + a2y = r(t), ai constants, t = time. (1)

For the spring-mass system, y is the displacement from equilibrium position, and r(t) is the externally applied force.

For the RLC-circuit, y represents the charge on the capacitor, and r(t) is the electromotive force E (t) applied to the circuit (or else y is the current and r(t) = E�).

By the theory of inhomogeneous equations, the general solution to (1) has the form

y = c1y1 + c2y2 + yp, c1, c2 arbitrary constants, (2)

where yp is a particular solution to (1), and c1y1 + c2y2 is the complementary function, i.e., the general solution to the associated homogeneous equation (the one having r(t) = 0).

The initial conditions determine the exact values of c1 and c2. So from (2),

the system modeled for every choice of c1, c2,by (1) is stable ⇐⇒ c1y1 + c2y2 → 0 as t → ∞. (3)

Often one applies the term stable to the ODE (1) itself, as well as to the system it models. We shall do this here.

If the ODE (1) is stable, the two parts of the solution (2) are named:

yp = steady-state solution c1y1 + c2y2 = transient; (4)

Stability OCW 18.03SC

the whole solution y(t) and the right side r(t) of (1) are described by the terms

y(t) = response r(t) = input.

From this point of view, the driving force is viewed as the input to the spring-mass system, and the resulting motion of the mass is thought of as the response of the system to the input. So what (2) and (4) are saying is that this response is the sum of two terms: a transient term, which depends on the initial conditions, but whose effects disappear over time; and a steady-state term, which represents more and more closely the response of the system as time goes to ∞, no matter what the initial conditions are.

2. Conditions for Stability: Second Order Equations

We now ask under what circumstances the ODE (1) will be stable. In view of the definition, together with (2) and (3), we see that stability concerns just the behavior of the solutions to the associated homogeneous equation

a0y�� + a1y� + a2y = 0 ; (5)

the forcing term r(t) plays no role in deciding whether or not (1) is stable.

There are three cases to be considered in studying the stability of (5); they are summarized in the table below, and based on the roots of the characteristic equation

a0r2 + a1r + a2 = 0 . (6)

roots solution to ODE condition for stability

r1 �= r2 c1er1t + c2er2t r1 < 0, r2 < 0

r1 = r2 er1t(c1 + c2t) r1 < 0

a ± ib eat(c1 cos bt + c2 sin bt) a < 0

The first two columns of the table should be familiar, from your work in solving the linear second-order equation (5) with constant coefficients. Let us consider the third column, therefore. In each case, we want to show that if the condition given in the third column holds, then the criterion (3) for stability will be satisfied.

Consider the first case. If r1 < 0 and r2 < 0, then it is immediate that the solution given tends to 0 as t ∞.→

On the other hand, if say r1 ≥ 0, then the solution er1t tends to ∞ (or to 1 if r1 = 0). This shows the ODE (5) is not stable, since not all solutions tend to 0 as t ∞.→

2

�


In the second case, the reasoning is the same, except that here we are using the limit

lim tert = 0 r < 0 t ∞

⇔→

For the third case, the relevant limits are (assuming b = 0 for the second limit):

lim eat cos bt = 0 a < 0, lim eat sin bt = 0 a < 0 . t ∞

⇔ t ∞

⇔→ →

The three cases can be summarized conveniently by one statement:

Stability criterion for second-order ODE’s — root form

a0y�� + a1y� + a2y = r(t) is stable all roots of a0r2 + a1r + a2 = 0 ⇔ have negative real part.

(7) Alternatively, one can phrase the criterion in terms of the coefficients of the ODE; this is convenient, since it doesn’t require you to calculate the roots of the characteristic equation.

Stability criterion for second order ODE’s — coefficient form. Assume a0 > 0.

a0y�� + a1y� + a2y = r(t) is stable a0, a1, a2 > 0 . (8)⇐⇒

The proof is left as an exercise; it is based on the quadratic formula.

3. Stability of Higher Order ODE’s

The stability criterion in the root form (7) also applies to higher-order ODE’s with constant coefficients:

(a0Dn + a1Dn−1 + . . . + an−1D + an) y = f (t) . (9)

These model more complicated spring-mass systems and multi-loop RLC circuits. The characteristic equation of the associated homogeneous equation is

a0rn + a1rn−1 + . . . + an−1r + an = 0 . (10)

The real and complex roots of the characteristic equation give rise to solutions to the associated homogeneous equation just as they do for second order equations. (For a k-fold repeated root, one gets additional solutions by multiplying by 1, t, t2, . . . tk−1.)

3

��

��


The reasoning which led to the above stability criterion for second-order equations applies to higher-order equations just as well. The end result is the same:

Stability criterion for higher-order ODE’s — root form

ODE (9) is stable all roots of (10) have negative real parts; (11)⇐⇒

that is, all the real roots are negative, and all the complex roots have negative real part.

There is a stability criterion for higher-order ODE’s which uses just the coefficients of the equation, but it is not so simple as the one (8) for second-order equations. We will not use this in the course, but it is worth seeing. The key point is that the stability of the system can be found without finding the roots of a higher order polynomial.

Without loss of generality, we may assume that a0 > 0. Then it is not hard to prove that

ODE (9) is stable a0, . . . , an > 0 . (12)⇒

The converse is not true. For an implication , the coefficients must sat⇐isfy a more complicated set of inequalities, which we give without proof, known as the

Routh-Hurwitz conditions for stability Assume a0 > 0; ODE (9) is stable ⇔

in the determinant below, all of the n principal minors (i.e., the subdeterminants in the upper left corner having sizes respectively 1, 2, . . . , n) are > 0 when evaluated.

a1 a0 0 0 0 0 . . . 0 a3 a2 a1 a0 0 0 . . . 0 a5 a4 a3 a2 a1 a0 . . . 0 . . . . . . . . . . . . . . . . . . . . . . . .

a2n−1 a2n−2 a2n−3 a2n−4 . . . . . . . . . an

(13)

In the determinant, we define ak = 0 if k > n; thus for example, the last row always has just one non-zero entry, an.

4




http://ocw.mit.edu


p(D) Notation

We start by recalling the basic results we have developed so far. We are studying solutions x = x(t) of the linear constant coefficient DE

anx(n) + an−1x(n−1) + ... + a1x� + a0 x = q(t) (I)

with characteristic polynomial

p(s) = ansn + an−1sn−1 + ... + a1s + a0 (P)

and homogeneous case (q = 0)

anx(n) + an−1x(n−1) + ... + a1x� + a0 x = 0 (H)

Notice that the left-hand sides of (I) and (H) have the same form. For this reason it will be useful to have a more compact notation. This is in fact provided by an important mathematical tool called operators. We will study these in more detail in the session on linear operators. For now, we just note that we can write D = dt

d for the operation of differentiation apdx plied to functions of t, i.e. if x = x(t), then Dx = dt , the first derivative of x.

In the same way we can write D2 = dtd2

2 for differentiation twice, i.e. D2x = ddt

2

2 x , the second derivative of x = x(t); similarly D3 = dt

d3

3 for differentiation three times, and so on. Then if p(s) = ansn + an−1 + sn−1 + ... + a1s + a0 is any polynomial, we can write

p(D) = anDn + an−1Dn−1 + ... + a1D + a0.

The DE’s (I) and (H) then become the statements

p(D)x = q (I) p(D)x = 0 (H)

respectively – an efficient way to write the DE’s indeed!

Now let’s recall the basics, but with our new operator notation. For the homogeneous case we have the following key theorem.

Transience theorem. All solutions x = x(t) to the linear homogeneous constant coefficient DE

p(D)x = 0 (H)

p(D) Notation OCW 18.03SC

decay to zero as t ∞ exactly when all roots r of the characteristic poly→nomial p(s) have negative real part.

In this case the solutions to (H) are called transients. By superposition, all solutions to (I) then converge to the same solution as t gets large, and we say that the DE is stable.

If we have a system modeled by a stable equation, but we are only interested in what it looks like after the transients have died down, we can ignore the initial condition:

input signalSystem

output signal xp

steady state

So in this case we are looking for particular solutions xp. If the input signal is sinusoidal, then we know from the results we obtained in the last session that there will be a particular solution which is also sinusoidal. This is the unique steady state solution which is periodic and it is of particular importance in many applications. Let’s review how it goes and then introduce some useful definitions and terminology that apply to these solutions.

The starting point is the Exponential Response Formula (ERF), which in the operator notation reads

p(D)x = Beat

and has a solution Beat

xp = p(a)

provided p(a) �= 0.

As we saw in the last session, the ERF and complex replacement can be used to obtain the periodic solution xp to the DE with sinusoidal input, i.e.

p(D)x = B cos(ωt).

This is done as follows. Since B cos(ωt) = B Re(eiωt) we look at the complex equation

p(D)z = Beiωt , so x = Re(z).

The exponential response formula gives

B �

eiωt �

zp = p(iω)

eiωt ⇒ xp = B Re p(iω)

2

p(D) Notation OCW 18.03SC

Thus,B

xp = p(iω)

cos(ωt − φ), | | where φ = Arg(p(iω)). The solution xp is the particular steady-state periodic solution.

Let’se examine the relation between the periodic input q(t) = B cos(ωt) B

and its periodic output xp(t) = |p(iω)cos(ωt − φ). We see that the am|

Bplitude of the input B is scaled and becomes the amplitude of the |p(iω)|output. We also see that the output sinusoid xp(t) is shifted by an angle φ = Arg(p(iω)) relative to the input sinusoid q(t).

This motivates the following definitions: for a CC linear DE

P(D)x = q(t) with sinusoidal input q(t).

Definition: 1. The gain is defined to be the the ratio of the amplitude of the output sinusoidto the amplitude of the input sinusoid.2.The phase lag is defined to be the angle by which the output sinusoid isshifted relative to the input sinusoid.

In the special case q(t) = B cos ωt which we solved above, we have that the gain g and the phase lag φ are

1 g = , φ = Arg(p(iω)). |p(iω)|

When solving using p(D)x = B cos ωt by complex replacement and the ERF we have xp = Re(zp) where zp(t) is the complex solution to p(D)zp = Beiwt. That is

B iωtzp = e . p(iω)

1For this reason, we define the complex gain in this case as .

p(iω)

Note that the gain and the phase lag depend only on the frequency of the ω of the input signal (as well as on the system p(D) of course).

3




http://ocw.mit.edu


� �

Example

Let’s apply what we just learned to a specific example. First, recall the basics. For the real homogeneous constant coefficient linear DE with sinusoidal input

p(D)x = B cos(ωt)

we have the unique real periodic solution

eiωt B xp = B Re

p(iω)= |p(iω)

cos(ωt − φ)|

where φ = Arg(p(iω)). In this case the complex gain is p(i1 ω) , and the

phase lag is φ = Arg(p(iω)).

Example. Find the periodic solution to

x�� + x� + 2x = cos t.

Solution. p(s) = s2 + s + 2, ω = 1, B = 1. iφ iπ

p(iω) = p(i) = i2 + i + 2 = −1 + i + 2 = 1 π + i|1 + i|e =

√2e 4 ,

since φ = Arg(1 + i) = tan−1(1/1) = .4

1 1Thus, Complex gain = = .

p(i) 1 + i1 1

Gain = = . p(i)| |

√2

πPhase lag = φ = Arg(p(i)) = .

4 1 π

Periodic solution = xp = √2

cos(t − 4 ).

Looking at the output xp in relation to the input signal we see q(t) = cos t. The amplitude of xp = √1

2 × amplitude of q so the gain is √1

2. We also see

that xp lags behind q by π/4 radians, so the phase lag φ = π 4 .




http://ocw.mit.edu


Damping and Phase Lag

Quiz: Consider the equation

.. . x + bx + 2x = cos(t).

If the damping constant b starts at 1 and is increased, what happens to the phase lag?

Choices:

a) It increases.

b) It decreases.

c) It stays the same.

Answer: The phase lag increases:The phase lag is the argument of p(i) = 1 + bi. As b increases the argumentincreases.




http://ocw.mit.edu




.. . x + bx + 2x = cos(t).


Choices:

a) It increases.

b) It decreases.






http://ocw.mit.edu




.. . x + bx + 2x = cos(t).






http://ocw.mit.edu


Damping and Amplitude


.. . x + bx + 2x = cos(t).

If the damping constant b starts at 1 and is increased, what happens to the amplitude of the solution?

Choices:

a) It increases.

b) It decreases.


Answer: The amplitude decreases.The amplitude of the solution is |

1 | . Since |p(i)| = |1 + bi| increases as bp(i)

increases, the amplitude |p(1 i)| decreases.




http://ocw.mit.edu




.. . x + bx + 2x = cos(t).


Choices:

a) It increases.

b) It decreases.






http://ocw.mit.edu




.. . x + bx + 2x = cos(t).






http://ocw.mit.edu


Introduction

In this section we show how to solve the constant coefficient linear ODE with polynomial input. That is,

p(D)y = q(x), where q(x) is polynomial.

Any function can be approximated in a suitable sense by polynomial functions, and this makes polynomials an important tool. In addition the technique we will learn, called the method of undetermined coefficents, is a good example of a general class of method widely used in mathematics, which go as follows: make an intelligent guess as to the form of the solution, leaving as letters any unknowns; plug this “trial solution” into the equation to be solved; and use it to determine the unknown values. Hence the slightly inaccurate name “undetermined coefficents” in this case – no worries, they won’t be undetermined for long!




http://ocw.mit.edu


Polynomial Input: The Method of Undetermined Coefficients

1. The Basic Result

A polynomial is a function of the form

nq(x) = anx + an−1xn−1 + + a0.· · ·

The largest k for which ak �= 0 is the degree of q(x). (The zero function is a polynomial too, but it doesn’t have a degree.)

Note that q(0) = a0 and q�(0) = a1.

Here is the basic fact about the response of an LTI system with characteristic polynomial p(s) to polynomial signals:

Theorem. (Undetermined coefficients) If p(0) �= 0, and q(x) is a polynomial of degree n, then

p(D)y = q(x)

has exactly one solution which is a polynomial, and it is of degree n.

The best way to see this, and to see how to compute this polynomial particular solution, is by examples.

2. The Method of Undetermined Coefficients

Given the linear time invariant (LTI) DE p(D)y = q(x) with q(x) is a polynomial of degree n, the Undetermined Coefficient (UC) solution method, as we discussed in the previous note, is to assume a particular solution of the form yp = h(x), where h(x) is a polynomial of degree n with unknown (“undetermined”) coefficients, and then to find the coefficients by substituting yp into the ODE. It’s important to do the work systematically; we suggest following the format given in the following example.

Example 1. Find a particular solution yp to y�� + 3y� + 4y = 4x2 − 2x.

Solution. Our trial solution is yp = Ax2 + Bx + C; we format the work as follows. The lines show the successive derivatives; multiply each line by the factor given in the ODE, and add the equations, collecting like powers of x as you go. The fourth line shows the result; the sum on the left takes into account that yp is supposed to be a particular solution to the given

Polynomial Input: The Method of Undetermined CoefficientsOCW 18.03SC

ODE.

× 4 yp = Ax2 + Bx + C

× 3 y� = 2Ax + Bp

y�p� = 2A

4x2 − 2x = (4A)x2 + (4B + 6A)x + (4C + 3B + 2A).

Equating like powers of x in the last line gives the three equations

4A = 4, 4B + 6A = −2, 4C + 3B + 2A = 0;

solving them in order gives A = 1, B = −2, C = 1, so that yp = x2 − 2x + 1.

Example 2. Solve y�� + 5y� + 4y = 2x + 3. Solution. Guess a trial solution of the form yp = Ax + B (same degree asinput).Substitute in DE: y�p� + 5y�p + 4yp = 0 + 5(A) + 4(Ax + B) = 2x + 3.

4Ax + (5A + 4B) = 2x + 3.⇒

Equate coefficients: 4A = 2, 5A + 4B = 3.Triangular system is easy to solve: A = 1/2, B = 1/8.

1 1 ⇒ yp = 2 x + 8 . Find solution of homogeneous DE: y�� + 5y� + 4y = 0 Char. equation: r2 + 5r + 4 = 0 ⇒ r = −1, −4 ⇒ yh = c1e−t + c2e−4t

general solution to DE = y = yp + yh.⇒

Example 3. Solve y�� + 5y� + 4y = x2 + 3x

Solution. Guess a trial solution of the form yp = Ax2 + Bx + C (same degree as input). Substitute this into the DE:

y�p� + 3y�p + 4yp = 2A + 5(2Ax + B) + 4(Ax2 + Bx + C) = x2 + 3x

4Ax2 + (10A + 4B)x + (2A + 5B + 4C) = x2 + 3x⇒

Equate coefficients: 4A = 1, 10A + 4B = 3, 2A + 5B + 4C = 0 Triangular system is easy to solve: A = 1/4, B = 1/8, C = −9/32

1 1 9yp = 4 x2 + 8 x − 32 .⇒

Use homogeneous solution from previous example to get the general solution to DE: y = yp + yh.

2




http://ocw.mit.edu


What Can Go Wrong

If the homogeneous DE p(D)y = 0 has polynomial solutions, then the polynomial solution of the inhomogeneous DE p(D)y = q will be of higher degree than the degree of q(x). We illustrate with an example.

Example. Solve y�� + y� = x + 1Try yp = Ax + B 0 + A = x + 1 –can’t solve.⇒

Problem: the constant term in y�� + ay� + b is 0.Fix: bump all degrees up by order of lowest derivative: try yp = Ax2 + Bx.Substitute: 2A + (2Ax + B) = x + 1Equate coeff: 2Ax + (2A + B) = x + 1 A = 1/2, B = 0 yp = 2

1 x2.⇒ ⇒

Example. y�� + 3y�� = x2 + x

Lowest order derivative is 2 bump up all degrees by 2. Try yp = Ax4 +Bx3 + Cx2 ⇒ (24Ax + 6B) +

⇒3(12Ax2 + 6Bx + 2C) = x2 + x.

Equate coefficients: 36A = 1, 24A + 18B = 1, 6B + 6C = 0 (we’ll skipthe algebra).




http://ocw.mit.edu


Solutions to Polynomial Input

Quiz: Which of the following are true about the differential equation 3x(4) + 2x(3) + x

�� − x� + 4x = 2t2 + 1?

Choices:

a) It has no polynomial solutions.

b) It has exactly one polynomial solution.

c) It has many polynomial solutions.

d) All its solutions are polynomials.

e) We can’t say from the information given.

Answer: The answer is b.The method of undetermined coefficients says there will be a particularsolution of the form xp = At2 + Bt + C. Therefore there is at least onepolynomial solution.

The general solution is of the form x = xp + xh, where xh is a homogeneous solution. Since 0 is not a root of the characteristic equation, every (nonzero) homogeneous solution is a combination of exponentials and/or sinusoidal functions. Therefore x is a polynomial only for the case xh = 0. That is, xp is the only polynomial solution.

1 1 5By the way, xp = 2 t2 + 4 t + 16 .




http://ocw.mit.edu




�� − x� + 4x = 2t2 + 1?

Choices:










http://ocw.mit.edu




�� − x� + 4x = 2t2 + 1?





http://ocw.mit.edu


Solutions to Polynomial Input 2


�� = 2t2 + 1?

Choices:






Answer: The answer is c. Because the smallest derivative in the differential operator is 2, the method of undetermined coefficients says we should look for a particular solution of the form xp = At4 + Bt3 + Ct2. Therefore there is at least one polynomial solution.

But, for any D, E the function Dt + E is a homogenous solution. (You can see this directly or because 0 is a double root of the characteristic equation.) Thus, there a lots of polynomial solutions.

Since there are nonzero roots of the characteristic equation not every solution is a polynomial.

1 2 3By the way, xp =

12 t4 −

3 t3 +

2 t2.




http://ocw.mit.edu




�� = 2t2 + 1?

Choices:










http://ocw.mit.edu




�� = 2t2 + 1?





http://ocw.mit.edu


Introduction

In this session we introduce the concept of an operator and see how they work in general. Then we specialize to the case of differential operators and show how they can be used to simplify the notations (as we already previewed in the session on Gain & Phase Lag) and the calculations used in solving linear constant-coefficient DE’s.




http://ocw.mit.edu


Operators

Operators are to functions as functions are to numbers. An operator takes a function, does something to it, and returns this modified function. There are lots of examples of operators around:

—The shift-by-a operator (where a is a number) takes as input a function f (t) and gives as output the function f (t − a). This operator shifts graphs to the right by a units.

—The multiply-by-h(t) operator (where h(t) is a function) multiplies by h(t): it takes as input the function f (t) and gives as output the function h(t) f (t).

You can go on to invent many other operators. In this course the most important operator is:

—The differentiation operator, which carries a function f (t) to its derivative f �(t).

The differentiation operator is usually denoted by the letter D; so D f (t) is the function f �(t). D carries f to f �. For example, Dt3 = 3t2. This is usually read as “D applied to t3.”

The identity operator takes an input function f (t) and returns the same function, f (t); it does nothing, but it still gets a symbol, I: I f = f .

Operators can be added and multiplied by numbers or more generally by functions. Thus tD + 4I is the operator sending f (t) to t f �(t) + 4 f (t).

The single most important concept associated with operators is that they can be composed with each other. Composition of two operators in a given order means that the two operators are applied to a function one after the other. For example, D2, the second-derivative operator, means differentiation twice, sending f (t) to f ��(t). It is in fact the composition of D with itself: D2 = D D, so that D2 f = D(D f ) = D( f �) = f ��.·




http://ocw.mit.edu


Linear Differential Operators With Constant Coefficients

The general linear ODE of order n for a function y = y(t) can be written as

y(n) + p1(t)y(n−1) + . . . + pn(t)y = q(t). (1)

From now on we will consider only the case where (1) has constant coefficients. This type of ODE can be written as

y(n) + a1y(n−1) + . . . + any = q(t) (2)

or, as we have seen, much more compactly using the differentiation opera-d

tor D = :dt

p(D) y = q(t) ,

where p(D) = Dn + a1Dn−1 + . . . + an. (3)

We call p(D) a polynomial differential operator with constant coefficients. We think of the formal polynomial p(D) as operating on a function y(t), converting it into another function; it is like a black box, in which the function y(t) goes in, and p(D)y (i.e., the left side of (2)) comes out.

p(D)

p(D)

y

y

The reason for introducing the polynomial operator p(D) is that this allows us to use polynomial algebra to simplify, streamline and extend our calculations for solving CC DE’s. Throughout this session we use the notation of equation (4):

p(D) = Dn + a1 Dn−1 + . . . + an , ai constants. (4)




http://ocw.mit.edu


Operator Rules

Our work with these differential operators will be based on several rules they satisfy. In stating these rules, we will always assume that the functions involved are sufficiently differentiable, so that the operators can be applied to them.

Sum rule. If p(D) and q(D) are polynomial operators, then for any (sufficiently differentiable) function u,

[p(D) + q(D)]u = p(D)u + q(D)u . (1)

Linearity rule. If f and g are functions and c1 and c2 are constants,

p(D) (c1 f + c2 g) = c1 p(D) f + c2 p(D) g . (2)

Proof of the linearity rule: This rule follows from the linearity of differentiation. That is,

D(c1 f + c2g) = (c1 f + c2 g)� = c1 f � + c2g� = c1Du1 + c2Du2.

Similarly taking the second or higher derivative also follows the linearity rule . That is,

Dn(c1 f + c2 g) = dn

(c1 f + c2 g) = c1 f (n) + c2g(n) = c1 Dn f + c2 Dng.dt

Next, we can scale the linear operator Dn by a and it stays linear. That is,

(n)aDn(c1 f + c2 g) = a dn

(c1 f + c2g) = c1a f (n) + c2ag = c1aDn f + c2aDngdt

(Notice that a does not actually have to be a constant, it can be a function of t (or of whatever independent variable we’re using). )

Finally we can combine these operators into a polynomial operator

Dn + a1Dn−1 + . . . + an−1 D + an

which clearly still obeys the linearity rule. �

� � � �

Operator Rules OCW 18.03SC

Multiplication rule. If p(D) = g(D) h(D) as polynomials in D, then � �

p(D) u = g(D) h(D) u . (3)

The picture illustrates the meaning of the right side of (3). The property is true when h(D) is the simple operator a Dk , essentially because

Dm(a Dku) = a Dm+ku.

It extends to general polynomial operators h(D) by linearity. Note that here a must be a constant; it’s false otherwise. p(D)u

u

g(D)

h(D)u

h(D)

An important corollary of the multiplication property is that polynomial operators with constant coefficients commute; i.e., for every function u(t),

g(D) h(D) u = h(D) g(D) u . (4)

As polynomials, g(D)h(D) = h(D)g(D) = p(D) therefore by the multiplication rule, both sides of (4) are equal to p(D) u and therefore equal to each other.

The remaining two rules are of a different type and are more concrete: they tell us how polynomial operators behave when applied to exponential functions and products involving exponential functions.

Substitution rule. p(D)eat = p(a)eat (5)

Proof. We have, by repeated differentiation,

Deat = aeat , D2eat = a2eat , . . . , Dkeat = akeat;

therefore,

(Dn + c1Dn−1 + . . . + cn) eat = (an + c1an−1 + . . . + cn) eat ,

which is the substitution rule (5). �

The exponential-shift rule This handles expressions such as tkeat andtk sin at. Let u = u(t). Then

p(D) eatu = eat p(D + a) u . (6)

Proof. We prove it in successive stages. First, it is true when p(D) = D, since by the product rule for differentiation,

Deatu(t) = eat Du(t) + aeatu(t) = eat(D + a)u(t). (7)

2

Operator Rules OCW 18.03SC

To show the rule is true for Dk , we apply (7) to D repeatedly:

D2eatu = D(Deatu) = D(eat(D +� a)u) � by (7); = eat(D + a) (D + a)u , by (7); = eat(D + a)2u , by (3).

In the same way,

D3eatu = D(D2eatu) = D(eat(D +� a)2u) � by the above;

= eat(D + a) (D + a)2u , by (7); = eat(D + a)3u , by (3),

and so on. This shows that (6) is true for an operator of the form Dk . To show it is true for a general operator

p(D) = Dn + a1Dn−1 + . . . + an ,

we write (6) for each Dk(eatu), multiply both sides by the coefficient ak, and add up the resulting equations for the different values of k. �

3




http://ocw.mit.edu


Example

Remark on complex numbers. As we saw in the session on Complex Arithmetic and Exponentials in Unit I, the formula

D (c eat) = c a eat (*)

remains true even when c and a are complex numbers. Therefore the rules and arguments above remain valid even when the exponents and coefficients are complex. We illustrate this with the following example.

Example. Find D3e−t sin t .

Solution using the exponential-shift rule. Using the exponential shift rule and the binomial theorem,

D3e−t sin t = e−t(D − 1)3 sin t = e−t(D3 − 3D2 + 3D − 1) sin t = e−t(2 cos x + 2 sin t),

since D2 sin t = − sin t and D3 sin t = − cos t.

Solution using the substitution rule. Write e−t sin t = �e(−1+i)t . We have

D3e(−1+i)t = (−1 + i)3e(−1+i)t , by the substitution rule and (*); = (2 + 2i) e−t(cos t + i sin t),

by the binomial theorem and Euler’s formula. To get the answer we take the imaginary part: e−t(2 cos t + 2 sin t).

The operator method combined with the Exponential Response formula gives an efficient way to write and solve inhomogeneous DE’s with real or complex exponential input. The following example again illustrates the usefulness of complex exponentials.




http://ocw.mit.edu


Time Invariance

In the case of constant coefficient operators p(D), there is an important and useful relationship between solutions to p(D)x = q(t) for input signals q(t) which start at different times t. The following result shows why these operators are called “Linear Time Invariant” (or LTI).

Translation invariance. If p(D) is an constant-coefficient differential operator and p(D)x = q(t), then p(D)y = q(t − c), where y(t) = x(t − c).

This is the “time invariance” of p(D). Here is an example of its use.

Example. Suppose that we know that xp(t) = √

2 sin(t/2 − π/4) is a solution to the DE .. .

2x + x + x = sin(t/2) (1)

Find a solution yp to

.. .2x + x + x = sin(t/2 − π/3) (2)

Solution. By translation-invariance, we have immediately that

yp = √

2 sin(t/2 − π/4 − π/3) = √

2 sin(t/2 − 7π/12).




http://ocw.mit.edu


Proof of the Generalized Exponential Response Formula

Using the exponential shift rule, we can now give a proof of the general case of the ERF which we stated without proof in the session on Exponential Response. This is a slightly complicated proof and you can safely skip it if you are not interested.

Generalized Exponential Response Formula. Let p(D) be a polynomial operator with constant coefficients and p(s) its s-th derivative. Then

p(D)x = eat , where a is real or complex (1)

has the particular solution

xp =

⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨ ⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩

i) eat

p(a) if p(a) �= 0

ii) teat

p�(a) if p(a) = 0 and p�(a) �= 0

iii) t2eat

p��(a) if p(a) = p�(a) = 0 and p��(a) �= 0

. . .

iv) tseat

p(s)(a) if a is an s-fold zero

Proof. That (i) is a particular solution to (1) follows immediately by using the linearity and substitution rules given earlier.

1 p(D)xp = p(D)

p(et

a)=

p(a) p(D)eat =

p(pa(

)

ae)

at = eat .

Since cases (ii) and (iii) are special cases of (iv) we skip right to that. For case (iv), we begin by noting that to say the polynomial p(D) has the number a as an s-fold zero is the same as saying p(D) has a factorization

p(D) = q(D)(D − a)s , q(a) �= 0. (2)

We will first prove that (2) implies

p(s)(a) = q(a) s! . (3)

Proof of the Generalized Exponential Response Formula OCW 18.03SC

To prove this, let k be the degree of q(D) and write it in powers of (D − a):

q(D) = q(a) + c1(D − a) + . . . + ck(D − a)k; then

p(D) = q(a)(D − a)s + c1(D − a)s+1 + . . . + ck(D − a)s+k; (4)

p(s)(D) = q(a) s! + positive powers of D − a.

Substituting a for D on both sides proves (3). �

Using (3), we can now prove (iv) easily using the exponential-shift rule. We have

eat xs eat p(D)

p(s)(a)=

p(s)(a) p(D + a)xs , by linearity and ERF case (i);

eat =

p(s)(a) q(D + a) Dsxs , by (2);

eat = q(D + a) s!, by (3);

q(a)s! eat

= q(a)s!

q(a) s! = eat ,

where the last line follows from (4), since s! is a constant:

q(D + a)s! = (q(a) + c1D + . . . + ckDk) s! = q(a)s! .

Note: By linearity we could have stated the formula with a factor of B in the input and a corresponding factor of B to the output. That is, the DE

p(D)x = Beat

has a particular solution

Beat xp =

p(a) , if p(a) �= 0 etc.

2




http://ocw.mit.edu


Introduction

In this session we examine the important topic of resonance. Pure resonance occurs when an undamped system is forced at the same frequency as (one of) its natural frequencies. In this case the amplitude of the response grows without bound.

An undamped system is an idealized case which can be considered as the limit of very light damping. In the lightly damped case the amplitude of the response is finite but it can be large. In this case we refer to the biggest possible amplitude as practical resonance.

One common example of practical resonance is a children’s swing. If you push it in time with its natural frequency the amplitude of the swing will increase. Another example is a pair of guitar strings tuned to the same note. If you pluck one of them then the vibrating air will push the other one at its natural frequency and it too will start to vibrate.




http://ocw.mit.edu


The Exponential Response Formula: Resonant Case

The starting point for understanding the mathematics of pure resonance is the generalized Exponential Response formula. First recall the simple case of the Exponential Response formula: A solution to

p(D)x = Beat (1)

is given by

xp = B eat

p(a) provided that p(a) �= 0. (2)

In the session on Exponential Response we also saw the generalization of this formula when p(a) = 0. Here we will need to use the special case when p�(a) �= 0: A solution to equation (1) is given by

B t eat xp =

p�(a) if p(a) = 0 and p�(a) �= 0 (3)

We will call this the Resonant Response Formula.

Let’s look at an example of the type we will be using here to study pure resonance.

Example. Find a particular solution to the DE x�� + 4x = 2 cos 2t. As usual, we try complex replacement and the ERF: if zp is a solution to the complex DE z�� + 4z = 2e2it, then xp = Re(zp) will be a solution to x�� + 4x = 2 cos 2t. The characteristic polynomial is p(s) = s2 + 4, and a = 2i, so that we have p(a) = 0. But since p�(s) = 2s, we have p�(a) = p�(2i) = 4i �= 0. The resonant case of the ERF thus gives

2 t e2it zp = .

4 i

Then taking the real part of zp gives us our particular solution

1 xp = t sin 2t.

2




http://ocw.mit.edu


Undamped Forced Systems

We now look at the pure resonant case for a second-order LTI DE. We will use the language of spring-mass systems in order to interpret the results in physical terms, but in fact the mathematics is the same for any second-order LTI DE for which the coefficient of the first derivative is equal to zero.

The problem is thus to find a particular solution the DE

x�� + ω02x = F0 cos ωt.

2 2Characteristic polynonial: ( ) = ω+p r r 0⎧ ⎪⎪⎪⎨

The steps, as in the example in the last note, are Complex replacement: z�� + ω0

2z = F0eiωt , x = Re(z). ⇒ p(iω) = ω0

2 − ω2 .

Exponential Response formula zp = ⇒

F0eiωt F0eiωt

p(iω)=

ω02 − ω2

if w �= ω0

⎪⎪⎪⎩

⇒ xp =

⎧ ⎪⎪⎨ ⎪⎪⎩

F0 teiωt F0 teiωt = if ω = ω0.

p�(iω) 2iω F0 cos ωt ω0

2 − ω2 if ω �= ω0

F0 t sin ω0t if ω = ω0 (resonant case).

2ω0

Resonance and amplitude response of the undamped harmonic oscillator F0In xp the amplitude = A = A(ω) = |

ω02 − ω2

| is a function of ω.

The right plot below shows A as a function of ω. Note, it is similar to the damped amplitude response except the peak is infinitely high. As w gets closer to ω0 the amplitude increases.

F0 t sin ω0tWhen ω = ω0 we have xp = . This is called pure resonance

2ω0 (like a swing). The frequency ω0 is called the resonant or natural frequency of the system. In the left plot below notice that the response is oscillatory but not periodic. The amplitude keeps growing in time (caused by the factor of t in xp).

Note carefully the different units and different meanings in the plots below.

Undamped Forced Systems OCW 18.03SC

The left plot is output vs. time (for a fixed input frequency) and the rightplot is output amplitude vs. input frequency.x and A are in physical units dependent on the system; t is in time; ω is inradians.

x

A

t

ωω0

Resonant response (ω = ω0) Undamped amplitude response

2




http://ocw.mit.edu


Introduction

In this session we will examine the response of a second order linear time invariant (LTI) system to a sinusoidal input. We will pay special attention to the way the output changes as the frequency of the input changes. This is what we mean by the frequency response of the system. In particular, we will look at the amplitude response and the phase response; that is, the amplitude and phase lag of the system’s output considered as functions of the input frequency.

We did something similar in the first unit of the course where, in the session on Exponential Input, we discussed the frequency response of a first order LTI system as the frequency of the input sinusoid varies.

In the recent sessions on Exponential Response and Gain & Phase Lag, we worked out in detail the formulas which give the response to a sinusoidal input signal for an LTI DE of any order. Here we will specialize to the second order case where we will focus on the interpretation of these mathematical results for mechanical systems. In particular, we will look at damped-spring-mass systems. We will study carefully two cases: first, when the mass is driven by pushing on the spring and second, when the mass is driven by pushing on the dashpot.

Both these systems have the same form

p(D)x = q(t),

but their amplitude responses are very different. This is because, as we will see, it can make physical sense to designate something other than q(t) as the input. For example, in the system

.. . . mx + bx + kx = by

we will consider y to be the input. (Of course, y is related to the expression on the right-hand-side of the equation, but it is not exactly the same.)




http://ocw.mit.edu


� �

Sinusoidally Driven Systems: Second Order LTI DE’s

We start with the second order linear constant coefficient (CC) DE, which as we’ve seen can be interpreted as modeling a damped forced harmonic oscillator. If we further specify the oscillator to be a mechanical system with mass m, damping coefficient b, spring constant k, and with a sinusoidal driving force B cos ωt (with B constant), then the DE is

mx�� + bx� + kx = B cos ωt. (1)

For many applications it is of interest to be able to predict the periodic response of the system to various values of ω. From this point of view we can picture having a knob you can turn to set the input frequency ω, and a screen where we can see how the shape of the system response changes as we turn the ω-knob.

In the sessions on Exponential Response and Gain & Phase Lag we worked out the general case of a sinusoidally driven LTI DE. Specializing these results to the second order case we have:

Characteristic polynomial: p(s) = ms2 + bs + k.

Complex replacement: mz�� + bz� + kz = Beiωt , x = Re(z).

Exponential Response Formula:

Beiωt Beiωt zp =

p(iω)=

k − mω2 + ibω

B ⇒ xp = Re(zp) = � (k − mω2)2 + b2ω2

cos(ωt − φ),

bωwhere φ = Arg(p(iω)) = tan−1

k − mω2 . (In this case φ must be be

tween 0 and π. We say φ is in the first or second quadrants.) B

Letting A = � , we can write the periodic response xp(k − mω2)2 + b2ω2

as xp = A cos(ωt − φ).

The complex gain, which is defined as the ratio of the amplitude of the output to the amplitude of the input in the complexified equation, is

1 1 g(ω) =

p(iω)=

k − mω2 + ibω .

Sinusoidally Driven Systems: Second Order LTI DE’s OCW 18.03SC

The gain, which is defined as the ratio of the amplitude of the output to the amplitude of the input in the real equation, is

1 1 g = g(ω) = = � . (2)| p(iω) | (k − mω2)2 + b2ω2

The phase lag is

φ = φ(ω) = Arg(p(iω) = tan−1( bω

) (3)k − mω2

and we also have the time lag = φ/ω.

Terminology of Frequency Response We call the gain g(ω) the amplitude response of the system. The phase lag φ(ω) is called the phase response of the system. We refer to them collectively as the frequency response of the system.

Notes: 1. Observe that the whole DE scales by the input amplitude B.

2. All that is needed about the input for these formulas to be valid is that it is of the form (constant) × (a sinusoidal function). Here we have used the notation B cos ωt but the amplitude factor in front of the cosine function can take any form, including having the constants depend on the system parameters and/or on ω. (And of course one could equally-well use sin ωt, or any other shift of cosine, for the sinusoid.) This point is very important in the physical applications of this DE and we will return to it again in a later session.

3. Along the same lines as the preceding: we always define the gain as the the amplitude of the periodic output divided by the amplitude of the periodic input. Later in this session we will see examples where the gain is not just equal to p(i

1 ω) (for complex gain) or p(i

1 ω)

(for real gain) – stay tuned! | |

2




http://ocw.mit.edu


Frequency Response and Practical Resonance

In the previous note in this session we found the periodic solution to the equation

mx�� + bx� + kx = B cos(ωt). (1)

The solution was xp = gB cos(ωt − φ), where g is the gain

1 g = g(ω) = � (2)

(k − mω2)2 + b2ω2

and φ is the phase lag

φ = φ(ω) = Arg(p(iω)) = tan−1(bω/(k − mω2)). (3)

The gain or amplitude response is a function of ω. It tells us the size of the system’s response to the given input frequency. If the amplitude has a peak at ωr we call this the practical resonance frequency. If the damping b gets too large then, for the system in equation (1), there is no peak and, hence, no practical resonance. The following figure shows two graphs of g(ω), one for small b and one for large b.

ω

g

ω0 ωr

ω

g

ω0

Fig 1a. Small b (has resonance). Fig 1b. Large b (no resonance)

In figure (1a) the damping constant b is small and there is practical resonance at the frequency ωr. In figure (1b) b is large and there is no practical resonant frequency.

Finding the Practical Resonant Frequency. We now turn our attention to finding a formula for the practical resonant frequency -if it exists- of the system in (1). Practical resonance occurs at the frequency ωr where g(w) has a maximum. For the system (1) with gain (2) it is clear that the maximum gain occurs when the expression under the radical has a minimum. Accordingly we look for the minimum of

f (ω) = (k − mω2)2 + b2ω2.

�

Frequency Response and Practical Resonance OCW 18.03SC

Setting f �(ω) = 0 and solving gives

f �(ω) = −4mω(k − mω2) + 2b2ω = 0

ω = 0 or m2ω2 = mk − b2/2.⇒

We see that if mk − b2/2 > 0 then there is a practical resonant frequency

k b2 ωr =

m −

2m2 .

Phase Lag:In the picture below the dotted line is the input and the solid line is theresponse.

The damping causes a lag between when the input reaches its maximum and when the output does. In radians, the angle φ is called the phase lag and in units of time φ/ω is the time lag. The lag is important, but in this class we will be more interested in the amplitude response.

•. . . ...

��

........................................... . . . ...............

..........

..............

....... . . . . ...................... t φ/ω

time lag

2




http://ocw.mit.edu


Mechanical Vibration System: Driving Through the Spring

The figure below shows a spring-mass-dashpot system that is driven through the spring.

Dashpot

Mass

Spring

y

x

Figure 1. Spring-driven system

Suppose that y denotes the displacement of the plunger at the top of the spring and x(t) denotes the position of the mass, arranged so that x = y when the spring is unstretched and uncompressed. There are two forces acting on the mass: the spring exerts a force force given by k(y − x) (where k .is the spring constant) and the dashpot exerts a force given by −bx (against the motion of the mass, with damping coefficient b). Newton’s law gives

.. . mx = k(y − x) − bx

or, putting the system on the left and the driving term on the right,

.. . mx + bx + kx = ky . (1)

In this example it is natural to regard y, rather than the right-hand side q = ky, as the input signal and the mass position x as the system response. Suppose that y is sinusoidal, that is,

y = B1 cos(ωt).

Then we expect a sinusoidal solution of the form

xp = A cos(ωt − φ).

�

Mechanical Vibration System: Driving Through the Spring OCW 18.03SC

By definition the gain is the ratio of the amplitude of the system response to that of the input signal. Since B1 is the amplitude of the input we have g = A/B1.

In the previous note in this session, we worked out the formulas for g and φ, and so we can now use them with the following small change. The k on the right-hand-side of equation (1) needs to be included in the gain (since we don’t include it as part of the input). We get

A k k g(ω) = = = �

B1 �|p(iω)| � (k − mω2)2 + b2ω2

bω φ(ω) = tan−1

k − mω2 .

Note that the gain is a function of ω, i.e. g = g(ω). Similarly, the phase lag φ = φ(ω) is a function of ω. The entire story of the steady state system response xp = A cos(ωt − φ) to sinusoidal input signals is encoded in these two functions of ω, the gain and the phase lag.

We see that choosing the input to be y instead of ky scales the gain by k and does not affect the phase lag.

The factor of k in the gain does not affect the frequency where the gain is greatest, i.e. the practical resonant frequency. From the previous note in this session we know this is

k b2 ωr =

m −

2m2 .

Note: Another system leading to the same equation is a series RLC circuit. We will favor the mechanical system notation, but it is interesting to note the mathematics is exactly the same for both systems.

2




http://ocw.mit.edu


Mechanical Vibration System: Driving Through the Dashpot

Now suppose instead that we fix the top of the spring and drive the system by moving the bottom of the dashpot instead.

Suppose that the position of the bottom of the dashpot is given by y(t) and the position of the mass is given by x(t), arranged so that x = 0 when the spring is relaxed. Then the force on the mass is given by

.. d mx = −kx + b (y − x)

dt

since the force exerted by a dashpot is supposed to be proportional to the speed of the piston moving through it. This can be rewritten as

.. . . mx + bx + kx = by . (1)

Dashpot

Mass

Spring

x

y

Figure 2. Dashpot-driven system

We will consider x as the system response, and again on physical grounds we specify as the input signal the position y of the back end of the dashpot. Note that the derivative of the input signal (multiplied by b) occurs on the right hand side of the equation.

Again we suppose that the input signal is of sinusoidal form

y = B1 cos(ωt).

We will now work out the frequency response analysis of this problem. .

First, y = B1 cos(ωt) ⇒ y = −ωB1 sin(ωt), so our equation is.. .

mx + bx + kx = −bωB1 sin(ωt) . (2)

Mechanical Vibration System: Driving Through the DashpotOCW 18.03SC

We know that the periodic system response will be sinusoidal, and as usual we choose the amplitude-phase form with the cosine function

xp = A cos(ωt − φ) .

Since y = B1 cos(ωt) was chosen as the input, the gain g is given by g = BA

1.

As usual, we compute the gain and phase lag φ by making a complex replacement.

One natural choice would be to regard q(t) = −bωB1 sin(ωt) as the imaginary part of a complex equation. This would work, but we must keep in mind that the input signal is B1 cos(ωt) and also that we want to express the solution xp as xp = A cos(ωt − φ).

Instead we will go back to equation (1) and complexify before taking the derivative of the right-hand-side. Our input y = B1 cos(ωt) becomes y = B1eiωt and the DE becomes

.. . mz + bz + kz = by� = iωbB1eiωt . (3)

Since y = Re(y) we have x = Re(z); that is, the sinusoidal system response xp of (2) is the real part of the exponential system response zp of (3). The Exponential Response Formula gives

iωbB1 iωtzp = ep(iω)

where p(s) = ms2 + bs + k

is the characteristic polynomial.

The complex gain (scale factor that multiplies the input signal to get the output signal) is

iωb g(ω) = .

p(iω)

Thus, zp = B1 g(ω)eiωt .

We can write g = |g|e−iφ, where φ = −Arg(g). (We use the minus sign so φ will come out as the phase lag.) Substitute this expression into the formula for zp to get

zp = B1|g| ei(ωt−φ).

Taking the real part we have

xp = B1|g| cos(ωt − φ).

2


All that’s left is to compute the gain g = |g| and the phase lag φ = −Arg(g). We have

p(iω) = m(iω)2 + biω + k = (k − mω2) + biω ,

so, iωb iωb

g = p(iω)

=(k − mω2) + biω

. (4)

This gives

ωb ωb g(ω) = g = = � .| | |p(iω)| (k − mω2)2 + b2ω2

In computing the phase φ we have to be careful not to forget the factor of i in the numerator of g. After a little algebra we get

φ(ω) = −Arg(g) = tan−1(−(k − mω2)/(bω)).

As with the system driven through the spring, we try to find the input frequency ω = ωr which gives the largest system response. In this case we can find ωr without any calculus by using the following shortcut: divide the numerator and denominator in (4) by biω and rearrange to get

1 1 g =

1 + (k − mω2)/(iωb)=

1 − i(k − mω2)/(ωb) .

Now the gain g = |g| can be written as

1 g = � .

1 + (k − mω2)2/(ωb)2

Because squares are always positive, this is clearly largest when the term k − mω2 = 0. At this point g = 1 and ωr =

√k/m = ω0, i.e. the resonant

frequency is the natural frequency.

Since g(ω0) = 1, we also see that the phase lag φ = Arg(g) is 0 at ωr

Thus the input and output sinusoids are in phase at resonance.

We have found interesting and rather surprising results for this dashpot-driven mechanical system, namely, that the resonant frequency occurs at the system’s natural undamped frequency ω0; that this resonance is independent of the damping coefficient b; and that the maximum gain which can be obtained is g = 1. We can contrast this with the spring-side driven

3


system worked out in the previous note, where the resonant frequency certainly did depend on the damping coefficient. In fact, there was no resonance at all if the system is too heavily damped. In addition, the gain could, in principle, be arbitarily large.

Comparing these two mechanical systems side-by-side, we can see the importance of the choice of the specification for the input in terms of understanding the resulting behavior of the physical system. In both cases the right-hand side of the DE is a sinusoidal function of the form B cos ωt or B sin ωt, and the resulting mathematical formulas are essentially the same. The key difference lies in the dependence of the constant B on either the system parameters m, b, k and/or the input frequency ω. It is in fact the dependence of B on ω and b in the dashpot-driven case that results in the radically different result for the resonant input frequency ωr.

Note: As with the mechanical system driven through the spring, the mechanical system driven through the dashpot has an exact mathematical analog in a series RLC circuit. We will discuss this in the next session.

4




http://ocw.mit.edu


Introduction

In this session we study one widely-used application of the linear time invariant DE analysis we have developed in this unit, namely, RLC circuits. Remarkably, these circuits can be modeled with the exact same differential equations as the mechanical systems studied in the previous sessions. The symbols used and their interpretation will change, but the fact that the DE’s are identical means that, in some sense the behavior of the systems is the same.

We will also use complex techniques to define and understand impedance. Impedance generalizes the notion of resistance and like resistance it follows Ohm’s law. These techniques will also allow us to understand phasors and the phase angles between the different voltages in the circuit.




http://ocw.mit.edu


• •

� �

RLC Circuits

1. Simple circuit physics

The picture at right shows an inductor, capacitor and resistor in series with a driving voltage source.

VL

I(t) is the current in the circuit in amps. L is the inductance in henries.

LR is the resistance in ohms.C is the capacitance in farads. ∼

•

C VCVinVin is the input voltage to the circuit. .

Q(t) is the charge on the capacitor, so I(t) = Q(t). •

R

I • •

VR

From physics we get that the voltage drops across each of the circuit elements. . .. . Q

VL = LI = LQ, VR = RI = RQ, VC = .C

The amazing thing is that this and Kirchhoff’s voltage law (KVL) is all the physics we need to understand this circuit. The rest is linear CC DE’s and complex arithmetic. (KVL says that the net voltage drop around any closed loop is 0.)

2. Summary of the this Session.

We start with a summary of the physics and DE’s covered in this session. Explanations will be given below and in the next note.

Compatible Units: The units given with the circuit diagram above are compatible and we will assume throughout that we are using them.

. 1Voltage Drops: L I, RI, Q. (We will just accept these from the physicists.)

C

DEs: Using the KVL and the voltage drops descibed above we get all of the

RLC Circuits OCW 18.03SC

following physically equivalent DE’s.

.. . 1LQ + RQ + Q = Vin (1)

C .. . 1 .

LI + RI + I = Vin (2)C

.. . 1 1LVC + RVC + VC = Vin (3)

C C .. . 1 .

LVR + RVR + VR = RVin (4)C

Complex Replacement: If x is a real number or function, we will use the following notational convention here: x� will be a complex replacement for x, in the same sense that we have use this term before, namely, x� is complex, and x is the real or imaginary part of x� depending on whether the input was cosine or sine.

Complex Impedance: (valid when V�in = eiωt) 1

Z�L = iLω, Z�R = R, Z�C = .iCω

Total impedance = Z� = Z�R + Z�L + Z�C = R + i(ωL − 1/(ωC)).

Complex Ohm’s Law: V�in = Z��I, V�L = Z�L �I, V�R = Z�R �I, V�C = Z�C �I.

Phasors: All the output voltages are plotted in the complex plane as a rigid set of vectors that rotate at frequency ω. VR and �I point in the same direction, V�L leads �I by π/2, V�C lags �I by π/2. �I either leads or lags V�in by φ = tan−1((Lω − 1/(Cω))/R).

Reactance and real impedance:Reactance = S = ωL − 1/(ωC). Z� = R + iS.�

⇒ � Real impedance = |Z�| = R2 + S2 = R2 + (ωL − 1/(ωC))2.

If Vin = E0 sin ωt then from V�in = Z��I we get I = E0 sin(ωt − φ), |Z�|

with the phase angle φ = tan−1(S/R).

Practical resonance: In equations (2) and (4) the practical resonance is always at the natural frequency ω0 = 1/

√LC.

2

RLC Circuits OCW 18.03SC

3. The Differential Equations

First, let’s justify the differential equations 1-4. KVL implies the total voltage drop around the circuit has to be 0. If we follow the current I clockwise around the circuit adding up the voltage drops, we get the basic equation . 1

LI + RI + Q − Vin = 0, (5)C

here we’ve assume that the input provides a voltage gain. We can replace I . . by Q in (5) to get equation (1). If we differentiate (5) and replace Q by I we get (2). Now multiply (1) by 1/C to get (3) and multiply (2) by R to get (4).

4. The Electro-Mechanical Analogy

Notice that equation (3) has the same form as the DE for the spring-mass system driven through the spring. That is, if you substitute m, b, k and x for L, R, 1/C, VC in (3) and call Vin the input then you have the equation for the spring-mass system driven through the spring. Likewise equation (4) has the same form as the spring-mass system driven through the dashpot.

Thus, ignoring their interpretations as voltage instead of postion, the outputs VC and VR behave exacly like the position x of the mass driven through the spring and dashpot respectively.

5. Resistance

Ohm’s law is most often given for the voltage VR = IR across a resistor. Recall: Two resistances R1 and R2 combine to give an equivalent resistance

1 1 1R. For R1, R2 in series R = R1 + R2, and in parallel = + .

R R1 R2

We are going to use the Exponential Response formula and complex arithmetic to understand the notions of complex impedance and phasor diagrams.

3




http://ocw.mit.edu


Impedance

1. Simple Complex Arithmetic Fact

You should be clear that in the complex plane multiplication by i is the same as rotation by π/2. Likewise division by i is the same Re as rotation by −π/2.

Im ziz

z/i = −izThe phase difference between two complex numbers a and b is simply the difference of their arguments, Arg(a) − Arg(b). The simple arithmetic fact implies

z and iz have a phase difference of π/2. (1) z and z/i have a phase difference of −π/2.

We will need this when we discuss phasors.

2. Complex Impedance

We repeat for reference some of the DE’s given in the previous note.

.. . 1 L Q + R Q + Q = Vin (2)

C .. . 1 .

L I + R I + I = VC in (3)

Using complex arithmetic and the Exponential Response formula we can understand all the statements about impedance and phasors.

First, note that if we remove the inductor and capacitor then (2) is just . Ohm’s law, i.e. R Q = RI = Vin.

Now we make the crucial assumption of sinusoidal input (alternating current):

Vin(t) = V0 sin(ωt).

With this input we will solve equation (3).

First, complexify (3): (Because of the tildes (�I) we use prime instead of dot to indicate derivatives.)

L�I�� + R� 1 I � +

C �I = V� iωt

in � = iωV0e , I = Im(I).

The Exponential Response formula gives the periodic solution:

�

� iωVI = 0 eiωt . (4)

P(iω)

��

��

Impedance OCW 18.03SC

A little algebra shows that the coefficient of V0eiωt in (4) is

iω iω 1 P(iω)

= −Lω2 + 1/C + Riω =

iLω + 1/(iCw) + R .

Accordingly we define the complex impedance as

1Z� = iLω + + R. (5)

iCw

(Notice Z� depends on the input frequency ω.)

We can now write the complex version of Ohm’s law (always assuming V�in = V0eiωt):

1�I = Z� · V�in or V�in = Z��I. (6)

We can associate a separate impedance to each circuit element:

1Z�L = iLω, Z�R = R, Z�C = . (7)

iCω

Comparing (5) and (7) we see that for a set of elements wired in series the total complex impedance is just the sum of the individual impedances. That is, impedance behaves just like resistance in series.

What’s more, using the voltage drops across each element we see they individually satisfy a complex Ohm’s Law.

1 1 � 1V�L = L�I� = Liω�I = Z�L �I, V�R = R�I, V�C = Q� = �I = �I = Z�C �I.

C C iCω

Note: the formulas involving ω depend crucially on the assumption that the complex input is V0eiωt .

3. Impedance in Parallel

It is also true and easy to show that for circuit elements in parallel the complex impedances combine like resistors in parallel. That is, if impedances Z�1 and Z�2 are in parallel then the total impedance of the pair, call it Z�, sat

1 1 1isfies = + .

Z� Z�1 Z�2

To see this we use Ohm’s law for a single circuit, KVL and Kirchoff’s current law (KCL). They imply

2

• • •

� �

�

�

Impedance OCW 18.03SCI�� I2��

I = I1 + I2, V� = Z�1�I1, V� = �I2Z�2. V� V� 1 1�I = + = V� +⇒ Z�1 Z�2 Z�1 Z�2

1 ⇒ V� = 1/Z�1 + 1/Z�2

�I. QED

• • • I1��

•

�Z1

•

��

•

�Z2

•

V

4. Amplitude-Phase Form and Real Impedance

First we put the expression (5) for complex impedance in the form we need

1 1Z� = iLω + + R = i(Lω − ) + R = iS + R.

iCω Cω

We call S = Lω − 1/(Cω) the reactance; note that S = 0 when ω2 = 1/(LC).

In amplitude phase form Z� = |Z�|eiφ, where |Z�| = S2 + R2 and φ = Arg(Z�) = tan−1(S/R).Notice the sign of φ depends on the sign of S = Lω − 1/Cω and also thatφ is between −π/2 and π/2.

Thus,

�I = V0 ei(ωt−φ) = �

V0 ei(ωt−φ). (8)√S2 + R2 (Lω − 1/Cω)2 + R2

The term √

S2 + R2 = |Z| = � (Lω − 1/Cω)2 + R2 is called the real impedance.

Taking imaginary parts in (8) gives

I|Z�| = V0 sin(ωt − φ),

which is like Ohm’s Law, except with a phase shift.

5. Phasors

(The term phasor just means eiωt).

We have seen that each element of an LRC circuit obeys a complex Ohm’s law:

1V�L = Z�L �I = Liω�I, V�R = R�I, V�C = Z�C �I = �I. (9)

iCω

Each of the complex voltages is some constant factor �I, which is, in turn, a multiple of eiωt . If we plot the voltages in the complex plane then as t increases the entire picture will rotate at frequency ω. We call each of these voltages a phasor.

3

� � ��

Impedance OCW 18.03SC

We want to look at the phase difference between the various voltages. By our simple arithmetic fact (1), the factors i and 1/i in V�L and V�C imply

1. The phasors V�L and V�C are respectively π/2 ahead and π/2 behind V�R.

Equation (8) implies

2. The phasor V�R is φ behind V�in (if φ is negative then V�R is ahead of V�in. Later we will look at the excellent Series LRC Circuit applet which illustrates this.

��ωt

V�L Im �� V�in

I φ �VR Re

�� V�C

6. Amplitude Response and Practical Resonance

The natural frequency of the circuit is ω0 = 1/√

LC. This is the frequency of oscillation when the “damping” term R is zero.

The practical resonance of the system (3) is independent of the value of R and always at the natural frequency ω0 = 1/

√LC (This is easy to see in

(8), since |�I| is clearly maximized when the term (Lω − 1/Cω)2 = 0.)

That is, practical resonance occurs when

Z�L + Z�C = 0 iLω − i/Cω = 0 Z� = R, �I = V0 eiωt .⇒ ⇒ R

In the phasor picture, at practical resonance V�in, �I and V�R all line up, i.e., lag is 0 and V�R = V�in.

This is one case where the corresponding sinusoidal graphs of the real voltages are neat enough to give a nice picture: the graph of VR is exactly in phase with Vin; VL and VC have the same magnitude and are 180◦ out of phase; increasing R doesn’t change VR, but decreases the amplitude of VL

and VC.

The applet Series LRC Circuit shows all this beautifully.

4




http://ocw.mit.edu


Series RLC Circuit Applet

Open the Series RLC Circuit applet.

1. Check the phasor diagram checkbox.

2. Check all the voltage checkboxes: VR, VL, VC, V, I. Note: if you click the checkboxes twice the graphs will be in color.

3. Animate the applet by clicking on the double arrow below the t axis.

4. Play with the applet.

Suggested Applet Exercise Set the applet to show you all four voltages and the current I. Set L = 500mH, C = 100 µF, R = 250 ohms.Compute the resonant frequency of the system.Move ω to the resonant frequency, watch the phasors and the sinusoidalplots as you do this.With ω set at ω0 watch the amplitudes of the three output voltages and theoutput current as R increases. Explain everything you see in terms of thecomplex Ohm’s laws and the Exponential Response formula solution for �i.

In the phasor diagram, can you see that the voltages across R and L are 90◦ out of phase? What about R and C? Are the voltages across C and L 180◦ out of phase? Why does the the angle between the input voltage and the output voltages vary as you vary ω?




http://ocw.mit.edu


Fourier Series Basics: Introduction

In this session we will introduce the Fourier series of a periodic function and show how to compute it. Fourier analysis is a large subject with wide-ranging applications. The main use we will make of these powerful tools will be to solve the inhomogeneous linear time invariant DE p(D)x = f (t), where f (t) is a periodic function.

The Fourier series expansion of the periodic solution xp(t) will, for example, show clearly when resonances occur, and at what frequencies. Nature abounds with examples of these types of phenomena. To give just one example: the inner ear is equiped with a cellular array which acts as a “Fourier analyzer" and allows us to detect the different frequencies of the in-coming sound by resonanting in tune with just those frequencies.

Periodic Functions

Periodic functions are functions which repeat: f (t + P) = f (t) for all t. For example, if f (t) is the amount of time between sunrise and sunset at a certain lattitude, as a function of time t, and P is the length of the year, then f (t + P) = f (t) for all t, since the Earth and Sun are in the same position after one full revolution of the Earth around the Sun.

We state this explicitly as the following defintion: a function f (t) is periodic with period P > 0 if

f (t + P) = f (t) for all t.

Example. f (t) = sin(2t) is periodic with period P = π. This is true because, for all t,

f (t + π) = sin(2(t + π) = sin(2t + 2π) = sin(2t) = f (t).

Notice, though, that in the example above f (t) = sin(2t) also has period P = 2π and period P = 3π. In fact, it has period P = nπ for any integer n = 1, 2, 3 . . . .

Graphically, a function with period P is one whose graph stays the same if it is shifted P to the left or right.

Base Period Most periodic functions have a minimal period, which is often called either the period or the base period. For example, sin t has minimal period is 2π. It follows from this that the minimal period for sin(2t) is π.

The only exception is the constant function. Every value of P > 0 is a period and so it has no minimal period. (We don’t allow P = 0 to be a period because then every function would be periodic with period P = 0.)

Windows To fully describe a periodic function you only need to specify the period and the value of the function over one full period. We call an interval containing one full period a window. Typical choices for windows are [−P/2, P/2) and [0, P), but any interval of length P will work.

Frequency Terminology Angular frequency, also called circular frequency has units of radians/unit time.

Frequency has units of cycles/unit time.Since one cycle is 2π radians the relationship is

Periodic Functions OCW 18.03SC

angular frequency = 2π × frequency.

The above is the official terminology, but in actual practice many people say frequency when they mean angular frequency. In fact, that has been the general usage earlier in this course where we have called ω the frequency of cos(ωt). You will have to use the context to decide exactly which frequency is being used.

For a function with period P the base angular frequency ω (also called the fundamental angular frequency) means the angular frequency corresponding to the base (or minimal) period P that is

2π ω = .

P

Fourier Series We will see that a periodic function with base frequency ω can be written as a sum of sines and cosines whose frequencies are integer multiples of ω. This is called the Fourier series for the function. That is, sines and cosines, the simplest periodic functions, are the “building blocks" for more general periodic functions.

Later in this session we will see exactly how to compute the Fourier series for a periodic function.

2

Quiz: Cosines with Common Periods

Quiz: Which collection of cosine functions all have period 2π ?

Choices:

a) cos(t), cos(t/2), cos(t/3), . . .

b) cos(πt), cos(2πt), . . .

c) cos(t), cos(2t), cos(3t), . . .

Answer: (c).For n = 1, 2, 3, . . . , cos(nt) has period 2π (and base period 2π/n) .



Choices:

a) cos(t), cos(t/2), cos(t/3), . . .

b) cos(πt), cos(2πt), . . .

c) cos(t), cos(2t), cos(3t), . . .


Quiz: Cosines with Common Frequecies

Quiz: What is the base (fundamental) frequency of the function

f (t) = cos(t) + cos(2t) + cos(3t)?

Choices:

a) 1

b) 2

c) 3

d) 6

e) there is no base frequency.

Answer: (a): Base frequency ω = 1. The smallest common period of cos(t), cos(2t) and cos(2t) is 2π. Thus, f (t) = cos(t)+ cos(2t)+ cos(3t) has minimal period P = 2π, and therefore its base frequency ω is 2π = 1.P




Choices:

a) 1

b) 2

c) 3

d) 6

e) there is no base frequency.


Fourier Series: Definitions and Coefficients

We will first state Fourier’s theorem for periodic functions with period P = 2π. In words, the theorem says that a function with period 2π can be written as a sum of cosines and sines which all have period 2π.

Theorem (Fourier)Suppose f (t) has period 2π then we have

f (t) ∼a20 + a1 cos(t) + a2 cos(2t) + a3 cos(3t) + . . .

+b1 sin(t) + b2 sin(2t) + b3 sin(3t) + . . . (1)

∞

∑a0

= +2

an cos(nt) + bn sin(nt), n=1

where the coefficients a0, a1, . . . and b1, b2 . . . are computed by

1 � π a0 = f (t) dt

π 1 �−π

π

an = f (t) cos(nt) dt (2)π −π

1 � π bn = f (t) sin(nt) dt

π −π

Some comments are in order.

1. As we saw in the quiz above, each of the functions cos(t), cos(2t), cos(3t), . . . all have 2π as a period. The same is clearly true for sin(t), sin(2t), sin(3t), . . . .

2. The series on the right-hand side (1) is called a Fourier series; and the coefficients , a0, a1, . . . b1, b2, . . . in (2) are called the Fourier coefficients of f (t).

3. The letter a is used in a0/2 because we can think of it as the coefficient of cos(0 t) = 1. We don’t need a b0 term because sin(0 t) = 0. The term · · constant term a

20 is written in this way to make the formula for a0 look

just like those of the other cosine coefficients an. (We will see why we need the factor of 1

2 in a later note when we prove that these formulas really do give the coefficients.)

4. In (1) we used the symbol ∼ instead of an equal sign because the two sides of (1) might differ at those values of t where f (t) is discontinuous. For us, this is a minor point and we will allow ourselves to use an equal sign from now on.

Fourier Series: Definitions and Coefficients OCW 18.03SC

5. There is some terminology coming from acoustics and music: the n = 1 frequency is called the fundamental, and the frequencies n ≥ 2 are called the higher harmonics (or overtones). We will explore the connection between Fourier series and sound in a later session.

Fourier series are a wonderful tool for breaking a periodic function, however complicated, into simple pieces. The superposition principle will then allow us to solve DE’s with arbitrary periodic input in Fourier series form.

In later notes we will extend Fourier’s theorem to functions of other periods. The extension is straighforward, but requires more notation, so we will wait until you have gained some experience with Fourier series.

2

The graph over several periods is shown below.

• ◦ 1• ◦ • ◦

t −2π −π π 2π

• ◦−1 • ◦ • ◦

�−1 for − π ≤ t < 0

f (t) = . 1 for 0 ≤ t < π

� ��

Examples

Example 1. Compute the Fourier series of f (t), where f (t) is the square wave with period 2π. which is defined over one period by

Solution. Computing a Fourier series means computing its Fourier coefficients. We do this using the integral formulas for the coefficients given with Fourier’s theorem in the previous note. For convenience we repeat the theorem here.

∞

∑f (t) = a0

(an cos(nt) + bn sin(nt)),+2 n=1

where 1 � π 1 � π 1 � π

a0 = f (t) dt, an = f (t) cos nt dt, bn = f (t) sin nt dt π −π π −π π −π

In applying these formulas to the given square wave function, we have to split the integrals into two pieces corresponding to where f (t) is +1 and where it is −1. We find

1 � π � 0 � π an =

π −π f (t) cos(nt) dt =

−π (−1) · cos(nt) dt +

0 (1) · cos(nt) dt.

Thus, for n = 0:

an = − sin(nt)

nπ

0 πsin(nt)= 0+

nπ 0−π

and for n = 0: � π1 a0 = f (t) dt = 0.

π −π

(Note: it’s advisable to do a0 separately.)

��

� �

� �

Examples OCW 18.03SC

Likewise 1 � π 1 � 0 1 � π

bn = f (t) sin(nt) dt = sin(nt) dt −π

− sin(nt) dt + π π π 0−π

0 π 1 − cos(−nπ) cos(nπ) − 1cos(nt) cos(nt)− −= =nπ nπ nπ nπ0−π

2 2 4 for n odd = (1 − cos(nπ)) = (1 − (−1)n) = nπ .

nπ nπ 0 for n even

We have used the simplification cos nπ = (−1)n to get a nice formula for the coefficients bn. (Note: when you get cos nπ in these calculations it’s always useful to make this substitution.)

This then gives the Fourier series for f (t): ∞

∑4 1 1

f (t) = bn sin(nt) = sin t + sin(3t) + sin(5t) + .· · · 3 5πn=1

Example 2. seeing the convergence of a Fourier series 4 1 1

The claim is that f (t) = sin t + sin 3t + sin 5t + . However, it π 3 5

· · · is not easy to see that the sum on the right-hand side is in fact converging to the square wave f (t). So let’s use a computer to plot the sums of the first N terms of the series. for N = 1, 3, 9, 33. We get the following four graphs:

Notice that since a finite sum of sine functions is continuous (in fact smooth), the partial sums cannot jump when t is an integer multiple of π, the way the square way f (t) does. But they are certainly “trying" to become the square wave f (t)! And the more terms you add in, the better the fit, with the theoretical limit as N ∞ being exactly equal to f (t) (except→actually at the jumps t = nπ, as we’ll see).

Note: In this case we don’t have any cosine terms, just sine. This turns out to be not an accident: it follows from the fact that f (t) here is an odd

2


function, i.e. f (−t) = − f (t), and such functions have only sines (whichare also odd functions) in their Fourier series. Similarly for even functionsand cosine series: if f (t) is even ( f (−t) = f (t)) then all the bn’s vanish and

the Fourier series is simply f (t) = a0

∞

∑ an cos(nt); while if f (t) is odd +2 n=1

∞

∑then all the an’s vanish and the Fourier series is f (t) bn sin(nt).=n=1

3

� � � �

�

Fourier Series for Functions with Period 2L

Suppose that we have a periodic function f (t) with arbitrary period P = 2L, generalizing the special case P = 2π which we have already seen. Then a simple re-scaling of the interval (−π, π) to (−L, L) allows us to write down the general Fourier series and Fourier coefficent formulas:

π π∞

∑f (t) = a0

2+ + bn sin (1)t tan cos n n

L Ln=1

with Fourier coefficients given by the general Fourier coefficent formulas

1 � L a0 = f (t) dt,

L −L

1 � L π an = f (t) cos(n t) dt, (2)

L −L L 1 � L π

bn = f (t) sin(n t) dt.L −L L

Note: The number L = P 2 is called the half-period.

Example. Let f (t) be the period 2 function, which is defined on the window [−1, 1) by f (t) = |t|. Compute the Fourier series of f (t).

The graph of f (t) below shows why this function is called either a triangle wave or a continuous sawtooth function.

−2 −1 1 2 Figure 1: The period 2 triangle wave.

Solution. In this case the period is P = 2, so the half-period L = 1. This means n π

L = nπ and we compute the coefficients from the formulas in (2), using integration by parts, as follows.

For n = 0:

1 � 1 � 1 an =

1 −1 |t| cos(nπt) dt = 2

0 t cos(nπt) dt �

t sin(nπt) cos(nπt) ��1 2

� 4 for n odd� n2π2

= 2 nπ

+ n2π2 �

0 =

n2π2 ((−1)n − 1) = 0 −

for n even

Fourier Series for Functions with Period 2L OCW 18.03SC

and for n = 0: 1 1 1

a0 = t dt = 2 t dt = 11

�−1

| |�

0

Since f (t) is an even function and sin(nπt) is odd, the sine coefficients bn = 0. (We will justify this carefully in the next session. For now you can compute the integrals for bn as an exercise and verify it in this case.)

Thus, the Fourier series for f (t) is

1 4 cos 3πt cos 5πt 1 4 cos(nπt) f (t) = −

�cos πt + + + · · ·

�= .

2 π2 32 52 ∑ 2 −

π2 n odd n2

2

Orthogonality Relations

We now explain the basic reason why the remarkable Fourier coefficent formulas work. We begin by repeating them from the last note:

1 � L a0 = f (t) dt,

L −L

1 � L π an = f (t) cos(n t) dt, (1)

L � L L −

1 L πbn = f (t) sin(n t) dt.

L −L L

�

� � � �

The key fact is the following collection of integral formulas for sines and cosines, which go by the name of orthogonality relations: ⎧ ⎪ 1 n = m = 0 � L

⎨ �

L 1

−L cos(n π L t) cos(m π

L t) dt = ⎪ 0 n �= m ⎩ 2 n = m = 0 � L1 −L cos(n π

L t) sin(m π L t) dt = 0L

1 � L sin(n π L t) sin(m π

L t) dt = 1 n = m �= 0

L −L 0 n �= m

Proof of the orthogonality relations: This is just a straightforward calculation using the periodicity of sine and cosine and either (or both) of thesetwo methods:Method 1: use cos at = eiat+

2 e−iat

, and sin at = eiat −2i

e−iat .

Method 2: use the trig identity cos(α) cos(β) = 12 (cos(α + β)+ cos(α − β)),

and the similar trig identies for cos(α) sin(β) and sin(α) sin(β).

Using the orthogonality relations to prove the Fourier coefficient formula Suppose we know that a periodic function f (t) has a Fourier series expansion

π π∞

∑f (t) = a0

2+ + bn sin (2)t tan cos n n

L Ln=1

How can we find the values of the coefficients? Let’s choose one coefficient, say a2, and compute it; you will easily how to generalize this to any other coefficient. The claim is that the right-hand side of the Fourier coefficient formula (1), namely the integral

1 � L � π �

L −Lf (t) cos 2

Lt dt.

Orthogonality Relations OCW 18.03SC

is in fact the coefficent a2 in the series (2). We can �replace f (t) in this integral by the series in (2) and multiply through by cos 2 π

L t

1 L a0 π ∞

�, to get

� �t�

cos 2 +∑� � π � π π

a� π

n cos n t cos 2 t�+ bn sin

�n t cos 2 t dt

L L 2 L n=1 L L L

�−

�L

��Now the orthogonality relations tell us that almost every term in this sum will integrate to 0. In fact, the only non-zero term is the n = 2 cosine term

1 � L π π a2 cos

L

�2 t

�cos

�2 t

�dt

−L L L

and the orthogonality relations for the case n = m = 2 show this integral is equal to a2 as claimed.

aWhy the denominator of 2 in 0 ?

2 Answer: it is in fact just a convention, but the one which allows us to have the same Fourier coefficent formula for an when n = 0 and n ≥ 1. (Notice that in the n = m case for cosine, there is a factor of 2 only for n = m = 0.)

aInterpretation of the constant term 0 .

2 We can also interpret the constant term a0

2 in the Fourier series of f (t) as the

average of the function f (t) over one full period: a0 � L

= 1 2 2L −L f (t) dt.

2

Operations on Fourier Series

In this session we learn a number of useful techniques for computing and manipulating Fourier series. We will also see some ways to get a new Fourier series from a given one, including the use of shifts, differentiation and integration. These can provide calculation shortcuts: if we already have worked out a Fourier series, we can find the coefficients of the series for a related function without having to use the integral formulas again.

�

� ��

� �

Compute a Fourier Series

Exercise. We warm up with a reminder of how one computes the Fourier series of a given periodic function using the integral Fourier coefficient formulas.

Compute the Fourier series for the period 2π continuous sawtooth function f (t) = |t| for −π ≤ t ≤ π.

Answer.

−2π −π π 2π Figure 1. Graph of the period 2π continuous sawtooth function.

The period is 2π, so the half-period L = π. Since f (t) = |t| for −π ≤t ≤ π, it is an even function we know the Fourier sine coefficents bn must be zero.

Computing the cosine coefficients we get: For n = 0:

1 � π 2 � π an =

π −π |t| cos(nt) dt =

π 0 t cos(nt) dt

π 2 4 for n odd2 t sin(nt) cos(nt) −n2π((−1)n − 1)+= = =n2 n2π 0 forπ n even0

For n = 0: 1 � π 2 � π

a0 = t dt = t dt = π. π −π

| | π 0

Thus, f (t) has Fourier series

π 4 cos(3t) cos(5t)f (t) = cos t + + +

2 −

π 32 52 · · ·

= π 4

∑ cos(nt)

2 −

π n odd n2

Compute a Fourier Series

Exercise. We warm up with a reminder of how one computes the Fourier series of a given periodic function using the integral Fourier coefficient formulas.

Compute the Fourier series for the period 2π continuous sawtooth function f (t) = |t| for −π ≤ t ≤ π.

Even and Odd Functions

If a periodic function f (t) is an even function we have already used the fact that its Fourier series will involve only cosines. Likewise the Fourier series of an odd function will contain only sines. Here we will give short proofs of these statements.

Even and odd functions.Definition. A function f (t) is called even if f (−t) = f (t) for all t.

The graph of an even function is symmetric about the y-axis. Here are some examples of even functions:

1. t2, t4, t6, . . . , any even power of t.

2. cos(at) (recall the power series for cos(at) has only even powers of t).

3. A constant function is even.

We will need the following fact about the integral of an even function over a ’balanced’ interval [−L, L]. � L � L

If f (t) is even then f (t) dt = 2 f (t) dt. −L 0

This fact becomes clear if we think of the integral as an area (see fig. 1).

tL−L

tL−L

Fig. 1: Even functions: Fig. 2: Odd functions: ( total area = twice area of right half) (total (signed) area is 0)

Definition. A function f (t) is called odd if f (−t) = − f (t) for all t.

The graph of an odd function is symmetric about the the origin. Here are some examples of odd functions:

1. t, t3, t5, . . . , any odd power of t.

2. sin(at) (recall the power series for sin(at) has only odd powers of t).

Even and Odd Functions OCW 18.03SC

We will need the following fact about the integral of an odd function over a ’balanced’ interval [−L, L]. � L

If f (t) is odd then f (t) dt = 0. −L

This fact becomes clear if we think of the integral as an area (see Fig. 2).

Multiplying Even and Odd Functions When multiplying even and odd functions it is helpful to think in terms of multiply even and odd powers of t. This gives the following rules. 1. even × even = even 2. odd × odd = even 3. odd × even = odd

All this leads to the even and odd Fourier coefficient rules: Assume f (t) is periodic then:

2 � L � π � 1. If f (t) is even then we have bn = 0, and an = f (t) cos n t dt.

L 0 L

2 � L � π � 2. If f (t) is odd then we have an = 0, and bn = f (t) sin n t dt.

L 0 L

Reason: Assume f (t) is even. The rule for multiplying even functions tells us that f (t) cos at is even and the rule for integrating an even function over a symmetric interval tell us that

1 � L � π � 2 � L � π � an = f (t) cos n t dt = f (t) cos n t dt.

L −L L L 0 L

Likewise, the rule even × odd = odd tell us that f (t) sin at is odd, and so the integral for bn is 0.

If f (t) is odd everything works much the same. The rule for multiplying odd functions tells us that f (t) sin at is even and therefore

1 � L � π � 2 � L � π � bn = f (t) sin n t dt = f (t) sin n t dt.

L −L L L 0 L

Likewise the rule odd × even = odd tells us that f (t) cos(at) is odd, and so the integral for an is 0.

Examples: In previous sessions we saw the odd square wave had only sine coefficients and the even triangle wave had only cosine coefficients.

2

1 tπ

−1 Figure 0: The graph of sq(t), the odd, period 2π square wave.

2

tπ Figure 1: f1(t) = sq(t) shifted up by 1 unit.

Scaling and Shifting

There is a very useful class of shortcuts which allows us to use the known Fourier series of a function f (t) to get the series for a function related to f (t) by shifts and scale changes. We illustrate this technique with a collection of examples of related functions.

We let sq(t) be the standard odd, period 2π square wave.

1 for π t < 0 sq(t) =

�− − ≤

(1)1 for 0 ≤ t < π

We already know the Fourier series for sq(t). It is

4 1 1 4 sin(nt) sq(t) =

�sin(t) + sin(3t) + sin(5t) + =

�(2)

π 3 5 · · ·

π ∑ n n odd

1. Shifting and Scaling in the Vertical Direction

Example 1. (Shifting) Find the Fourier series of the function f1(t) whose graph is shown.

Solution. The graph in Figure 1 is simply the graph in Figure 0 shifted upwards one unit. That is, f1(t) = 1 + sq(t). Therefore

4 sin(nt) f1(t) = 1 +

∑ . π n odd n

Example 2. (Scaling) Let f2(t) = 2 sq(t). Sketch its graph and find its Fourier series.

��

��

2

tπ

−2 Figure 2: Graph of f2(t) = 2 sq(t).

1 tπ

1

1 t

1 −1

Figure 4: sq(t) scaled in time.

Scaling and Shifting OCW 18.03SC

Solution.

The Fourier series of f2(t) comes from that of sq(t) by multiplying by 2.

8 sin(nt) f2(t) = ∑ .

π n n odd

Example 3. We can combine shifting and scaling along the vertical axis. Let f3(t) be the function shown in Figure 3. Write it in terms of sq(t) and find its Fourier series.

−Figure 3: f3(t) = sq(t) shifted by 1 and then scaled by 1/2.

1 1 2 sin ntSolution. f3(t) = (1 + sq(t)) = + ∑ .

2 2 π n odd n

2. Scaling and Shifting in t

Example 4. (Scaling in time) Find the Fourier series of the function f4(t) whose graph is sho

InFigure 4 the point marked 1 on the t-axis corresponds with the point marked π in Figure 0. This shows that f4(t) = sq(πt) and therefore we replace t by πt in the Fourier series of sq(t).

4 sin(nπt) f4(t) = ∑ .

π n n odd

��

��

wn.

��

2

1 t

π/2

Figure 5: sq(t) shifted in time.

Scaling and Shifting OCW 18.03SC

Example 5. (Shifting in time) Let f5(t) = sq(t + π/2). Graph this function and find its Fourier series.

Solution. We have f5(t) is sq(t) shifted to the left by π/2. Therefore

4 sin(3t + f

� 3π/2) 4 cos 3t

5(t) = sin(t + π/2) + + . . . �=

π

�cos t − + . . .

3 π 3

�(To simplify the series we used the trig identities sin(θ + π/2) = cos(θ) and sin(θ + 3π/2) = − cos(θ) etc.)

Notice that f5(t) is even, and so must have only cosine terms in its series, which is in fact confirmed by the simplified form above.

��

3

� �

� �

Integration and Differentiation

We can integrate a Fourier series term-by-term:

Example 1. Let

cos 2t cos 3tf (t) = 1 + cos t + + + . . .

2 3

then, � t sin 2t sin 3th(t) = f (u) du = t + sin t + + + . . .

0 22 32

Note: The integrated function h(t) is not periodic (because of the t term), so the result is a series, but not a Fourier series.

We can also differentiate a Fourier series term-by-term to get the Fourier series of the derivative function.

Example 2. Let f (t) be the period 2π triangle wave (continuous sawtooth) given on the interval [−π, π) by f (t) = |t|. Its Fourier series is

π 4 cos 3t cos 5tf (t) =

2 −

π cos t +

32 + 52 + . . .

In the previous session we computed the Fourier series of a period 2 triangle wave. This series can then be obtained from that one by scaling by π in both time and the vertical dimension, using the methods we learned in the previous note. The derivative of f (t) is the square wave. (You should verify this). Differentiating the Fourier series of f (t) term-by-term gives

4 sin 3t sin 5tf �(t) = sin t + + + . . . ,

π 3 5

which is, indeed, the Fourier series of the period 2π square wave we found in the previous session.

π

π 2π−2π −π Figure 1: The period 2π triangle wave.

� �

Integration and Differentiation OCW 18.03SC

Example 3. What happens if you try to differentiate the square wave

4 sin 3t sin 5t sq(t) = sin t + + + . . . ?

π 3 5

Solution. Differentiation term-by-term gives

4 sq�(t) = (cos t + cos 3t + cos 5t + . . .) .

π

But, what is meant by sq�(t)? Since sq(t) consists of horizontal segments its derivative at most places is 0. However we can’t ignore the ’vertical’ segments where the function has a jump discontinuity. For now, the best we can say is that the slope is infinite at these jumps and sq�(t) doesn’t exist. Later in this unit we will learn about delta functions and generalized derivatives, which will allow us to make better sense of sq�(t).

2

� • • • • • •

Convergence of Fourier Series

The period 2L function f (t) is called piecewise smooth if there are a only finite number of points 0 ≤ t1 < t2 < . . . < tn ≤ 2L where f (t) is not differentiable, and if at each of these points the left and right-hand limits lim f �(t) and lim f �(t) exist (although they might not be equal). t t+

i t t−i→ →

Recall that when we first introduced Fourier series we wrote

f (t) ∼a20 + a1 cos(t) + a2 cos(2t) + a3 cos(3t) + . . .

+b1 sin(t) + b2 sin(2t) + b3 sin(3t) + . . . ∞

∑a0

= +2

an cos(nt) + bn sin(nt), n=1

where we used ’∼’ instead of an equal sign. The following theorem shows that our subsequent use of an equal sign, while not technically correct, is close enough to be warranted.

Theorem: If f (t) is piecewise smooth and periodic then the Fourier series for f 1. converges to f (t) at values of t where f is continuous 2. converges to the average of f (t−) and f (t+) where it has a jump discontinuity.

Example. Square wave. No matter what the endpoint behavior of f (t) the Fourier series converges to: • ◦ • ◦ • ◦ ◦ ◦ ◦ ◦ ◦ ◦

• ◦ • ◦ • ◦ ◦ ◦ ◦ ◦ ◦ ◦

Original f (t) Fourier series for f (t)

Example. Continuous sawtooth: Fourier series converges to f (t).

Example.

◦ •

◦ •

◦ •

�

Original f (t) Fourier series

An Interpretation of Fourier Series

Fourier Analysis is a name sometimes used to denote the decomposition of a function f (t) into the sum of its sinusoidal harmonics, which are also called its Fourier components. Fourier Synthesis is a name used to denote the building up of a function f (t) by adding up its successive Fourier components; that is, the reconstruction of a function f (t) from its Fourier components.

The are electronic devices which can perform both types of operations, called Fourier analyzers and Fourier synthesizers. The ear and the brain also function as, respectively, a Fourier analyzer and a Fourier synthesizer! We describe this briefly (in very general terms).

The input to the ear is a time-varying pressure wave-form f (t). The inner ear has an array of about twenty thousand hair-like cells, each of which resonates at a different frequency. Each of the individual Fourier components of f (t) stimulates a different one of these “hair" cells. Thus this array of cells acts all together as a Fourier analyzer. Each hair cell which gets selected by a component of f (t) to be driven into motion then stimulates an attached nerve which sends a signal to the brain. The brain (being the smart and capable device it is) then somehow combines or synthesizes these individual received Fourier components and produces a reconstructed approximation of the function f (t) as their sum. This reconstructed f (t)-pattern in the brain is then what we experience as the sound corresponding to the incoming pressure-wave signal f (t).

� �

Gibbs’ Phenomenon

In practice it may be impossible to use all the terms of a Fourier series. For example, suppose we have a device that manipulates a periodic signal by first finding the Fourier series of the signal, then manipulating the sinusoidal components, and, finally, reconstructing the signal by adding up the modified Fourier series. Such a device will only be able to use a finite number of terms of the series.

Gibbs’ phenomenon occurs near a jump discontinuity in the signal. It says that no matter how many terms you include in your Fourier series there will always be an error in the form of an overshoot near the discontinuity. The overshoot always be about 9% of the size of the jump.

We illustrate with the example. of the square wave sq(t). The Fourier series of sq(t) fits it well at points of continuity. But there is always an overshoot of about .18 (9% of the jump of 2) near the points of discontinuity.

−1 1

1

−1

1.18

−1.18

t

Gibbs: max n = 1

−1 1

1

−1

1.18

−1.18

t

Gibbs: max n = 3

−1 1

1

−1

1.18

−1.18

t

Gibbs: max n = 9

−1 1

1

−1

1.18

−1.18

t

Gibbs: max n = 33

In these figures, for example, ’max n=9’ means we we included the terms for n = 1, 3, 5, 7 and 9 in the Fourier sum

4 sin 3t sin 5t sin 7t sin 9tsin t + + + + .

π 3 5 7 9

Application to Infinite Series

There is a famous formula found by Euler: ∞ 1 2

∑ π

= n=1 n2 . (1)

6

We’ll show how you can use a Fourier series to get this result.

tConsider the period 2π function given by f (t) = t π on [0, 2π].

2

� �

�

� ��

−

t2π−2π

Figure 1: Graph of f (t).

First, we compute the Fourier series of f (t). Since f is even, the sine terms are all 0. For the cosine terms it is slightly easier to integrate over a full period from 0 to 2π rather than doubling the integral over the half-period. We give the results, but leave the details of the integration by parts to the reader. For n = 0 we have

1 � 2π 2π2

a0 = π 0

t(π − t/2) dt = 3

and for n = 0 we have

1 � 2π

an = t(π − t/2) cos(nt) dt π 0

2πt2 sin(nt)1 πt sin(nt) sin(nt) 2π cos(nt) t cos(nt)+ +− − = −=

n2 . n2 n2 n32nπ n 0

∞

∑π2 cos(nt)

Thus the Fourier series is f (t) = − 2n2 .

3 n=1 Since the function f (t) is continuous, the series converges to f (t) for all t. Plugging in t = 0, we then get

−∞

∑ n=1

2n2 .

π2

f (0) = 0 =3

A little bit of algebra then gives Euler’s result (1).

ODE’s with Periodic Input, Resonance

Fourier series can be used to solve an inhomogeneous linear time invariant (LTI) DE

p(D)x = f (t) (1)

in the case where f (t) is a periodic function. The main idea is to use Fourier series and the superposition principle to reduce the problem to solving (1) when the input f (t) is a sinusoid.

Very briefly: if f (t) is periodic then its Fourier series gives f as a sum of sines and cosines. The superposition principle allows us to find the response of (1) to each of the sine or cosine terms separately and then superposition these individual responses to find the response of the system to the periodic signal f .

The particular solution found in this way will also be a sum of sines and cosines with the same period as the input. We will call it the steady-periodic solution, xsp(t). One way in which the steady-periodic solution is useful is that it reveals when resonances will occur, and at what frequencies.

In this session we will also show how Fourier series can be used to analyze sound waves.

Example: Simple Harmonic Oscillator

Example. Let f (t) = the odd square wave of period 2π with f (t) = 1 for 0 < t < 1. Use Fourier series to solve the DE

.. x + 9.1x = f (t). (1)

Solution. From previous examples we know the Fourier series for f (t), � 4 sin 3t sin 5t

�4 sin nt

f (t) = sin t + + + . . . = π 3 5 π ∑ nn odd

So the DE (1) becomes

.. 4 �

sin 3t sin 5t x + 9.1x = sin t + + + . . . . (2)

π 3 5

�Step 1: Solve the DE with a single sine function as input. That is, solve

.. sin nt xn + 9.1xn = . (3)

n

Notice, we use the index n so we can tell our solutions apart. Also notice that equation (3) does not include the factor 4 π ; we will bring that back inthe superposition step.

We have a lot of experience solving equation (3). using complex replacement and the Exponential Response formula. We get particular solutions

sin nt xn,p(t) = .

n(9.1 − n2)

Step 2: Use superposition to get a particular solution xp to (2). Here we line up the DE and the solution so you can see superposition in action:

.. x 9.1x 4 sin t sin 3t sin 5t sin nt + = ( + + + . . .) = 4

π 3 5 π ∑ nn odd

x t 4) = (x 4 sp( 1,p(t) + x3,p(t) + x5,p(t) + . . .) = ∑ xn,p(π t) π

4 = π

� n odd sin sin t sin 3 nt

t sin 5t+ + + 9.1 1 . 3(9.1 9) 5(9.1 25) . . . = 4 ∑ . (4)− − −�

π n(9.1 − n n2) odd

This is called the steady periodic solution.

Example: Simple Harmonic Oscillator OCW 18.03SC

Near resonance: The amplitudes of each of the terms in (4) are:

�

4 �

1 4 1 4 π 9.1 1

≈ 0.157, − π

�3 − 9)

�≈ 4.244,

(9.1 π

�1

0.016(9.1 − 25)

�≈

5−

for n = 1, 3, 5 respectively. Then for n > 5 the amplitudes are much smaller. We see the n = 3 term in the steady periodic response xsp(t) has by far the biggest amplitude.

We can explain this by noticing that the natural frequency of this system is

√9.1 ≈ 3 and so, the system has a resonant-type response to the

“embedded third harmonic” sin 3t in the input 3 signal.

Notice that the input signal has base (fundamental) frequency 1, so the presence of this third harmonic is not apparent to the eye, and yet the driven oscillator picked it out in its response, which has a dominant frequency three times the fundamental frequency of the input.

There is a simple way to visualize this type of phenomenon: you can push a pendulum swing into resonance even if you give it a push only every third time it comes momentarily to rest at its maximum height, instead of pushing it every time.

2

Harmonic Frequency Response Applet

As usual, start the applet and play with it a little bit.

In this applet the input has a fixed angular frequency of 1 and the ωn

slider adjusts the resonant frequency of the system.

Choose f (t) to be the sine wave. Look at what happens as you change ωn. Why does the amplitude of the response go to infinity when ωn = 1.

Now choose f (t) to be the square wave. Notice that the amplitude of the response becomes infinite at ωn = 1, 3 or 5.

Question: As ωn gets close to 1, 3 or 5 what is the dominant frequency in the ouput?

Answer: You should have seen that with ωn near 1 the output resembles a frequency 1 sine wave. For ωn near 3 the dominant frequency in the output is 3, i.e. there are three peaks in the oscillation over one cycle of the square wave. Likewise for wn near 5 the dominant frequency is 5.

We can explain this using Fourier series. The square wave has Fourier series

4 sin nt f (t) = ∑ .

π n n odd

Each term in the series affects the system. If the system has natural frequency 3 then the sin 3t term causes it to resonate with a large amplitude at that frequency. Thus, the response to that term is far larger than the response to any other term.

We will explore this further later in this session.

� �

�

� �

Example: Damped Harmonic Oscillator

Example. Let f (t) be the triangle wave shown in figure 1. Solve the differential equation .. .

x + 2x + 9x = f (t).

t −π π 2π 3π 4π

1

Solution. Using a previous example, or computing directly, we have the Fourier series for f (t) is

1 4 cos 3t cos 5tf (t) =

2 −

π2 cos t + 32 +

52 + . . . .

We follow the same steps as in the example in the previous note.

Step 1: Solving for the individual components: Solve: .. .

xn + 2xn + 9xn = cos nt (1)

If n = 0 we get xn,p = 19 .

For n ≥ 1 we have .. .

Complex replacement: zn + 2zn + 9zn = eint , xn = Re(zn)inte

Exponential Response formula: zn,p = 9 − n2 + 2in

.

Polar coords: 9 − n2 + 2in = Rneiφn , where

Rn = (9 − n2)2 + 4n2 and φn = Arg(9 − n2 + 2in) = tan−19 −

2nn2

(since the complex number is in the first or second we must take the arctangent between 0 and π).

Thus, zn,p =1

ei(nt−φn), which implies xn,p = 1

cos(nt − φn)Rn Rn

Step 2: Superposition. To make things easier in step one we did not include the Fourier coefficients of the input in the DE (1). To use superposition we need to include them here.

1 4 cos(t − φ1) cos(3t − φ3) cos(5t − φ5)xsp(t) = 18

− π2 R1

+ 32R3

+ 52R5

+ . . . ,

with the formulas for Rn and φn as above.

� �

.. . � �

� �

� �

General Case

It is actually just as easy to write out the formula for the Fourier series expansion of the steady-periodic solution xsp(t) to the general second-order LTI DE p(D)x = f (t) with f (t) periodic as it was to work out the previous example - the only difference is that now we use letters instead of numbers. We will choose the letters used for the spring-mass-dashpot system, but clearly the derivation and formulas will work with any three parameters. For simplicity we will take the case of f (t) even (i.e. cosine series).

.. .Problem: Solve mx + bx + kx = f (t), for the steady-periodic response

∞

∑π

xsp(t), where f (t) = a0 t dt+ an cos n2 Ln=1

Solution

Characteristic polynomial: p(s) = ms2 + bs + k. Solving for the component pieces:

mxn + bxn + kxn = cos n π L t

1For n = 0 we get x0,p = .

k For n ≥ 1:Complex replacement: m

.. zn + bz

. n + kzn = ein π

L t , xn = Re(zn)

ein π L t

Exponential Response formula: zn,p(t) = p(in π

L ) .

Polar coords: p(in π L ) = k − m(n π

L )2 + i b n π = p(in π

L ) eiφn ,L�� 2

| |

where |p(in π L )| = k − m(n π

L )2 + b2(n π

L )2 and

b n π Lφn = Arg(p(in π

L )) = tan−1 k−m(n π

L )2 (phase lag).

Thus, zn,p(t) = gn ei(n π L t−φn ), with gn = |p(in

1 π L )|

(gain).

Taking the real part of xn,p we get xn,p(t) = gn cos(n π L t − φn).

Now using superposition and putting back in the coefficients an we get: ∞

∑ ∞

∑πa0 a0 xsp(t) = an xn,p(t) = cos(nL

t − φn)x0,p + + gn an2 2kn=1 n=1

This is the general formula for the steady periodic response of a second-order LTI DE to an even periodic driver f (t)

Fourier Coefficients: Complex with Sound

Caution: If you listen to the sound through headphones, start by setting the volume low and increase it slowly. Only let the volume get high enough to hear the sounds. This is especially important when playing pure sine waves because brain is used to auditory input of many frequencies at once. To get the same apparent loudness from a single frequency requires a large amplitude. Prolonged exposure to a high amplitude sound of a given frequency can damage your hearing at that frequency.

As usual, first open the applet and play with it.

The applet makes a Fourier series of a function out of complex exponentials instead of sines and cosines. It then graphs the real part of the function plays the periodic sound whose pressure wave corresponds to the graph.

Using the applet.

1. The fundamental (base) frequency is given ν in kilohertz (kHz). Therefore, the fundamental angular frequency is ω = 2000π ν. There is a slider for setting the value of ν.

2. The function f (t) is given by f (t) = ∑ cneinωt, where the cn are complex coefficients. Therefore f (t) is a complex-valued function.

3. There is a parameter φ with a slider. Leave it at 0 for now.

4. The graph shows Re(eiφ f (t)). The sound played corresponds to this graph.

5. To select one of coefficients cn, click on the yellow dot for that coefficient in either of the windows on the lower right. There are two ways to adjust the value of cn. First, you can use the mouse to drag the white dot representing the value of cn around the complex plane displayed on the lower left. The second method is to use the sliders on the lower right. The top slider represents the magnitude |cn| and the bottom slider the polar angle Arg(cn). Notice that as you move the sliders the point in the complex plane also moves.

Exploring the Applet Now click the reset button to set all the coefficients to 0. Set φ to 0.

Fourier Coefficients: Complex with Sound OCW 18.03SC

Make sure the ’ f (t) real’ checkbox is not selected.Set the frequency to the lower A value. (This is the musical note A440 (Hz).)Select the first coefficient (c1) and set the magnitude to 2. The graph shouldbe a sinusoid.

Now start the sound, you should hear a steady pure tone. While the sound is playing adjust the Arg(c1). What happened to the graph? What happened to the sound?

The graph should slide left or right as you change the phase angle. The sound shouldn’t change: you ear does not detect phase.

Play with setting the higher harmonics, i.e. setting the other coefficients. How does the sound change? Does the pitch that you’re hearing change?

If the higher harmonics have much lower amplitudes than the fundamental frequency, then the fundamental pitch will stay the same but the quality of the sound will change. If the amplitude of a higher harmonic approaches that of the fundamental you may begin to hear it as a separate note.

Make sure the ’ f (t) real’ checkbox is not selected and φ = 0. Try to adjust the coefficients to get a square wave. Hint: Which coefficients of the square wave are nonzero? How do you get a sine function out of complex exponentials.

As we found in an earlier session, the Fourier series of the square wave with fundamental angular frequency ω = 2πν and amplitude 1 is

4 (sin(ωt) + sin(3ωt)/3 + sin(5ωt)/5 + . . .) .

π

Since sin(x) is the real part of −ieix and we selected amplitude 2, we should set

8i 8i 8i 8i 8i c1 = − , c3 = − , c5 = − , c7 = − , c9 = − .

π 3π 5π 7π 9π

(All the even coefficients are 0.)

Could you get the square wave using the coefficients c−1, c−3 etc.? Could you get the square wave using real coefficients and adjusting the value of φ ?

2

Step and Delta Functions: Introduction

This session will make two additions to our mathematical modeling toolkit: step functions and delta functions. These are simple functions modeling idealized signals.

A step function represents an idealized signal that switches from off to on at a specific time. That is, its value jumps from 0 to 1. In the real world the signal would go through a transition phase, which would take a small amount of time to get from 0 to 1. We idealize it as making the transition instantaneously. The most basic step function is the unit or Heaviside step function, u(t). It is 0 for t < 0 and 1 for t > 0. Its graph looks like

t

1

u(t)

The graph of the unit step function.

A delta function represents an idealized input that acts all at once. If a finite force pushes on a mass it changes the momentum of the mass over time. We can achieve the same change in momentum with a small force acting over a long time or a large force acting over a short time. If the force acts over a very short time we call it an impulse. The unit delta function δ(t) (also called the unit impulse function) models an idealized impulse, which can be thought of as an infinite force acting over an infinitesimal amount of time and causes a unit change in the momentum of the mass.

In the above, the delta function represented an idealized impulsive force acting on a second order mechanical system. First order system, and indeed systems of any order, also have the notion of an impulse, which can also be modeled by a delta function.

Step functions and delta functions are not differentiable in the usual sense, but they do have what we call generalized derivatives. In fact, as a generalized derivative we have u�(t) = δ(t). Since step and delta functions can also be integrated they can used in DE’s.

Step and delta functions are of fundamental importance in our study of LTI systems. For example, if we know the response of such a system to either the unit delta or unit step function then we can compute its response to any input whatsoever.

�

�

Step and Box Functions

1. Heaviside Unit Step Function

The unit step function is defined by

0 for t < 0 u(t) = 1 for t > 0

The reason for the name unit step can be seen in the graph.

t

1

u(t)

ta

1

u(t− a)

The first graph shows the function u(t). The second graph shows u(t − a), which is simply u(t) shifted to the right.

0 for t < a u(t − a) = 1 for t > a

A few details need to be highlighted.

1. u(t) is also called the Heaviside function.

2. u(t) is not defined when t = 0. Looking at the graph we see that u(t) has a jump discontinuity at t = 0.

3. The graph shows that u(0−) = 0 and u(0+) = 1. Here, u(0−) means the limit of u(t) as t approaches 0 from the left –called the left-hand limit. Likewise, u(0+) means the limit of u(t) as t approaches 0 from the right.

4. In the graphs we used dashed lines at the jump discontinuity. These lines are not part of the graph and we could have left them out. It is also common to use solid lines. Strictly speaking this is incorrect, but it gives nicer looking figures (for more on this see the next section).

2. Models

We can use u(t) to model an on/off process. Suppose a light turns on; first it is dark, then it is light. The basic model is the unit step function

Step and Box Functions OCW 18.03SC

Of course a light doesn’t reach its steady state instantaneously; it takes a small amount of time. If we use a finer time scale, you can see what happens. It might move up smoothly; it might overshoot; it might move up in fits and starts as different elements come on line. If we zoomed in near t = 0 the graph might actually look like

t

1

.01

At the longer time scale, we don’t care about these details. Modeling the process by u(t) lets us ignore them.

3. Box Functions

When we modeled the light with u(t) we assumed the light went on and stayed on forever. Eventually the light will be turned off or burn out. To be general, let’s assume the light goes on at time a and off at time b. We can model this with the function ⎧ ⎨ 0 for t < a

uab(t) = ⎩ 1 0

for a < t < b for b < t.

The graph of this is

a b

1

uab(t) = u(t− a)− u(t− b)

The graph shows why this is often called a box function. If you plot u(t −a) and u(t − b) on the same axes you will find that

uab(t) = u(t − a) − u(t − b).

2


We will usually dispense with the notation uab(t) and the formuala u(t −a) − u(t − b) for the box function.

4. Switches

By multiplying by a function f (t) we can use step and box functions as switches to turn f (t) on or off.

The graphs show step and box functions acting as switches.

• The first plot shows f (t).

• The second shows u(t − a) being used to switch f (t) on at t = a. That is, u(t − a) f (t) is 0 for t < a and agrees with f (t) for t > a.

• The third shows u(t − a) − u(t − b) being used to turn f (t) on in the window a < t < b and off outside it.

• The fourth graph shows u(t − a) f (t − a). That is, first f (t) is translated to the right a units and the result is switched on at time a.

5. u-format and Cases Format

We now have two ways to express functions that change formulas for different intervals of t.

Example. Suppose f (t) is 0 for t < 0, t for 0 < t < 1, t2 for 1 < t < 2 and 2t for t > 2. Express f (t) in both u and cases formats. Solution. Cases format expresses f (t) by specifying the formula for each case:

f (t) =

⎧ ⎪⎪⎨ ⎪⎪⎩

0 for t < 0 t for 0 < t < 1 t2 for 1 < t < 2 2t for 2 < t.

u-format uses step and box functions to turn on and off expressions:

f (t) = (u(t) − u(t − 1)) t + (u(t − 1) − u(t − 2)) t2 + u(t − 2) 2t.· · ·

3


Notice how each case tells us which step or box functions to use as switches and how each u function tells us where the cases change.

Example. Write f (t) = u(t) 4t + u(t − 2) t2 + u(t − 4) t3/4 in cases· · · format. Solution.

f (t) =

⎧ ⎪⎪⎨ ⎪⎪⎩

0 for t < 0 4t for 0 < t < 2 4t + t2 for 2 < t < 4 4t + t2 + t3/4 for 4 < t.

Notice how there are no off switches in the expression for f (t), so in cases format the number of terms in each successive case grows as the u-switches turn on.

4

Using Step Functions as Switches

Quiz: What is the equation for the function which agrees with f (t) between a and b (assume a < b) and is zero outside this window?

Choices:

a) (u(t − b) − u(t − a)) f (t)

b) (u(t − a) − u(t − b)) f (t − a)

c) (u(t − a) − u(t − b)) f (t)

d) u(t − a) f (t − a) − u(t − b) f (t − b)

e) none of these

Answer: (c): (u(t − a) − u(t − b)) f (t).The box function u(t − a) − u(t − b) is 1 between a and b and 0 outside it.



Choices:

a) (u(t − b) − u(t − a)) f (t)

b) (u(t − a) − u(t − b)) f (t − a)

c) (u(t − a) − u(t − b)) f (t)

d) u(t − a) f (t − a) − u(t − b) f (t − b)

e) none of these


Delta Functions: Unit Impulse

1. Introduction

In our discussion of the unit step function u(t) we saw that it was an idealized model of a quantity that goes from 0 to 1 very quickly. In the idealization we assumed it jumped directly from 0 to 1 in no time.

In this note we will have an idealized model of a large input that acts over a short time. We will call this model the delta function or Dirac delta function or unit impulse.

After constructing the delta function we will look at its properties. The first is that it is not really a function. This won’t bother us, we will simply call it a generalized function. The reason it won’t bother us is that the delta function is useful and easy to work with. Inside integrals or as input to differential equations we will see that it is much simpler than almost any other function.

2. Delta Function as Idealized Input

Suppose that radioactive material is dumped in a container. The equation governing the amount of material in the tank is

. x + kx = q(t),

where, x(t) is the amount of radioactive material (in kg), k is the decay rate of the material (in 1/year), and q(t) is the rate at which material is being added to the dump (in kg/year).

The input q(t) is in units of mass/time, say kg/year. So, the total amount dumped into the container from time 0 to time t is � t

Q(t) = q(u) du. 0

Equivalently . Q(t) = q(t).

To keep things simple we will assume that q(t) is only nonzero for a short amount of time and that the total amount of radioactive material dumped over that period is 1 kg. Here are the graphs of two possibilities for q(t) and Q(t).

Delta Functions: Unit Impulse OCW 18.03SC

t1/2

2

q(t) =⇒

t1/2

1

Q(t)

t1/8

8

q(t) =⇒

t1/8

1

Q(t)

Figure 1: two possible graphs of q(t) and Q(t), both with total input = 1.

It is easy to see that each of the boxes on the left side of Figure 1 has total area equal to 1. Thus, the graphs for Q(t) rise linearly to 1 and then stay equal to 1 thereafter. In other words, the total amount dumped in each case is 1.

Now let qh(t) be a box of width h and height 1/h. As h 0, the width →of the box becomes 0, the graph looks more and more like a spike, yet it still has area 1 (see Figure 2).

t1

1

h = 1

t12

2

h = 1/2

t116

16

h = 1/16

t

h→ 0, q(t) = δ(t)

Figure 2: Box functions qh(t) becoming the delta function as h 0.→

We define the delta function to be the formal limit

δ(t) = lim qh(t). h 0→

Graphically δ(t) is represented as a spike or harpoon at t = 0. It is an infinitely tall spike of infinitesimal width enclosing a total area of 1 (see figure 2, rightmost graph).

2


As an input function δ(t) represents the ideal case where 1 unit of material is dumped in all at once at time t = 0.

�

3. Properties of δ(t)

We list the properties of δ(t) below.

1. From the previous section we have

0 if t = 0,δ(t) =

�∞ if t = 0.

The graph is represented as a spike at t = 0. (See figure 2

2. Because δ(t) is the limit of graphs of area 1, the area under its graph is 1. More precisely: � d

� 1 if c < 0 < d

δ(t) dt = c 0 otherwise

3. For any continuous function f (t) we have

f (t)δ(t) = f (0)δ(t) and � d

f (t)δ(t) dt = �

f (0) if c < 0 < d

c 0 otherwise

The first statement follows because δ(t) is 0 everywhere except at t = 0. The second follows from the first and property (2).

4. We can place the delta function over any value of t: δ(t − a) is 0 everywhere but at t = a. Its total area remains 1. Its graph is now a spike shifted to be over t = a; and we have

� f (

dt)δ(t − a) = f (a)δ(�

t − a).

f (t)δ(t − a) dt = f (a) if c < a < d

c 0 otherwiseta

δ(t− a)

5. δ(t) = u�(t), where u(t) is the unit step function. Because u(t) has a jump at 0, δ(t) is not a derivative in the usual sense, but is called a generalized derivative. This is explained below.

3


6. We defined δ(t) as a limit of a sequence of box functions, all with unit area and which, in the limit, become a infinite spike over t = 0. Box functions are simple, but not special. Any sequence of functions with these properties has δ(t) as its limit.

7. In practical terms, you should think of δ(t) as any function of unit area, concentrated very near t = 0.

8. δ(t) is not really a function. We call it a generalized function.

9. In arriving at these properties we have skipped over some important technical details in the analysis. Generally property (3) is taken to be the formal definition of δ(t), from which the other properties follow.

4. Examples of integration

Properties (3) and (2) show that δ(t) is very easy to integrate, as the following examples show: � 5 Example 1. 7et2

cos(t)δ(t) dt = 7. All we had to do was evaluate the

integrand at t =−5

0. � 5 Example 2. 7et2

cos(t)δ(t − 2) dt = 7e4 cos(2). All we had to do was −5

evaluate the integrand at t = 2. � 1 Example 3. 7et2

cos(t)δ(t − 2) dt = 0. Since t = 2 is not in the interval −5

of integration the integrand is 0 on the entire interval.

The value t = 0− represents the ’left-side’ of 0 and t = 0+ is the ’right-side’. So, 0 is in the interval [0−, ∞) and not in [0+ , ∞). Thus � ∞ � ∞

δ(t) dt = 1 and δ(t) dt = 0. 0− 0+

In fact, since all the area under the graph is concentrated at 0, we can even write � 0+

δ(t) dt = 1. 0−

5. Generalized Derivatives

Our goal in this section is to explain property (5). A look at the graph of the unit step function u(t) shows that it has slope 0 everywhere except

4


�

�

at t = 0 and that its slope is ∞ at t = 0.

t

1u(t)

That is, its derivative is

u�(t) =0 if t �= 0 ∞ if t = 0.

Since u(t) has a jump of 1 at t = 0 this derivative matches properties (1) and (2) of δ(t) and we conclude that u�(t) = δ(t).

Now this derivative does not exist in the calculus sense. The function u(t) is not even defined at 0. So we call this derivative a generalized derivative.

We can also explain property (5) by looking at the anti-derivative of δ(t). Let � t

f (t) = δ(τ) dτ. −∞

The fundamental theorem of calculus leads us to say that f �(t) = δ(t). (Again, this is only in a generalized sense since technically the fundamental theorem of calculus requires the integrand to be continuous.) Property (3) makes it easy to compute

0 if t < 0f (t) = 1 if t > 0.

That is, f (t) = u(t), so u(t) is the antiderivative of δ(t).

In general, a jump discontinuity contributes a delta function to the generalized derivative.

Example 4. Suppose f (t) has the following graph.

t

f(t) = t2

f(t) = 2

f(t) = 3t− 7

2

-1

2

5


The formula for each piece of the graph is indicated. For the smooth parts of the graph the derivative is just the usual one. Each jump discontinuity adds a delta function scaled by the size of the jump to f �(t). ⎧ ⎨ 2t if t < 0

f �(t) = 2δ(t) − 3δ(t − 2) + ⎩ 0 if 0 < t < 2 3 if 2 < t

In the graph for f �(t) we represent the delta functions as spikes with the magnitude written next to the spike. The sign is indicated by the direction of the spike. The rest of the f �(t) is plotted normally.

t

2

3

We say f �(t) is a generalized function. In 18.03 a generalized function will mean a sum of a regular function and a linear combination of delta functions. (In the wider world of mathematics there are other generalized functions.)

If we want to refer to the different parts of a generalized function we will call the delta function pieces the singular part and the remainder will be called the regular part. If the singular part contains a multiple of δ(t − a) we will say the function contains δ(t − a).

Example. Consider f (t) = u(t) + δ(t) + e−t + 3δ(t − 2). The regular part of f is u(t) + e−t . The singular part is δ(t) + 3δ(t − 2). The function contains δ(t) and δ(t − 2). It does not contain δ(t − 1).

Important: In this unit, whenever a discontinuous function is differentiated we will mean the generalized derivative.

6

Integration with Delta Functions

Quiz: Compute � 10 5δ(t + 1) + 3δ(t) + t2δ(t − 5) + tδ(t − 20) dt.

0−

Choices:

a) 0

b) 25

c) 28

d) 33

e) 48

f) 53

g) none of these

Answer: (c) 28.The interval of integration contains 0 and 5, but not -1 or 20. So, only theδ(t) and δ(t − 5) terms contribute to the integral. Their contributions are 3and 25 (t2 evaluated at 5).



0−

Choices:

a) 0

b) 25

c) 28

d) 33

e) 48

f) 53

g) none of these




0−


Generalized Derivatives.

Quiz: When you fire a gun, you exert a very large force on the bullet over a very short period of time. If we integrate F = ma = mx” we see that a large force over a short time creates a sudden change in the momentum, mx�. This is called an "impulse."

If the gun is fired straight up, the graph of the elevation of the bullet, plotted against t, starts at zero, then rises in an inverted parabola, and then when it hits the ground it stops again.

The velocity (derivative of the position function) is zero for t < 0; then it rises to v0 (the initial velocity of the bullet); then it falls at constant rate (the acceleration of gravity) until the instant when it hits the ground, when it returns abruptly to zero.

The graph of v(t) looks like this:

t

v0

−v0

v(t)

What does the graph of the generalized derivative of v(t) look like?

Choices:

t

v0 v0a)

e) None of these.

Answer: (a).

t

v0

v0

b)

t

v0c)

t

d)






t

v0

−v0

v(t)


Choices:

t

v0 v0a)

t

v0

v0

b)

t

v0c)

t

d)

e) None of these.







t

v0

−v0

v(t)



Unit Step and Unit Impulse Response: Introduction

In real life, we often do not know the parameters of a system (e.g. the spring constant, the mass, and the damping constant, in a spring- massdashpot system). We may not even know the order of the system. For example, there may be many interconnected springs or diodes. Instead, we often learn about a system by watching how it responds to various input signals.

In this session we will study the response of a linear time invariant (LTI) system from rest initial conditions to two standard and very simple signals: the unit impulse δ(t) and the unit step function u(t). Reasonably enough we will call these responses the unit impulse response and the unit step response.

The theory of the convolution integral studied in the next session will give us a method of dertemining the response of a system to any input once we know its unit impulse response.

Because both δ(t) and u(t) are discontinuous at t = 0 we will have to be careful with our definition of initial conditions. The most sensible mathematical and physical way to do this is to define our initial conditions at 0−. As input an impulse causes a jump when it is applied. This means that the conditions at 0+ will be different than those at 0−. To distinguish these two cases we will use the terms pre-initial conditions (at 0−) and post-initial conditions (at 0+). We will be able to state precisely the effect of a unit impulse on these conditions.

�

Initial Conditions

1. Introduction

Before we try to solve higher order equations with discontinuous or impulsive input we need to think carefully about what happens to the solution at the point of discontinuity.

Recall that we have the left and right limits of a function as t 0:→

x(0−) = lim x(t) and x(0+) = lim x(t). t 0 t 0↑ ↓

(Note that we can define these limits as t goes to any value a.) For a continuous function these two limits are the same, and they are both equal to x(0).

For the unit step function we have

u(0−) = 0, u(0+) = 1, u(0) is undefined.

For the unit impulse function δ(t) we have

δ(0−) = 0, δ(0+) = 0, δ(0) = ∞.

In this unit our differential equations will always have initial conditions at t = 0. The above examples show that when there is a discontinuity we might need to distinguish between 0− and 0+ . Assuming x is the output, .we will do this by calling x(0−), x(0−), . . . the pre-initial conditions and . x(0+), x(0+), . . . the post-initial condition.

Important: Hereafter when we just say initial conditions we will mean the pre-initial conditions. In cases where x(t) is smooth the pre and post-initial conditions are the same and their is no need to distinguish between them.

2. Simple Examples

Example 1. Consider the initial value problem . x = u(t), x(0−) = 0.

0 for t < 0 x(t) = t for t > 0.

This is a simple calculus problem and has solution

t

x(t)

�

�

.. �

�

�

Initial Conditions OCW 18.03SC

It is easy to see that x(0+) = 0, so the post-initial condition is the same as the pre-initial condition. This should not surprise us. Although the rate of input jumps from 0 to 1, it is still only inputting an infinitesimal amount at a time. So, the response x(t) should be continuous. But, note . .that x(0−) = 0 = x(0+) = 1.

Example 2. Consider the initial value problem . x = δ(t), x(0−) = 0.

We know how to integrate δ(t) to get x(t) = u(t).

t

1x = u(t)

Here the pre-initial condition x(0−) = 0 does not match the post-initial condition x(0+) = 1. The impulse causes a jump in the value of x.

Example 3. Consider a second order IVP .. . x = u(t), x(0−) = 0, x(0−) = 0.

0 for t < 0Integrating twice we get x(t) = t2/2 for t > 0. t

1 x(t)

. .Again, it’s easy to check that x(0−) = x(0+) and x(0−) = x(0+). That ..

is, the pre and post initial conditions are the same. (But, x(0−) = 0 = x(0+) = 1.)

Example 4. Consider the initial value problem .. . x = δ(t), x(0−) = 0, x(0−) = 0.

.Integrating once gives

0 for t < 0 x(t) = t for t > 0.

x(t) = u(t). Integrating a second time gives

t

x(t)

Checking the pre and post initial conditions gives

x(0−) = 0 = x(0+) . . x(0−) = 0 = x(0+) = 1

2

Initial Conditions OCW 18.03SC

In other words, x(t) itself is continuous, but for the second order equation the input δ(t) caused a jump in the first derivative.

If we continued these examples we’d find that for an nth-order equation an input of δ(t) causes a jump in the derivative of order n − 1.

3. Rest Initial Conditions

The case where x(t) = 0 for t < 0 is called rest initial conditions. If we have a DE of order n this translates into pre-initial conditions

x(0−) = 0, x. (0−) = 0, . . . , x(n−1)(0−) = 0.

4. Conclusion

A unit step input u(t) causes a smooth response with matching pre and post-initial conditions. For a unit impulse input δ(t) the pre and post initial conditions match except for the derivative one less than the order of the equation.

3

�

�

�

First order Unit Step Response

1. Unit Step Response

Consider the initial value problem.x + kx = ru(t), x(0−) = 0, k, r constants.

This would model, for example, the amount of uranium in a nuclear reactor where we add uranium at the constant rate of r kg/year starting at time t = 0 and where k is the decay rate of the uranium.

As in the previous note, adding an infinitesimal amount (r dt) at a time leads to a continuous response. We have x(t) = 0 for t < 0; and for t > 0 we must solve .

x + kx = r, x(0) = 0.

The general solution is x(t) = (r/k) + ce−kt. To find c, we use x(0) = 0:

r r0 = x(0) =

k + c ⇒ c = −

k .

Thus, in both cases and u-format

x(t) = 0

kr (1 − e−kt)

for t < 0 =

r (1 − e−kt)u(t). (1)

for t > 0 k

With r = 1, this is the unit step response, sometimes written v(t). To be more precise, we could write v(t) = u(t)(1/k)(1 − e−kt).

The claim that we get a continuous response is true, but may feel a bit unjustified. Let’s redo the above example very carefully without making this assumption. Naturally, we will get the same answer.

The equation is

. 0 for t < 0 x + kx = r for t > 0, x(0−) = 0. (2)

Solving the two pieces we get

c1e−kt for t < 0 x(t) = kr + c2e−kt for t > 0.

This gives x(0−) = c1 and x(0+) = r/k + c2. If these two are different there is a jump at t = 0 of magnitude

x(0+) − x(0−) = r/k + c2 − c1.

�

�

First order Unit Step Response OCW 18.03SC

The initial condition x(0−) = 0 implies c1 = 0, so our solution looks like �

0 for t < 0 x(t) =

kr + c2e−kt for t > 0.

To find c2 we substitute this into our differential equation (2). (We must use the generalized derivative if there is a jump at t = 0.) After substitution the left side of (2) becomes

. 0 for t < 0 x + kx = (r/k + c2)δ(t) + −kc2e−kt + r + kc2e−kt for t > 0

0 for t < 0 = (r/k + c2)δ(t) + r for t > 0.

Comparing this with the right side of (2) we see that r/k + c2 = 0, or c2 = −r/k. This gives exactly the same solution (1) we had before.

Figure 1 shows the graph of the unit step response (r = 1). Notice that it starts at 0 and goes asymptotically up to 1/k.

t

1/k

v(t)

.Figure 1. Unit step is the response of the system x + kx = f (t) when f (t) = u(t).

The Meaning of the Phrase ’Unit Step Response’ In this note looked at the system with equation

. x + kx = f (t)

and we considered f (t) to be the input. As we have noted previously, it sometimes makes more sense to consider something else to be the input. For example, in Newton’s law of cooling

. T + kT = kTe

it makes physical sense to call Te, the temperature of the environment, the input. In this case the unit step response of the system means the response to the input Te(t) = u(t), i.e. the solution to

.T + kT = ku(t).

2

Unit Step Response: Post-initial Conditions


. v + kv = u(t)

with rest initial conditions, v(0−) = 0. .

For the solution v(t) what is v(0+)?

Choices:

.a) v(0+) = 0

.b) v(0+) = 1/k

.c) v(0+) = 1

.d) v(0+) = k

e) None of these.

Answer: (c)v(t) is continuous so v(0−) = v(0+) = v(0) = 0 Therefore the DE shows. v(0+) = u(0+) = 1.



. v + kv = u(t)



Choices:

.a) v(0+) = 0

.b) v(0+) = 1/k

.c) v(0+) = 1

.d) v(0+) = k

e) None of these.




. v + kv = u(t)




�

First order Unit Impulse Response

1. Unit Impulse Response

Consider the initial value problem.x + kx = δ(t), x(0−) = 0, k, r constants.

This would model, for example, the amount of uranium in a nuclear reactor where at time t = 0 we add 1 kilogram of uranium all at once and k is the decay rate of the uranium.

Because of the rest initial conditions we have x(t) = 0 for t < 0. The effect of the input is to cause the amount x(t) to jump from 0 to 1 at t = 0. That is, x(0+) = 1. For t > 0 the input δ(t) = 0 and, therefore, for t > 0 we should solve .

x + kx = 0, x(0) = 1.

The general solution is x(t) = ce−kt. To find c, we use x(0) = 1, which gives c = 1. Thus, in both cases and u-format

x(t) = 0 e−kt

for t < 0 = e−ktu(t). (1)

for t > 0

This is called the unit impulse response, which we denote w(t). In some sense it is the simplest nontrivial solution; you just give the system a unit kick at t = 0, stand back, and watch the result. For t > 0 it is just the homogeneous solution with initial condition x(0) = 1.

2. Graph of the Unit Impulse Response w(t)

Figure 1 shows the graph of the unit impulse response. Notice that at t = 0 it jumps to x = 1 and then decays exponentially to 0.

t

1w(t)

.Figure 1. The unit impulse response of the system x + kx.

3. δ(t) as a limit of box functions

Originally we found δ(t) as a limit of box functions of area 1. In this section we will compute the unit impulse response as the limit of the responses

�

First order Unit Impulse Response OCW 18.03SC

to these box functions. The main two points in doing this are: first, to gain more comfort and facility with this circle of ideas and second, to convince you that the delta function is much nicer to work with than box functions. We invite you to compare the amount of work required for solving the unit impulse with the amount of work needed in the unit step case.

A quick review: Define the box function as uh(t) =

⎧ ⎪⎨ ⎪⎩

0 for t < 01/h for 0 < t < h

0 for h < t. It has total area 1 for all h > 0 and the graph of uh(t) becomes a spike as h 0, i.e. →

lim uh(t) = δ(t). h 0→

For the equation . x + kx = uh(t), x(0−) = 0

the three pieces of the solution are easily found to be

x(t) =

⎧ ⎪⎨ ⎪⎩

c1e−kt for t < 0 1 hk + c2e−kt for 0 < t < h

c3e−kt for h < t.

Using the initial condition x(0−) = 0 and matching the value of x at the endpoints of each piece we find c1 = 0, c2 = −1/hk, c3 = (ekh − 1)/hk. This gives the solution

x(t) =

⎧ ⎪⎨ ⎪⎩

0 for t < 0

1 kh − 1)e−kt kh (e for h < t.

kh e − 1 hk

1 hk (1 − e−kt) for 0 < t < h

Letting h 0 this becomes (since limh 0

→ = 1)→

0 for t < 0 x(t) =

e−kt for 0 < t.

This limit is exactly the unit impulse response w(t) we found in a previous note.

Figure 2 shows this graphically by plotting the input and output for several values of h.

2

First order Unit Impulse Response OCW 18.03SC

t

11

1/1t

2

1

1/2t

3

1

1/3t

1

Figure 2. Responses for h = 1, h = .5, h = .333, and h 0.→

The input is plotted in black and the output in red. Notice how the output rises faster and gets closer to 1 as h 0. Finally, in the limit of small h, it →jumps directly to 1.

The Meaning of the Phrase ’Unit Impulse Response’ Exactly as in the case of the unit step response, the unit impulse response means the response of the system when the input is a unit impulse. In this note we looked at the the system

. x + kx = f (t)

and we considered f (t) to be the input. Suppose, instead, we have the system .

T + kT = kTe ,

where we consider Te to be the input. Then the unit impulse response is the response of the system to input Te(t) = δ(t), i.e. the solution to

.T + kT = kδ(t).

3

Unit Impulse Response: Post-initial Conditions


. w + kw = δ(t)

with rest initial conditions, w(0−) = 0. .

For the solution w(t) what is w(0+)?

Choices:

.a) w(0+) = 0

.b) w(0+) = −1/k

.c) w(0+) = −1

.d) w(0+) = −k

e) None of these.

Answer: (d). .Using the DE we get w(0+) + kw(0+) = δ(0+). We know w(0+) = 1 and . δ(0+) = 0. Therefore w(0+) = −k.

We could also look at the solution w(t) = e−kt for t > 0. Thus w. (t) =

−ke−kt for t > 0. This implies w. (0+) = −k.

Using the solution to the DE probably seems easier than the first method, but it is important to be able to draw conclusions without knowing the solution.



. w + kw = δ(t)



Choices:

.a) w(0+) = 0

.b) w(0+) = −1/k

.c) w(0+) = −1

.d) w(0+) = −k

e) None of these.




. w + kw = δ(t)




Second order Unit Step Response

1. Unit Step Response

We will use the example of an undamped harmonic oscillator with input f (t) modeled by ..

mx + kx = f (t).

The unit step response is the solution to this equation with input u(t) and rest initial conditions x(t) = 0 for t < 0. That is, it is the solution to the initial value problem (IVP)

.. . mx + kx = u(t), x(0−) = 0, x(0−) = 0.

This could be an undamped spring-mass system with mass m and spring constant k. The mass is at rest at equilibrium until time t = 0 when a steady force starts to act on it.

Force represents a change in momentum over time. A finite force F(t) can only cause an ininitesimal change in momentum (i.e. F(t) dt) at a time. Therefore, the mass does not change position abruptly, nor does it change velocity instantaneously. Because of this we should expect a solution which is continuous with continuous derivative. Only the acceleration experiences a discontinuity.

For t < 0 we already know that x(t) = 0. For t > 0 the DE is

.. mx + kx = 1.

This has a constant particular solution x(t) = 1/k, and a general homogeneous solution

xh(t) = c1 cos(ωnt) + c2 sin(ωnt), where ωn = √

k/m.

Putting the two together gives the general solution

x(t) = 1/k + c1 cos(ωnt) + c2 sin(ωnt) for t > 0.

. . .The continuity of x and x implies x(0) = x(0−) = 0 and x(0) = x(0−) = 0. This allows us to find c1 and c2.

0 = x. (0) = 1/k + c1 ⇒ c1 = −1/k 0 = x(0) = c2ωn c2 = 0.⇒

�

�

�

�

�

�

Second order Unit Step Response OCW 18.03SC

The unit step response for this system is (in both cases and u-format)

0 for t < 0 1 x(t) = 1

k (1 − cos(ωnt)) for t > 0. =

k (1 − cos(ωnt))u(t).

As in the first order case, we will sometimes denote this v(t).

The claim that we get a continuous response is true, but may feel a bit unjustified. Let’s redo the above example very carefully without making this assumption. It will take more work, but we will get the same answer.

In cases format the equation for the IVP is

.. 0 for t < 0 . mx + kx =

1 for t > 0, x(0−) = 0, x(0−) = 0. (1)

Solving the two pieces we get

c1 cos(ωnt) + c2 sin(ωnt) for t < 0 x(t) =

1/k + c3 cos(ωnt) + c4 sin(ωnt) for t > 0.

.The pre-initial conditions x(0−) = x(0−) = 0 easily imply c1 = c2 = 0. So our solution looks like

0 for t < 0 x(t) =

1/k + c3 cos(ωnt) + c4 sin(ωnt) for t > 0.

To find c3 and c4 we substitute x(t) into equation (1). .

To measure the jumps we compute x(0+) = 1/k + c3 and x(0+) = c4ωn. We use this as we compute derivatives of x.

. 0 for t < 0 x(t) = (1/k + c3)δ(t) +

−c3ωn sin(ωnt) + c4ωn cos(ωnt) for t > 0. .. 0 for t < 0 x(t) = (1/k + c3)δ�(t) + c4ωnδ(t) +

−c3ω2 cos(ωnt) − c4ω2 sin(ωnt) for t > 0.n n

Since the right-hand side of equation (1) does not have any delta functions .. or any δ�(t) the coefficients in front of these terms in the formula for x must be 0:

1/k + c3 = 0 ⇒ c3 = −1/k

c4ωn = 0 c4 = 0.⇒

2

Second order Unit Step Response OCW 18.03SC

In the end, we have exactly the same solution as above for the unit step response.

To summarize: the continuity assumptions follow because any jumps .in x(t) or x(t) would result in delta functions when x is substituted into equation (1).

The generalized derivative δ�(t) is not something we’ve seen before. It is often called a doublet. There is an entire theory of these and other generalized functions, but we will only use δ(t) in this course.

Figure 1 shows the graph of the unit step response (with k = 1 and m = 0.5.

t

2/k

..Figure 1. The unit step response for the system mx + kx = u(t).

If we added some damping the homogeneous part of the solution would go to 0 and the unit step response would go asymptotically to 1/k.

The Meaning of the Phrase ’Unit Step Response’ As we noted in the first order case, the unit step response is the response of the system to a unit step input. For example, if our system is

.. . . mx + bx + kx = by

and we consider y to be the input, then the unit step response is the solution to .. . . .. .

mx + bx + kx = bu(t) equivalently mx + bx + kx = bδ(t).

3

Second order Unit Impulse Response

1. Effect of a Unit Impulse on a Second order System

We consider a second order system.. .

mx + bx + kx = f (t). (1)

Our first task is to derive the following. If the input f (t) is an impulse cδ(t − a), then the system’s response to f (t) has the following properties. .1. The momentum mx(t) jumps by c units at t = a. That is,

mx. (a+) − mx

. (a−) = c.

2. The position x(t) is unchanged at t = a. That is,

x(a+) = x(a−).

Recall the argument that we used before: If x(t) had a jump at a then . .. x(t) would contain a multiple of δ(t − a). So, mx(t) would contain a multiple of the doublet δ�(t − a). This is impossible since the input δ(t − a) does not contain a doublet. This shows point (2) above.

.To show point (1), we note that if mx(t) has a jump of c units at t − a ..

then mx(t) contains the term cδ(t − a). This is needed to make the left-hand side of equation (1) match the right hand side when f (t) = cδ(t − a).

Another way to show points (1) and (2) is a physical argument. A force acting on the mass over time changes its momentum. In fact, the best way to state Newton’s second law is that

dp = f (t),

dt

where p(t) is the momentum of a system and f (t) is an external force acting on the system. If a force f (t) acts over the time interval [t1, t2] the total change of momentum due to the force is � t2

f (t) dt. t1

Physicists call this the impulse of the force f (t) over the interval [t1, t2]. If a very large force is applied over a very short time interval and has total impulse of 1 the result will be a sudden unit jump in the momentum of the system.

Second order Unit Impulse Response OCW 18.03SC

For a second order system the unit impulse function δ can be thought of as an idealization of this force. It is a force with total impulse 1 applied all at once.

A third argument that we will skip would be to solve equation (1) with a box function for input and take the limit as the box gets narrower and taller always with area 1.

2. Unit Impulse Response

We consider once again the damped harmonic oscillator equation

.. . mx + bx + kx = f (t).

The unit impulse response is the solution to this equation with input f (t) = δ(t) and rest initial conditions: x(t) = 0 for t < 0. That is, it is the solution to the initial value problem (IVP)

.. . . mx + bx + kx = δ(t), x(0−) = 0, x(0−) = 0.

This could be an damped spring-mass system with mass m, damping constant b and spring constant k. The mass is at rest at equilibrium until time t = 0 when it is hit by a sudden very brief very intense force, rather like getting hit on the head by a hammer. The effect is to increase the momentum instantaneously, without changing the position of the mass.

Let w(t) denote the solution we seek. The rest initial conditions tell us that w(t) = 0 for t < 0. We know from section 1 that the effect of the input is to cause a unit jump in the momentum at t = 0 and no change in position. We also know that, for t > 0, the input δ(t) = 0. Putting this together, for t > 0 the w(t) satisfies the equation

.. . . mw + bw + kw = 0, w(0) = 1/m, w(0) = 0.

This is a homogeneous constant coefficient linear differential equation which we have lots of practice in solving.

Example 1. Find the unit impulse response for the system

.. .2x + 8x + 26x = f (t). (2)

Solution. We will use the standard notation w(t) for the unit impulse response. We are looking for the response from rest to f (t) = δ(t). We know

2

�


w(t) = 0 for t < 0. At t = 0 the input causes a unit jump in momentum, .i.e., 2 w(0+) = 1. So, for t > 0 we have to solve

2w .. + 8w

. + 26w = 0, w

. (0+) = 1/2, w(0+) = 0.

The roots of the characteristic polynomial are −2 ± 3i. Which implies

w(t) = c1e−2t cos(3t) + c2e−2t sin(3t), for t > 0.

The initial conditions give

0 = w(0+) = c1, .1/2 = w(0+) = −2c1 + 3c2 ⇒ c2 = 1/6.

Thus, the unit impulse response (in both cases and u-format) is

0 for t < 0 1 w(t) = 1

6 e−2t sin(3t) for t > 0

= 6

e−2t sin(3t)u(t). (3)

Figure 1 the graph of the unit impulse response. Notice that at t = 0 the .graph has a corner. This corresponds to the slope w jumping from 0 to 1/2. For t > 0 the graph decays to 0 while oscillating.

t

Figure 1. The unit impulse response for the system 2.. . x + 8x + 26x.

3. Checking Example 1 by Substitution

With any differential equation you can verify a solution by plugging it into the equation. We will do that with example 1 to gain some more insight into why we get the solution.

First, we take the derivatives of the solution in equation (3) for t > 0

w. (t) = 1

6 e−2t(−2 sin(3t) + 3 cos(3t)) for t > 0

w.. (t) = 1

6 e−2t(−5 sin(3t) − 12 cos(3t)) for t > 0

Next we look at the jumps at t = 0

w(0−) = 0, w(0+) = 0 . . w(0−) = 0, w(0+) = 1/2

3


Now we can compute the full generalized derivatives

w. (t) = 1

6 e−2t(−2 sin(3t) + 3 cos(3t))u(t)

w.. (t) = 1

2 δ(t) + 16 e

−2t(−5 sin(3t) − 12 cos(3t))u(t)

Finally we substitute w for x in equation (2)

2w.. (t) = δ(t)− −

35 e−2t sin(3t) − 4e−2t cos(3t)

8w. (t) = −

38 e−2t sin(3t) + 4e−2t cos(3t)

1326w(t) = 3 e−2t sin(3t)

.. .2w + 8w + 26w = δ(t).

The Meaning of the Phrase ’Unit Impulse Response’ As we’ve noted several times already, the response to a given input depends on what we in our equation we consider to be the input. For example, if our system is .. . .

mx + bx + kx = by

and we consider y to be the input, then the unit impulse response is the solution to

.. . . .. . . mx + bx + kx = bδ(t) equivalently mx + bx + kx = bδ(t).

. (Here, δ is what we’ve called a doublet.)

4


..Quiz: Let w(t) be the solution to mx + kx = δ(t) with rest initial condi.tions. What is w(0+)?

Choices:

.a) w(0+) = 0

.b) w(0+) = ωm

.c) w(0+) = k

.d) w(0+) = k/m

.e) w(0+) = 1/m

f) None of these.

Answer: (e).The unit impulse input causes a unit jump in momentum. Starting from. .rest this means mw(0+) = 1 or w(0+) = 1/m.



Choices:

.a) w(0+) = 0

.b) w(0+) = ωm

.c) w(0+) = k

.d) w(0+) = k/m

.e) w(0+) = 1/m

f) None of these.


Higher Order Unit Impulse Response

We can extend our reasoning in the first and second order cases to any order. Consider an nth order system with DE

anx(n) + an−1x(n−1) + . . . + a1x� + a0x = f (t) , (1)

where we take f (t) to be the input. The equation for the unit impulse response of this system is

anx(n) + an−1x(n−1) + . . . + a1x� + a0x = δ(t), with rest IC. (2)

The effect of the δ function input is to cause a jump in the n − 1st derivative at time t = 0, while the lower order derivatives do not jump. That is, the system is put in the state

x(0+) = 0, x�(0+) = 0, . . . , x(n−2)(0+) = 0, x(n−1)(0+) = 1/an.

To show this we use the same reasoning as in the second order case. Suppose there was a jump in a lower derivative. For example, suppose

x(n−3)(0+) = b �= 0.

Then the expression for x(n−2)(t) contains bδ(t), which implies that x(n−1)(t) contains bδ�(t) and x(n)(t) contains bδ��(t). This is impossible because the right-hand side of (2) does not have any derivatives of the delta function.

Since xn−1(t) has a jump of x(n−1)(0+) = 1/an at t = 0, its derivative anx(n)(t) has a unit impulse, δ(t), at t = 0.

We conclude that the solution to (2) is 0 for t < 0 and for t > 0 it is exactly the same as the solution to

anx(n) + an−1x(n−1) + . . . + a1x� + a0x = 0

with initial conditions

x(0) = 0, x�(0) = 0, . . . , xn−2(0) = 0, xn−1(0) = 1/an.

Convolution: Introduction

The convolution product of two functions is a peculiar looking integral which produces another function. It is found in a wide range of applications, so it has a special name and a special symbol. The convolution of f and g is denoted f ∗ g and defined by

� t+

( f ∗ g)(t) = f (s)g(t − s) ds. 0−

We will start by stating this formula without any motivation. It’s main properties are relatively easy to deduce from its definition.

The motivation will come in the form of Green’s formula. This important tool tells us how to solve a linear time invariant (LTI) system with any input (and rest IC) once we know its unit impulse response.

The rest of the session is concerned with the proof of the Green’s formula and examples of convolution and Green’s formula.

Technical Detail: Because we want convolution to work with delta functions we needed to be careful with the limits of integration. This explains the plus and minus on the limits.. If both functions are continuous or have at most jump discontinuities then the limits can be 0 and t.

Definition and Properties

1. Definition

The convolution of two functions f and g is a third function which we denote f ∗ g. It is defined as the following integral

� t+

( f ∗ g)(t) = f (τ)g(t − τ) dτ for t > 0. (1)0−

We will leave this unmotivated until the next note, and for now just learn how to work with it.

There are a few things to point out about the formula.

• The variable of integration is τ. We can’t use t because that is already used in the limits and in the integrand. We can choose any symbol we want for the variable of integration –it is just a dummy variable.

• The limits of integration are 0− and t+. This is important, particularly when we work with delta functions. If f and g are continuous or have at worst jump discontinuities then we can use 0 and t for the limits. You will often see convolution written like this: � t

f ∗ g(t) = f (τ)g(t − τ) dτ. 0

• We are considering one-sided convolution. There is also a two-sided convolution where the limits of integration are ±∞.

• (Important.) One-sided convolution is only concerned with functions on the interval (0−, ∞). When using convolution we never look at t < 0.

2. Examples

Example 1 below calculates two useful convolutions from the definition (1). As you can see, the form of f ∗ g is not very predictable from the form of f and g.

Example 1. Show that

at bt eat − ebtat at e ∗ e =

a − b , a �= b; e ∗ e = t eat

�

Definition and Properties OCW 18.03SC

Solution. We show the first; the second calculation is similar. If a = b,

eat bt = � t

eaτ eb(t−τ) dτ = ebt � t

e(a−b)τ dτ = ebt e(a−b)τ �t

= ebt e(a−b)t − 1

=∗ e0 0 a − b a − b

0

Note that because the functions are continuous we could safely integrate just from 0 to t instead of having to specify precisely 0− to t+ .

The convolution gives us a formula for a particular solution yp to an inhomogeneous linear ODE. The next example illustrates this for a first order equation.

Example 2. Express as a convolution the solution to the first order constant-coefficient linear IVP.

. y + ky = q(t); y(0) = 0. (2)

Solution. The integrating factor is ekt; multiplying both sides by it gives

(y ekt)� = q(t)ekt .

Integrate both sides from 0 to t, and apply the Fundamental Theorem of Calculus to the left side; since we have y(0) = 0, the solution we seek satisfies � t

yp ekt = q(τ)ekτ dτ; (τ is the dummy variable of integration.) 0

Moving the ekt to the right side and placing it under the integral sign gives

� typ = q(τ)e−k(t−τ) dτ

0

yp = q(t) ∗ e−kt .

Now we observe that the solution is the convolution of the input q.(t) with e−kt, which is the solution to the corresponding homogeneous DE y + ky =, but with IC y(0) = 1. This is the simplest case of Green’s formula, which is the analogous result for higher order linear ODE’s, as we will see shortly.

eat bt − e.

a − b

2


3. Properties

1. Linearity: Convolution is linear. That is, for functions f1, f2, g and constants c1, c2 we have

(c1 f1 + c2 f2) ∗ g = c1( f1 ∗ g) + c2( f2 ∗ g).

This follows from the exact same property for integration. This mightalso be called the distributive law.

2. Commutivity: f ∗ g = g ∗ f . Proof: This follows from the change of variable v = t − τ. Limits: τ = 0− ⇒ t �−t+

τ = t+ and τ = t� +

t+⇒ t − τ = 0−

Integral: ( f ∗ g)(t) = f (τ)g(t − τ) dτ = f (t − v)g(v) dv = (g ∗ f )(t)0− 0−

3. Associativity: f ∗ (g ∗ h) = ( f ∗ g) ∗ h. The proof just amounts to changing the order of integration in a double integral (left as an exercise).

4. Delta Functions

We have

(δ ∗ f )(t) = f (t) and (δ(t − a) ∗ f )(t) = f (t − a). (3)

The notation for the second equation is ugly, but its meaning is clear.

We prove these formulas by direct computation. First, remember the rules of integration with delta functions: for b > 0 � b

δ(τ) f (τ) dτ = f (0). 0−

The formulas follow easily for t ≥ 0 � t+

(δ ∗ f )(t) = δ(τ) ∗ f (t − τ) dτ = f (t − 0) = f (t)0−� t+

(δ(t − a) ∗ f )(t) = δ(τ − a) ∗ f (t − τ) dτ = f (t − a).0−

5. Convolution is a Type of Multiplication

You should think of convolution as a type of multiplication of functions. In fact, it is often referred to as the convolution product. In fact, it has the properties we associate with multiplication:

It is commutative. •

3


It is associative. •

It is distributive over addition. •

• It has a multiplicative identity. For ordinary multiplication, 1 is the multiplicative identity. For convolution, formula (3) shows that δ(t) is the multiplicative identity for the convolution product.

4

Green’s Formula

In this note we state Green’s formula and look at some examples. We will prove it in the next note.

1. Green’s Formula

Suppose that we have a linear time invariant system with rest IC.

P(D)y = f (t), y(t) = 0 for t < 0 (1)

• As in previous sessions, we will consider f (t) to be the input to this system. Everything we say will also hold for systems like. .. . .T + kT = kTe with input Te and mx + bx + kx = by with input y.

• In this context, where we don’t consider functions for t < 0, the initial conditions mean that y(t) and all its derivatives are 0 at t = 0−.

• P(D) is a polynomial differential operator. Although it can be of any order, recall that we developed the second order case extensively in the last unit, where it was often written as

.. .P(D)y = my + by + ky.

Suppose further that w(t) is the unit impulse response for (1). That is, w(t) satisfies P(D)w = δ(t), with rest IC. Then, for any input f (t) the solution to equation (1) is given by Green’s formula � t+

y(t) = ( f ∗ w)(t) = f (τ)w(t − τ) dτ. (2)0−

This is a wonderful formula! It tells us the response to any input once we know the unit impulse response. Furthermore, it gives us that response as an integral which can be computed numerically if necessary. For many physical systems the impulse response can be measured directly or deduced from measurements. So, Green’s formula gives us a method for predicting the system’s response to any input.

2. Unit Impulse Response = Weight Function

The unit impulse response is also called the weight function. We will use the terms interchangeably. If we think of an integral as a ’sum’ then Green’s formula shows the solution y(t) to (1) is given as a weighted sum

Green’s Formula OCW 18.03SC

of the small bits of input, f (τ) dτ from before time t. Each piece is weighted by w(t − τ).

Before proceeding, let us recall the definition of the unit impulse response. The weight w(t) is the unique solution to the IVP

P(D)y = δ(t) with rest IC (3)

In the previous session we learned how to rewrite (3) as a homegeneous equation. We will only restate this for second order equations. The weight function for the system

.. . mx + bx + kx = f (t)

is 0 for t < 0 and the solution to

.. . . mx + bx + kx = 0, x(0) = 0, x(0) = 1/m

for t > 0.

3. Examples

We now out Green’s formula (2) in a couple of cases where we can check it by finding the particular solution yp by another method.

Example 1. Find the particular solution given by (2) to

.. . y + y = A, y(0) = 0, y(0) = 0, where A is a constant.

Solution. The unit impulse response is w(t) = sin t. Therefore for t ≥ 0, we have � t �t

yp(t) = A sin(t − τ) dτ = A cos(t − τ) = A(1 − cos t). 0 0

We check this by another method: The exponential response formula or the method of undetermined coefficients produces the particular solution yp = A. Adding in the homogeneous solution we get the general solution to the DE is

y = A + c1 cos t + c2 sin t.

You can easily compute that the rest initial conditions are matched by y = A − A cos t, as found by Green’s formula.

2

�

Green’s Formula OCW 18.03SC

Example 2. Find the particular solution for t ≥ 0 given by (2) to

y�� + y = f (t) = 1 for 0 ≤ t ≤ π

0 elsewhere

Solution. Here the method of Example 1 leads to two cases: 0 ≤ t ≤ π and t ≥ π: ⎧ ⎪⎪⎪⎨ ⎪⎪⎪⎩

� t �t

sin(t − τ) dτ = cos(t − τ) = 1 − cos t, for 0 ≤ t ≤ π;0� t f (τ) sin(t − τ) dτ= �0πyp = � π0 sin(t − τ) dτ = cos(t − τ) = −2 cos t, for t ≥ π .0

0

We leave it to ther reader to check this by our earlier methods.

3

Proof of Green’s Formula

Green’s Formula: For the equation

P(D)y = f (t), y(t) = 0 for t < 0 (1)

the solution for t > 0 is given by

� t+

y(t) = ( f ∗ w)(t) = f (τ)w(t − τ) dτ, (2)0−

where w(t) is the weight function (unit impulse response) for the system.

Proof: The proof of Green’s formula is surpisingly direct. We will use the linear time invariance of the system combined with superposition and the definition of the integral as a limit of Riemann sums.

To avoid worrying about 0− and t+ we will assume that f (t) is continuous. With appropriate care, the proof will work for an f (t) that has jump discontinuities or contains delta functions.

As we saw in the session on Linear Operators in the last unit, linear time invariance means that

y(t) solves P(D)y = f (t) ⇒ y(t − a) solves P(D)y = f (t − a). (3)

Or, in the language of input-response, if y(t) is the response to input f (t) then y(t − a) is the response to input f (t − a).

First we will partition time into intervals of width Δt. So, t0 = 0, t1 = Δt, t2 = 2Δt, etc.

t0 = t0 t1 t2

. . .tk tk+1

. . .∆t ∆t ∆t

Figure 1: Division of the t-axis into small intervals.

Next we decompose the input signal f (t) into packets over each interval. The kth signal packet, fk(t) coincides with f (t) between tk and tk+1 and is 0 elsewhere �

f (t) for tk < t < tk+1fk(t) = 0 elsewhere.

� �

Proof of Green’s Formula OCW 18.03SC

t0 = t0 t1 t2

. . .tk tk+1

. . .∆t ∆t ∆t

f(t)

t0 = t0 t1 t2

. . .tk tk+1

. . .∆t ∆t ∆t

fk(t)

Figure 2: The signal packet fk(t).

It is clear that for t > 0 we have f (t) is the sum of the packets

f (t) = f0(t) + f1(t) + . . . + fk(t) + . . .

A single packet fk(t) is concentrated entirely in a small neighborhood of tk so it is approximately an impulse with the same size as the area under fk(t). The area under fk(t) ≈ f (tk) Δt. Hence,

fk(t) ≈ ( f (tk) Δt) δ(t − tk).

The weight function w(t) is response to δ(t). So, by linear time invariance the response to fk(t) is

yk(t) ≈ ( f (tk) Δt) w(t − tk).

We want to find the response at a fixed time. Since t is already in use, we will let T be our fixed time and find y(T).

Since f is the sum of fk, superposition gives y is the sum of yk. That is, at time T

y(T) = y0(T) + y1(T) + . . . � � (4) ≈ f (t0)w(T − t0) + f (t1)w(T − t1) + . . . Δt

We can ignore all the terms where tk > T. (Because then w(T − tk) = 0, since T − tk < 0.) If n is the last index where tk < T we have

y(T) ≈ f (t0)w(T − t0) + f (t1)w(T − t1) + . . . + f (tn)w(T − tn) Δt

2

Proof of Green’s Formula OCW 18.03SC

This is a Riemann sum and as Δt 0 it goes to an integral → � T y(T) = f (t)w(T − t) dt

0

Except for the change in notation this is Green’s formula (2).

Note on Causality: Causality is the principle that the future does not affect the past. Green’s theorem shows that the system (1) is causal. That is, y(t) only depends on the input up to time t. Real physical systems are causal.

There are non-causal systems. For example, an audio compressor that gathers information after time t before deciding how to compress the signal at time t is non-causal. Another example is the system with input f (t) and . output y(t) where y is the solution to y = f (t + 1).

3

Examples

We will give several examples of Green’s formula. The first we will ’build from scratch’ so you get a sense of how this formula arises naturally. The last example shows how Green’s formula works for a system driven at its resonant frequency.

Example 1. The build up of a pollutant in a lake Every good formula deserves a particularly illuminating example, and perhaps the following will serve for the convolution integral. It is also illustrated by the Mathlet Convolution: Accumulation.

Problem: We have a lake, and a pollutant is being dumped into it, at a certain variable rate f (t). This pollutant degrades exponentially over time. If the lake begins at time zero with no pollutant, how much is in the lake at time t > 0?

Solution. Let x(t) be the amount of pollutant in the lake at time t and a be the decay constant. For exponential decay we know that if a quantity p of pollutant is dropped in the lake at time τk then at a later time t it will have been reduced to the amount

pe−a(τ−τk ). (1)

Here t − τk is the time elapsed between when the pollutant is added and when we check how much of it is left.

In our system pollutant is not being added all at once. Rather, it is dripping continuously into the lake. We break the interval [0, t] into n small pieces of width Δτ as shown.

τ0 = τ0 τ1 τ2

. . . τk τk+1. . .

τn = t∆τ ∆τ ∆τ

Let pk be the amount of pollutant added in the interval [τk, τk+1]. Since Δτ is small we get the approximation

pk ≈ f (τk)Δτ.

(Remember f (τ) is a rate; to get a quantity you must multiple by time.) According to equation (1) the amount of this left at time t is approximately

pke−a(t−τk ) ≈ f (τk)Δτ e−a(t−τk).


This is approximately the contribution to x(t) from the interval [τk, τk+1]. To determine the x(t) we simply sum up the contributions of all the intervals.

x(t) ≈ � p0e−a(t−τ0) + p1e−a(t−τ1) + . . . + pn−1e−a(t−τn−1) �

≈ f (τ0)e−a(t−τ0) + f (τ1)e−a(t−τ1) + . . . + f (τn−1)e−a(t−τn−1) Δτ.

This is a Riemann sum. Taking the limit as Δτ 0 we get the convolution →integral � t

x(t) = f (τ)e−a(t−τ) dτ. (2)0

Example 2. In example 1 we constructed our formula by slicing an interval into pieces. You should know how to do this. But, we prove theorems and formulas to avoid always going back to first principles. In this example we will solve the problem in example 1 using the differential for exponential decay and finding its weight function. (Of course, this DE was found by slicing an interval into pieces . . . .)

The DE with rest IC is.x + ax = f (t), x(0−) = 0

Its weight function w(t) is 0 for t < 0, and for t > 0 it is the solution to the IVP .

w + aw = 0, w(0) = 1.

We get w(t) = e−at for t > 0.

Using Green’s formula we again get the convolution integral (2)

Example 3. Resonance Use Green’s formula to solve the DE with rest inital conditions

.. .2x + 8x = cos(2t), x(0−) = 0, x(0−) = 0

For t > 0, the weight function is the solution to .. .

2w + 8w = 0, w(0) = 0, w(0) = 1/2.

1The solution is w(t) = sin(2t). For t > 0 Green’s formula gives

4 � t 1 x(t) = sin(2τ) cos(2(t − τ)) dτ.

0 4

2


This is an easy integral, we sketch the algebra to compute it. It uses the

trigonometric identity: sin(A) cos(B) = sin(A + B) + sin(A − B)

.2

x(t) = � t sin(2τ) 4

1 cos(2(t − τ)) dτ.0 � t = 1

0 (sin(2t) + sin(4τ − 2t)) dτ8 � �t = 1 τ sin(2t) − cos(4τ−2t)

8 4 0 t sin(2t)

= .8

This is the answer we expected from our earlier work with the exponential response formula.

3

Laplace Transform Basics: Introduction

An operator takes a function as input and outputs another function. A transform does the same thing with the added twist that the output function has a different independent variable. The Laplace transform takes a function f (t) and produces a function F(s).

We will allow the variable s to be complex. We will see that it can be thought of as complex circular frequency

It is best to think of f (t) and F(s) as two views of the same underlying phenomenon. If we have a signal then f (t) is the familiar view of that signal in time and F(s) is the less familiar view in frequency. Everything about the signal is present in both views, but some things are easier to see in one view or the other. Using them together gives us a powerful tool for understanding systems and signals.

In practice the Laplace transform has the following benefits:

• It makes explicit the long-term behavior of f (t).

• It converts differential equations into algebraic equations.

• It allows you to find the operator P(D) if you know its weight function w(t).

• It converts Green’s formula, which is a complicated convolution integral in the time view, into as simple algebraic statement in the frequency view.

• Most importantly, the frequency view can be summarized in something called the pole diagram. This diagram can show at a glance the stability and frequency response of a system. It is an important engineering design tool.

We will see and explore each of these virtues in this and the next few sessions.

Definition of Laplace Transform

1. Definition of Laplace Transform

The Laplace transform of a function f (t) of a real variable t is another function depending on a new variable s, which is in general complex. We will denote the Laplace transform of f by L f . It is defined by the integral � ∞

(L f )(s) = f (t)e−st dt, (1)0−

for all values of s for which the integral converges.

There are a few things to note.

• L f is only defined for those values of s for which the improper integral on the right-hand side of (1) converges.

• We will allow s to be complex.

• As with convolution the use of 0−, in the definition (1) is necessary to accomodate generalized functions containing δ(t). Many textbooks do not do this carefully, and hence their definition of the Laplace transform is not consistent with the properties they assert. In those cases where 0− isn’t needed we will use the less precise form � ∞

(L f )(s) = f (t)e−st dt. (1’)0

• Also, as with convolution, the limits of integration mean that the Laplace transform is only concerned with functions on (0−, ∞).

2. Notation, F(s)

We will adopt the following conventions:

1. Writing (L f )(s) can be cumbersome so we will often use an uppercase letter to indicate the Laplace transform of the corresponding lowercase function:

(L f )(s) = F(s), (Lg)(s) = G(s), etc.

For example, in the formula

L( f �) = sF(s) − f (0−)

it is understood that we mean F(s) = L( f ).

�

Definition of Laplace Transform OCW 18.03SC

2. If our function doesn’t have a name we will use the formula instead. For example, the Laplace transform of the function t2 is written L(t2)(s) or more simply L(t2).

3. If in some context we need to modify f (t), e.g. by applying a translation by a number a, we can write L( f (t − a)) for the Laplace transform of this translation of f .

4. You’ve already seen several different ways to use parentheses. Sometimes we will even drop them altogether. So, if f (t) = t2 then the following all mean the same thing

(L f )(s) = F(s) = L f (s) = L( f (t))(s) = L(t2)(s); L f = F = L(t2).

3. First Examples

For the first few examples we will explicitly use a limit for the improper integral. Soon we will do this implicitly without comment.

Example 1. Let f (t) = 1, find F(s) = L f (s).

Solution. Using the definition (1’) we have

L(1) = F(s) = � ∞

e−st dt = lim e−st �T

= lim e−sT − 1

�T

. 0 T ∞ −s T ∞ −s→ 0 → 0

The limit depends on whether s is positive or negative.

e−sT 0 if s > 0lim =

T→∞ ∞ if s < 0.

Therefore, � 1 if s > 0sL(1) = F(s) = diverges if s ≤ 0.

(We didn’t actually compute the case s = 0, but it is easy to see it diverges.)

Example 2. Compute L(eat).

Solution. Using the definition (1’) we have

� ∞ ate−st dt

e(a−s)t �T e(a−s)T − 1 �T

L(eat) = 0

e = Tlim

∞ a − s =

Tlim

∞ a − s .

→ 0 → 0

2

�

Definition of Laplace Transform OCW 18.03SC

The limit depends on whether s > a or s < a.

lim e(a−s)T = 0 if s > a

T→∞ ∞ if s < a.

Therefore, � 1 if s > aat) = s−aL(e

diverges if s ≤ a.

(We didn’t actually compute the case s = a, but it is easy to see it diverges.)

We have the first two entries in our table of Laplace transforms:

f (t) = 1 F(s) = 1/s, s > 0⇒

f (t) = eat ⇒ F(s) = 1/(s − a), s > a.

4. Linearity

You will not be surprised to learn that the Laplace transform is linear. For functions f , g and constants c1, c2

L(c1 f + c2g) = c1L( f ) + c2L(g)

This is clear from the definition (1) of L and the linearity of integration.

3

Domain of F(s)

1. Complex s and region of convergence

We will allow s to be complex, using as needed the properties of the complex exponential we learned in unit 1.

Example 1. In the previous note we saw that L(1) = 1/s, valid for all s > 0. Let’s recompute L(1)(s) for complex s. Let s = α + iβ.

L(1)(s) = �

0 ∞ e−st dt �T

= limT ∞ e−st

→ −s 0 �T

= limT ∞ e−αt (cos(βt)+i sin(βt))

→ −s 0

This converges if α > and diverges if α < 0. Since α = Re(s) we have

L(1) = 1/s, for Re(s) > 0.

The region Re(s) > 0 is called the region of convergence of the transform. It is a right half-plane.

Re(s) > 0

Real axis

Imag. axis

Region of convergence: right half-plane Re(s) > 0.

Frequency: The Laplace transform variable s can be thought of as complex frequency. It will take us a while to understand this, but we can begin here. Euler’s formula says eiωt = cos(ωt) + i sin(ωt) and we call ω the angular frequency. By analogy for any complex number exponent we call s the complex frequency in est . If s = a + iω then s is the complex frequency and its imaginary part ω is an actual frequency of a sinusoidal oscillation.

2. Piecewise continuous functions and functions of exponential order

If the integral fails to converge for any s then the function does not have a Laplace transform.

�

Domain of F(s) OCW 18.03SC

Example. It is easy to see that f (t) = et2 has no Laplace transform.

The problem is the et2 grows to fast as t gets large. Fortunately, all of

the functions we are interested in do have Laplace transforms valid for Re(s) > a for some value a.

Functions of Exponential Order The class of functions that do have Laplace transforms are those of exponential order. Fortunately for us, all the functions we use in 18.03 are of this type.

A function is said to be of exponential order if there are numbers a and M such that | f (t)| < Meat . In this case, we say that f has exponential order a.

Examples. 1, cos(ωt), sin(ωt), tn all have exponential order 0. eat has exponential order a.

A function f (t) is piecewise continuous if it is continuous everywhere except at a finite number of points in any finite interval and if at thesepoints it has a jump discontinuity (i.e. a jump of finite height).

Example. The square wave is piecewise continuous.

Theorem: If f (t) is piecewise continuous and of exponential order a then the Laplace transform L f (s) converges for all s with Re(s) > a.

Proof: Suppose Re(s) > a and | f (t)|e−< ibt

Meat . Then we can write s = (a + α) + ib, where α > 0. Then, since | | = 1,

| f (t)e−st| = | f (t)e−(a+α)te−ibt| = | f (t)e−(a+α)t)| < Me−αt , � ∞ Since Me−αt dt converges for α > 0, the Laplace transform integral also

0 converges.

Domain of F(s): For f (t) we have F(s) = 1/s with region of convergence Re(s) > 0. But, the function 1/s is well defined for all s = 0. The process of extending the domain of F(s) from the region of convergence is called analytic continuation. In this class analytic continuation will always consist of extending F(s) to the complex plane minus the zeros of the denominator.

2

� �

More Entries for the Laplace Table

In this note we will add some new entries to the table of Laplace transforms.

s1. L(cos(ωt)) =

s2 + ω2 , with region of convergence Re(s) > 0.

ω2. L(sin(ωt)) =

s2 + ω2 , with region of convergence Re(s) > 0.

Proof: We already know that L(eat) = 1/(s − a). Using this and Euler’s formula for the complex exponential, we obtain

L(cos(ωt) + i sin(ωt)) = L(eiωt) = 1

= 1 s + iω

= s + iω

s − iω s − iω ·

s + iω s2 + ω2 .

Taking the real and imaginary parts gives us the formulas.

L(cos(ωt)) = Re iωt) = s/(s2 + ω2) L(sin(ωt)) = Im

�LL((eeiωt)

� = ω/(s2 + ω2)

The region of convergence follow from the fact that cos(ωt) and sin(ωt) both have exponential order 0.

Another approach would have been to use integration by parts to compute the transforms directly from the Laplace integral.

3. For a positive integer n, L(tn) = n!/sn+1. The region of convergence is Re(s) > 0.

Proof: We start with n = 1. � ∞ L(t) = te−st dt

0

Using integration by parts:

u = t dv v =

= e−

e−st

st

/(−s)

�

L(t) = − te−st �∞

+ 1 � ∞

e−st dt.du = 1 s 0 s 0

For Re(s) > 0 the first term is 0 and the second term is 1 s L(1) = 1/s2. Thus,

L(t) = 1/s2.

Next let’s do n = 2: � ∞ L(t2) = t2e−st dt

0

More Entries for the Laplace Table OCW 18.03SC

Again using integration by parts:

u = t2 dv = e−st � t2e−st �∞ 1 � ∞

v = e−st/(−s) L(t2) = − + 2te−st dt.du = 2t s 0 s 0

For Re(s) > 0 the first term is 0 and the second term is 1 s L(2t) = 2/s3.

Thus, L(t2) = 2/s3.

We can see the pattern: there is a reduction formula for � ∞ L(tn) = tne−st dt.

0

Integration by parts:

u = tn dv v =

= e−

e−st

st

/(−s)

�

L(tn) = − tne−st �∞

+ 1 � ∞

ntn−1e−st dt.du = ntn−1 s 0 s 0

For Re(s) > 0 the first term is 0 and the second term is 1 s L(ntn−1).

Thus, L(tn) = ns L(tn−1).

Thus we have

L(t3) = 3 s L(t2) = 3·2 = s

3! 4s4

L(t4) = 4 s L(t3) = 4

s·53! = s

4! 5

. . .

L(tn) = n! sn+1 .

4. (s-shift formula) If z is any complex number and f (t) is any function then

L(ezt f (t)) = F(s − z).

As usual we write F(s) = L( f )(s). If the region of convergence for L( f ) is Re(s) > a then the region of convergence for L(ezt f (t)) is Re(s) > Re(z) + a.

Proof: We simply calculate

=L(ezt f (t)) �

0 ∞ ezt f (t)e−st dt

= �

0 ∞ f (t)e−(s−z)t dt = F(s − z).

2

� �

More Entries for the Laplace Table OCW 18.03SC

Example. Find the Laplace transform of e−t cos(3t).

Solution. We could do this by using Euler’s formula to write

e−t cos(3t) = (1/2) e(−1+3i)t + e(−1−3i)t

but it’s even easier to use the s-shift formula with z = −1, which gives

L(e−t f (t)) = F(s + 1),

where here f (t) = cos(3t), so that F(s) = s/(s2 + 9). Shifting s by -1 according to the s-shift formula gives

s + 1 L(e−t cos(3t)) = F(s + 1) = (s + 1)2 + 9

.

We record two important cases of the s-shift formula:

4a) L(ezt cos(ωt)) = (s −

sz− )2

z + ω2

4b) L(ezt sin(ωt)) = (s − z

ω )2 + ω2 .

Consistency. It is always useful to check for consistency among our various formulas:

1. We have L(1) = 1/s, so the s-shift formula gives L(ezt 1) = 1/(s − z).· This matches our formula for L(ezt).

2. We have L(tn) = n!/sn+1. If n = 1 we have L(t0) = 0!/s = 1/s. This matches our formula for L(1).

3

Computing the Laplace Transform

Quiz: What is L((1 + t)2)?

Choices:

a. (1/s + 1/s2)2 = 1/s2 + 2/s3 + 1/s4.

b. 1/s + 2/s2 + 2/s3.

c. (1 + t)(1/s + 1/s2).

d. None of the above.

1 2 2! Answer: (b): L((1 + t)2) = L(1 + 2t + t2) = +

s2 + s3 .

s



Choices:

a. (1/s + 1/s2)2 = 1/s2 + 2/s3 + 1/s4.

b. 1/s + 2/s2 + 2/s3.

c. (1 + t)(1/s + 1/s2).

d. None of the above.


The Laplace Transform of the Delta Function

Since the Laplace transform is given by an integral, it should be easy to compute it for the delta function. The answer is

1. L(δ(t)) = 1.

2. L(δ(t − a)) = e−as for a > 0.

As expected, proving these formulas is straightforward as long as we use the precise form of the Laplace integral. For (1) we have: � ∞

L(δ(t)) = δ(t)e−st dt = 1. 0−

As we saw in a previous session, integrating e−st against δ(t) amounts to evaluating e−st at t = 0, and e0 = 1. Similarly for the shifted version (2), integrating e−st against δ(t − a) amounts to evaluating e−st at t = a: � ∞

L(δ(t − a)) = δ(t − a)e−st dt = e−sa . 0−

Notice that the two formulas are consistent: if we set a = 0 in formula (2) then we recover formula (1).

Partial Fractions and Inverse Laplace Transform

In order to use the Laplace transform we need to be able to invert it and find f (t) when we’re given F(s). Often this can be done by using the Laplace transform table. So for example, if F(s) = 1/(s − 5) then f (t) = e5t .

More often we have to do some algebra to get F(s) into a form suitable for the direct use of the table. Our main technique for doing this is the partial fractions decomposition. You probably saw this before in calculus as a method for computing integrals.

First we will learn how to do partial fractions in a straightforward algebraic way using the method of undetermined coefficients. Next we will learn the Heaviside coverup method which makes some of the algebra easier.

� � � �

� �

� �

Laplace Inverse by Table Lookup

The first thing we need to be able to do is to use the Laplace table to find the inverse Laplace transform. We will illustrate this entirely by examples.

Notation: The inverse Laplace transform will be denoted L−1.

Example 1. Find L−1(1/(s − 2)).

Solution. Use the table entry L(eat) = 1/(s − a):

L−1(1/(s − 2)) = e2t .

Example 2. Find L−1(1/(s2 + 9)).

Solution. Use the table entry L(sin(ωt)) = ω/(s2 + ω2) and linearity:

L−1 1 =

13 L−1 3

= 1

sin(3t). s2 + 9 s2 + 32 3

Example 3. Find L−1(4/s2).

Solution. Use the table entry L(t) = 1/s2: L−1(4/s2) = 4t.

Example 4. Find L−1(4/(s − 2)2).

Solution. Use the s-shift formula L(ezt f (t)) = F(s − 2), where, in this case,

F(s) = 4/s2 f (t) = 4t by example (3). ⇒

Therefore, L−1(4/(s − 2)2) = L−1(F(s − 2)) = e2t f (t) = e2t 4t.

Example 5. Find L−1 1.

s2 + 4s + 13 Solution. We first need to complete the square

s2 + 4s + 13 = s2 + 4s + 4 + 9 = (s + 2)2 + 9.

We have a shifted function F(s + 2), where F(s) = 1/(s2 + 9). Using example (2), we know that f (t) = sin(3t)/3, so using the s-shift rule we get

L−1 1 = L−1(F(s + 2)) = e−2t sin(3t)

. s2 + 4s + 13 3

� �

� �

Laplace Inverse by Table Lookup OCW 18.03SC

Example 6. Find L−1 (s2 +

s ω2)2 .

Solution. We haven’t seen this formula yet, but there is a table entry, which t

gives: sin(ωt).2ω

Example 7. Find L−1 (s2+

1 ω2)2 .

1Solution. This is also a table entry, answer:

2ω3 (sin(ωt) − ωt cos(ωt)).

2

Partial Fractions: Undetermined Coefficients

1. Introduction

Not every F(s) we encounter is in the Laplace table. Partial fractions is a method for re-writing F(s) in a form suitable for the use of the table.

In this note we will run through the various cases encountered when we apply the method of partial fractions decomposition to a rational function. In the next note we will learn the Heaviside cover-up method, which simplifies some of the algebra.

Note: We use the term undetermined coefficients in the same way it was use when solving an ODE with polynomial input. It involves setting a polynomial with unknown coefficients equal to a known polynomial and solving for the unknown coefficients by equating them with the known ones.

2. Rational Functions

A rational function is one that is the ratio of two polynomials. For example

s + 1 s2 + 7s + 9and

s2 + 7s + 9 s + 1 are both rational functions.

A rational function is called proper if the degree of the numerator is strictly smaller than the degree of the denominator; in the examples above, the first is proper while the second is not.

Long-division: Using long-division we can always write an improper rational function as a polynomial plus a proper rational function. The partial fraction decomposition only applies to proper functions.

s3 + 2s + 1Example 1. Use long-division to write

s2 + s − 2 as a the sum of a poly

nomial and a proper rational function.

Solution. s − 1

s2 + s − 2 s3 + 2s +1 3 2s +s −2s

−s2 +4s +1 −s2 −s +2

5s −1

Partial Fractions: Undetermined Coefficients OCW 18.03SC

�

Therefore, ss

3

2 +

+ 2ss − +

21 = s − 1 +

s25+ s −

s − 1

2.

3. Linear Factors

Here we assume the denominator factors in distinct linear factors. We start with a simple example. We will explain the general principle immediately afterwords.

Example 2. Decompose R(s) = s − 3

using partial fractions. Use (s − 2)(s − 1)

this to find L−1(R(s)).

Solution. s − 3

= A

+ B

. (s − 2)(s − 1) s − 2 s − 1

Multiplying both sides by the denominator on the left gives

s − 3 = A(s − 1) + B(s − 2) (1)

The sure algebraic way is to expand out the right hand side and equate the coefficients with those of the polynomial on the left.

coeff. of s: 1 = A + B s − 3 = (A + B)s + (−A − 2B) ⇒ coeff. of 1: =−3 −A − 2B

We solve this system of equations to find the undetermined coefficients A and B: A = −1, B = 2. Answer: R(s) = −1/(s − 2) + 2/(s − 1). Table lookup then gives L−1(R(s)) = −e2t + 2et .

Note: In this example it would have been easier to plug the roots of each factor into equation (1). When you do this every term except one becomes 0. Plug in s = 1 ⇒ −2 = B(−1) ⇒ B = 2 Plug in s = 2 ⇒ −1 = A(1) ⇒ A = −1.

In general, if P(s)/Q(s) is a proper rational function and Q(s) factors into distinct linear factors Q(s) = (s − a1)(s − a2) (s − an) then· · ·

P(s) A1 A2 An = + + + .

Q(s) s − a1 s − a2 · · ·

s − an

The proof of this is not hard, but we will not give it. Remember you must have a proper rational function and each of the factors must be distinct. Repeated factors are discussed below.

2


� �

� �

� �

� �

Example 3. Use partial fractions to find L−1 s3 − 3s2

3 − s + 3

.

Solution. The hardest part of this problem is to factor the denominator. For higher order polynomials this might be impossible. In this case you can check

s3 − 3s2 − s + 3 = (s − 1)(s + 1)(s − 3).

The partial fractions decomposition is

3 A B C = + + .

(s − 1)(s + 1)(s − 3) s − 1 s + 1 s − 3

Multiplying through by the denominator gives

3 = A(s + 1)(s − 3) + B(s − 1)(s − 3) + C(s − 1)(s + 1).

Plugging in s = 1 gives A = −3/4, likewise s = −1 gives B = 3/2 and s = 3 gives C = −3/4. Our answer is

L−1 s3 − 3s2

3 − s + 3

= Aet + Be−t + Ce3t = − 24

et + 32

e−t − 34

e3t .

4. Quadratic Factors

If the denominator has quadratic factors then, the numerator in the partial fraction decomposition will be a linear term instead of a constant.

Example 4. Find L−1 (s + 1

s )(

− s12 + 4)

.

Solution. This is a proper rational function so

s − 1 =

A +

Bs + C . (2)

(s + 1)(s2 + 4) s + 1 s2 + 4

Notice the quadratic factor gets a linear term in the numerator. Notice also that the number of unknown coefficients is the same as the degree of the denominator in the original fraction.

From (2) we can write

L−1 (s + 1

s )(

− s12 + 4)

= Ae−t + B cos(2t) + C 2

sin(2t).

3


� �

All that’s left is to do some algebra to find the coefficients Muliplying (2) through by the denominator gives

s − 1 = A(s2 + 4) + (Bs + C)(s + 1) = (A + B)s2 + (B + C)s + (4A + C).

Equate the coefficients on both sides:

s2 : 0 = A + B s : 1 = B + C s2 : −1 = 4A + C

Solving, we get A = −2/5, B = 2/5, C = 3/5.

Example 5. Don’t be fooled by quadratic terms that factor into linear ones.

1 1 A B C (s + 1)(s2 − 4)

=(s + 1)(s + 2)(s − 2)

= s + 1

+ s + 2

+ s − 2

.

Example 6. Don’t forget that the rational function must be proper. For s3 + 2s + 1

example, decompose s2 + s − 2

using partial fractions.

Solution. First, we must use long-division to make this proper. From example (1) we have

ss

3

2 +

+ 2ss − +

21 = s − 1 +

s25+ s −

s − 1

2 = s − 1 +

(s + 52s )(

− s 1 − 1)

= s − 1 + s +

A 2 +

s − B

1.

Solving for the undetermined coefficients gives A = 11/3, B = 4/3.

5. Repeated Linear Factors

For repeated linear factors we need one partial fraction term for each power of the factor as illustrated by the following example.

Example 7. Find L−1 s3(s + 1

2)

s 2(s + 2)


Solution.

2s A B C D E F s3(s + 1)2(s + 2)

= s +

s2 + s3 +

s + 1 +

(s + 1)2 + s + 2

Here the denominator has a linear factor s repeated three times (term s3), and a linear factor (s + 1) repeated twice (term (s + 1)2); hence three partial

4


� �

� �

� �

� �

� � � �

fractions are associated with the first, while two are associated with the latter. The term (s+2) which is not repeated leads to one partial fraction as previously seen. You can check that the coefficients are

A = −5/2, B = 1, C = 0, D = 2, E = 2, F = 1/2.

Using the s-shift rule we have L−1(1/(s + 1)2) = te−t. Thus,

L−1 s3(s + 1

2)

s 2(s + 2)

= A + Bt + C 2

t2 + De−t + Ete−t + Fe−2t

= − 52 + t + 2e−t + 2te−t +

12

e−2t .

6. Repeated Quadratic Factors

Just like repeated linear factors, quadratic factors have one term for each power of the factor as illustrated in the following example.

Example 8. Find L−1 s(s2 + 1)2(

2ss 2 + 4s + 2)


Solution. The partial fractions decomposition is

2s A Bs + C Ds + E Fs + G s(s2 + 1)2(s2 + 4s + 6)

= s +

s2 + 1 +

(s2 + 1)2 + s2 + 4s + 6

Note the repeated factor (s2 + 1)2 lead to two partial fraction terms.

We won’t compute the coefficients –you can do this by going through the algebra. Instead, we’ll focus on finding the Laplace inverse. Our table contains the terms with repeated quadratic factors.

Ds t L−1 (s2 + 1)2 = D

2 sin(t)

L−1 E = E

1 (sin(t) − t cos(t)).

(s2 + 1)2 2

For the terms with denominator s2 + s + 2 we need to complete the square. Notice that we make sure to also shift the s-term in the numerator.

Fs + G F(s + 2) + G − 2F L−1 s2 + 4s + 6

= L−1 (s + 2)2 + 2

= Fe−2t cos(√

2 t) + √12 (G − 2F)e−2t sin(

√2 t).

5


� �

� � � �

Putting this together, the answer is

L−1 s(s2 + 1)2(

2ss 2 + 4s + 6)

= A + B cos(t) + C sin(t)

t 1 +D sin(t) + E (sin(t) − t cos(t))

2 2

+Fe−2t cos(√

2 t) + √12 (G − 2F)e−2t sin(

√2 t).

7. Complex Factors

We can allow complex roots. In this case all quadratic terms factor into linear terms.

Example 9. Decompose s/(s2 + ω2) using complex partial fractions and use it to show L−1(s/(s2 + ω2)) = cos(ωt).

Solution.

s s A Bs2 + ω2 = (s − iω)(s + iω)

= s − iω

+ s + iω

.

Multiplying through by the denominator gives s = A(s + iω) + B(s − iω). Plug in s = iω A = 1/2. ⇒

Plug in s = −iω ⇒ B = 1/2. From the table:

L−1 s2 +

s ω2 = L−1

s − A

iω +

s + B

iω = Aeiω + Be−iω =

12 (eiω + e−iω) = cos(ωt).

6

��

��

Heaviside Cover-up Method

1. Introduction

The cover-up method was introduced by Oliver Heaviside as a fast way to do a decomposition into partial fractions. This is an essential step in using the Laplace transform to solve differential equations, and this was more or less Heaviside’s original motivation.

The cover-up method can be used to make a partial fractions decom

position of a proper rational function p(s)

whenever the denominator can be q(s)

factored into distinct linear factors.

2. Linear Factors

We first show how the method works on a simple example, and then show why it works.

Example 1. Decompose s − 7

into partial fractions. (s − 1)(s + 2)

Solution. We know the answer will have the form

s − 7 =

A +

B . (1)

(s − 1)(s + 2) s − 1 s + 2

To determine A by the cover-up method, on the left-hand side we mentally remove (or cover up with a finger) the factor s − 1 associated with A, and substitute s = 1 into what’s left; this gives A:

s − 7 (s + 2) s=1

= 1 − 7

= −2 = A. (2)1 + 2

Similarly, B is found by covering up the factor s + 2 on the left, and substituting s = −2 into what’s left. This gives

s − 7 (s − 1) s=−2

−2 − 7 = = 3 = B. −2 − 1

Thus, our answer is

s − 7 (s − 1)(s + 2)

= −2

s − 1 +

3 s + 2

. (3)

Heaviside Cover-up Method OCW 18.03SC

Why does the method work? The reason is simple. The “right” way to determine A from equation (1) would be to multiply both sides by (s − 1); this would give

s − (s

7 + 2)

= A + s +

B 2 (s − 1). (4)

Now if we substitute s = 1, what we get is exactly equation (2), since the term on the right disappears. The cover-up method therefore is just an easy and efficient way of doing the calculations.

In general, if the denominator of the proper rational function factors into the product of distinct linear factors:

p(s) A1 Ar

(s − a1)(s − a2) (s − ar)=

s − a1 + . . . +

s − ar , ai �= aj , · · ·

then Ai is found by covering up the factor s − ai on the left, and setting s = ai in the rest of the expression.

1Example 2. Decompose

s3 into partial fractions. − s

Solution. Factoring, s3 − s = s(s2 − 1) = s(s − 1)(s + 1). By the cover-up method,

1 =

−1 +

1/2 +

1/2 .

s(s − 1)(s + 1) s s − 1 s + 1

To be honest, the real difficulty in all of the partial fractions methods (the cover-up method being no exception) is in factoring the denominator. Even the programs which do symbolic integration, like Macsyma, or Maple, can only factor polynomials whose factors have integer coefficients, or “easy coefficients” like

√2. and therefore they can only integrate rational

functions with “easily-factored” denominators.

3. Quadratic Factors

Heaviside’s cover-up method can be used even when the denominator doesn’t factor into distinct linear factors. This only gives partial results, but these can often be a big help, as the following example illustrates.

5s + 6Example 3. Decompose

(s2 + 4)(s − 2) .

Solution. We write 5s + 6 As + B C

(s2 + 4)(s − 2)=

s2 + 4 +

s − 2. (5)

2


We first determine C by the cover-up method, getting C = 2 . Then A and B can be found by the method of undetermined coefficients; the work is greatly reduced since we need to solve only two simultaneous equations to find A and B, not three.

Following this plan, using C = 2, we combine terms on the right of (5) so that both sides have the same denominator. The numerators must then also be equal, which gives us

5s + 6 = (As + B)(s − 2) + 2(s2 + 4). (6)

Comparing the coefficients of s2 and of the constant terms on both sides of (6) gives the two equations

0 = A + 2 and 6 = −2B + 8,

from which A = −2 and B = 1 .

In using (6), one could have instead compared the coefficients of s, getting 5 = −2A + B, leading to the same result, but providing a valuable check on the correctness of the computed values for A and B.

In Example 3, an alternative to undetermined coefficients would be to substitute two numerical values for s into the original equation (5), say s = 0 and s = 1 (any values other than s = 2 are usable). Again one gets two simultaneous equations for A and B. This method requires addition of fractions, and is usually better when only one coefficient remains to be determined (as in Example 4 below).

Still another method would be to factor the denominator completely into linear factors, using complex coefficients, and then use the cover-up method, but with complex numbers. At the end, conjugate complex terms have to be combined in pairs to produce real summands, and the calculations can sometimes be longer.

4. Repeated Linear Factors

The cover-up method can also be used if a linear factor is repeated, but there too it gives just partial results. It applies only to the highest power of the linear factor.

1Example 4. Decompose

(s − 1)2(s + 2) .

Solution. We write 1 A B C

(s − 1)2(s + 2)=

(s − 1)2 + s − 1

+ s + 2

. (7)

3


To find A cover up (s − 1)2 and set s = 1; you get A = 1/3. To find C, cover up s + 2, and set s = −2; you get C = 1/9.

This leaves B which cannot be found by the cover-up method. But since A and C are already known in (7), B can be found by substituting any numerical value (other than 1 or −2) for s in (7). For instance, if we put s = 0 and remember that A = 1/3 and C = 1/9, we get

1 1/3 B 1/9 2

= 1

+ + 2

, −1

giving B = −1/9.

B could also be found by applying the method of undetermined coefficients to the (7); note that since A and C are known, it is enough to get a single linear equation in order to determine B — simultaneous equations are no longer needed.

The fact that the cover-up method works for just the highest power of the repeated linear factor can be seen just as before. In the above example for instance, the cover-up method for finding A is just a short way of multiplying (5) through by (s − 1)2 and then substituting s = 1 into the resulting equation.

4

� �

� �

� � � �

Table Entries: Repeated Quadratic Factors


We will add three entries to our Laplace Table.

1 1 L 2ω3 (sin(ωt) − ωt cos(ωt)) =

(s2 + ω2)2 (1)

t s L 2ω

sin(ωt) =(s2 + ω2)2 (2) � � 21 sL

2ω (sin(ωt) + ωt cos(ωt)) =

(s2 + ω2)2 (3)

There are several ways to prove these formulas. We will give one using partial fractions by factoring the denominators on the frequency side into complex linear factors.

Proof of 1. First some algebra:

1 A B C D (s − a)2(s + a)2 = (s − a)2 +

s − a +

(s + a)2 + s + a

Cover-up gives us A and C. Undetermined coefficients then gives B and D:

1 1A =

4a2 = C, D = 4a3 = −B

This gives the inverse Laplace transform

L−1((s − a)2

1 (s + a)2 ) =

41 a2 (te

at + te−at) − 41 a3 (e

at − e−at). (4)

We will use this on the right hand side of (1), but first recall

eiωt + e−iωt = 2 cos(ωt) and eiωt − e−iωt = 2i sin(ωt) (5)

Let a = iω, then (4) and (5) combine to prove formula (1).

L−1 1 = L−1 1

(s2 + ω2)2 (s − iω)2(s + iω)2

= − 4ω

12 (te

iωt + te−iωt) + 4ω

13i (eiωt − e−iωt).

= − 4ω

12 (te

iωt + te−iωt) + 4ω

13i (eiωt − e−iωt)

1 1 = t cos(ωt) + sin(ωt).−

2ω2 2ω3

Table Entries: Repeated Quadratic Factors OCW 18.03SC

The proofs of (2) and (3) are similar, and we will omit them.

2. Note on the Relation to Resonance:

Each of the formulas (1), (2), and (3) has a term with a factor of t. This is exactly what we saw with the response x in the resonance equation

.. x + ω2x = cos(ωt),

which has solution x(t) = t sin(ωt)/(2ω).

Notice that L−1(1/s2) = t and the s-shift rule shows L−1(1/(s − a)2) = teat. So repeated factors on the frequency side always lead to multiplication by t on the time side. If the repeated factor has a higher power then we get multiplication by a higher power of t.

2

Laplace Transform: Solving IVP’s: Introduction

In this session we apply the Laplace transform techniques we have learned to solving intitial value problems for LTI DE’s p(D)x = f (t). For signals f (t) which are discontinuous or impulsive, using the Laplace transform is often the most efficient solution method.

We start by deriving the simple relations between the Laplace transform of a derivative of a function and the Laplace transform of the function itself. Our goal is to use these formulas to solve IVP’s of the form

p(D)x = f (t) (with initial conditions).

We do this by Laplace transforming both sides of the DE and solving for the function X(s) = L(x(t)). It turns out that the resulting equation for X(s) is a simple algebraic equation which can be solved immediately. Then one recovers the solution x(t) by computing the inverse Laplace transform x(t) = L−1(X(s)).

Table Entries: Derivative Rules

1. t-derivative rule

This is a course on differential equations. We should try to compute . L( f �). (We use the notation f � instead of f simply because we think the dot does not sit nicely over the tall letter f .)

As usual, let L( f )(s) = F(s). Let f � be the generalized derivative of f . (Recall, this means jumps in f produce delta functions in f �.) The t-derivative rule is

L( f �) = sF(s) − f (0−) (1) L( f ��) = s2F(s) − s f (0−) − f �(0−) (2)

L( f (n)) = snF(s) − sn−1 f (0−) − sn−2 f �(0−) + . . . + f (n−1)(0−). (3)

Proof: Rule (1) is a simple consequence of the definition of Laplace transform and integration by parts. � ∞ L( f �) = f �(t)e−st dt u = e−st v� = f �(t)

0− �∞ � ∞ u� = −se−st v = f (t)

= f (t)e−st + s f (t)e−st dt0− 0−

= − f (0−) + sF(s).

The last equality follows from: 1. We assume f (t) has exponential order, so if Re(s) is large enough f (t)e−st

is 0 at t = ∞. 2. The integral in the second term is none other than the Laplace transform of f (t).

Rule (2) follows by applying rule (1) twice.

L( f ��) = sL( f �) − f �(0−) = s(L( f ) − f (0−)) − f �(0−) = sF(s) − s f (0−) − f �(0−).

Rule (3) Follows by applying rule (1) n times.

Notes: 1. We will call the terms f (0−), f �(0−) the ’annoying terms’. We will be happiest when our signal f (t) has rest initial conditions, so all of

Table Entries: Derivative Rules OCW 18.03SC

the annoying terms are 0. 2. A good way to think of the t-derivative rules is

L( f ) = F(s) L( f �) = sF(s) + annoying terms at 0−. L( f ��) = s2 F(s) + annoying terms at 0−.

Roughly speaking, Laplace transforms differentiation in t to multiplication by s. 3. The proof of rule (1) uses integration by parts. This is clearly valid if f �(t) is continuous at t = 0. It is also true (although we won’t show this) if f �(t) is a generalized function. –See example 2 below.

Example 1. Let f (t) = eat . We can compute L( f �) directly and by usingrule (1).Directly: f �(t) = aeat ⇒ L( f �) = a/(s − a).Rule (1): L( f ) = F(s) = 1/(s − a) ⇒ L( f �) = sF(s) − f (0−) = s/(s −a) − 1 = a/(s − a).Both methods give the same answer.

.Example 2. Let u(t) be the unit step function, so u(t) = δ(t). .Directly: L(u) = L(δ) = 1. .Rule (1): L(u) = sL(u) − u(0−) = s(1/s) − 0 = 1. Both methods give the same answer.

Example 3. Let f (t) = t2 + 2t + 1. Compute L( f ��) two ways.

Solution. Directly: f ��(t) = 2 ⇒ L( f ��) = 2/s.Using rule (3): L( f ��) = s2 F(s) − s f (0−) − f �(0−) = s2(2/s3 + 2/s2 +1/s) − s 1 − 2 = 2/s.· Both methods give the same answer.

2. s-derivative rule

There is a certain symmetry in our formulas. If derivatives in time lead to multiplication by s then multiplication by t should lead to derivatives in s. This is true, but, as usual, there are small differences in the details of the formulas.

The s-derivative rule is

L(t f )(s) = −F�(s) (4)

L(tn f )(s) = (−1)nF(n)(s) (5) (6)

2

� �

� �

� �


Proof: Rule (4) is a simple consequence of the definition of Laplace transform. � ∞

F(s) = L( f ) = f (t)e−st dt

d � ∞

0−

F�(s) = f (t)e−st dt ⇒ ds 0−� ∞

= 0−

−t f (t)e−st

= −L(t f (t)).

Rule (5) is just rule (4) applied n times.

Example 4. Use the s-derivative rule to find L(t). Solution. Start with f (t) = 1, then F(s) = 1/s. The s-derivative rule now says L(t) = −F�(s) = 1/s2 –which we know to be the answer.

Example 5. Use the s-derivative rule to find L(teat and L(tneat).

Solution. Start with f (t) = eat, then F(s) = 1/(s − a). The s-derivative rule now says L(teat) = −F�(s) = 1/(s − a)2.

Continuing: L(t2eat) = F��(s) = 2/(s − a)3,L(t3eat) = −F��(s) = 3 2/(s − a)4, L(t4eat) = F(4)(s) = 4 3 2/(s − a)5,· · · L(tneat) = (−1)nF(n)(s) = n!/(s − a)n+1.

With Laplace, there is often more than one way to compute. We know L(tn) = n!/sn+1. Therefore the s-shift rule also gives the above formula for L(tneat).


Recall the table entries for repeated quadratic factors

1 1 L 2ω3 (sin(ωt) − ωt cos(ωt)) =

(s2 + ω2)2 (7)

t s L 2ω

sin(ωt) =(s2 + ω2)2 (8)

1 s2 L

2ω (sin(ωt) + ωt cos(ωt)) =

(s2 + ω2)2 (9)

Previously we proved thesse formulas using partial fractions and factoring the denominators on the frequency side into complex linear factors. Let’s prove them again using the s-derivative rule.

3


Proof of (8) using the s-derivative rule. ω

Let f (t) = sin(ωt). We know F(s) = s2 + ω2 . The s-derivative rule im

plies 2ωs L(t sin ωt) = −F�(s) =

(s2 + ω2)2 .

This formula is (8) with the factor of 2ω moved from one side to the other.

The other two formulas can be proved in a similar fashion. We won’t give the proofs here.

4

Precise Definition of Laplace Inverse

1. Domain of L−1(F).

We have been a bit vague on one key technical point which we aim to clear up in this note. We start with an example.

Example. Let u(t) be the unit step function. Since u(t) = 1 for t > 0 we don’t need the 0− in the Laplace limits of integration. � ∞ � ∞

L(u) = 0−

u(t)e−st dt = 0

e−st dt = 1/s.

This is exactly the same as L(1). So, when we look for f (t) = L−1(1/s) is it f (t) = 1 or f (t) = u(t)?

The answer is, it doesn’t matter. Since we are only concerned with the interval (0−, ∞), you can choose either one. To be precise, for a function F(s) we allow any function f (t) with the following properties to be called its Laplace inverse L−1(F):

1. f (t) is a (possibly) generalized function.

2. f (t) is defined on (0, ∞), except possibly at a discrete set of points where there are jump discontinuities or f (t) is singular, e.g. has delta functions.

3. f (t) may also have a singular part at t = 0.

4. L( f ) = F.

5. In particular, the Laplace inverse is not defined at t = 0 and has nothing to say about f (0−), f �(0−), f ��(0−), . . .. Indeed, it has nothing to say about f (t) for all t < 0. When finding the inverse Laplace transform these values are either irrelevant or must be determined by other means.

The functions in Figure 1 below all all have Laplace transform 1/s. Notice, they are all different for t < 0 and the last two are not defined at 0. Since they agree on t > 0 (and have no delta function at t = 0) they have the same Laplace transform.

t

1f(t) = 1

t

1u(t)

t

1

Figure 1. Three functions with the same Laplace transform.

�

Precise Definition of Laplace Inverse OCW 18.03SC

2. Removable Discontinuities

We’re still we’re not done with our discussion og the possible differences between two functions with the same Laplace transform. Consider the two functions whose graphs are shown in Figure 2.

t

1

a

f(t)

t

1

f1(t) = 1

Figure 2. Two functions that differ at one point.

They differ only at the point t = a. It is easy to see that they have the same Laplace transform: L( f ) = L( f1) = 1/s. (This is because the integral is an area and the areas under two curves that differ like these are the same.)

According to our definition either f (t) and f1(t) can be chosen as L−1(1/s). But, here the continuous function f1(t) is usually the better choice. For example, if we are finding a physical quantity that varies over time then the continuous function is usually the better model. The discontinuity in f (t) looks physically spurious.

Discontinuities like the one in f (t) are called removable discontinuities. That is, by changing the value of f (a) the function can be made continuous. (Technical definition below).

Course Convention. In this course we will follow the physically and mathematically reasonable convention that our signals do not have removable discontinuities. They can, however, have jump discontinuities and contain delta functions, which are idealizations of real physical signals.

Technical Definition of Removable Discontinuity. Suppose a function is discontinuous at t = a. If it can be made continuous by changing just the value of at a then we call t = a a removable discontinuity. Graphically: if the curve is a continuous curve with a gap where one point was moved then the point is a removable discontinuity. In symbols: If f (a−) = f (a+) = b then we can make a new function, continuous at a by redefining the value at t = a:

f1(t) =f (t) for t �= a

b for t = a.

Again, we say the discontinuity at t = a is removable.

2

Laplace: Solving Initial Value Problems

1. Introduction

We now have everything we need to solve IVP’s using Laplace transform. We will show how to do this through a series of examples.

To be honest we should admit that some IVP’s are more easily solved by other techniques. However, we will also see some examples where the Laplace machinery we’ve developed is a big help.

2. Examples of Solving IVP’s

Example 1. Solve x . + 3x = e−t with rest initial conditions (rest IC).

.Solution. Rest IC mean that x(t) = 0 for t < 0, so x(0−), x(0−), . . . are all 0. As usual, we let X = L(x).

Using the t-derivative rule we can take the Laplace transform of (both sides) of the DE.

(sX(s) − x(0−)) + 3X(s) = 1/(s + 1).

Next we substitute the known value x(0−) = 0 and solve for X(s)

1 1 (s + 3)X(s) = X(s) = . (1)

s + 1 ⇒

(s + 1)(s + 3)

Finally, we find x(t) = L−1(X) by using cover-up to do the partial fractions decomposition.

1 =

1/2 1/2 x(t) =

1 e−t 1

e−3t for t > 0. (s + 1)(s + 3) s + 1

− s + 3

⇒ 2

− 2

Notes: 1. The term e−t/2 is what the exponential response formula would give us. The term e−3t/2 is the homogenous part of the solution, needed to match the IC.

2. This technique found x(t) for t > 0. The rest IC tell us x(t) = 0 for t < 0.

3. x(0+) = 0: Since the input does not contain δ(t). There is no jump in x(t) at 0.

4. The factor of (s + 3) in front of X(s) in (1) is none other than the characteristic polynomial of this system.

�

Laplace: Solving Initial Value Problems OCW 18.03SC

Example 2. Solve x . + 3x = e−t , x(0−) = 4.

Solution. Laplace:

sX(x) − x(0−) + 3X(s) = 1/(s + 1) ⇒ (s + 3)X(s) = 4 + 1/(s + 1).

Solve for X(s): 4 1

X(s) = + (2)s + 3 (s + 1)(s + 3)

We can use the partial fractions work from example (1).

x(t) = 4e−3t + 21

e−t − 21

e−3t for t > 0

= 1

e−t + 7

e−3t for t > 0.2 2

Notes: (Same remarks as in example 1.)

Example 3. Find the unit impulse response for the operator D + 3I. Give your answer in both u and cases format.

Solution. The unit impulse response is the solution to

. (D + 3I)w = w + 3w = δ(t), with rest IC.

Taking the Laplace transform we get

1 (sW(s) − w(0−)) + 3W(s) = 1 ⇒ (s + 3)W = 1 ⇒ W = .

s + 3

Laplace inverse now implies w(t) = e−3t for t > 0. Thus,

w(t) = 0 for t < 0

= u(t)e−3t . e−3t for t > 0

Notes: 1. The post-initial condition is w(0+) = 1. This came out of the calculation, we didn’t have to think about the effect of the input δ(t) at t = 0. 2. The Laplace transform method did not help us find w(t) for t < 0. For this we used the rest IC that are part of the definition of the unit impulse function. 3. Since w(0−) = 0 the output jumps by 1 unit at t = 0. 4. Once again you saw the characteristic polynomial appearing.

2

Laplace: Solving Initial Value Problems OCW 18.03SC

Example 4. Find the unit impulse response for the system p(D)x = f , where p(D) = D2 + 2D + 2I and we consider f to be the input Give your answer in both u and cases format.

.. .Solution. (We outline the solution.) IVP: w + 2w + 2w = δ(t), with rest IC. Laplace: s2W + 2sW + 2W = 1 W = 1/(s2 + 2s + 2).⇒(Here we left out all the ’annoying terms’ because they are all 0 due to therest IC.)Complete the square: s2 + 2s + 2 = (s + 1)2 + 1.Inverse Laplace: (using the s-shift rule)

W = 1/((s + 1)2 + 1) ⇒ w(t) = e−tL−1(1/(s2 + 1)) = e−t sin(t) for t > 0.

Thus � 0 for t < 0

w(t) = = u(t)e−t sin(t). e−t sin(t) for t > 0

Notes: .1. The post-initial conditions emerge naturally from the solution and are w(0+) = 0, w(0+) = 1. .2. Since w(0−) = 0 the first derivative jumps by 1 unit at t = 0. 3. Once again you saw the characteristic polynomial appearing.

.Example 5. Solve x + 2x = 4t, with initial condition x(0) = 1.

Remark. Because the input contains no delta functions it is okay to specifythe initial condition at t = 0 instead of t = 0−. There will be no jump in theoutput, i.e., x(0) = x(0−) = x(0+).

Solution. Laplace: sX − x(0−) + 2X = 4/s2.Algebra and partial fractions:

4 1 A B C 1X(s) =

s2(s + 2)+

s + 2 =

s +

s2 + s + 2

+ s + 2

.

Cover-up gives B = 2, C = 1. Undetermined coefficients gives A = −1. Inverse Laplace: x(t) = −1 + 2t + 2e−2t , for t > 0.

..Example 6. Solve x + 4x = cos(2t), with rest IC.

Solution. Laplace: (s2 + 4)X(s) = s/(s2 + 4) X(s) = s/(s2 + 4)2. This⇒is a repeated quadratic factor and it is in our table: x(t) = t sin(2t)/4.

Notes: 1. This is a response of pure resonance. 2. We could have turned the logic around and used our previous knowledge of the solution to this equation to give yet another proof for the table entry L(t sin(ωt)/2ω) = s/(s2 + ω2)2.

3

�

IVP’s and t-translation

1. Introductory Example .

Consider the system x + 3x = f (t). In the previous note we found its unit impulse response:

w(t) = 0 for t < 0

= u(t)e−3t . e−3t for t > 0

This is the response from rest IC to the input f (t) = δ(t). What if we shifted the impulse to another time, say, f (t) = δ(t − 5)? Linear time invariance tells us the response will also be shifted. That is, the solution to

. x + 3x = δ(t − 2), with rest IC (1)

is � 0 for t < 2

x(t) = w(t − 2) = e−3t for t > 2

= u(t − 2)e−3(t−2).

In words, this is a system of exponential decay. The decay starts as soon as there is an input into the system. Graphs are shown in Figure 1 below.

t t2

Figure 1. Graphs of w(t) and x(t) = w(t − 2).

We know that L(δ(t − a)) = e−as. So, we can find X = L(x) by taking the Laplace transform of (1).

(s + 3)X(s) = e−2s X(s) = e−2s

= e−2sW(s),⇒ s + 3

where W = Lw. This is an example of the t-translation rule.

2. t-translation Rule

We give the rule in two forms.

L(u(t − a) f (t − a)) = e−as F(s) (2) L(u(t − a) f (t)) = e−asL( f (t + a)). (3)

� �

IVP’s and t-translation OCW 18.03SC

For completeness we include the formulas for

L(u(t − a)) L(δ(t − a))

=

=

e−as/s

e−as . (4) (5)

Remarks: 1. Formula (3) is ungainly. The notation will become clearer in the examples below. 2. Formula (2) is most often used for computing the inverse Laplace transform, i.e., as � �

u(t − a) f (t − a) = L−1 e−as F(s) .

3. These formulas parallel the s-shift rule. In that rule, multiplying by an exponential on the time (t) side led to a shift on the frequency (s) side. Here, a shift on the time side leads to multiplication by an exponential on the frequency side.

Proof: The proof of (2) is a very simple change of variables on the Laplace integral. � ∞ L(u(t − a) f (t − a)) = u(t − a) f (t − a)e−st dt

0� ∞ = �a

∞

f (t − a)e−st dt (u(t − a) = 0 for t < a)

= f (τ)e−s(τ+a) dτ (change of variables: τ = t − a)0 � ∞

= e−as f (τ)e−sτ dτ 0

= e−as F(s).

Formula (3) follows easily from (2). The easiest way to proceed is by introducing a new function. Let g(t) = f (t + a), so

f (t) = g(t − a) and G(s) = L(g) = L( f (t + a)).

We get

L(u(t − a) f (t)) = L(u(t − a)g(t − a)) = e−asG(s) = e−asL( f (t + a)).

The second equality follows by applying (2) to g(t).

Example. Find L−1 ke−as

. s2 + k2

2

� �

� �

� �

�

�

IVP’s and t-translation OCW 18.03SC

Solution. f (t) = L−1 k = sin(kt).

s2 + k2

ke−as

⇒ L−1 s2 + k2 = u(t − a) f (t − a) = u(t − a) sin k(t − a).

Example. L (u(t − 3)t) = e−3sL(t + 3) = e−3ss12 +

3.

s

Example. L(u(t − 3) 1) = e−3sL(1) = e−3s/s.·

0 for t < 2Example. Find L( f ) for f (t) =

t2 for t > 2.

Solution. f (t) = u(t − 2)t2 ⇒ F(s) = e−2sL((t + 2)2) = e−2s( s23 +

s42 +

4 s ).

Example. Find L( f ) for f (t) = cos(t) for 0 < t < 2π

0 for t > 2π.

Solution. f (t) = cos(t)(u(t) − u(t − 2π)) = u(t) cos(t) − u(t − 2π) cos(t).

F(s) = s L(cos(t + 2π)) = (1 − e−2πs)

s .

s2 + 1 − e−2πs

s2 + 1⇒

We will look at more involved examples in the next note.

3

�

� � � �

� �

IVP’s: Longer Examples

The fish population in a lake is not reproducing fast enough and the population is decaying exponentially with decay rate k. A program is started to stock the lake with fish. Three different scenarios are discussed below.

Example 1. A program is started to stock the lake with fish at a constant rate of r units of fish/year. Unfortunately, after 1/2 year the funding is cut and the program ends. Model this situation and solve the resulting DE for the fish population as a function of time.

Solution. Let x(t) be the fish population and let A = x(0−) be the initial population. Exponential decay means the population is modeled by

. x + kx = f (t), x(0−) = A (1)

where f (t) is the rate fish are being added to the lake. In this case

r for 0 < t < 1/2 f (t) =

0 for 1/2 < t.

First, write f in ’u-format’: f (t) = r(1 − u(t − 1/2)). Next, take the Laplace transform and solve for X(s).

F(s) = L( f )(s) = rs −

rs

e−s/2. r

sX − x(0−) + kX = F(s) (s + k)X − A = (1 − e−s/2)⇒ ⇒ s

A r ⇒ X(s) = s + k

+ s(s + k)

(1 − e−s/2).

To find x(t) we temporarily ignore the factor of e−s/2 and take Laplace inverse of what’s left. (using partial fractions).

A r r L−1 s + k

= Ae−kt , L−1 s(s + k)

= k (1 − e−kt).

The t-translation formula says

re−s/2 L−1

s(s + k)= u(t − 1/2)

kr (1 − e−k(t−1/2)).

�

� �

IVP’s: Longer Examples OCW 18.03SC

Putting it all together we get (in u and cases format).

r rAe−kt (1 − e−kt) − u(t − 1/2) (1 − e−k(t−1/2))x(t) = +

k k

Ae−kt + kr (1 − e−kt) for 0 < t < 1/2

= Ae−kt − k

r (e−kt + e−k(t−1/2)) for 1/2 < t.

Example 2. (Periodic on/off) The program is refunded and the have enough money to stock at a constant rate of r for the first half of each year. Find x(t) in this case.

Solution. All that’s changed from example 1 is the input function f (t). We write it in cases-format and translate that to u-format so we can take the Laplace transform. ⎧⎪⎪⎪⎪⎪⎪⎨ ⎪⎪⎪⎪⎪⎪⎩

r for 0 < t < 1/20 for 1/2 < t < 1r for 0 < t < 3/2f (t) =

0 for 3/2 < t < 2 · · ·

1 3 r(1 − u(t − ) + u(t − 1) − u(t − ) + . . .)=

2 2

The computations from here are essentially the same as in the previous example.L( f ) = r (1 − e−s/2 + e−s − e−3s/2 + . . .)s

s+A

k + (1 − e−s/2 + e−srX = − . . .)s(s+k)⇒

⇒ x(t) = Ae−kt + kr (1 − e−kt) − u(t − 1/2)(1 − e−k(t−1/2)) + . . .

⇒ x(t) =

⎧⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎨ ⎪⎪⎪⎪⎪⎪⎪⎪⎪⎪⎩

Ae−kt + kr − k

r e−kt for 0 < t < 1 2

rAe−kt − k (e−kt − e−k(t−1/2)) for 1

2 < t < 1

· · · Ae−kt + k

r − kr (e−kt − e−k(t−1/2) + . . . + e−k(t−n)) for n < t < n + 1

Ae−kt − kr (e−kt − e−k(t−1/2) + . . . − e−k(t−n−1/2)) for n + 1

2 < t < n + 1

· · ·

2

2

�

� � � �

IVP’s: Longer Examples OCW 18.03SC

Factoring out e−kt gives:

x(t) = Ae−kt + k

r − kr e−kt(1 − ek/2 + ek − e3k/2 + . . . + enk) for n < t < n + 1/2

Ae−kt − kr e−kt(1 − ek/2 + ek − . . . − ek(n+1/2)) for n + 1/2 < t < n + 1.

Note that the constant term r/k is only present during periods of stocking.

Example 3. (Impulse train) The answer to the previous example is a little hard to read. We know from experience that impulsive input usually leads to simpler output. In this scenario suppose that once a year r/2 units of fish are dumped all at once into the lake. Find x(t) in this case.

Solution. Once again, all that’s changed from example 1 is the input function f (t). The IVP is still given by equation (1).

rf (t) = (δ(t) + δ(t − 1) + δ(t − 2) + δ(t − 3) + . . .).

2 This is called an impulse train. Its Laplace transform is easy to find.

F(s) = r (1 + e−s + e−2s + e−3s + . . .).

2 One nice thing about delta functions is that they don’t introduce any new terms into the partial fractions part of the problem.

sX(s) − x(0−) + kX(s) = r (1 + e−s + e−2s + e−3s + . . .).

2

X(s) = A

+ r

(1 + e−s + e−2s + e−3s + . . .).⇒ s + k 2(s + k)

Laplace inverse is easy:

L−1 1 = e−kt L−1 e−ns

= u(t − n)e−k(t−n). s + k

⇒ s + k

Thus,

x(t) = Ae−kt + r

e−kt + r

u(t − 1)e−k(t−1) + r

u(t − 2)e−k(t−2) + r

u(t − 3)e−k(t−3) + . . . 2 2 2 2

Here are graphs of the solutions to examples 2 and 3 (with A = 0, k = 1, r = 2). Notice how they settle down to periodic behavior.

t1 2 3 4

1

t1 2 3 4

1

Fig. 1. Graphs from example 2 (left) and example 3 (right).

3

Transfer and Weight Functions, Green’s Formula

In this session we will introduce the transfer function (often called the system function). For a system p(D)x = f (t), the transfer function is simply 1/p(s). It is also the Laplace transform of the unit impulse response.

The operator p(D) contains complete information about the system, but it can be hard to see at a glance. Things like stability, oscillatory behavior and resonanant frequencies are not always obvious just by looking at the polynomial operator.

The transfer function W(s) = 1/p(s) also contains complete information. After all, you can easily find p(D) if you know W(s). The transfer function has several big advantages.

1. Not all linear time invariant systems come from differential equations. So they don’t necessarily have a p(D). However, they all have transfer functions which are used in exactly the same way as we will learn for our systems.

2. When we studied convolution we learned Green’s formula. This allows us to solve an ODE with rest initial conditions and any input once we know the unit impulse response. The Laplace transform will make Green’s formula very simple to understand and use. Since the transfer function is the Laplace transform of the unit impulse response it will play a prominent role in this.

3. We can understand a combination of systems if we know the system functions of its constituent parts.

4. In the next session we will use the transfer function’s pole diagram to to make many of the system’s properties obvious at a glance.

In this session we will start by defining the transfer function and use it to solve IVP’s with rest initial conditions. Then we will see that Laplace transform changes convolution of functions in time into multiplication of functions in frequency. This will make Green’s formula particularly simple to state. Finally, we will look at block diagrams and use them to combine systems in much the same way that simple circuits can be combined to make complicated ones.

The Transfer Function

1. Definition

We start with the definition (see equation (1). In subsequent sections of this note we will learn other ways of describing the transfer function. (See equations (2) and (3).)

For any linear time invariant system the transfer function is

W(s) = L(w(t)), where w(t) is the unit impulse response. (1)

.Example 1. Find the transfer function for the system x + 3x = f (t).

Solution. The unit impulse response is the solution to . w + 3w = δ(t), with rest IC.

The Laplace transform method finds W(s) on the way to finding w(t). Since we only want W(s) we can stop when we get there. Taking the Laplace transform of the DE we get

1sW(s) − w(0−) + 3W = 1 ⇒ W =

s + 3.

The ’annoying’ term w(0−) = 0 because we have rest initial conditions. (Subsequent to this we will not bother writing the annoying terms when we have rest IC.)

2. Other Standard Terminology

The unit impulse response is also called the weight function and the transfer function is also called the system function. All of these terms are widely used and we will use them all to help you become familiar with them.

3. Formula for W(s) .. .

Example 2. Find the transfer function for mx + bx + kx = f (t).

Solution. The unit impulse response is the solution to .. .

mw + bw + kw = δ(t), with rest IC.

By definition, the transfer function W(s) = L(w). So, we take the Laplace transform of the DE. There are no ’annoying terms’ because with rest initial

The Transfer Function OCW 18.03SC

conditions L(w.. ) = s2W(s) and L(w. ) = sW(s). We get

1 (ms2 + bs + k)W(s) = 1 W(s) = .⇒

ms2 + bs + k

In example 2, the differential operator is p(D) = mD2 + bD + kI. That is, the characteristic polynomial is p(s) = ms2 + bs + k and the transfer function is W(s) = 1/p(s).

Exactly the same reasoning holds for operators of higher order. Formula: For any polynomial operator p(D) the transfer function for the system

p(D)x = f (t)

is given by 1

W(s) = . (2)p(s)

Example 3. Suppose W(s) = 1/(s2 + 4) is the transfer function for a system p(D)x = f (t). What is p(D)?

Solution. Since W(s) = 1/p(s) we have p(s) = s2 + 4, which implies p(D) = D2 + 4I.

4. Another Characterization of the Transfer Function

The best way to think of the transfer function is as a ratio of output to input. By this we mean the following.

Suppose we have an equation

p(D)x = f (t), with rest IC.

Taking Laplace transform of both sides gives

1 p(s)X(s) = F(s) X(s) = F(s) = W(s)F(s).⇒

p(s)

Solving for W(s) shows

X(s) outputW(s) = = . (3)

F(s) input

5. Conclusion

We have characterized the transfer functions in three different ways. Equations (1) and (3) are perfectly general and apply to any LTI system. Equation (2) is specific to constant coefficient linear differential equations.

2

I. Finding p(D)

1Quiz: Suppose W(s) = is the transfer function for the sys

(s + 2)(s + 3) tem p(D)x = f (t). What is p(D)?

Choices:

a) e−2t − e−3t

b) D2 + 5D + 6I

c) 1/(D + 3)(D + 2)

d) It doesn’t exist

e) Can’t be found with the data given

Answer: (b)W(s) = 1/p(s) = 1/(s2 + 5s + 6) p(D) = D2 + 5D + 6I.⇒




http://ocw.mit.edu


I. Finding p(D)



Choices:

a) e−2t − e−3t

b) D2 + 5D + 6I

c) 1/(D + 3)(D + 2)




I. Finding p(D)




II. Finding p(D)

sQuiz: Suppose W(s) = . Find p(D) so that W(s) is the transfer func

s2 + 1tion for the system p(D)x = f (t).

Choices:

a) cos(t)

b) D2 + I

c) D + 1/D



Answer: (d)The system p(D)x = f (t) has transfer function 1/p(s). Since W(s) is notone over a polynomial there is no such polynomial.

.. .Note that W(s) is the transfer function for the system x + x = y, where

we consider y to be the input.

II. Finding p(D)



Choices:

a) cos(t)

b) D2 + I

c) D + 1/D




II. Finding p(D)




Modified Input

Way back when we introduced the language of system, input and response we decided that the right hand side of our equations wasn’t always the input. Sometimes it was a modified version of the input.

Example 1. Recall the heat diffusion equation

. x + kx = kTe(t),

where Te(t) is the temperature of the environment. Consider Te(t) to be the input and find the system function.

Solution. Look at the equation for the unit impulse response

. w + kw = kδ(t), rest IC.

Notice that since Te(t) is the input, the unit impulse response comes by letting Te(t) = δ(t). The Laplace transform now gives

k (s + k)W(s) = k W(s) = .⇒

s + k

Note well, that with modified input on the right hand side of the DE, the system function does not automatically have a 1 in the numerator.

You might have noticed that in the previous example we could have 1

written W(s) = , which has our usual form. The next example s/k + 1

shows that this is not always the case.

Example 2. Consider an LC circuit with input voltage v(t). We’ll assume L and C are set so the differential equation for the current i(t) is

i��(t) + 4i = v�(t).

We consider the input to be v(t) and the output to be i(t). (We use primes instead of dots for the derivative because the i already has a dot.)

Finding the unit impulse response is tricky, because if we set v(t) = δ(t) then we will have δ�(t) on the right hand side of the DE. Let’s avoid this by using the characterization of the transfer function as the ratio output/input. In this case, we’ll have W(s) = I(s)/V(s).

Modified Input OCW 18.03SC

Assuming rest IC, we have L(v�) = sV(s), where, as usual, we have let the uppercase letter be the Laplace transform of the lowercase one. Applying the Laplace transform gives

(s2 + 4)I(s) = sV(s) W(s) = I(s)

= s

.⇒ V(s) s2 + 4

The s in the numerator guarantees this cannot be written in the form 1/p(s) for any polynomial p(s).

As a concluding note, we’ll say that we were too pessimistic about our ability to handle δ�(t). We might not know what it is, but we do know how to find its Laplace transform.

L−1(δ�(t)) = sL(δ(t)) − δ(0−) = s.

2

Green’s Formula, Laplace Transform of Convolution

1. Green’s Formula in Time and Frequency

When we studied convolution we learned Green’s formula. This says, the IVP

p(D)x = f (t), with rest IC (1)

has solution

x(t) = (w ∗ f )(t), where w(t) is the weight function. (2)

(Remember, the weight function is the same as the unit impulse response.)

The Laplace transform changes these equations to ones in the frequency variable s.

p(s)X(s) = F(s) (3) 1

X(s) = F(s) = W(s)F(s), (4)p(s)

where W(s) is the transfer function.

Equation (2) is Green’s formula in time and (4) is Green’s formula in frequency. In words, viewed from the t side, the solution to (1) is the convolution of the weight function and the input. Viewed from the s side, the solution is the product of the transfer function and the input.

2. Convolution

Comparing equations (2) and (4) we see that

L(w ∗ f ) = W(s) F(s). (5)·

It appears that Laplace transforms convolution into multiplication. Technically, equation (5) only applies when one of the functions is the weight function, but the formula holds in general.

Theorem: For any two functions f (t) and g(t) with Laplace transforms F(s) and G(s) we have

L( f ∗ g) = F(s) G(s). (6)·

Remarks: 1. This theorem gives us another way to prove convolution is commutative. It is just the commutivity of regular multiplication on the s-side.

L( f ∗ g) = F G = G F = L(g ∗ f ).· ·

Green’s Formula, Laplace Transform of Convolution OCW 18.03SC

2. In fact, the theorem helps solidify our claim that convolution is a type of multiplication, because viewed from the frequency side it is multiplication.

Proof: The proof is a nice exercise in switching the order of integration. We won’t use 0− and t+ in the integrals, since they would just clutter the exposition. It is an amusing exercise to put them in and see that they transform correctly as we manipulate the integrals.

We start by writing L( f ∗ g) as the convolution integral followed by the Laplace integral. � ∞

L( f ∗ g) = ( f ∗ g)(t)e−st dt 0 � ∞ � t

= f (t − u)g(u)e−st du dt. 0 0

Next, we change the order of integration (see the figure below). � ∞ � ∞ = f (t − u)g(u)e−st dt du.

0 u

Finally, change variables in the inner integral: substitute v = t − u, dv = dt, (u a constant) � ∞ � ∞

= f (v)g(u)e−s(v+u) dv du 0 0 � ∞ � ∞

= f (v)e−sv dv g(u)e−su du 0 0

= F(s)G(s).

u//

tOO

t = u

u//

tOO

t = u

Fig. 1. Changing the order from du dt to dt du.

3. Integration Rule

If differentiation on the time side leads to multiplication by s on the frequency side then we should expect integration in time to lead to division by s. If f (t) is a function with Laplace transform F(s) then the integration

2

Green’s Formula, Laplace Transform of Convolution OCW 18.03SC

rule states: �� t+ � F(s)L

0− f (τ) dτ =

s .

Proof: One way to prove this is using the t-derivative rule. Let’s be clever and use convolution instead. The integral is exactly f (t) ∗ 1. Thus, �� t+ �

L 0−

f (τ) dτ = L( f ∗ 1) = F(s) ∗ L(1) = F(

ss)

.

This is what we needed to show.

3

Block Diagrams

1. Introduction

We discussed some simple block diagrams when we introduced the notions of system, input, and output back in unit 1. Here, we will include the transfer function in the diagram and show how to use them to compute the transfer function of more complicated systems.

As we do this, it will be useful to keep in mind the desciption of the transfer function as output/input.

2. Simple Examples .. .

Example 1. Suppose we have the system mx + bx + kx = f (t), with input f (t) and output x(t). The Laplace transform converts this all to functions and equations in the frequency variable s. The transfer function for this system is W(s) = 1/(ms2 + bs + k). We can write the relation between input and output as

input F(s) � output X(s) = W(s)F(s)

As a block diagram we can represent the system by

F (s)W (s)

X(s)

Fig. 1. Block diagram for a system with transfer function W(s).

Sometimes we write the formula for the transfer function in the box representing the system. For the above example this would look like

F (s) 1

ms2 + bs+ k

X(s)

Fig. 2. Block diagram giving the formula for the transfer function.

Example 2. (Cascading systems) Consider the cascaded system

p1(D)x = f , p2(D)y = x, rest IC.

The input to the cascade is f and the output is y. The first equation takes the input f and outputs x. This is the input to the second equation, which ouputs y.

Block Diagrams OCW 18.03SC

This is easy to solve on the frequency side. Let W1(s) = 1/p1(s) and W2(s) = 1/p2(s) be the transfer functions for the two differential equations. Considering the two equations separately we have

X(s) = W1(s) F(s) and Y(s) = W2(s) X(s).· ·

It follows immediately that Y(s) = W2(s) W1(s) F(s). Therefore the trans· · fer function for the cascade is

output/input = Y(s)/F(s) = W2(s) W1(s).·

In other words, for cascaded systems the transfer functions multiply.

Representing this as block diagrams we have two equivalent diagrams

W1(s) W2(s)F (s) X(s) Y (s)

W1(s)W2(s)Y (s)F (s)

Fig. 3. Equivalent block diagrams for a cascaded system.

Example 3. (Parallel systems) Suppose that we have a system consisting of two systems in parallel as shown in the block diagram.

W1(s)

W2(s)

+F (s) Y (s)

Fig. 4. Systems in parallel. Find the transfer function for the entire system.

Solution. The plus sign in the circle indicates the two signals coming into the junction should be added. The split near the start indicates the input F(s) go into each system.

The way to figure out the transfer function is to name the outputs of each individual system.

W1(s)

W2(s)

+F (s) F

F

X1

X2

Y (s)

Fig. 5. System with intermediate outputs labeled.

2


For each system we know output = transfer function × input. Thus, X1 = W1 F, X2 = W2 F, Y = X1 + X2. So, we easily compute · ·

Y = X1 + X2 = W1 F + W2 F = (W1 + W2) F.· · ·

Therefore the transfer function is W1 + W2.

3. Feedback Loops

This is a bonus section. You will not see it in the problems or tests.

Many systems use feedback loops. That is, the output of the system is monitored and used to modify the input. It is very hard to control a system without a feedback loop.

Suppose we start with a system with transfer function W(s).

F (s)W (s)

X(s)

and modify it to have the feedback loop shown

∑W

×g

+−

F V Y

gY

The original system is known as the open loop system and the corresponding system with feedback is known as the closed loop system.

We’ve labeled the outputs from each system element. The symbol ×g means the input is scaled by g, that is apply a gain of g to the input. The symbol ∑ means the two inputs are combined; the plus and minus signs indicate to add or subtract the corresponding input.

The method of finding the transfer function is the same as in the previous examples. A bit of algebra gives

WV = F − gY, Y = W · V ⇒ Y = W(F − gY) ⇒ Y =

1 + gW · F.

As usual, the transfer function is output/input = Y/F = W/(1 + gW). This formula is one case of what is often called Black’s formula

Example 4. Suppose we have an open loop system, say a circuit, with transfer function W(s) = s/(as2 + bs + c). If we add a feedback loop with

3


gain g then using Black’s formula the closed loop transfer function is

s/(as2 + bs + c) s 1 + gs/(as2 + bs + c)

= as2 + (b + g)s + c

.

4

Poles, Amplitude Response, Connection to ERF

For our standard LTI system p(D)x = f , the transfer function is W(s) = 1/p(s). In this case the poles of W(s) are simply the zeros of the characteristic polynomial p(s) (also known as the characteristic roots). We have had lots of experience using these roots in this course and know they give important information about the system. The reason we talk about the poles of the transfer function instead of just sticking with the characteristic roots is that all LTI systems have a transfer function, while the characteristic polynomial is defined only for systems described by constant coefficient linear ODE’s.

Much of units 1,2 and the current unit have been devoted to the study of DE’s for LTI systems:

p(D)x = f (t). (1)

In units 1 and 2 we saw that the stability of the system is determined by the roots of the characteristic polynomial. We saw as well that the amplitude response of the system to a sinusoidal input of frequency ω is also determined by the characteristic polynomial, namely as the gain

1 g(ω) = |p(iω)|

In this unit we’ve learned about the Laplace transform, which gives us another view of a signal by transforming it from a function of t, say f (t), to a function F(s) of the complex frequency s.

A key object from this point of view is the transfer function. For the system (1), if we consider f (t) to be the input and x(t) to be the output, then the transfer function is W(s) = 1/p(s), which is again determined by the characteristic polynomial.

In this session we will learn about poles and the pole diagram of an LTI system. This ties together the notions of stability, amplitude response and transfer function, all in one diagram in the complex s-plane. The pole diagram gives us a way to visualize systems which makes many of their important properties clear at a glance; in particular, and remarkably, the pole diagram

1. shows whether the system stable;

2. shows whether the unforced system is oscillatory;

Poles, Amplitude Response, Connection to ERF OCW 18.03SC

3. shows the exponential rate at which the unforced system returns to equilibrium (for stable systems); and

4. gives a rough picture of the amplitude response and practical resonances of the system.

For these reasons the pole diagram is a standard tool used by engineers in understanding and designing systems.

We conclude by reminding you that every LTI system has a transfer function. Everything we learn in this session will apply to such systems, including those not modeled by DE’s of the form (1)

2

Definition of Poles

1. Rational Functions

A rational function is a ratio of polynomials q(s)/p(s).

Examples. The following are all rational functions. (s2 + 1)/(s3 + 3s + 1), 1/(ms2 + bs + k), s2 + 1 + (s2 + 1)/1.

If the numerator q(s) and the denominator p(s) have no roots in common, then the rational function q(s)/p(s) is in reduced form

Example. The three functions in the example above are all in reduced form.

Example. (s − 2)/(s2 − 4) is not in reduced form, because s = 2 is a root of both numerator and denominator. We can rewrite this in reduced form as

s − 2 s − 2 1 s2 − 4

=(s − 2)(s + 2)

= s + 2

.

2. Poles

For a rational function in reduced form the poles are the values of s where the denominator is equal to zero; or, in other words, the points where the rational function is not defined. We allow the poles to be complex numbers here.

Examples. a) The function 1/(s2 + 8s + 7) has poles at s = −1 and s = −7.b) The function (s − 2)/(s2 − 4) = 1/(s + 2) has only one pole, s = −2.c) The function 1/(s2 + 4) has poles at s = ±2i.d) The function s2 + 1 has no poles.e) The function 1/(s2 + 8s + 7)(s2 + 4) has poles at -1, -7, ±2i. (Notice thatthis function is the product of the functions in (a) and (c) and that its polesare the union of poles from (a) and (c).)

Remark. For ODE’s with system function of the form 1/p(s), the poles are just the roots of p(s). These are the familiar characteristic roots, which are important as we have seen.

3. Graphs Near Poles

We start by considering the function F1(s) = 1 s . This is well defined for

every complex s except s = 0. To visualize F1(s) we might try to graph it. However it will be simpler, and yet still show everything we need, if we graph F1(s) instead.| |

Definition of Poles OCW 18.03SC

To start really simply, let’s just graph |F 1 1(s)| = | | for s real (rather than s

complex).

s−3 −2 −1 1 2 3

1

2

|1/s|

Figure 1: Graph of 1 s for s real. | |

Now let’s do the same thing for F2(s) = 1/(s2 − 4). The roots of the denominator are s = ±2, so the graph of F2(s) = 1 has vertical asymps2

totes at s = ±2. | | | −4|

s−3 −2 −1 1 2 3

|1/(s2 − 4)|

Figure 2: Graph of 1 for s real. |s2−4|As noted, the vertical asymptotes occur at values of s where the denom

inator of our function is 0. These are what we defined as the poles.

F1(s) = 1 has a single pole at s = 0.• s

• F2(s) = s21 −4 has two poles, one each at s = ±2.

Looking at Figures 1 and 2 you might be reminded of a tent. The poles of the tent are exactly the vertical asympotes which sit at the poles of the function.

2


4. Poles and Exponential Growth Rate

If a > 0, the exponential function f1(t) = eat grows rapidly to infinity as t → ∞. Likewise the function f2(t) eat= sin bt is oscillatory with the amplitude of the oscillations growing exponentially to infinity as t → ∞. In both cases we call a the exponential growth rate of the function.

The formal definition is the following Definition: The exponential growth rate of a function f (t) is the smallest value a such that

f (t)lim =

ebt 0 for all b > a. (1)t→∞

In words, this says f (t) grows slower than any exponential with growth rate larger than a.

Examples. 1. e2t has exponential growth rate 2.

2. e−2t has exponential growth rate -2. A negative growth rate means that the function is decaying exponentially to zero as t → ∞.

3. f (t) = 1 has exponential growth rate 0.

3

Let’s now try to graph F1(s) and F2(s) when we allow s to be com| | | |plex. If s = a + ib then F1(s) depends on two variables a and b, so the graph requires three dimensions: two for a and b, and one more (the vertical axis) for the value of F1(s) . The graphs are shown in Figure 3 below. They are | |3D versions of the graphs above in Figures 1 and 2. At each pole there is a conical shape rising to infinity, and far from the poles the function fall off to 0.

Figure 3: The graphs of |1/s| and 1/|s2 − 4|.

Roughly speaking, the poles tell you the shape of the graph of a function |F(s)|: it is large near the poles. In the typical pole diagams seen in practice, the F(s) is also small far away from the poles. | |


cos t4. cos t has exponential growth rate 0. This follows because lim = 0

t→∞ ebt

for all positive b.

5. f (t) = t has exponential growth rate 0. This may be surprising because f (t) grows to infinity. But it grows linearly, which is slower than any positive exponential growth rate.

6. f (t) = et2 does not have an exponential growth rate since it grows faster

than any exponential.

Poles and Exponential Growth Rate We have the following theorem connecting poles and exponential growth rate.

Theorem: The exponential growth rate of the function f (t) is the largest real part of all the poles of its Laplace transform F(s).

Examples. We’ll check the theorem in a few cases.

1. f 3(t) = e t clearly has exponential growth rate equal to 3. Its Laplace transform is 1/(s − 3) which has a single pole at s = 3,and this agrees with the exponential growth rate of f (t).

2. Let f t t, then F s 1/s2( ) = ( ) = . F(s) has one pole at s = 0. This matches the exponential growth rate zero found in (5) from the previous set of examples.

3. Consider the function f t 3e2t 5et 7e−8t( ) = + + . The Laplace transform is F(s) = 3/(s − 2) + 5/(s − 1) + 7/(s + 8), which has poles at s = 2, 1, −8. The largest of these is 2. (Don’t be fooled by the absolute value of -8, since 2 > −8, the largest pole is 2.) Thus, the exponential growth rate is 2. We can also see this directly from the formula for the function. It is clear that the 3e2t term determines the growth rate since it is the dominant term as t → ∞.

4. Consider the function f t e−t cos 2t 3e−2t( ) = + The Laplace transform is F s ) = s 3(

( + )2 + + s+2 . This has poles 1 4 s = s −1 ± 2i, -2. The largest real

part among these is -1, so the exponential growth rate is -1.

Note that in item (4) in this set of examples the growth rate is negative because f (t) actually decays to 0 as t → ∞. We have the following Rule: 1. If f (t) has a negative exponential growth rate then f (t) → 0 as t → ∞.

2. If f (t) has a positive exponential growth rate then f (t) → ∞ as t → ∞.

4


5. An Example of What the Poles Don’t Tell Us

Consider an arbitrary function f (t) with Laplace transform F(s) and a > 0. Shift f (t) to produce g(t) = u(t − a) f (t a), which has Laplace transform G s(s) = e−as F(s). Since e−a

−does not have any poles, G(s) and

F(s) have exactly the same poles. That is, the poles can’t detect this type of shift in time.

5

Pole Diagrams

1. Definition of the Pole Diagram

The pole diagram of a function F(s) is simply the complex s-plane with an X marking the location of each pole of F(s).

Example 1. Draw the pole diagrams for each of the following functions. a) F1(s) = s+

12 b) F2(s) = s−

12 c) F3(s) = s2

1 +4

d) F4(s) = s2+6ss+10 e) F5(s) =

((s2+3)2+11 )(s+2)(s+4) f) F6(s) =

((s+3)2+1

1)(s−2)

Solution.

(a)

1 3−1−3

i

3i

−i

−3i

X

(b)

1 3−1−3

i

3i

−i

−3i

X

(c)

1− 3−3 −1

i

3i

i

−3i

X

X

(d)

1 3−1−3

i

3i

−i

−3i

X

X

(e)

1 3−1−3

i

3i

−i

−3i

X

X XX

( f )

1− 3−3X −1

i

3i

i

−3i

X X

For (d) we found the poles by first completing the square: s2 + 6s + 10 = (s + 3)2 + 1, so the poles are at s = −3 ± i.

Example 2. Use the pole diagram to determine the exponential growth rate of the inverse Laplace transform of each of the functions in example 1.

Solution. a) The largest pole is at -2, so the exponential growth rate is -2.b) The largest pole is at 2, so the exponential growth rate is 2.c) The poles are ±2i, so the largest real part of a pole is 0. The exponentialgrowth rate is 0.d) The largest real part of a pole is -3. The exponential growth rate is -3.e) The largest real part of a pole is -2. The exponential growth rate is -2.

Pole Diagrams OCW 18.03SC

f) The largest real part of a pole is 2. The exponential growth rate is 2.

Example 3. Each of the pole diagrams below is for a function F(s) which is the Laplace transform of a function f (t). Say whether (i) f (t) 0 as t ∞→ →(ii) f (t) ∞ as t ∞→ →(iii) You don’t know the behavior of f (t) as t 0,→

(a)

1 3−1−3

i

3i

−i

−3i

X

(b)

1 3−1−3

i

3i

−i

−3i

X X

X

(c)

1 3−1−3

i

3i

−i

−3i

X

X

X

(d)

1 3−1−3

i

3i

−i

−3i

X

(e)

1 3−1−3

i

3i

−i

−3i

X

X

X

( f )

1 3−1−3

i

3i

−i

−3i

X

X

X

X

X

X

(g)

1 3−1−3

i

3i

−i

−3i

X

X

Solution. a) Exponential growth rate is -2, so f (t) 0.→b) Exponential growth rate is -2, so f (t) 0.→c) Exponential growth rate is 2, so f (t) ∞.→d) Exponential growth rate is 0, so we can’t tell how f (t) behaves. Two examples of this: (i) if F(s) = 1/s then f (t) = 1, which stays bounded; (ii) if F(s) = 1/s2 then f (t) = t, which does go to infinity, but more slowly than any positive exponential. e) Exponential growth rate is 0, so don’t know the behavior of f (t). f) Exponential growth rate is 3, so f (t) ∞.→g) Exponential growth rate is 0, so don’t know the behavior of f (t). (e.g. both cos t and t cos t have poles at ±i.

2. The Pole Diagram for an LTI System

Definition: The pole diagram for an LTI system is defined to be the pole diagram of its transfer function.

Example 4. Give the pole diagram for the system

.. . x + 8x + 7x = f (t),

2

Pole Diagrams OCW 18.03SC

Therefore, the poles are s = −1, −7 and the pole diagram is

1−1−7

i

−i XX

where we take f (t) to be the input and x(t) the output. 1 1

Solution. The transfer function for this system is W(s) = = . s2 + 8s + 1 (s + 1)(s + 7)

Example 5. Give the pole diagram for the system

.. . . x + 4x + 6x = y,

where we consider y(t) to be the input and x(t) to be the output.

Solution. Assuming rest IC’s, Laplace transforming this equation gives us

(s2 + 4s + 6)X = sY. This implies X(s) = s2 + 4

ss + 6

Y(s) and the transfer s

function is W(s) = . This has poles at s = −2 ±√

2 i. s2 + 4s + 6

1−1 2−2

i

−i

2i

−2i

X

X

Figure: Pole diagram for the system in example 5.

3

Poles and Stability

Recall that the LTI system

p(D)x = f (1)

has an associated homogeneous equation

p(D)x = 0 (2)

In unit 2 we saw the following stability criteria. 1. The system is stable if every solution to (2) goes to 0 as t → ∞. In words, the unforced system always returns to equilibrium.

2. Equivalently, the system is stable if all the roots of the characteristic equation have negative real part.

1Now, since the transfer function for the system in (1) is the poles

p(s) of the system are just the characteristic roots. Comparing this with the stability criterion 2, gives us another way of expressing the stability criteria.

3. The system is stable if all its poles have negative real part.

4. Equivalently, the system is stable if all its poles lie strictly in the left half of the complex plane Re(s) < 0.

Criterion 4 tells us how to see at a glance if the system is stable, as illustrated in the following example.

Example. Each of the following six graphs is the pole diagram of an LTI system. Say which of the systems are stable.

(a)

1 3−1−3

i

3i

−i

−3i

XX

(b)

1 3−1−3

i

3i

−i

−3i

XX X

X

X

(c)

1 3−1−3

i

3i

−i

−3i

XXXX

Poles and Stability OCW 18.03SC

(d)

1 3−1−3

i

3i

−i

−3i

XX X X

(e)

1 3−1−3

i

3i

−i

−3i

X

X

X

X

X

(f)

1 3−1−3

i

3i

−i

−3i

X

X

X

Solution. (a), (c) and (e) have all their poles in the left half-plane, so they are stable. The others do not, so they are not stable.

2

Poles and Amplitude Response

We started the session by considering the poles of functions F(s), and saw that, by definition, the graph of |F(s)| went off to infinity at the poles. Since it tells us where |F(s)| is infinite, the pole diagram provides a crude graph of |F(s)|: roughly speaking, |F(s)| will be large for values of s near the poles. In this note we show how this basic fact provides a useful graphical tool for spotting resonant or near-resonant frequencies for LTI systems.

Example 1. Figure 1 shows the pole diagram of a function F(s). At which of the points A, B, C on the diagram would you guess |F(s)| is largest?

•C

1−1 2−2

i

−i

2i

−2i

X

X

X

X

•A

•B

Figure 2: Pole diagram for example 1.

Solution. Point A is close to a pole and B and C are both far from poles so we would guess point |F(s)| is largest at point A.

Example 2. The pole diagram of a function F(s) is shown in Figure 2. At what point s on the positive imaginary axis would you guess that |F(s)| is largest?

X 3i

X 2i

i

X

−2 X

−1

X

1 −i

−2i

−3i

2

Figure 2: Pole diagram for example 2.

Poles and Amplitude Response OCW 18.03SC

Solution. We would guess that s should be close to 3 i, which is near a pole. There is not enough information in the pole diagram to determine the exact location of the maximum, but it is most likely to be near the pole.

1. Amplitude Response and the System Function

Consider the system p(D)x = f (t). (1)

where we take f (t) to be the input and x(t) to be the output. The transfer function of this system is

1W(s) = . (2)

p(s)

When f (t) = B cos(ωt) the Exponential Response Formula from unit 2 gives the following periodic solution to (1)

xp(t) = B cos(ωt − φ)

, where φ = Arg(p(iω)). (3)|p(iω)|

If the system is stable, then all solutions are asymptotic to the periodic solution in (3). In this case, we saw in the session on Frequency Response in unit 2 that the amplitude response of the system as a function of ω is

1 g(ω) = (4)|p(iω)|.

Comparing (2) and (4), we see that for a stable system the amplitude response is related to the transfer function by

g(ω) = |W(iω)|. (5)

Note: The relation (5) holds for all stable LTI systems.

Using equation (5) and the language of amplitude response we will now re-do example 2 to illustrate how to use the pole diagram to estimate the practical resonant frequencies of a stable system.

Example 3. Figure 3 shows the pole diagram of a stable LTI system. At approximately what frequency will the system have the biggest response?

2

Poles and Amplitude Response OCW 18.03SC

X 3i

X 2i

i

X

−2 X

−1

X

1 −i

−2i

−3i

2

Figure 3: Pole diagram for example 3 (same as Figure 2).

Solution. Let the transfer function be W(s). Equation (5) says the amplitude response g(ω) = |W(iω)|. Since iω is on the positive imaginary axis, the amplitude response g(ω) will be largest at the point iω on the imaginary axis where |W(iω)| is largest. This is exactly the point found in example 2. Thus, we choose iω ≈ 3i, i.e. the practical resonant frequency is approximately ω = 3.

Note: Rephrasing this in graphical terms: we can graph the magnitude of the system function W(s) as a surface over the s-plane. The amplitude response of the system

|g(ω)

|= |W(iω)| is given by the part of the system

function graph that lies above the imaginary axis. This is all illustrated beautifully by the applet Amplitude: Pole Diagram explored in the next note in this session.

3




http://ocw.mit.edu


Amplitude Response: Pole Diagram Applet

Open the applet and play with it. The main graph window is actually a 3D plot that you can rotate with your mouse.

Exploring the Applet

The applet shows the pole diagram and amplitude response of the system .. .

x + bx + kx = k cos(ωt).

It is designed to illustrate the connection between the pole diagram and the amplitude response of the system.

To return the graph window to its original position click on the ’top’ button. The screen should look something like

The main graph window and the pole diagram in the lower right look the same. Notice that the real axis on both is green, the imaginary axis is yellow, and the poles are red. The amplitude response graph in the upper right is the same one we’ve seen in the Amplitude and Phase Second Order applets.

The amplitude response graph is the same color as the imaginary axis because A = |p(i

k ω)| , that is, because iω is on the imaginary axis. The yellow

��

Amplitude Response: Pole Diagram Applet OCW 18.03SC

dot in all the windows indicates the current value of ω. (Because cos(ωt) is the real part of eiω and e−iω, there are dots at both ±iω in the pole diagram and main graph.)

Play with the b, k and ω sliders to see how the poles and amplitude change. Now set b = 0.75 and k = 2, then move ω to the position where the system has the maximum amplitude response.

Now click on the ’side’ button and rotate the main graph so you can see all its features. It should look something like this:

The surface shown in the plot is the graph of the magnitude of the transfer function k The yellow curve on the surface above the imaginary

k

.p(s)

axis is therefore the plot of . Notice that this is the same graph asp(iω)

the amplitude response graph in the upper right of the applet. (The main graph also shows k for ω < 0. This is just the mirror image of thep(iω)

graph for ω > 0.

We can now explain how the main graph illustrates why choosing iω near a pole gives a big response: The yellow amplitude curve runs alongside the “mountains” that rise up near the poles. As iω gets close to a pole, the amplitude curve moves up the side of the mountain.

2

Amplitude Response: Pole Diagram Applet OCW 18.03SC

Questions 1. Why can’t the amplitude response become infinite by placing iω directly on a pole?

2. Why are the poles in the left half-plane (Re(s) < 0) for all choices of b and k?

3. Move b to 1.5 and k to 0.4. What happens to the poles? At what frequency is the amplitude response maximized?

4. Leave k at 0.4 and move b to 1.1. The poles are now a complex conjugate pair. What frequency gives the maximum amplitude response?

Answers 1. Because iω must stay on the imaginary axis, and the poles are not on the imaginary axis.

2. Because both b and k are positive, we know that the system is stable (since all second order systems with positive coefficients are stable). Therefore, all the poles must have negative real part, which places them in the left half-plane

3. At these settings of b and k the poles are real. The peak amplitude response occurs at iω = 0.

4. The peak amplitude is still at ω = 0. As we saw in the session on Frequency Response in unit 2, this system needs to be sufficiently lightly damped, not just underdamped, in order to have a practical resonant frequency.

3

Linear Systems: Introduction

Suppose you have want to model the populations of cats and mice on an island, say c(t) and m(t). Cats feed on mice, and will starve without them. Therefore, any differential equation describing the rate-of-change of c should also involve m. Equally, the rate-of-change of m depends on the current number of cats.

Up until now, we have studied differential equations with a single dependent variable. However, many real-life situations are modeled by collections of differential equations involving several dependent variables and their time-derivatives, the above scenario is one of them.

These are called systems. In this final unit, we will learn about some of the simpler ones: they will be of first order, which means that only the first time-derivatives of the dependents variables are involved. Also, there will usually only be two dependent variables. We will spend most of the unit studying linear systems, before devoting the final few sessions to nonlinear ones.

The present session introduces 2 × 2 linear systems; we will learn how to solve them by turning the problem into a second order ODE in one variable. Conversely, every second order ODE can be obtained from a system, and such a perspective can shed a new light on the ODE. Look out for the companion matrix and the applet Phase portraits: Matrix entry, which allows one to easily visualize the solutions to these sorts of systems.

Perhaps most importantly, we will learn to present both our system and its solutions using matrices and vectors, paving the way to the later sessions in this unit.

First order Linear Systems

1. Models: Two Examples

Example 1. Farmer Jones and farmer McGregor each have a field full of rabbits; farmer Jones’ field contains x(t) rabbits, and farmer McGregor’s y(t) ones; t, time, is measured in months. These rabbits breed fast: they show a net growth rate of 0.5 rabbits per rabbit per month. The rabbits can also hop over the hedge between the fields; the grass is greener in Farmer McGregor’s field, so Jones’ rabbits jump over a the rate of 0.2 per month, whereas McGregor’s jump at the rate of 0.1 per month.

We can illustrate this with the following flow diagram:

Jones McGregor

.5 .5

.2

.1

Putting everything together, the equations governing x and y are:

. x = .5x − .2x + .1y = .3x + .1y . y = .5y − .1y + .2x = .2x + .4y.

This is an example of a first order 2 × 2 linear system.

Example 2. Consider the equations:

R� = 41 R + J

J� = 17 + 43 J− 16 R

This is another first order 2 × 2 linear system. It is the result of an analysis by the MIT Humanities Department of the plot a famous Shakespeare play: R denotes Romeo’s love for Juliet, and J Juliet’s love for Romeo. What does this model mean? Let us try to work backwards.

The change in Romeo’s feelings towards Juliet is mostly determined by how she feels about him: J is the most important term in the expression for R�. His own feelings have a small reinforcing effect corresponding to the factor of 1/4 in front of the R.

First order Linear Systems OCW 18.03SC

On the other hand, Juliet is more complex. She has a healthy self-awareness. If she loves Romeo, that very fact causes her to love him more: this is where the (3/4)J term comes from. On the other hand, if he seems to love her, she gets frightened and starts to love him less â AS hence the ¸−(17/16)R term.

We shall revisit this example in sections text companion matrix and applet companion matrix, and analyze mathematically the solutions to this tragic model.

2. Solving a Linear System by Elimination

Definition. A two-by-two first order linear system of ODE’s with constant coefficients is a collection of equations:

. x = ax + by . y = cx + dy,

where a, b, c, and d are constants.

We’ll shorten this to first order linear system or even linear system.

Remark. Say you have dependent variables x1, . . . , xn; one can define n × n first order linear systems (with constant coefficients) in exactly the same way:

. x1 = a11x1 + . . . + a1nxn

. . . . xn = an1x1 + . . . + ann xn,

where the aij are constants.

In the next few sessions, we shall develop many tools for understanding 2 × 2 linear systems, both analytically and qualitatively. All these techniques generalize, in a fairly straightforward fashion, to n × n systems. However, this goes beyond the scope of this course. The interested reader could for instance consult the textbook by Edwards and Penney.

The naive way to solve a linear system of ODE’s with constant coefficients is by eliminating variables, so as to change it into a single higher order equation, in one dependent variable. One then solves this equation using the techniques for constant-coefficient ODE’s learned in unit 2. This is best illustrated with a worked example.

2


Example. Let us consider the system of equations from example 1. . x = 0.3x + 0.1y (1) . y = 0.2x + 0.4y. (2)

Step 1. Transform the equations to get a second order ODE for x. Use (1) to express y in terms of x:

. y = 10x − 3x (3)

. .. .Plug this into (2). (From equation (3) we get y = 10x − 3x). This gives

.. . . .. .10x − 3x = .2x + 4x − 1.2x ⇔ 10x − 7x + x = 0. (4)

Step 2. Solve the ODE for x.The characteristic equation for (4) is:

10s2 − 7s + 1 = 0 ⇔ s2 − .7s + .1 = 0,

which has roots r1 = .5 and r2 = .2. Thus we get two basic solutions,

x1 = e.5t and x2 = e.2t .

Step3. Solution for y.Each basic solution for x gives a corresponding solution for y, using equation (3)

y1 = 2e.5t and y2 = −e.2t .

Step 4. Using superposition we get the general solution

x(t) = c1e0.5t + c2e0.2t

y(t) = 2c1e0.5t − c2e0.2t

Remarks. 1. It is important to understand that the constants c1 and c2 are the same for x and y; this follows from equation (3). 2. For certain ci, there will be negatively-valued solutions; these are clearly not biologically significant: the model only holds for x,y ≥ 0. 3. We chose to eliminate y to have a second order equation in terms of x; we could just as well have chosen to eliminate x to get an equation in y. It might sometimes be computationally easier to go one way than the other; look out for this.

3


4. We started by solving systems by elimination because it reduces to our previous methods. This will not be our preferred technique. In fact, in both theoretical and especially numerical work it is usually preferable to go the opposite way and convert a higher order ODE into a system of first order equations and then use matrix methods.

4

Solving by Elimination

Exercise. Use the method of elimination to solve the following system.

. x = x + 3y . y = x − y.

Answer. Step 1. Let us eliminate x by solving the second equation for x. We get

. x = y + y (1)

.Replacing x everywhere by y + y in the first equation gives

.. y − 4y = 0. (2)

Step 2. The characteristic equation for (2) is (r − 2)(r + 2) = 0, so the general solution for y is

y = c1e2t + c2e−2t .

Step 3. From the solution for y and equation (1), that was originally used to eliminate x, we get x = 3c1e2t − c2e−2t .

Step 4. The solution to the system is thus

x = 3c1e2t − c2e−2t

y = c1e2t + c2e−2t .

Solving by Elimination

Exercise. Use the method of elimination to solve the following system.

. x = x + 3y . y = x − y.

Try to solve these problems and then look at the solutions.

Review of Vectors and Matrices

1. Vectors

A vector (or n-vector) is an n-tuple of numbers; they are usually real numbers, but we will sometimes allow them to be complex numbers. All the rules and operations below apply just as well to n-tuples of complex numbers. (In the context of vectors, a single real or complex number, i.e., a constant, is called a scalar.) As we are dealing with 2 × 2 linear systems, we are primarily interested in scalars and 2-vectors: ordered pairs of numbers.

The pair can be written horizontally as a row vector or vertically as a column vector. In these notes, it will almost always be a column. To save space, we will sometimes write the column vector as shown below; the small T stands for transpose, and means: change the row to a column.

a = (a, b) row vector a = (a, b)T column vector

These notes use boldface for vectors hope; in handwriting, place an arrow �a over the letter.

Vector operations. Here are two standard operations on vectors:

• addition: (a, b) + (c, d) = (a + c, b + d).

• multiplication by a scalar: c (a, b) = (ca, cb)

• scalar product: (a, b)(c, d) = ac + bd

2. Matrices

An m × n matrix A is a rectangular array of numbers (real or complex) having m rows and n columns. The element in the i-th row and j-th column is called the ij-th entry and written aij. The matrix itself is sometimes written (aij), i.e., by giving its generic entry, inside the matrix parentheses. We will be interested in matrices where m and n are at most 2.

Note that a 1 × 2 matrix is a row vector; an 2 × 1 matrix is a column vector.

Matrix operations.

• addition: if A and B are both m × n matrices, they are added by adding the corresponding entries; i.e., if A = (aij) and B = (bij), then A + B = (aij + bij).

Review of Vectors and Matrices OCW 18.03SC

• multiplication by a scalar: to get cA, multiply every entry of A bythe scalar c; i.e., if A = (aij), then cA = (caij).

• matrix multiplication: if A is an m × n matrix and B is an n × kmatrix, their product AB is an m × k matrix, defined by using thescalar product operation:

ij-th entry of AB = (i-th row of A)(j-th column of B)T

where the scalar product of two 1-vectors is just their normal product.

The definition makes sense since both vectors on the right are vectors of the same length n. In what follows, the most important cases of matrix multiplication will be: (i) A and B are square 2 × 2 matrices. In this case, multiplication is always possible, and the product AB is again an 2 × 2 matrix. (ii) A is an 2 × 2 matrix and B = b, a column 2-vector. In this case, the matrix product Ab is again a column 2-vector.

Laws satisfied by the matrix operations. For any matrices for which the products and sums below are defined, we have (A B) C = A (B C) (associative law)A (B + C) = A B + A C, (A + B) C = A B + A C (distributive laws)A B �= B A (commutative law fails in general)

The identity matrix I is the 2 × 2 matrix with 1’s on the main diagonal (upper left and bottom right), and 0’s elsewhere. If A is an arbitrary 2 × 2 matrix, it is easy to check from the definition of matrix multiplication that

AI = A and IA = A.

The exercises later in this session should help you get familiar with all these concepts.

2

� �

� � � � � �

� � � �

� �

� � � �

� �

Vectors and Matrices

1. Complete the following vector operations.

a) (1, 2)T . Answer. 1

(transpose).2

a 3 a + 3b) + . Answer. .b 4 b + 4

5 5cc) c . Answer. .

6 6c

2. Compute the following matrix products:

1a) (a, b) . Answer. a + 2b (1 × 2 times 2 × 1 = 1 × 1.)

2

a a 2ab) (1, 2). Answer. (2 × 1 times 1 × 2 = 2 × 2.)b b 2b � ��

a b 1 a + 2bc) . Answer. . c d 2 c + 2d

a bd) (1, 2) . Answer. (a + 2c, b + 2d). c d � ��

a b 1 2 a + 3b 2a + 4be) . Answer. . c d 3 4 c + 3d 2c + 4b

� � � �

� �

� �

� �

Vectors and Matrices

1. Complete the following vector operations.

a) (1, 2)T .

b) a b

+ 3 4

. � �

c) c 5 6

.

2. Compute the following matrix products:

1a) (a, b) .

2

ab) (1, 2).b � ��

a b 1c) . c d 2

a bd) (1, 2) . c d � ��

a b 1 2e) . c d 3 4


� � � �

� . �

� �

� �

Describing a First order System Using Matrix Notation

1. Description of the Equation

A general 2 × 2 linear system is given by:.x = ax + by . y = cx + dy

The terms have been arranged in a suggestive manner. We can express this system using matrices and vectors: � . � � ��

x a b x y . = c d y

.

We can present this in the following even more compact form. a b x

Let A = and write u for the column vector . c d y

We have u. (t) =

x. (t) and the system is simply u . = Au.

y(t)

Example 1. Our favorite system, governing the rabbit populations in farmers Jones’ and McGregor’s fields, was

. x = 0.3x + 0.1y . y = 0.2x + 0.4y,

which has matrix form � . � � �� x 0.3 0.1 x . 0.3 0.1 y . =

0.2 0.4 y or u = Au, where A =

0.2 0.4

2. Description of the Solution

To describe the solution, we will use the column vector u(t) = x(t)

. y(t)

Example 2. Earlier we used the method of elimination to solve the system in example 1. We found x(t) = c1e0.5t + c2e0.2t , y(t) = 2c1e0.5t − c2e0.2t . Rewriting this in vector form we have

c1e0.5t + c2e0.2t u(t) =

2c1e0.5t − c2e0.2t .

� � � �

� � � �

Describing a First order System Using Matrix Notation OCW 18.03SC

We can rewrite this as

0.5t 1 0.2t 1 u(t) = c1e + c2e ,

2 −1

which is a clearer way of presenting it. Let

u1(t) = e.5t 21

and u2(t) = e.2t −1

1 .

The column vectors u1(t) and u2(t) are both solutions. Since they both involve only one form of exponential, they are sometimes known as basic independent solutions, or normal modes. The general solution is a linear combination of them. We will learn much more about normal modes in the sessions on matrix methods and the phase portrait.

Remark. As with linear second order ODE’s in unit 2, the general solution to to a 2 linear system should always consist of linear combinations of two truly different solutions. It is not necessary , but usually our techniques will make these two solutions the normal modes.

3. Geometry of the Solutions

Suppose you want to plot a solution u(t). As time increases, it traces a curve in the xy-plane.

Example. The solution u1(t) traces a ray that passes through (1, 2) at t = 0 and move off towards infinity in a straight line, with exponential speed. There is another ray through (1, −1), corresponding to the solution u2(t). (This is tricky: the exponential in the formula for u1 might make you think the trajectory is curved. However, if you look carefully at the formula you will see u1(t) is always a multiple of the vector (1, 2)T.)

The applet Linear phase portrait: matrix entries will allow us to visualise this nicely, and get a feel for other sorts of trajectories. Later in this session you will look at this applet.

2

� � � � � � � �

� � � �

Matrix Notation

Exercise. The system (which we looked at earlier)

. x = x + 3y . y = x − y

has general solution

x = 3c1e2t − c2e−2t

y = c1e2t + c2e−2t .

Re-express this using matrix notation. What are two independent basic solutions?

Answer. The matrix form for the system is � . � � �� x 1 3 x y . =

1 −1 y .

and the solution can be expressed as

x 3c1e2t − c2e−2t 2t 3

+ c2e−2t −1 y

= c1e2t + c2e−2t = c1e1 1

.

Two basic independent particular solutions are

2t 3 e−2t −1 e and .1 1

Matrix Notation

Exercise. The system (which we looked at earlier)

. x = x + 3y . y = x − y

has general solution

x = 3c1e2t − c2e−2t

y = c1e2t + c2e−2t .

Re-express this using matrix notation. What are two independent basicsolutions?


� �

� �

Linear Phase Portraits: Matrix Entry

Open up the applet Linear phase portraits: Matrix Entry. Unselect companion matrix. This enables you to enter values for a, b, c, and d, between −4 and 4. These numbers are the entries of the matrix A of a first order linear system, which you see on the left. You can also see some of the solution curves u(t) in the xy-plane. If you click on a point in the plane, the curve through that point will appear.

Ignore all the other features of the applet for now; they will be used, and explained, later on.

0.3 0.11. Set the matrix to be . This corresponds to the system we

0.2 0.4 looked earlier in this session in the note called describing a first order system using matrix notation (about farmers Jones and McGregor). Can you see the two rays that were described at the end of that section? What do the other curves look like? Do you notice anything about their behavior near zero? Examine the expression for the solution again, to see whether you can find an explanation for this.

1 32. Set the matrix to be

1 −1 . This corresponds to the system that we

studied earlier in this session. What do the rays correspond to? What do the other curves look like? Examine the expression for the solution again. Why does one ray point towards zero, and one away from it?

By the end of session phase portrait, you should have a thorough understanding of the geometry of these examples. In the mean time, this applet will be a very important tool, notably for illustrations.

� �

The Companion Matrix

1. Introduction

We started the session by using elimination to convert a first order linear system to an ordinary differential equation in one of the variables. In this note we’ll see how to go the other way. That is, to convert a second order ODE to a 2 × 2 system of first order equations. Looking at a second order system this way gives us an important way to visualize the second order equation.

Since elimination takes us in the other direction we might call this process anti-elimination. This term is not entirely standard, but it will serve us nicely.

We won’t cover it in this course, but every method we saw for numerical solutions to first order ODE’s goes through without change to systems of first order equations. Thus, anti-elimination allows us to use numerical techniques on second order ODE’s. Indeed it can be used to convert any order ODE to a system of first order equations.

2. Anti-elimination and the Companion Matrix

Suppose we have a second order homogeneous linear equation, say .. . x + bx + kx = 0. (1)

We can derive a first order linear system from this, by using the following trick. Introduce a second variable defined by

. y = x

. . ..Substituting y = x and y = x into equation (1) we get

. . y + by + kx = 0 ⇒ y = −kx − by.

We now have a first order system . x = y. y = −kx − by

The corresponding coefficient matrix is

0 1A = . −k −b

� �

The Companion Matrix OCW 18.03SC

This is called the companion matrix of the equation (1). In this case, the

solution vector u(t) = xx. ((tt))

. It records both the solution to (1) and its

derivative.

Example 1. Consider the equation ..

x . + 4

5 x = 0. The companion matrix � � x −0 1

is . −5/4 1

What do solutions of this system look like? The characteristic polynomial of the second order equation is

p(s) = s2 − s + 5/4 = (s − (1/2))2 + 1.

So, the roots are r = (1/2) ± i. From unit 2, the general solution in amplitude-phase form is given by

x(t) = Cet/2 cos(t − φ),

where C and φ are constants. These oscillate under an exponentially growing envelope. The derivative does the same, but is off phase. This means .that the trajectory traced out by (x, y) = (x, x) is an expanding spiral.

Example 2. (Elimination followed by anti-elimination) Earlier in this session, we learned how to solve systems by elimination. What happens when we do elimination followed by anti-elimination? Let us re-visit the example about Romeo and Juliet. The second order system describing their mutual feelings was

1R� = R + J (2)

4 17 3

J� = − 16

R + 4

J (3)

Let us first eliminate J, to get a second order equation in R. From (2), J = R� − (1/4)R. Substituting this into (3) gives

1 17 3 3 5R�� −

4 R� = −

16 R +

4 R� −

16 R ⇔ R�� − R� −

4 R = 0. (4)

(We saw this equation in example 1.) Applying anti-elimination gives

R� = Y

Y� = 54 R + Y.

2

The Companion Matrix OCW 18.03SC

This is a different system from the one we started with. The companion matrix of the ODE (4) is different from the original matrix associated to the system.

3

� �

Companion Matrices

Open up the applet Linear phase portrait: Matrix entry again and uncheck companion matrix. Let’s see what happens to Romeo and Juliet. Enter the matrix corresponding to their system:

0.25 1A = . −1.125 0.75

What do the trajectories look like? How can they be interpreted?

Remember that x corresponds to R, and y to J. Let us start at (.5,0): Romeo is fond of Juliet, but she is neutral towards him. However, she does notice that he is fond of her and this makes her somewhat hostile. As she becomes more distant, his affection wanes. Eventually, he is neutral and she really doesn’t like him: the trajectory is (roughly) at (0,-1). This continues; presently he stays away from her, and this very fact makes her more interested. She warms to him; he notices, and R�, while still negative, start to increase. Eventually she is neutral and the trajectory crosses the x-axis again, around (-2.4,0). He then starts to feel better towards her, but still stays away; now both his attitude and hers cause her to feel progressively more well disposed towards him. This causes him to continue to warm to her. Following this around, you wind up at J = 0 again, but now R has increased: it is already outside the applet’s screen. This is a cyclical evolution, but with each cycle the intensity of feelings increases. We all know the sad outcome.

Now check companion matrix again. This corresponds to doing elimination followed by anti-elimination, as in example 2 in the note on the companion matrix. What do the trajectories look like? This should confirm the answer that we obtained analytically for example 1 in section text: companion matrix. How did the trajectories change when you checked ’companion matrix’? Note that x is still R, but y is now R�, not J. The fact that the pictures did not change dramatically is no accident. We will learn why in the phase portraits session.

Play around with the applet a bit more, entering some other systems and then clicking companion matrix (thus performing elimination followed by anti-elimination). What sorts of things do you notice?

� �

� �

� �

� �

Elimination Followed by Anti-elimination

Quiz: Elimination converts the system

. x = 6x + 5y . y = x + 2y

.. .to the ODE x − 8x + 7x = 0.

What is the coefficient matrix of the system that anti-elimination converts it back to?

Choices:

1 2 a.

6 5

6 5b.

1 2

0 1 c. −7 8

d. None of these.

0 1Answer: (c) −7 8

.

Note that neither a. nor b. has the form of a companion matrix.

� �

� �

� �



. x = 6x + 5y . y = x + 2y

.. .to the ODE x − 8x + 7x = 0.


Choices:

1 2 a.

6 5

6 5b.

1 2

0 1 c. −7 8

d. None of these.




. x = 6x + 5y . y = x + 2y

.. .to the ODE x − 8x + 7x = 0.



Matrix Methods: Eigenvalues and Normal Modes

In the previous session, we learned how to solve first order systems by elimination; we presented out results using vectors and matrices. This leads to a new approach, pursued in this session; it turns out the system can be largely understood by examining features of the coefficient matrix, notably its eigenvalues and the corresponding eigenvectors.

These are terms belonging to the field of linear algebra; the whole session is fairly algebra-heavy, so we start out by building up some background knowledge. Once this is in place, we develop our approach –first studying a long example, then the general case. It turns out that this splits into subcases classified by features of the eigenvalues of the coefficient matrix: these can be real and distinct; complex; real and repeated. Each case will be accompanied with a worked example.

This session is a quite long, probably close to two normal sessions in length. This is because we cover all the possible cases: real, complex and repeated eigenvalues. Please don’t rush through it. Take your time and learn this well. Eigenvalues and eigenvectors are not only the key idea in the rest of this unit they are also ubiquitous throughout math, science and engineering.

� �

� �

Vectors and Matrices: Homogeneous Systems

This is meant as a follow-up on the review of vectors and matrices in the previous session.

1. More on matrices

Associated with every square matrix A is a number, written |A| or |det(A)| called the determinant of A. For these notes, it will be enough if you can calculate the determinant of 2 × 2 matrices, which is as follows:

a bdet = ad − bc c d

The trace of a square matrix A is the sum of the elements on the main diagonal; it is denoted tr(A):

a b tr = a + d c d

Remark. Theoretically, the determinant should not be confused with the matrix itself. The determinant is a number, the matrix is the square array. But, everyone puts vertical lines on either side of the matrix to indicate its determinant, and then uses phrases like ”the first row of the determinant,” meaning the first row of the corresponding matrix.

An important formula which everyone uses and no one can prove is

det(A B) = detA detB. (1)·

2. Homogeneous 2 × 2 systems

Matrices and determinants were originally invented to handle, in an efficient way, the solution of a system of simultaneous linear equations. This is still one of their most important uses. We give a brief account of what you need to know for now. We will restrict ourselves to square 2 ×2 homogeneous systems; they have two equations and two variables (or “unknowns”, as they are frequently called). Our notation will be:

A = (aij), a square 2 × 2 matrix of constants,

x = (x1, x2)T , a column vector of unknowns;

� �

Vectors and Matrices: Homogeneous Systems OCW 18.03SC

then the square system

a11x1 + a12x2 = 0 a21x1 + a21x2 = 0

can be abbreviated by the matrix equation

A x = 0. (2)

This always has the solution x = 0, which we call the trivial solution. The question is: when does it have a nontrivial solution?

Theorem. Let A be a square matrix. The equation

Ax = 0 has a nontrivial solution detA = 0 (i.e., A is singular). ⇔ (3)

We will use this, but not prove it in this course.

3. Linear independence of vectors

Conceptually, linear independence of vectors means each one provides something new to the mix. For two vectors this just means they are not zero and are not multiples of each of other.

Example 1. (1, 2) and (3, 4) are linearly independent.

Example 2. a = (1, 2) and b = (2, 4) are linearly dependent because b is a multiple of a. Notice that if we take linear combinations then b doesn’t add anything to the set of vectors we can get from a alone.

Example 3. a = (1, 2) and b = (0, 0)are linearly dependent because b is a multiple of a, i.e., b = 0 a.

Determinantal criterion for linear independence Let a = (a1, a2) and b = (b1, b2) be 2-vectors, and A the square matrix having these vectors for its rows (or columns). Then

a, b are linearly independent detA = 0. (4)⇔ �

Let us re-visit our previous examples.

1 2Examples. 1. det

3 4 = 4 − 6 = −2 �= 0. Therefore, (1, 2) and (3, 4)

are linearly independent.

2

� �

� �

Vectors and Matrices: Homogeneous Systems OCW 18.03SC

1 22. det = 1 × 4 − 2 × 2 = 0. Therefore, (1, 2) and (2, 4) are linearly

2 4 dependent.

1 23. det = 1 × 0 − 2 × 0 = 0. Therefore, (1, 2) and (0, 0) are linearly

0 0 dependent.

Remark. The theorem on square homogeneous systems (3) follows from this criterion. We will prove neither.

Two linearly independent 2-vectors v1 and v2 form a basis for the plane: every 2-vector w can be written as a linear combination of v1 and v2. That is, there are scalars c1 and c2 such that

c1v1 + c1v2 = w

Remark. All of the notions and theorems mentioned in this section generalize to higher n (and a larger collection of vectors), though we will not need them.

3

� � � � � �

� � � �

Linear Algebra

1. Compute determinants of the following matrices. 1 2 a b 1 2

a) b) c) .3 4 c d −2 −4

Answer. a) -2 b) ad − bc c) 0.

2. Find all solutions to Ax = 0 for 1 2 1 2

a) −2 −4 b)

3 4.

Answer. a) All multiples of (−2, 1)T . b) 0 (zero-vector) only.

3. Which of the following pairs of vectors are linearly independent?a) (1, 0) and (1, 1)b) (2, 5) and (1, 3)c) (1, 3) and (−2, −6)?

Answer. a) and b), but not c): The pairs in (a) and (b) are not multiples of each other. In (c) (−2, −6) = −2(1, 3).

� � � � � �

� � � �

Linear Algebra

1. Compute determinants of the following matrices. 1 2 a b 1 2

a) b) c) .3 4 c d −2 −4

2. Find all solutions to Ax = 0 for 1 2 1 2

a) b) . −2 −4 3 4

3. Which of the following pairs of vectors are linearly independent?a) (1, 0) and (1, 1)b) (2, 5) and (1, 3)c) (1, 3) and (−2, −6)?

� � � �

� � � �

Motivation and Derivation: Worked Example

Consider the system we studied in several examples in the opening session of unit 4: � � � ��

x� 1 3 x y� =

1 −1 y .

We are now going to show a new method of solving this system, which makes use of the matrix form for writing it. Recall that two modal solutions to the system are

e2t 3 and e−2t −1

.1 1

Based on this, our new method is to look for solutions of the form

x = eλt a1 (1)y a2

where a1, a2 and λ are unknown constants. We substitute this into the system to determine what these unknown constants should be. This gives � � � ��

λeλt aa

1

2 = eλt 1

1 −31

aa

1

2 (2)

We can cancel the factor eλt from both sides, getting � � � �� a1 1 3 a1λ a2

= 1 −1 a2

(3)

This is a matrix equation for the three unknowns. It is not very clear how to solve it. When faced with equations in unfamiliar notation, a reasonable strategy is to rewrite them in more familiar notation. If we try this, we get the pair of equations

λa1 = a1 + 3a2

λa2 = a1 − a2 .

Technically speaking, these are a pair of nonlinear equations in three variables. The trick in solving them is to look at them as a pair of linear equations in the unknowns ai, with λ viewed as a parameter. If we think of them this way, it immediately suggests writing them in standard form

(1 − λ)a1 + 3a2 = 0 a1 + (−1 − λ)a2 = 0.

(4)

��

� � � � � � � �

� � � �

Motivation and Derivation: Worked Example OCW 18.03SC

In this form, we recognize them as forming a square system of homogeneous linear equations. According to our theorem on square homogeneous systems they have a non-zero solution for the a’s if and only if the determinant of coefficients is zero:

1 − λ 3 1 −1 − λ

= 0.

After calculation of the determinant this becomes the equation

λ2 − 4 = 0 .

The roots of this equation are 2 and −2. What the argument shows is that the equations (4) (and therefore also (2)) have non-trivial solutions for the a’s exactly when λ = 2 or λ = −2. To complete the work, we see that for these values of the parameter λ, the system (4) becomes respectively

−a1 + 3a2 = 0 3a1 + 3a2 = 0 a1 − 3a2 = 0 a1 + a2 = 0 (5)

(for λ = 2) (for λ = −2)

Remark. It is of course no accident that in each case the two equations of the system become dependent, i.e., one is a constant multiple of the other. If this were not so, the two equations would have only the trivial solution (0, 0). All of our effort has been to locate the two values of λ for which this will not be so. The dependency of the two equations is thus a check on the correctness of the value of λ.

To conclude, we solve the two systems in (5). This is best done by assigning the value 1 to one of the unknowns, and solving for the other. First try a1 = 1; if that does not work (in which case, the solution to (5) will have a1 = 0), try a2 = 1. We get

a1 3 a1 1 a2

= 1

for λ = 2; a2 = for λ = −2,−1

which gives us, in view of (1), the two solutions:

e2t 3 and e−2t 1

,1 −1

which are essentially the two solutions we had found previously by the method of elimination.

2

Motivation and Derivation: Worked Example OCW 18.03SC

Remarks. 1. With the elimination method, the basic normal solutions could be multiplied by an arbitrary non-zero constant without changing the validity of the general solution. Here, this corresponds to the fact that we get to select an arbitrary value of one of the a’s (the other value then being determined).

3. Is there some way of passing from (3) (the point at which we were temporarily stuck) to (4) by using matrices, without writing out the equations separately? The temptation in (3) is to try to combine the two column vectors a by subtraction, but this is impossible as the matrix equation stands. If we rewrite it however as � ��

λ 0 a1 1 3 a1 0 λ a2

= 1 −1 a2

,

it now makes sense to subtract the left side from the right. Using the distributive law for matrix multiplication, this becomes � ��

1 − λ 3 a1 = 0

,1 −1 − λ a2 0

which is just the matrix form for (4). The trick therefore was in (3) to replace the scalar λ by the diagonal matrix λ I .

3

� � � � � �

General Case: Eigenvalues and Eigenvectors

We are now ready to tackle the general case of a linear 2 × 2 system: . x = ax + by . y = cx + dy ,

where the a, b, c, d are constants. We will be following exactly the strategy that we laid out in the previous note. These are key concepts for the rest of the unit, and you should take the time to absorb them.

We want to learn to write the system efficiently in matrix form. So, throughout the derivation, we will give the expanded matrix form of our manipulations on the left, and the abridged form on the right. For example, our system is: � . � � ��

x a b x . y . = c d y

⇔ x = Ax (1)

Following the method in the previous note, we look for solutions to our system having the form

x λt a1 a1eλt λt

y = e a2

= a2eλt ⇔ x = e a,

where a1, a2, and λ are unknown constants. We substitute this into the system (1) to determine these unknown constants. Since D(aeλt) = λaeλt , we arrive at � � � ��

λt a1 λt a b a1 λt λt A a1λe = e λe a = e . a2 c d a2

⇔ a2

We can cancel the factor eλt from both sides, getting � � � �� a1 a b a1 a1λ = λa = A . a2 c d a2

⇔ a2

As it stands, we cannot combine the two sides by subtraction, since the scalar λ cannot be subtracted from the square matrix on the right. As in the previously worked example, the trick is to replace the scalar λ by the diagonal matrix λI. This gives � ��

λ 0 a1 a b a1 a1 = λIa = A .0 λ a2 c d a2

⇔ a2

� � � �

��

� �

General Case: Eigenvalues and Eigenvectors OCW 18.03SC

If we now proceed as we did in the example, subtracting the left side from the right one and using the distributive law for matrix addition and multiplication, we get a 2 × 2 homogeneous linear system of equations: ��

a − λ c d − λ

b aa

1

2 =

00

⇔ (A − λI)a = 0/

Written out without using matrices, the equations are

(a − λ)a1 + ba2 = 0 (2)ca1 + (d − λ)a2 = 0.

According to the theorem on square homogeneous systems this system has a non-zero solution for the a’s if and only if the determinant of the coefficients is zero, i.e.,

= 0 = 0.⇔ |A − λI| a − λ bc d − λ

Evaluating the determinant we get a quadratic equation in λ:

λ2 − (a + d)λ + (ad − bc) = 0 .

Definition. This is called the characteristic equation of the matrix

a bA = c d

and if often denoted pA(λ). Its roots λ1 and λ2 are called the eigenvalues or characteristic values of the matrix A.

Remark. In calculating the characteristic equation notice that

ad − bc = det A a + d = tr A.

Using this, the characteristic equation for a 2 × 2 matrix A can be written as

λ2 − (tr A) λ + det A = 0 .

In this form, the characteristic equation of A can be written down by inspection; you don’t need the intermediate step of writing down |A − λI| = 0.

Remark. Abridged vs. expanded notation In the manipulations above, the matrix notation on the right is compact

2


to write, which makes the derivation look simpler. On the other hand, its chief disadvantage for beginners is that it is very compressed. Practice writing the sequence of matrix equations so you get some skill in using the notation. Until you acquire some confidence, keep referring to the written-out form on the left, so you are sure you understand what the abridged form is actually saying.

There are now various cases to consider, according to whether the eigenvalues of the matrix A are: 1. two distinct real numbers, 2. a single repeated real number, 3. a pair of conjugate complex numbers.

We begin with the first case: for the rest of this note, the eigenvalues are two distinct real numbers λ1 and λ2.

1. Real distinct eigenvalues

To complete our work, we have to find the solutions to the system (2) corresponding to the eigenvalues λ1 and λ2. Formally, the systems become

(a − λ1)a1 + ba2 = 0 (a − λ2)a1 + ba2 = 0 ca1 + (d − λ1)a2 = 0 ca1 + (d − λ2)a2 = 0

(3)

The solutions to these two systems are column vectors, for which we will typically use v.

Definition. The respective solutions a = v1 and a = v2 to the systems (3) are called eigenvectors (or characteristic vectors) corresponding to the eigenvalues λ1 and λ2.

Remarks. 1. If the work has been done correctly, in each of the two systems in (3), the two equations will be dependent, i.e., one will be a constant multiple of the other. Why? The two values of λ have been selected so that in each case the coefficient determinant A − λI will be zero, which means the equations will be dependent.

2. The solution v is determined only up to an arbitrary non-zero constant factor: if v is an eigenvector for λ, then so it cv, for any real constant c; because of this, the line through v is sometimes called an eigenline. A convenient way of finding the eigenvector v is to assign the value 1 to one of the ai, then use the equation to solve for the corresponding value of the other ai. (First try a1 = 1; if that does not work, then a2 = 1 will.)

Once the eigenvalues and their corresponding eigenvectors have been

3

� �


found, we have two independent solutions to the system (1). They are

x1(t) = eλ1tv1, x2(t) = 2eλ2tv2, where xi(t) = xi . yi

Definition. In science and engineering applications, these are usually called the normal modes.

Using superposition, the general solution to the system (1) is

x = c1x1 + c2x2 = c1eλ1tv1 + c22eλ2tv2.

Remarks. 1. The normal nodes often have physical interpretations; this means that it is sometimes possible to find them just by inspection of the physical problem.

2. In the compact notation, the definitions and derivations are valid for square systems of any size. Thus, for instance, you know how to solve a 3 times3 system, if its eigenvalues turn out to be real and distinct. We won’t consider any such systems in these notes, though.

We will apply these techniques to a worked example in the next note in this session titled Worked Example: Distinct Real Roots.

4

� �

� �

Worked Example: Distinct Real Roots

Problem. Find the general solution to

. u = Au, where A =

−2 1 . −4 3

Find the solution with initial conditions u(0) = (1, 0)T . Throughout, comments are given in italics.

Solution. Step 0. Write down A − λIEven if you find the characteristic equation of A using its trace and determinant,you will need this later, for finding eigenvectors. Most students find it useful towrite it down clearly at the start of the question.

A − λI = −2 −−4

λ 3 −

1 λ

.

Step 1. Find the characteristic equation of A.We use the method involving the trace and determinant of A.

tr(A) = −2 + 3 = 1

det(A) = −2 × 3 − 1 × (−4) = −6 + 4 = −2

Thus pA(λ) = det(A − λI) = λ2 − λ − 2.

Step 2. Find the eigenvalues of A.These are the roots of the characteristic equation. we complete the square. (Wecould also have used the quadratic formula.)

pA(λ) = (λ − 1/2)2 − 9/4.

The roots are 1/2 ± 3/2, so λ1 = −1 and λ2 = 2.

Step 3. Find associated eigenvectors.3a. Eigenvector for λ1. This is vector a = (a1, a2)T that must satisfy� ��

(A + I)a = 0 −2 + 1 1 a1 =

0 ⇔ −4 3 + 1 a2 0� �� −1 1 a1 0

= ⇔ � −4 4 a2 0 −a1 + a2 = 0 ⇔ −4a1 + a2 = 0

� � � �

� � � �

� � � �

Worked Example: Distinct Real Roots OCW 18.03SC

Check: one equation is a multiple of the other, as should be the case. This is a good sign. Setting a1 = 1 gives a2 = 1; thus one eigenvector for λ1 is (1, 1)T .

3b. Eigenvector for λ2. This is a vector (a1, a2)T that must satisfy: � ��

(A − 2I)a = 0 −2 − 2 1 a1 =

0 −4a1 + a2 = 0 ⇔ −4 3 − 2 a2 0 ⇔ −4a1 + a2 = 0

Check: one equation is a (trivial) multiple of the other.Setting a1 = 1 gives a2 = 4. Thus, one eigenvector for λ2 is (1, 4)T .

Step 4. Normal modes and general solution

The normal modes are e−t 1 and e2t 1

.1 4

and the general solution is:

u(t) = c1e−t 1 + c2e2t 1

.1 4

Step 5. Solution matching IC.We solve for c1 and c2 using our initial condition. From our expressionfor the general solution, u(0) = c1(1, 1)T + c2(1, 4)T = (c1 + c2, c1 + 4c2)T .Thus the initial condition u(0) = (1, 0)T gives:

c1 + c2 = 1 c1 + 4c2 = 0

⇔ c2 = −1/3, c1 = 4/3

The solution we were asked for is:

u(t) = 34

e−t 11 −

31

e2t 41

.

2

� �

Matrix/Vector Applet

Let us revisit the Matrix/Vector applet.

1 21. Set the matrix to . Can you find a non-zero input vector that

2 4 produces zero output? What’s the determinant, and why does your finding make sense? Find other matrices where this is the case, and some where it is not.

2. In the previous session, we found some cases where the input vector lined up with the output, and noted that scaling the input did not change this. These are eigenvectors, and the line they span is called an eigenline. Note that if the eigenvalue is negative, the input and output vectors are lined up, but in point in opposite directions.

Can you find:a) a matrix with exactly two eigenlines;b) a matrix with exactly one eigenline;c) a matrix with no eigenlines;d) a matrix where all lines are eigenlines?

We have already seen some examples, and will see more in the later notes: (a) corresponds to the case of a matrix with two distinct real eigenvalues; (b) to the case of a defective repeated eigenvalue; (c) to the case of complex eigenvalues, and (d) to the case of a complete repeated eigenvalue.

Complex Eigenvalues

1. Complex Eigenvalues

In the previous note, we obtained the solutions to a homogeneous linear system with constant coefficients

. x = A x

under the assumption that the roots of its characteristic equation

|A − λI| = 0,

— i.e., the eigenvalues of A — were real and distinct.

In this section we consider what to do if there are complex eigenvalues. Since the characteristic equation has real coefficients, its complex roots must occur in conjugate pairs:

λ = a + bi, λ ¯ = a − bi .

Let’s start with the eigenvalue a + bi. According to the solution method described in the note Eigenvectors and Eigenvalues, (from earlier in this session) the next step would be to find the corresponding eigenvector v, by solving the equations

(a − λ)a1 + ba2 = 0 ca1 + (d − λ)a2 = 0

for its components a1 and a2. Since λ is complex, the ai will also be complex, and therefore the eigenvector v corresponding to λ will have complex components. Putting together the eigenvalue and eigenvector gives us formally the complex solution

x = e(a+bi)tv. (1)

Naturally, we want real solutions to the system, since it was real to start with. To get them, the following theorem tells us to just take the real and imaginary parts of (1). (This theorem is exactly analogous to what we did with ordinary differential equations.)

.Theorem. Given a system x = Ax, where A is a real matrix. If x = x1 + i x2 is a complex solution, then its real and imaginary parts x1, x2 are also solutions to the system.

� �

� � � �

Complex Eigenvalues OCW 18.03SC

Proof. Since x1 + i x2 is a solution, we have

(x1 + i x2)� = A (x1 + i x2) = Ax1 + i Ax2.

Equating real and imaginary parts of this equation,

x1� = Ax1 , x2

� = Ax2 ,

which shows exactly that the real vectors x1 and x2 are solutions to x� = Ax .

Example. Find the corresponding two real solutions to x� = Ax if a complex eigenvalue and corresponding eigenvector are

i λ = −1 + 2i , v =

2 − 2i .

Solution. First write v in terms of its real and imaginary parts:

0 1 v = + i

2 −2

The corresponding complex solution x = eλt v to the system can then be written ��

x = e−t� cos(2t) + i sin(2t)

� 0 + i 1

.2 −2

Now, using the theorem, the real and imaginary parts of x are ��

x1 = e−t 0 cos(2t) − i 1

sin(2t) = e−t − sin(2t) ,

2 −2 2 cos(2t) + 2 sin(2t) ��

e−t 1 0 e−t − cos(2t)x2 = −2 cos(2t) − i

2 sin(2t) = −2 cos(2t) + 2 sin(2t) .

These are two distinct real solutions to the system.

In general, if the complex eigenvalue is a + bi, to get the real solutions to the system, we write the corresponding complex eigenvector v in terms of its real and imaginary part:

v = v1 + i v2, where v1, v2 are real vectors;

(study carefully in the example above how this is done in practice). Then we substitute into (1) and calculate as in the example:

x = eat(cos(bt) + i sin(bt)) (v1 + iv2),

2

� �

� �


so that the real and imaginary parts of x give respectively the two real solutions

x1 = eat(v1 cos(bt) − v2 sin(bt)) , x2 = eat(v1 sin(bt) + v2 cos(bt)) . (2)

These solutions are linearly independent: they are two truly different solutions. The general solution is given by their linear combinations

c1x1 + c2x2 .

Remarks 1. The complex conjugate eigenvalue a − bi gives up to sign the same two solutions x1 and x2.

2. The expression (2) was not written down for you to memorize, learn, or even use; the point was just for you to get some practice in seeing how a calculation like the one in our example looks when written out in general. To actually solve ODE systems having complex eigenvalues, imitate the procedure in the following example.

2. Worked Example

. 1 2Problem. Solve u = Au, where A = . −2 1 Comments are given in italics; the steps initially follow those in section worked example: real distinct eigenvalues, then diverge.

Solution. � �

Step 0. Write down A − λI: A − λI = 1 −−

λ 2 1 − λ

2.

Step 1. Find the characteristic equation of A.We use the method involving the trace and determinant of A.tr(A) = 1 + 1 = 2; det(A) = 1 × 1 − 2 × (−2) = 5. Thus

pA(λ) = det(A − λI) = λ2 − tr(A)λ + detA = λ2 − 2λ + 5.

Step 2. Find the eigenvalues of A.We complete the square. pA(λ) = λ2 − 2λ + 5 = (λ − 1)2 + 4. The rootsare 1 ± 2i, so λ1 = 1 + 2i and λ2 = 1 − 2i.

Step 3. Find the eigenvector associated to one eigenvalue The eigenvalues are complex, so we’ll only need one eigenvector. We look for the eigenvector

a1 for λ1 = 1 + 2i. a2

3

�

� �

� � � �

� � � �

� � � �


It must satisfy: � ��

(A − (1 + 2i)I)a = 0 ⇔ −−

22 i

−22 i a

a2

1 = 00

−2ia1 + 2a2 = 0 ⇔ −2a1 − 2ia2 = 0.

You should check that these two equations are equivalent.This gives a1 + ia2 = 0. Pick a1 = 1, this implies a2 = i. Thus an eigenvec

1tor for λ1 is v1 = .i

Step 4. Find the real and imaginary parts of solution associated to λ1 The solution we associated to λ1 is

eλ1tv1 = e(1+2i)t 1 = et(cos(2t) + i sin(2t)) 1

i i

This has real and imaginary parts:

cos(2t) sin(2t)t tx1 = e , x2 = e . − sin(2t) cos(2t)

If you are confused by steps 3 or 4, you should read over the note again.

Step 5. General solution.

cos(2t) sin(2t)t tc1e + c2e . − sin(2t) cos(2t)

4

� � � �

� � � � � �

� � � �

Repeated Eigenvalues

1. Repeated Eignevalues

Again, we start with the real 2 × 2 system.x = Ax. (1)

We say an eigenvalue λ1 of A is repeated if it is a multiple root of the characteristic equation of A; in our case, as this is a quadratic equation, the only possible case is when λ1 is a double real root.

We need to find two linearly independent solutions to the system (1). We can get one solution in the usual way. Let v1 be an eigenvector corresponding to λ1. This is found by solving the system

(A − λ1 I) a = 0. (2)

This gives the solution x1 = eλ1tv1 to the system (1). Our problem is to find a second solution. To do this we have to distinguish two cases, called complete and defective. The first one is easier, especially in the 2 × 2 case.

A. The complete case. Still assuming λ1 is a real double root of the characteristic equation of A, we say λ1 is a complete eigenvalue if there are two linearly independent eigenvectors v1 and v2 corresponding to λ1; i.e., if these two vectors are two linearly independent solutions to the system (2).

In the 2 × 2 case, this only occurs when A is a scalar matrix that is, when A = λ1 I. In this case, A − λ1 I = 0, and every vector is an eigenvector. It is easy to find two independent solutions; the usual choices are

1 0 eλ1t and eλ1t .0 1

So the general solution is

λ1t 1 λ1t 0 λ1t c1c1e + c2e = e .0 1 c2

Of course, we could choose any other pair of independent eigenvectors to generate the solutions, e.g.,

5 1 eλ1t and eλ1t .1 −1

� �

Repeated Eigenvalues OCW 18.03SC

Remark. For n = 3 and above the situation is more complicated. We will not discuss it here. The interested reader can consult, for instance, the textbook by Edwards and Penney.

B. The defective case. If the eigenvalue λ is a double root of the characteristic equation, but the system (2) has only one non-zero solution v1 (up to constant multiples), then the eigenvalue is said to be incomplete or defective and x1 = eλ1tv1 is the unique normal mode. However, a second order system needs two independent solutions. Our experience with repeated roots in second order ODE’s suggests we try multiplying our normal solution by t. It turns out this doesn’t quite work, but it can be fixed as follows: a second independent solution is given by

x2 = eλ1t(tv1 + v2). (3)

where v2 is any vector satisfying

(A − λ1 I) v2 = v1 .

(You can easily, if tediously, check by substitution that this does give a solution. You need to remember that Av1 = λ1v1.)

Fact. The equation for v2 is guaranteed to have a solution, provided that the eigenvalue λ1 really is defective. When solving for v2 = (b1, b2)T, try setting b1 = 0, and solving for b2. If that does not work, try setting b2 = 0 and solving for b1.

Remarks 1. Some people do not bother with (3). When they encounter the defective case (at least when n = 2), they give up on eigenvalues, and simply solve the original system (1) by elimination.

2. Although we will not go into it in this course, there is a well developed theory of defective matrices which gives insight into where this formula comes from. You will learn about all this when you study linear algebra.

We will now do a worked example.

2. Worked example: Defective Repeated Eigenvalue

.Problem. Solve u = Au, where A =

−2 1 . −1 0

Comments are given in italics.

Solution. � �

Step 0. Write down A − λI: A − λI = −2 − λ 1

. −1 −λ

2

Repeated Eigenvalues OCW 18.03SC

Step 1. Find the characteristic equation of A:tr(A) = −2 + 0 = −2, det(A) = −2 × 0 − 1 × (−1) = 1. Thus,

pA(λ) = det(A − λI) = λ2 − tr(A)λ + det(A) = λ2 + 2λ + 1 = 0.

Step 2. Find the eigenvalues of A. The characteristic polynomial factors: pA(λ) = (λ + 1)2. This has a repeated root, λ1 = −1.

As the matrix A is not the identity matrix, we must be in the defective repeated root case.

Step 3. Find an eigenvector.This is vector v1 = (a1, a2)T that must satisfy:� ��

(A + I)v1 = 0 −2 + 1 1 a1 =

0 ⇔ −1 1 a2 0 � �� −1 1 a1 =

0.⇔ −1 1 a2 0

Check: this gives two identical equations, which is good.The equation is −a1 + a2 = 0. Setting a1 = 1 gives a2 = 1. Thus,one eigenvector for λ1 is v1 = (1, 1)T . All other eigenvectors for λ1 aremultiples of this.

Step 4. Find v2: This vector v2 = (b1, b2)T must satisfy � ��

(A − λ1 I) v2 = v1 −1 1 b1 =

1 −b1 + b2 = 1.⇔ −1 1 b2 1 ⇔

Setting b1 = 0 gives b2 = 1; so v2 = (0, 1)T is suitable.

Step 5. General solution. The general solution is � � � � � � � � � � ��

u(t) = c1e−t 1 + c2(te−t 1

+ e−t 0 = e−t c1

1 + c2

1 + t .

1 1 1 1 1 + 2t

3

Introduction

Up to now we have handled systems analytically, concentrating on a procedure for solving linear systems with constant coefficients. In this session, we consider methods for sketching graphs of the solutions.

The emphasis is on the word sketching. Computers do the work of drawing reasonably accurate graphs. Here we want to see how to get quick qualitative information about the graph, without having to actually calculate points on it. These graphs of the solutions (also called the trajectories of the system) are called of phase portraits.

In this session we consider 2 × 2 linear homogeneous systems x� = Ax. In a later session we extend this program, known as the phase-plane analysis, to more general non-linear 2 × 2 DE systems.

The analysis of the linear case will be the foundation for the more general program, so it is very important that we understand this case well. For that reason, this session is somewhat long, we wish to have the linear case well worked out in detail so that we can refer back to it later as needed.

The Eigenvalues Rule There are a lot of details in this session. The one key fact tying them all together is the eignenvalues rule:

We classify the linear phase portraits according to the eigenvalues of the matrix A.

� �

� �

� � � � � �

The Phase Plane

The sort of system for which we will be trying to sketch solutions can be written in the form

x� = ax + by a b y� = cd + dy

⇔ x� = Ax, where A = c d , (1)

where a, b, c, d are constants.

A solution of this system has the form (we write it two ways)

x(t) =x(t)

, x = x(t)

(2)y(t) y = y(t).

It is a vector function of t whose components satisfy the system (1) when they are substituted in for x and y. In general, you learned in 18.02 and physics that such a vector function describes a motion in the xy-plane; the equations in (2) tell how the point (x, y) moves in the xy-plane as the time t varies. The moving point traces out a curve called the trajectory of the solution (2). The xy-plane itself is called the phase plane for the system (1). We show a sketch of a trajectory at right. Notice the arrow is used to indicate the direction of increasing time.

We use the term phase portrait to mean the graphs of enough trajectories to give a good sense of all the solutions to the system (1)

1. Critical Points

Definition. A critical point is a point where the derivatives are 0. Therefore a point (x0, y0) is a critical point of the system (1) if

x� x0 0 = A = . y� y0 0

The equations of the system (1) show this is equivalent to

x = x0, y = y0 is a (constant) solution to (1).

Critical points are the key to our qualitative view of systems. We classify the linear systems by their behavior near critical points.

For the linear system constant coefficient system (1) there is always a critical point at (0, 0). If the matrix A is invertible then this is the only critical point.

The Phase Plane OCW 18.03SC

2. Sketching Principle

When sketching integral curves for direction fields we saw that integral curves did not cross. For the system (1) we have a similar principle.

Sketching Principle. Two trajectories of (1) cannot intersect.

2

� � � � � � � �

Sketching the Basic Linear Systems

In this note we will only consider linear systems of the form x� = Ax. Such a system always has a critical point at the origin.

We start by sketching a few of the simple examples, so as to get an idea of the various possibilities for their trajectories. We will also introduce the terminology used to describe the resulting geometric pictures.

Example 1. Let’s consider the linear system on the left below. Its characteristic equation is λ2 − 1 = 0, so the eigenvalues are ±1. It is easy to see its general solution is the one on the right below: � � � � �

x� = y 1 t 1 y� = x

; x = c1 1 e + c2 −1

e−t . (1)

The only critical point of the system is (0, 0).

Looking at the general solution in (1), we see that by giving one of the c’s the value 0 and the other one the value 1 or −1, we get four easy solutions:

1 t 1 t 1 1 1

e , − 1

e , −1 e−t , − −1

e−t .

These four solutions give four trajectories which are easy to plot. Consider the first, for example. When t = 0, the point is at (1, 1). As t increases, the point moves outward along the line y = x. As t decreases through negative values, the point moves inwards along the line, toward (0, 0). Since t is always understood to be increasing on the trajectory, the whole trajectory consists of the ray y = x in the first quadrant, excluding the origin (which is not reached in finite negative time), with an outward direction of motion.

A similar analysis can be made for the other three solutions; see figure 1 below.

Sketching the Basic Linear Systems OCW 18.03SC

Figure 1

As you can see, each of the four solutions has as its trajectory one of the four rays. The indicated direction of motion is outward or inward according to whether the exponential factor increases or decreases as t increases. There is even a fifth trajectory: the origin itself, which is a stationary point, i.e., a solution all by itself. So the intersecting diagonal lines represent five trajectories, no two of which intersect.

For the other trajectories we can do a little algebra: We have

x = c1et + c2e−t

ty = c1e − c2e−t .

This easily gives x2 − y2 = 4c1c2 = a constant, which is the equation of a hyperbola oriented with the axes –we sketch in some of the hyperbolas. We know which direction to point the arrows indicating the direction of motion as t increases since they must be compatible with the motion along the rays — for by continuity, nearby trajectories must have arrowheads pointing in similar directions. The only possibility therefore is the one shown in figure 1.

A linear system whose trajectories show the general features of those in fig. 1 is said to be an unstable saddle. It is called unstable because the trajectories go off to infinity as t increases (there are three exceptions: what are they?). It is called a saddle because of its general resemblance to the level curves of a saddle-shaped surface in 3-space.

Example 2. This time we consider the linear system below —since it is decoupled, its general solution (on the right) can be obtained easily by in

2


spection: � � � � � x� = −x 1 0 y� = −2y

x = c1 0 e−t + c2 1

e−2t . (2)

It is immediate that x = c1e−t and y = c2e−2t implies

2y = cx .

That is, the trajectories are a family of parabolas.

Following the same plan as in Example 1, we single out the four solutions � � � � � � � �

1 0

e−t , − 1 0

e−t , 0 1

e−2t , − 0 1

e−2t . (13)

Their trajectories are the four rays along the coordinate axes, the motion being always inward as t increases. Put compatible arrowheads on the parabolas and you get figure 2.

A linear system whose trajectories have the general shape of those in fig. 2 is called an asymptotically stable node or a sink node. The word node is used when the trajectories have a roughly parabolic shape (or exceptionally, they are rays); asymptotically stable or sink means that all the trajectories approach the critical point as t increases.

Figures 2 and 3. Trajectories from examples 2 and 3

Example 3. This is the same as Example 2, except that the signs are reversed: � � � � �

x� = x x =

1 et 0 e2t . (3)y� = 2y c1 0 + c2 1

3


The first order differential equation remains the same, so we get the same parabolas. The only difference in the work is that the exponentials now have positive exponents. The picture remains exactly the same except that now the trajectories are all traversed in the opposite direction — away from the origin — as t increases. The resulting picture is fig. 3, which we call an unstable node or source node.

Example 4. A different type of simple system (eigenvalues ±i) and its solution is � � � � �

x� = y ; x =

sin t cos t . (4)y� = −x c1 cos t + c2 − sin t

For this example let’s see a different way of finding the trajectories. Dividing y�/x� converts to a separable first order ODE.

dy/dt dy x 2 2 dx/dt

= dx

= − y

, x + y = c.

The trajectories are the family of circles centered at the origin. To determine the direction of motion, look at the solution in (4) for which c1 = 0, c2 = 1. Notice that it is the reflection in the y-axis of the usual (counterclockwise) parametrization of the circle; hence the motion is clockwise around the circle. An even simpler procedure is to determine a single vector in the velocity field — that’s enough to determine all of the directions. For example, the velocity vector at (1, 0) is < 0, −1 >= −j, again showing the motion is clockwise. (The vector is drawn in on fig. 4, which illustrates the trajectories.)

This type of linear system is called a stable center . The word stable signifies that any trajectory stays within a bounded region of the phase plane as t increases or decreases indefinitely. (We cannot use “asymptotically stable,” since the trajectories do not approach the critical point (0, 0) as t increases.) The word center describes the geometric configuration: it would be used also if the curves were ellipses having the origin as center.

Figures 4, 5 and 6. Trajectories from examples 4 and 5

4


�

Example 5. As a last example, a system having a complex eigenvalue λ = −1 + i is, with its general solution, � � � � �

x� = −x + y x = c1e−t sin t

+ c2e−t cos t . (5)y� = −x − y cos t − sin t

The two fundamental solutions (using c1 = 0 and c1 = 1, and vice-versa) are typical. They are like the solutions in example 4, but multiplied by e−t . Their trajectories are therefore traced out by the tip of an origin vector that rotates clockwise at a constant rate, while its magnitude shrinks exponentially to 0. In other words, the trajectories spiral in toward the origin as t increases. We call this pattern an asymptotically stable spiral or a sink spiral; see fig. 6. (An older terminology uses focus instead of spiral.)

To determine the direction of motion, it is simplest to do what we did in the previous example: determine from the ODE system a single vector of the velocity field. For instance, the system (5) has at (1, 0) the velocity vector −i − j, which shows that the motion is clockwise.

For the system x� = x + y

, an eigenvalue is λ = 1 + i, and in (5) y� = −x + y et replaces e−t. The magnitude of the rotating vector increases as t increases, giving as pattern an unstable spiral, or source spiral, as in fig. 6.

5

Sketching More General Linear Systems

In the preceding section we sketched trajectories for some particular linear systems. They were chosen to illustrate the different possible geometric pictures. Based on that experience, we can now describe how to sketch the general system

x� = Ax, A = 2 × 2 constant matrix.

The geometric picture is largely determined by the eigenvalues and eigenvectors of A, so there are several cases.

For the first group of cases, we suppose the eigenvalues λ1 and λ2 are real and distinct.

Case 1. The λi have opposite signs: λ1 > 0, λ2 < 0 ; unstable saddle.

Suppose the corresponding eigenvectors are �α1 and �α2, respectively. Then four solutions to the system are

x = ±�α1eλ1t , x = ±�α2eλ2t . (1)

How do the trajectories of these four solutions look?

In figure 1 below, the four vectors ±�α1 and ±�α2 are drawn as origin vectors. In figure 2, the corresponding four trajectories are shown as solid lines, with the direction of motion as t increases shown by arrows on the lines. The reasoning behind this is the following.

Look first at x = �α1eλ1t. We think of eλ1t as a scalar factor changing the length of x; that is as t increases from −∞ to ∞, this scalar factor increases from 0 to ∞, since λ1 > 0. The tip of this lengthening vector represents the trajectory of the solution x = �α1eλ1t, which is therefore a ray going out from the origin in the direction of the vector�α1.

Similarly, the trajectory of x = −�α1eλ1t is a ray going out from the origin in the opposite direction: that of the vector −�α1.

The trajectories of the other two solutions x = ±�α2eλ2t will be similar, except that since λ2 < 0, the scalar factor eλ2t decreases as t increases. Thus the solution vector will be shrinking as t increases. The trajectory traced out by its tip will be a ray having the direction of �α2 or −�α2, but traversed toward the origin as t increases, getting arbitrarily close but never reaching it in finite time.

Sketching More General Linear Systems OCW 18.03SC

To complete the picture, we sketch some nearby trajectories. These will be smooth curves generally following the directions of the four rays described above. In example 1 in the previous note they were hyperbolas. In general they are not, but they look something like hyperbolas, and they do have the rays as asymptotes. They are the trajectories of the solutions

x = c1�α1eλ1t + c2�α2eλ2t , (2)

for different values of the constants c1 and c2.

Figures 1 and 2. Trajectories for case 1: saddle Case 2. λ1 and λ2 are distinct and negative: say λ1 < λ2 < 0; asymptotically stable (sink) node

Formally, the solutions (1) are written the same way and we draw their trajectories just as before. The only difference is that now all four trajectories are represented by rays coming in towards the origin as t increases, since both of the λi are negative. The four trajectories are represented as solid lines in figure 3, on the next page.

The trajectories of the other solutions (2) will be smooth curves which generally follow the four rays. In the corresponding example 2 from the previous note, they were parabolas; here too they will be parabola-like, but this does not tell us how to draw them, so a little more thought is needed. The parabolic curves will certainly come in to the origin as t increases, but tangent to which of the rays? Briefly, the answer is this:

Node-sketching principle. Near the origin, the trajectories follow the ray attached to the λi nearer to zero; far from the origin, they follow (i.e. are roughly parallel to) the ray attached to the λi further from zero.

You need not memorize the above. Instead learn the reasoning on which it is based, since this type of argument will be used over and over in science and engineering work having nothing to do with differential equations.

Since we are assuming λ1 < λ2 < 0, it is λ2 which is closer to 0. We want to know the behavior of the solutions near the origin and far from the

2


origin. Since all solutions are approaching the origin, near the origin corresponds to large positive t (we write t � 1); and far from the origin corresponds to large negative t (written t � −1).

As before, the general solution has the form

x = c1�α1eλ1t + c2�α2eλ2t , λ1 < λ2 < 0. (3)

If t � 1, then x is near the origin, since both terms in (3) are small. However, the first term is negligible compared with the second: for since λ1 −λ2 < 0, we have

eλ1t

eλ2t = e(λ1−λ2)t ≈ 0, t � 1 . (4)

Thus if λ1 < λ2 < 0 and t � 1, we can neglect the first term of (3), getting

x ∼ c2�α2eλ2t for t � 1 (x near the origin),

which shows that x(t) follows the ray corresponding to the the eigenvalue λ2 closer to zero.

Similarly, if t � −1, then x is far from the origin since both terms in (3) are large. This time the ratio in (4) is large, so that it is the first term in (3) that dominates the expression, which tells us that

x ∼ c1�α1eλ1 t for t � −1 (x far from the origin).

This explains the reasoning behind the node-sketching principle in this case.

Some of the trajectories of the solutions (3) are sketched in dashed lines in figure 3, using the node-sketching principle, and assuming λ1 < λ2 < 0.

Figures 3, 4 and 5. Trajectories for source and sink nodes Case 3. λ1 and λ2 are distinct and positive: say λ1 > λ2 > 0 unstable (source) node

3


The analysis is like the one we gave above. The direction of motion on the four rays coming from the origin is outwards, since the λi > 0. The node-sketching principle is still valid and the reasoning for it is like the reasoning in case 2. The resulting sketch looks like the one in fig. 5.

Case 4. Eigenvalues are pure imaginary: λ = ±bi, b > 0 stable center

Here the solutions to the linear system have the form

x = c1 cos bt + c2 sin bt, c1, c2 constant vectors . (5)

(There is no exponential factor since the real part of λ is zero.) Since every solution (5) is periodic, with period 2π/b, the moving point representing it retraces its path at intervals of 2π/b. The trajectories therefore are closed curves; ellipses, in fact; see fig. 7.

Sketching the ellipse is a little troublesome, since the vectors ci do not have any simple relation to the major and minor axes of the ellipse. For this course, it will be enough if you determine whether the motion is clockwise or counterclockwise. As in example 4 in the previous note, this can be done by using the system x� = Ax to calculate a single velocity vector x�

of the velocity field. From this the sense of motion can be determined by inspection.

The word stable means that each trajectory stays for all time within some circle centered at the critical point. Asymptotically stable is a stronger requirement: each trajectory must approach the critical point (here, the origin) as t ∞.→

Case 5. The eigenvalues are complex, but not purely imaginary. There are two cases:

a ± bi, a < 0, b > 0; asymptotically stable (sink) spiral; a ± bi, a > 0, b > 0; unstable (source) spiral.

Here the solutions to the linear system have the form

x = eat(c1 cos bt + c2 sin bt), c1, c2 constant vectors . (6)

They look like the solutions (5), except for a scalar factor eat which either

decreases towards 0 as t ∞ (a < 0), or→increases towards ∞ as t ∞ (a > 0) .→

4


�

�

Thus the point x travels in a trajectory which is like an ellipse, except that the distance from the origin is steadily shrinking or expanding. The result is a trajectory which does one of the following:

spirals steadily towards the origin, (asymptotically stable spiral) : a < 0 spirals steadily away from the origin. (unstable spiral); a > 0

The exact shape of the spiral is not obvious and perhaps best left to computers. You should determine the direction of motion by calculating from the linear system x� = Ax a single velocity vector x� near the origin. Typical spirals are pictured (figs. 7, 8).

Figures 6, 7 and 8. Trajectories centers and spirals Other cases.

Repeated real eigenvalue λ = 0, defective: (incomplete: one independent eigenvector):

defective node; unstable if λ > 0; asymptotically stable if λ < 0 (fig. 9).

Repeated real eigenvalue λ = 0, complete (two independent eigenvectors): star node; unstable if λ > 0; asymptotically stable if λ > 0. (fig. 10).

One eigenvalue λ = 0. (Picture left for exercises and problem sets.)

Figure 9. Stable and unstable nodes Figures 10. Star nodes

5

�

Summary

In summary, the procedure of sketching trajectories of the 2 × 2 linear homogeneous system x� = Ax, where A is a constant matrix, is the following.

Begin by finding the eigenvalues of A.

1. If they are real, distinct, and non-zero:

a) find the corresponding eigenvectors;

b) draw in the corresponding solutions whose trajectories are rays. Use the sign of the eigenvalue to determine the direction of motion as t increases; indicate it with an arrowhead on the ray;

c) draw in some nearby smooth curves, with arrowheads indicating the direction of motion:

(i) if the eigenvalues have opposite signs, this is easy; (ii) if the eigenvalues have the same sign, determine which is the

dominant term in the solution for t � 1 and t � −1, and use this to determine which rays the trajectories are tangent to, near the origin, and which rays they are parallel to, away from the origin. (Or use the node-sketching principle.)

2. If the eigenvalues are complex, a ± bi, the trajectories will be

a) ellipses if a = 0

b) spirals if a = 0; inward if a < 0, outward if a > 0.

In all cases, determine the direction of motion by using the system x� = Ax to find one velocity vector.

3. The details in the other cases (eigenvalues repeated, or zero) will be left as exercises using the reasoning in this note.

� �

Trace-Determinant Diagram

Recall that the characteristic polynomial of a square matrix A is defined to be

p(λ) = det(A − λI).

For a 2 × 2 matrix A, A = ac d

b , we have p(λ) =| a −

c λ

d − b

λ |

= λ2 − (a + d)λ + (ad − bc). If we now recall the definitions of trace and determinant for a 2 × 2 matrix A from the linear algebra and matrix review given at the end of the previous session on Matrix Methods, namely trA = a + d and det A = ad − bc, we see that we can write p(λ) = λ2 − trAλ + det A.

Now if we use the abbreviations T = trA and D = det A, we can write the characteristic polynomial as p(λ) = λ2 − Tλ + D. The eigenvalues are the roots of p(λ), so the quadratic formula immediately gives us that the eigenvalues will be real if and only if the discriminant T2 − 4D > 0 and complex if and only if T2 − 4D < 0. The separating curve D = T2/4 is shown on the trace-determinant graph below.

Then looking at the full quadratic formula for p(λ) = 0, λ = −T±√

2 T2−4d ,

we can determine the conditions for the signs in the case of real eigenvalues and also the signs of the real part for the complex case. We leave this as an exercise (not difficult and highly recommended) for the reader. The results are as follows:

1. If D < 0, the eigenvalues are real and of opposite sign, and the phase portrait is a saddle (which is always unstable).

2. If 0 < D < T2/4, the eigenvalues are real, distinct, and of the same sign, and the phase portrait is a node, stable if T < 0, unstable if T > 0.

3. If 0 < T2/4 < D, the eigenvalues are neither real nor purely imaginary, and the phase portrait is a spiral, stable if T < 0, unstable if T > 0.

Sketching this information in on the T − D graph gives the trace-determinant diagram below. The boundary cases, where the either inequalities become equality and/or

Trace-Determinant Diagram OCW 18.03SC

where T = 0 or D = 0, are called the “borderline cases." We will discuss these further in a later session.

The Mathlets Linear Phase Portraits: Cursor Entry and Linear Phase Portraits: Matrix Entry will allow you to explore the classification of the solution types provided by the T − D diagram interactively, and are highly recommended.

D = det A

center

spiral spiral sink source defective or star node

nodal sink nodal source

degenerate

T = trA

saddle

��

��

��

2




http://ocw.mit.edu


Appendix: A Computer-Generated Portrait Gallery

There are a number of public-domain computer programs which produce phase portraits for 2 × 2 autonomous systems. One has the option of displaying the trajectories with or without the “little" arrows of the vector direction field. We first give an example where the direction field is included, and then a portrait gallery with only the tractories themselves. In the pictures which illustrate these cases, only a few trajectories are shown, but these are sufficent to shown the qualitative behavior of the system.

A much richer understanding of this gallery can be achieved using the

Vector field: x' = y, y' = −x

−3 −2 −1 1 2 3

−3

−2

−1

1

2

3

●

●

x

yMathlets Linear Phase Portraits: Cursor Entry and Linear Phase Portraits: Matrix Entry.

Example. (direction field included).� �� x y

= . y −x

General solution: x = c1 cos t + c2 sin t, y = −c1 sin t + c2 cos t As we saw before, the trajectories are circles. We know the direction is clockwise by looking at a single tangent vector.

The portrait gallery. 1. A has real eigenvalues with two independent eigenvectors. Let λ1, λ2 be the eigenvalues and v1 and v2 the corresponding eigenvectors. ⇒ general solution to (�) is x = c1eλ1tv1 + c2eλ2tv2.

i) λ1 > λ2 > 0. Unstable nodal source. As t → ∞ the term eλ1tv1 dominates and x → ∞. As t → −∞ the term eλ2tv2 dominates and x → 0.

x

y

��

��

��

��

��

��

��

��

�� •

Appendix: A Computer-Generated Portrait Gallery OCW 18.03SC

ii) λ1 < λ2 < 0. Stable nodal sink (asymptotically stable). (Simply reverse the arrows on case (i).) As t → 0 the term eλ2tv2 dominates and x → ∞. As t → −∞ the term eλ1tv1 dominates and x → ∞.

iii) λ1 > 0 > λ2. Saddle, unstable. As t → ∞ the term eλ1tv1 dominates and x → ∞. As t → −∞ the term eλ2tv2 dominates and x → ∞.

iv) λ1 = 0 > λ2. Line of critical points. The critical points are not isolated –they lie on the line through 0 with direction v1. x = c1v1 + c2eλ2tv2 As t → ∞ x → c1v1 along a line parallel to v2.

v) λ1 = 0 < λ2. Line of critical points. (Simply reverse the arrows in case (iv).) The critical points are not isolated –they lie on the line through 0 with direction v1. x = c1v1 + c2eλ2tv2 As t → ∞ x → ∞ along a line parallel to v2.

x

y

��

��

��

��

��

��

��

��

�� •

x

y

��

��

��

��

��

��

��

��

��

•

x

y

•

��

��

•

��

��

•

��

��•

��

�� •

��

��

x

y

•

��

��

•

��

��

•

��

��•

��

�� •

��

��

2


vi) λ1 = λ2 > 0. Star nodal source (unstable). Let λ = λ1 = λ2. Two independent eigenvectors ⇒ A is a

→scalar matrix ⇒ x = eλt−c . That is all trajectories are straight rays. As t → ∞ x → ∞ along a line from 0. As t → −∞ x → 0

vii) λ1 = λ2 < 0. Star nodal sink (asymptotically stable). (Simply reverse the arrows in case (vi).)

x

y

•

��

��

��

��

��

��

��

��

x

y

•

��

��

��

��

��

��

��

��

viii) λ1 = λ2 = 0. Non-isolated critical points. Every point is a critical point, every trajectory is a point. 2. Real defective case (repeated eigenvalue, only one eigenvector). Let λ be the eigenvalue and v1 the corresponding eigenvector. Let v2 be a generalized eigenvector associated with v1.

⇒ general solution to (�) is x = eλt(c1v1 + c2(tv1 + v2)).

y yi) λ > 0. Defective source node (unstable). As t → ∞ the term tv1 dominates and x → ∞.

��

��

��

��

��

•

y

x As t → −∞ the term tv1 dominates and x → 0.

��

��

��

��

��

�� •

y

ii) λ < 0. Defective sink node (asymptotically stable). (Simply reverse the arrows in case (i).)

x

��

��

��

��

��

• x

��

��

��

��

��

�� •

3

x


iii) λ = 0. Line of critical points. x = c1v1 + c2(tv1 + v2). Trajectories are parallel to v1.

x

y

• •

•

• •

��

��

x

y

• •

•

• •

��

��

��

��

��

��

��

��

3. Complext roots (pair of complex conjugate eigenvalues and vectors). Let an eigenvalue be λ = α + β i, with eigenvector v + i w. ⇒ x = eαt(c1(cos βt v − sin βt w) + c2(sin βt v + cos βt w)). The sines and cosines in the solution will cause the trajectory to spiral around the critical point.

y yi) Re(λ) > 0 (i.e. α > 0). Spiral source (unstable). Trajectories can spiral clockwise or counterclockwise.

�� •

y

�� •

y

x x As t → ∞, x → ∞. As t → −∞, x → 0.

ii) Re(λ) < 0 (i.e. α < 0). Spiral sink (asymptotically stable). (Reverse arrows from case (i).) Trajectories can spiral clockwise or x x counterclockwise. As t → ∞, x → 0. As t → −∞, x → ∞.

��

•

y

��

•

y iii) Re(λ) = 0 (i.e. α = 0). Stable center. Trajectories can turn clockwise or counterclockwise. As t → ±∞, x follows an ellipse.

x ��

• x

��

•

For the complex case you can find the direction of rotation by checking the tangent vector at one point.

4

� �

� � � �


2 3Example. A = has eigenvalues 2 ± 3i. ⇒ the critical point is −3 2 a spiral source.

1 2The tangent vector at the point x0 = is A x0 = .

0 −3 ⇒ spiral is clockwise.

5

Linear Phase Portraits: Cursor Entry

Open up the applet Linear phase portraits: Cursor Entry. This applet is similar to Linear phase portraits: Matrix Entry.

In this exploration we want you to look at how the different regions in the trace-determinant diagram correspond to their phase portraits and eigenvalues.

As usual, first play with the applet a little to get familiar with it. Pay attention to the color coding.

Without using the applet, for each different region in the diagram answer the following questions:

1. Are the eigenvalues real or complex?

2. What is the sign on the real part of the eigenvalues?

3. What will the phase portrait look like?

Now check your answers by clicking in each region.

Answer the same questions for each of the red boundary lines separating the different regions.

A small movement in the trace-determinant plane should result in a small change in the eigenvalues and in the phase portrait. Verify this by dragging the cross-hairs around the diagram.

As you drag the cross-hairs across a boundary line the phase portrait will switch types. If you do it slowly you can see how the portrait on the boundary is ’right between’ the types on either side. For example, if you drag it from the second to the third quadrant across the horizontal (trace) axis, the eigenvalues change from both negative to one positive and one negative. Right on the boundary one of them is zero.

Likewise, the phase portraits change from turning in towards the origin to turning away from the origin. With the cursor right on the boundary they don’t turn, they are just straight lines heading towards the line of critical points corresponding to the eigenvector with eigenvalue zero.

Introduction

In this session we learn general results about the solutions of any n × n linear DE system (not necessarily constant coefficient).

First we will learn the some general theory for linear systems. This will be familiar to you from our study of linear ODE’s.

After that we will discuss the fundamental matrix which is an efficient way to package all the solutions of a linear system of differential equations. It will also allow us to use matrix algebra when working with such systems. We will use the fundamental matrix to derive the variation of parameters formula, which is used to solve an inhomogeneous system with arbitrary output.

Next we will define the matrix exponential and express the fundamental matrix in terms of it. The matrix exponential has many nice theoretical properties.

Finally we will recast what we’ve done in the language of decoupling which is commonly used by engineers.

This session is long and covers many important topics. It is something of a sidelight in this course and will not be used in subsequent sessions.

� � � � � �

� �

General Linear ODE Systems and Independent Solutions

We have studied the homogeneous system of ODE’s with constant coefficients,

x� = A x , (1)

where A is an n × n matrix of constants (n = 2, 3). We described how to calculate the eigenvalues and corresponding eigenvectors for the matrix A, and how to use them to find n independent solutions to the system (1).

With this concrete experience in solving low-order systems with constant coefficients, what can be said when the coefficients are functions of the independent variable t? We can still write the linear system in the matrix form (1), but now the matrix entries will be functions of t:

x� = a(t)x + b(t)y ,

x � =

a(t) b(t) .

x , (2)y� = c(t)x + d(t)y y c(t) d(t) y

or in more abridged notation, valid for n × n linear homogeneous systems,

x� = A(t) x . (3)

Note how the matrix becomes a function of t — we call it a matrix-valued function of t, since to each value of t the function rule assigns a matrix:

t0 A(t0) = a(t0) b(t0) → c(t0) d(t0)

In the rest of this chapter we will often not write the variable t explicitly, but it is always understood that the matrix entries are functions of t.

We will sometimes use n = 2 or 3 in the statements and examples in order to simplify the exposition, but the definitions, results, and the arguments which prove them are essentially the same for higher values of n.

Definition 1 Solutions x1(t), . . . , xn(t) to (3) are called linearly dependent if there are constants ci, not all of which are 0, such that

c1x1(t) + . . . + cnxn(t) = 0, for all t. (4)

If there is no such relation, i.e., if

c1x1(t) + . . . + cnxn(t) = 0 for all t all ci = 0, (5)⇒

the solutions are called linearly independent, or simply independent.

General Linear ODE Systems and Independent Solutions OCW 18.03SC

The phrase for all t is often in practice omitted, as being understood. This can lead to ambiguity. To avoid it, we will use the symbol ≡ 0 for identically 0, meaning zero for all t; the symbol �≡ 0 means not identically 0, i.e., there is some t-value for which it is not zero. For example, (4) would be written

c1x1(t) + . . . + cnxn(t) ≡ 0 .

Theorem 1 If x1, . . . , xn is a linearly independent set of solutions to the n × n system x� = A(t)x, then the general solution to the system is

x = c1x1 + . . . + cnxn. (6)

Such a linearly independent set is called a fundamental set of solutions. This theorem is the reason for expending so much effort to find two independent solutions, when n = 2 and A is a constant matrix. In this chapter, the matrix A is not constant; nevertheless, (6) is still true.

Proof. There are two things to prove:

(a) All vector functions of the form (6) really are solutions to x� = A x.

This is the superposition principle for solutions of the system; it’s true because the system is linear. The matrix notation makes it really easy to prove. We have

(c1x1 + . . . + cnxn)� = c1x1� + . . . + cnx�n

= c1 A x1 + . . . + cn A xn, since xi� = A xi ;

= A (c1x1 + . . . + cnxn), by the distributive law.

(b) All solutions to the system are of the form (6).

This is harder to prove and will be the main result of the next note.

2

The Existence and Uniqueness Theorem for Linear Systems

For simplicity, we stick with n = 2, but the results here are true for all n. There are two questions about the following general linear system that we need to consider.

x� = a(t)x + b(t)y ; in matrix form,

� x ��

= �

a(t) b(t) ��

x �

y� = c(t)x + d(t)y y c(t) d(t) y (1)

The first is from the previous section: to show that all solutions are of the form

x = c1x1 + x2x2,

where the xi form a fundamental set, that is, no xi is a constant multiple of the other). (The fact that we can write down all solutions to a linear system in this way is one of the main reasons why such systems are so important.)

An even more basic question for the system (1) is: how do we know that it has two linearly independent solutions? For systems with a constant coefficient matrix A, we showed in the previous chapters how to solve them explicitly to get two independent solutions. But the general non-constant linear system (1) does not have solutions given by explicit formulas or procedures.

The answers to these questions are based on following theorem.

Theorem 2 Existence and uniqueness theorem for linear systems.

If the entries of the square matrix A(t) are continuous on an open interval I containing t0, then the initial value problem

x� = A(t) x, x(t0) = x0 (2)

has one and only one solution x(t) on the interval I.

The proof is difficult and we shall not attempt it. More important is to see how it is used. The following three theorems answer the questions posed for the 2 × 2 system (1). They are true for n > 2 as well, and the proofs are analogous.

In the following theorems, we assume the entries of A(t) are continuous on an open interval I. Here the conclusions are valid on the interval I, fFor example, I could be the whole t-axis.

Theorem 2A Linear independence theorem.

The Existence and Uniqueness Theorem for Linear Systems OCW 18.03SC

Let x1(t) and x2(t) be two solutions to (1) on the interval I, such that at some point t0 in I, the vectors x1(t0) and x2(t0) are linearly independent. Then

a) the solutions x1(t) and x2(t) are linearly independent on I, and

b) the vectors x1(t1) and x2(t1) are linearly independent at every point t1 of I.

Proof. a) By contradiction. If they were dependent on I, one would be a constant multiple of the other, say x2(t) = c1x1(t). Then x2(t0) = c1x1(t0), showing them dependent at t0. �

b) By contradiction. If there were a point t1 on I where they were dependent, say x2(t1) = c1x1(t1), then x2(t) and c1x1(t) would be solutions to (1) which agreed at t1. Hence, by the uniqueness statement in Theorem 2, x2(t) = c1x1(t) on all of I, showing them linearly dependent on I. �

Theorem 2B General solution theorem.

a) The system (1) has two linearly independent solutions.

b) If x1(t) and x2(t) are any two linearly independent solutions, then every solution x can be written in the form (3), for some choice of c1 and c2:

x = c1x1 + c2x2. (3)

Proof. Choose a point t = t0 in the interval I.

a) According to Theorem 2, there are two solutions x1, x2 to (1), satisfying respectively the initial conditions

x1(t0) = i, x2(t0) = j , (4)

where i and j are the usual unit vectors in the xy-plane. Since the two solutions are linearly independent when t = t0, they are linearly independent on I, by Theorem 5.2A.

b) Let u(t) be a solution to (1) on I. Since x1 and x2 are independent at t0 by Theorem 2, using the parallelogram law of addition we can find constants c1

� and c2� such that

u(t0) = c1� x1(t0) + c2

� x2(t0). (5)

The vector equation (5) shows that the solutions u(t) and c1� x1(t) + c2

� x2(t) agree at t0. Therefore by the uniqueness statement in Theorem 2, they are equal on all of I; that is,

u(t) = c�1x1(t) + c2� x2(t) on I.

2

��

��

� ��

The Wronskian

We know that a standard way of testing whether a set of n n-vectors are linearly independent is to see if the n × n determinant having them as its rows or columns is non-zero. This is also an important method when the n-vectors are solutions to a system; the determinant is given a special name. (Again, we will assume n = 2, but the definitions and results generalize to any n.)

Definition 3 Let x1(t) and x2(t) be two 2-vector functions. We define their Wronskian to be the determinant

W(x1, x2)(t) =x1(t) x2(t) y1(t) y2(t)

(1)

whose columns are the two vector functions.

The independence of the two vector functions should be connected with their Wronskian not being zero. For points, the relationship is clear. Using the result mentioned above, we can say

W(x1, x2)(t0) =x1(t0) x2(t0) y1(t0) y2(t0)

= 0 x1(t0) and x2(t0) are dependent.⇔

(2) However for vector functions, the relationship is clear-cut only when x1 and x2 are solutions to a well-behaved ODE system (3). The theorem is:

We are still considering the system ��a(t)x + b(t)y b(t)x�

y�

Theorem 3 Wronskian vanishing theorem.

On an interval I where the entries of A(t) are continuous, let x1 and x2 be two solutions to (3) and W(t) their Wronskian (1). Then either

a) W(t) ≡ 0 on I, and x1 and x2 are linearly dependent on I, or

b) W(t) is never 0 on I, and x1 and x2 are linearly independent on I.

Proof. Using (2), there are just two possibilities.

a) x1 and x2 are linearly dependent on I; say x2 = c1x1. In this case they are dependent at each point of I, and W(t) ≡ 0 on I, by (2).

a(t)x x=(3), = ,c(t)x + d(t)y d(t)c(t)y y=

The Wronskian OCW 18.03SC

b) x1 and x2 are linearly independent on I, in which case by Theorem 2A they are linearly independent at each point of I, and so W(t) is never zero on I, by (2). �

2

Existence and Uniqueness and Superposition in the General Case

We can extend the results above to the inhomogeneous case.

x� = A(t)x (homogeneous) (H)

x� = A(t)x + F(t) (inhomogeneous), (I)

where F(t) is the input to the system.

Linearity/superposition: 1. x1 and x2 solutions to (H) x = c1x1 + c2x2⇒

proof: x� = c1x1� + c2x2

� = c1 Ax1 + c2 Ax2 = A(c1x1 + c2x2) = x.

2. xh solution to (H) and xp solution to (I) x = xh + xp is a solution ⇒to (I).

proof: x� = xh� + xp

� = Axh + Axp + F = A(xh + xp) + F = Ax + F.

3. x1� = Ax1 + F1, x2

� = Ax2 + F2 x1 + x2 satisfies x� = Ax + F1 +⇒F2

i.e., superposition of inputs leads to superposition of outputs.

Existence and uniqueness: We start with an initial time t0 and the initial value problem:

x� = A(t)x + F(t), x(t0) = x0. (IVP)

Theorem: If A(t) and F(t) are continuous then there exists a unique solution to (IVP).

� � � �

� �

Fundamental Matrices

In the literature, solutions to linear systems often are expressed using square matrices rather than vectors. This is an elegant bookkeeping technique and a very compact, efficient way to express these formulas. As before, we state the definitions and results for a 2 × 2 system, but they generalize immediately to n × n systems.

We return to the system

x� = A(t) x , (1)

with the general solution

x = c1x1(t) + c2x2(t) , (2)

where x1 and x2 are two independent solutions to (1), and c1 and c2 are arbitrary constants.

We form the matrix whose columns are the solutions x1 and x2:

Φ(t) = x1 =

x1 x2 . (3)x2 y1 y2

Since the solutions are linearly independent, we called them a fundamental set of solutions, and therefore we call the matrix in (3) a fundamental matrix for the system (1).

Writing the general solution using Φ(t). As a first application of Φ(t), we can use it to write the general solution (2) efficiently. For according to (2), it is � � � � � � � �

x = c1 x1 y1

+ c2 x2 y2

= x1 y1

x2 y2

c1 c2

,

which becomes using the fundamental matrix

x = Φ(t) c where c = c1 , (general solution to (1)). (4)c2

Note that the vector c must be written on the right, even though the c’s are usually written on the left when they are the coefficients of the solutions xi.

� �

� � � � � �

Fundamental Matrices OCW 18.03SC

Solving the IVP using Φ(t). We can now write down the solution to the IVP

x� = A(t) x , x(t0) = x0. (5)

Starting from the general solution (4), we have to choose the c so that the initial condition in (6) is satisfied. Substituting t0 into (5) gives us the matrix equation for c :

Φ(t0) c = x0 .

Since the determinant |Φ(t0)| is the value at t0 of the Wronskian of x1 and x2, it is non-zero since the two solutions are linearly independent (Theorem 3 in the note on the Wronskian). Therefore the inverse matrix exists and the matrix equation above can be solved for c:

c = Φ(t0)−1x0.

Using the above value of c in (4), the solution to the IVP (1) can now be written

x = Φ(t)Φ(t0)−1x0 . (6)

Note that when the solution is written in this form, it’s “obvious” that x(t0) = x0, i.e., that the initial condition in (5) is satisfied.

An equation for fundamental matrices We have been saying “a” rather than “the” fundamental matrix since the system (1) doesn’t have a unique fundamental matrix: there are many ways to pick two independent solutions of x� = A x to form the columns of Φ. It is therefore useful to have a way of recognizing a fundamental matrix when you see one. The following theorem is good for this; we’ll need it shortly.

Theorem 1 Φ(t) is a fundamental matrix for the system (1) if its determinant |Φ(t)| is non-zero and it satisfies the matrix equation

Φ� = A Φ , (7)

where Φ� means that each entry of Φ has been differentiated.

Proof. Since |Φ| �≡ 0, its columns x1 and x2 are linearly independent, as we

saw in the previous note. Let Φ = x1 . According to the rules for x2

matrix multiplication (7) becomes

x1� x1 Ax1 = A = .

x2� x2 Ax2

2

Fundamental Matrices OCW 18.03SC

which shows that

x�1 = A x1 and x2� = A x2 ;

this last line says that x1 and x2 are solutions to the system (1). �

3

� �

The Normalized Fundamental Matrix

In the previous note we saw to main facts about the a funcdamemtal matrix: � � � �

Φ(t) = x1 x2

= x1 y1

x2 y2

. (1)

and x = Φ(t)Φ(t0)

−1x0 . (2)

Is there a “best” choice for fundamental matrix?

There are two common choices, each with its advantages. If the ODE system has constant coefficients, and its eigenvalues are real and distinct, then a natural choice for the fundamental matrix would be the one whose columns are the normal modes — the solutions of the form

xi = �αiełi t , i = 1, 2.

There is another choice however which is suggested by (2) and which is particularly useful in showing how the solution depends on the initial conditions. Suppose we pick Φ(t) so that

1 0Φ(t0) = I = . (3)

0 1

Referring to the definition (1), this means the solutions x1 and x2 are picked so � � � �

1 0 x1(t0) = , x2(t0) = . (3�)

0 1

Since the xi(t) are uniquely determined by these initial conditions, the fundamental matrix Φ(t) satisfying (3) is also unique; we give it a name.

Definition 2 The unique matrix Φ� t0 (t) satisfying

Φ� �t0

= A Φ� t0 , Φ� t0 (t0) = I (4)

is called the normalized fundamental matrix at t0 for A. For convenience in use, the definition uses Theorem 1 to guarantee Φ� t0 will actually be a fundamental matrix. The condition |Φ� t0 (t)| �= 0 in Theorem 1 is satisfied, since the definition implies |Φ� t0 (t0)| = 1.

� �

� �

The Normalized Fundamental Matrix OCW 18.03SC

To keep the notation simple, we will assume in the rest of this section that t0 = 0, as it almost always is; then Φ� 0 is the normalized fundamental matrix. Since Φ� 0(0) = I, we get from (2) the matrix form for the solution to the IVP: x� = A(t) x, x(0) = x0 is

x(t) = Φ� 0(t)x0. (5)

Calculating Φ� 0. One way is to find the two solutions in (3�) and use them as the columns of Φ� 0. This is fine if the two solutions can be determined by inspection.

If not, a simpler method is this: find any fundamental matrix Φ(t); then

Φ� 0(t) = Φ(t) Φ(0)−1. (6)

To verify this, we have to see that the matrix on the right of (6) satisfies the two conditions in Definition 2. The second is trivial. The first is easy using the rule for matrix differentiation:

If M = M(t) and B, C are constant matrices, then (BM)� = BM�, (MC)� = M�C,

from which we see that since Φ is a fundamental matrix,

(Φ(t)Φ(0)−1)� = Φ(t)�Φ(0)−1 = AΦ(t)Φ(0)−1 = A(Φ(t)Φ(0)−1),

showing that Φ(t)Φ(0)−1 also satisfies the first condition in Definition 2. �

0 1Example 2A. Find the solution to the IVP: x� =

0 x , x(0) = −1

x0 .

Solution. Since the system is x� = y, y� = −x, we can find by inspection the fundamental set of solutions satisfying (3�) :

x = cos t x = sin tand . y = − sin t y = cos t

Thus by (5) the normalized fundamental matrix at 0 and solution to the IVP is � ��

x = � = cos t sin t x0 =

cos t sin t .Φ x0 − sin t cos t y0

x0 − sin t + y0 cos t

Example 2B. Give the normalized fundamental matrix at 0 for x� = 1 3

x .1 −1

2

� �

� � � �

The Normalized Fundamental Matrix OCW 18.03SC

Solution. This time the solutions (3�) cannot be obtained by inspection, so we use the second method. You can easily find the eigenvalues and eigenvectors for this system. Doing so produces the normal modes. Using them as the columns of a fundamental matrix gives us

Φ(t) = 3e2t −e−2t

. e2t e−2t

Using (6) and the formula for calculating the inverse matrix we get

Φ(0) = 3 −1

, Φ(0)−1 = 1 1 1

,1 1 4 −1 3

so that � ��

Φ� (t) =1 3e2t −e−2t 1 1

= 1 3e2t + e2t 3e2t − 3e−2t

.4 e2t e−2t −1 3 4 e2t − e−2t e2t + 3e−2t

3

� � � �

� �

The Exponential Matrix

The work in the preceding note with fundamental matrices was valid for any linear homogeneous square system of ODE’s,

x� = A(t) x .

However, if the system has constant coefficients, i.e., the matrix A is a constant matrix, the results are usually expressed by using the exponential matrix, which we now define.

Recall that if x is any real number, then

x2 xn ex = 1 + x + + . . . + + . . . . (1)

2! n!

Definition 3 Given an n × n constant matrix A, the exponential matrix eA

is the n × n matrix defined by

A2 An eA = I + A + + . . . + + . . . . (2)

2! n!

Each term on the right side of (2) is an n × n matrix adding up the ij-th entry of each of these matrices gives you an infinite series whose sum is the ij-th entry of eA . (The series always converges.) In the applications, an independent variable t is usually included:

At e = I + A t + A2 t2 + . . . + An tn

+ . . . . (3)2! n!

This is not a new definition, it’s just (2) above applied to the matrix A t in which every element of A has been multiplied by t, since for example

(At)2 = At At = A A t2 = A2t2.· · ·

Try out (2) and (3) on these two examples (the second is very easy, since it is not an infinite series).

Example 3A. Let A = 0 a 0

b . Show: eA = e0

a

e0 b ; and

eat 0At e = 0 ebt

� � � �

� �

The Exponential Matrix OCW 18.03SC

Example 3B. Let A = 0 1

, show: eA = 1 1

and0 0 0 1

1 tAt e = .0 1

What’s the point of the exponential matrix? The answer is given by the theorem below, which says that the exponential matrix provides a royal road to the solution of a square system with constant coefficients: no eigenvectors, no eigenvalues, you just write down the answer!

Theorem 3 Let A be a square constant matrix. Then

(1) (a) eAt = Φ� 0(t), the normalized fundamental matrix at 0;

(2) (b) the unique solution to the IVP x� = Ax, x(0) = x0 is x = eAtx0.

Proof. Recall that in the previous note we saw that if Φ� 0(t) is the normalized fundamental matrix then

The solution to the IVP : x� = A(t) x, x(0) = x0 is x(t) = Φ� 0(t)x0. (4)

Statement (2) follows immediately from (1), in view of (4).

We prove (1) is true by using the fact that if t0 = 0 then the normalized fundamental matrix has Φ(0) = I. Letting Φ = eAt, we must show Φ� = AΦ and Φ(0) = I .

The second of these follows from substituting t = 0 into the infinite series definition (3) for eAt .

To show Φ� = AΦ, we assume that we can differentiate the series (3) term-by-term; then we have for the individual terms

d tn tn−1

dt An

n! = An ·

(n − 1)!,

since An is a constant matrix. Differentiating (3) term-by-term then gives

dΦ d At A + A2t + . . . + An (n

tn

−−

11

)! + . . . (5)dt = dt e =

= A eAt = A Φ .

Calculation of eAt .

The main use of the exponential matrix is in Theorem 3 — writing down explicitly the solution to an IVP. If eAt has to be calculated for a specific system, several techniques are available.

2

� � � �

� � � �

The Exponential Matrix OCW 18.03SC

a) In simple cases, it can be calculated directly as an infinite series of matrices.

b) It can always be calculated, according to Theorem 3, as the normalized fundamental matrix Φ� 0(t), using (11): Φ� 0(t) = Φ(t)Φ(0)−1.

c) A third technique uses the exponential law

e(B+C)t = eBteCt , valid if BC = CB. (6)

To use it, one looks for constant matrices B and C such that

A = B + C, BC = CB, eBt and eCt are computable; (7)

then eAt = eB teC t . (8)

2 1 1Example 3C. Let A = . Solve x� = A x, x(0) = ,

0 2 2 using eAt .

2 0 0 1Solution. We set B = and C = ; then (7) is

0 2 0 0 satisfied, and � ��

At e2t 0 1 t 2t 1 t e = 2t = e ,0 e 0 1 0 1

by (8) and Examples 3A and 3B. Therefore, by Theorem 3 (2), we get � �� At 2t 1 t 1 2t 1 + 2t

x = e x0 = e = e .0 1 2 2

3

��

� � � �

� � � �

� � � �

Inhomogeneous Case: Variation of Parameters Formula

The fundamental matrix Φ(t) also provides a very compact and efficient integral formula for a particular solution to the inhomogeneous equation x� = A(t)x + F(t). (presupposing of course that one can solve the homogeneous equation x� = A(t)x first to get Φ.) In this short note we give the formula (with proof!) and one example.

Variation of parameters: (solving inhomegeneous systems) (H) x� = A(t)x � Φ(t) = fundamental matrix (I) x� = A(t)x + F(t)Variation of parameters formula for solution to (I) (just like order 1 DE’s):

x = Φ Φ−1 F dt + C .· ·

proof (remember this)General homogeneous solution: x = Φ c for a constant vector c.· Make c variable � trial solution x = Φ v(t).· Plug this into (I): x� = Ax + F Φ� · v + Φ v� = AΦ v + F.⇒ · · Now substitute for Φ� = Aφ:

AΦ v + Φ v� = AΦ v + F.⇒ · · · Φ v� = F⇒ · v� = Φ−1 F⇒ �

·

v = Φ−1 F + C.⇒ · ��

x = Φ v = Φ Φ−1 F dt + C . QED.⇒ · ·

Definite integral version of variation of parameters �� t �

x(t) = Φ(t) t0

Φ−1(u) · F(u) du + C , where C = Φ−1(t0) · x(t0).

6 5 et Example. Solve x� = x + 5t1 2 e

6 5 1Notation: A = , F = .

1 2 t

Fundamental matrix (earlier example): Φ = −ee

t

t 5ee

7

7

t

t Φ−1 = e−

68t e

e

7

t

t −5ee

7

t

t .

�

� � � �

Inhomogeneous Case: Variation of Parameters Formula OCW 18.03SC

Variation of parameters: x = Φ Φ−1 F dt · � � � � � � � �

= Φ e−8t e7t −5e7t e

5t

t = Φ

1 1 − 5e4t dt

6 et et · e 6 e−6t + e−2t

=1

Φ t − 4

5 e4t + c1 = 1 tet − 4

5 e5t − 65 et − 2

5 e5t + c1et + 5c2e7t

6 − 16 e−6t − 1

2 e−2t + c2 6 −tet + 4

5 e5t − 16 e

t − 21 e5t − c1et + c2e7t � � � � � � � � � � ��

5t t t 7t = 1

tet 1 + e −15/4

+ e −5/6 + c1e 1

+ c2e 5.

6 −1 3/4 −1/6 −1 1

(Notice the homogeneous solution appearing with the constants of integration).

2

Introduction

In this session we introduce and develop the basic properties of autonomous 2 × 2 systems. In the next session we will see how to get key information about the solutions to such a system directly from the DE itself, without having to actually solve it. This is an example of what is called the qualitative theory of differential equations.

� �

The Phase Plane

1. General First Order Autonomous Systems

The sort of system for which we will be trying to sketch solutions can be written in the form

x� = f (x, y) (1)y� = g(x, y)

This is called an autonomous system. The word autonomous means self-regulating. These systems are self-regulating in the sense that their rate of change (i.e. derivatives) depends only on the state of the system (values of x and y) and not on the time t. You can easily spot an autonomous system because the independent variable (which we understand to be t) does not appear explicitly on the right, though of course it lurks in the derivatives on the left.

The system (1) is a first-order autonomous system; it is in standard form — the derivatives on the left, the functions on the right.

Just as for linear constant coefficient systems autonomous systems have trajectories in the phase plane. We will repeat the definitions of these objects in this more general setting.

A solution of this system has the form (we write it two ways)

x(t) =x(t)

, x = x(t)

(2)y(t) y = y(t).

It is a vector function of t, whose components satisfy the system (1) when they are substituted in for x and y. In general, you learned in 18.02 and physics that such a vector function describes a motion in the xy-plane; the equations in (2) tell how the point (x, y) moves in the xy-plane as the time t varies. The moving point traces out a curve called the trajectory of the solution (2) The xy-plane itself is called the phase plane for the system (1). We show a sketch of a trajectory at right. Notice the arrow is used to indicate the direction of in increasing time.

We use the term phase portrait to mean the graphs of enough trajectories to give a good sense of all the solutions to the system (1)

We have seen how we can picture the solutions (2) to the system. But how can we picture the system (1) itself? We can think of the derivative of

� �


a solution � �

x�(t) = xy

��((tt))

(3)

as representing the velocity vector of the point (x, y) as it moves according to (2). From this viewpoint, we can interpret geometrically the system (1) as prescribing for each point (x0, y0) in the xy-plane a velocity vector having its tail at (x0, y0):

x� = gf ((xx

0

0

,, yy

0

0

))

= f (x0, y0)i + g(x0, y0)j. (4)

The system (1) is thus represented geometrically as a vector field, the velocity field. A solution (2) of the system is a point moving in the xyplane so that at each point of its trajectory, it has the velocity prescribed by the field. The trajectory itself will be a curve which at each point has the direction of the velocity vector at that point.

2. Critical Points

Definition. A point (x0, y0) is a critical point of the system (1) if

f (x0, y0) = 0 and g(x0, y0) = 0.

In considering how to sketch trajectories of the system (1), the first thing to consider are the critical points (they are sometimes called stationary points).

If we adopt the geometric viewpoint, thinking of the system as represented by a velocity vector field, then a critical point is one where the velocity vector is zero. That is (x0, y0) is a critical point is equativalent to

x = x0, y = y0 is a (constant) solution to (1).

Such a point is a trajectory all by itself, since by not moving it satisfies the equations (1) of the system (and hence the alternative designation stationary point).

The critical points represent the simplest possible solutions to (1), so you begin by finding them; this is done by solving the pair of simultaneous equations

f (x, y) = 0 g(x, y) = 0

2


Next, you can try the strategy indicated in the following note of passing to the associated first-order ODE and trying to solve that and sketch the solutions; or you can try to locate some sketchable solutions to (??) and draw them in, as we did for linear constant coefficient systems in the session on Phase Potraits.

3. Sketching Principle

When sketching integral curves for direction fields we saw that integral curves did not cross. For the system (1) we have a similar principle.

Sketching Principle. Assuming the the functions f (x, y) and g(x, y) are smooth, (i.e. have continuous partial derivatives) then two trajectories of (1) cannot intersect.

3

� �

First Order Autonomous ODE Systems and First Order ODE’s

We are still considering the first order autonomous system

x� = f (x, y)y� = g(x, y) (1)

We can eliminate t from the system by dividing one equation by the other. Since by the chain rule

y� dy/dt dy = = .

x� dx/dt dx

we get after the division a single first-order ODE in x and y :

x� = f (x, y) dy =

g(x, y) . (2)y� = g(x, y) −→

dx f (x, y)

If the first order equation on the right is solvable, this is an important way of getting information about the solutions to the system on the left. Indeed, in the older literature, little distinction was made between the system and the single equation — “solving” meant to solve either one.

There is however a difference between them: the system involves time, whereas the single ODE does not. Consider how their respective solutions are related:

x = x(t) y = y(t) −→ F(x, y) = 0 , (3)

where the equation on the right is the result of eliminating t from the pair of equations on the left. Geometrically, F(x, y) = 0 is the equation for the trajectory of the solution x(t) on the left. The trajectory in other words is the path traced out by the moving point x(t), y(t) ; it doesn’t contain any record of how fast the point was moving; it is only the track (or trace, as one sometimes says) of its motion.

In the same way, we have the difference between the velocity field, which represents the left side of (2), and the direction field, which represents the right side. The velocity vectors have magnitude and sense, whereas the line segments that make up the direction field only have slope. The passage from the left side of (2) to the right side is represented geometrically by changing each of the velocity vectors to a line segment of standard length.

First Order Autonomous ODE Systems and First Order ODE’sOCW 18.03SC

Even the arrowhead is dropped, since it represents the direction of increasing time, and time has been eliminated; only the slope of the vector is retained.

2

Introduction

In this session, we continue to develop the methods for sketching graphs of the solutions to DE systems which we carried out for linear systems in the session on Phase Portraits. The goal is to see how to get quick qualitative information about the graphs of the solutions, without having to actually calculate points on them.

�

Sketching Non-linear Systems

In session on Phase Portraits, we described how to sketch the trajectories of a linear system

x� = ax + by a, b, c, d constants. y� = cx + dy

We now return to the general (i.e., non-linear) 2 × 2 autonomous system discussed at the beginning of this chapter, in sections 1 and 2:

x� = f (x, y) ; (1)y� = g(x, y)

it is represented geometrically as a vector field, and its trajectories — the solution curves — are the curves which at each point have the direction prescribed by the vector field. Our goal is to see how one can get information about the trajectories of (1), without determining them analytically or using a computer to plot them numerically.

Linearizing at the origin. To illustrate the general idea, let’s suppose that (0, 0) is a critical point of the system (1), i.e.,

f (0, 0) = 0, g(0, 0) = 0, (2)

Then if f and g are sufficiently differentiable, we can approximate them near (0, 0) (the approximation will have no constant term by (2)):

f (x, y) = a1x + b1y + higher order terms in x and y g(x, y) = a2x + b2y + higher order terms in x and y.

If (x, y) is close to (0, 0), then x and y will be small and we can neglect the higher order terms. Then the non-linear system (2) is approximated near (0, 0) by a linear system, the linearization of (2) at (0,0):

x� = a1x + b1y , (3)y� = a2x + b2y

and near (0,0), the solutions of (1) — about which we know nothing — will be like the solutions to (4), about which we know a great deal from our work in the previous sessions.

x� = y cos xExample 1. Linearize the system y� = x(1 + y)2 at the critical

point (0, 0).

Sketching Non-linear Systems OCW 18.03SC

�

�

�

Solution. We have x� ≈ y(1 − 21 x2)

2) so the linearization is

y� = x(1 + 2y + yx� = y y� = x

.

Linearising at a general point More generally, suppose now the critical point of (1) is (x0, y0), so that

f (x0, y0) = 0, g(x0, y0) = 0.

One way this can be handled is to make the change of variable

x1 = x − x0, y1 = y − y0; (4)

in the x1y1-coordinate system, the critical point is (0, 0), and we can proceed as before.

Example 2. Linearize x� = x − x2 − 2xy

at its critical points on y� = y − y2 − 3

2 xy the x-axis.

Solution. When y = 0, the functions on the right are zero when x = 0 and x = 1, so the critical points on the x-axis are (0, 0) and (1, 0).

The linearization at (0, 0) is x� = x, y� = y.

To find the linearization at (1, 0) we change of variable as in (4): x1 = x − 1, y1 = y ; substituting for x and y in the system and keeping just the linear terms on the right gives us as the linearization:

x� = (x1 + 12 ) −

3 (x1 + 1)2 − 2(x1 + 1)y1 ≈ −

1 x1 − 2y11

y1� = y1 − y1 − 2 (x1 + 1)y1 ≈ − 2 y1 .

Linearization using the Jacobian matrix

Though the above techniques are usable if the right sides are very simple, it is generally faster to find the linearization by using the Jacobian matrix, especially if there are several critical points, or the functions on the right are not simple polynomials. We derive the procedure.

We need to approximate f and g near (x0, y0). While this can sometimes be done by changing variable, a more basic method is to use the main approximation theorem of multivariable calculus. For this we use the notation

Δx = x − x0, Δy = y − y0, Δ f = f (x, y) − f (x0, y0) (5)

2


� � � �

� � � �

� � � �

� �

and we have then the basic approximation formula

∂ f ∂ fΔ f ≈

∂x Δx +

∂y Δy, or � �0 � �0 (6)

∂ f ∂ ff (x, y) ≈

∂x 0 Δx +

∂y 0 Δy ,

since by hypothesis f (x0, y0) = 0. We now make the change of variables (4)

x1 = x − x0 = Δx, y1 = y − y0 = Δy,

and use (6) to approximate f and g by their linearizations at (x0, y0). The result is that in the neighborhood of the critical point (x0, y0), the linearization of the system (1) is

∂ f ∂ f x1� = x1 + y1, � ∂x �0 � ∂y �0 (7)

y1� =

∂gx1 +

∂gy1.

∂x 0 ∂y 0

In matrix notation, the linearization is therefore

x1� = A x1, where x1 =

x1 and A = fx fy ;y1 gx gy (x0,y0)

(8) the matrix A is the Jacobian matrix, evaluated at the critical point (x0, y0).

General procedure for sketching the trajectories of non-linear systems.

We can now outline how to sketch in a qualitative way the solution curves of a 2 × 2 non-linear autonomous system,

x� = f (x, y) (9)y� = g(x, y).

1. Find all the critical points (i.e., the constant solutions), by solving the system of simultaneous equations

f (x, y) = 0 g(x, y) = 0 .

2. For each critical point (x0, y0), find the matrix A of the linearized system at that point, by evaluating the Jacobian matrix at (x0, y0):

fx fy . gx gy (x0,y0)

3


(Alternatively, make the change of variables x1 = x −x0, y1 = y − y0, and drop all terms having order higherthan one; then A is the matrix of coefficients for the linearterms.)

3. Find the geometric type and stability of the linearized system at the critical point point (x0, y0), by carrying out the analysis in sections 4 and 5.

sl The subsequent steps require that the eigenvalues be nonzero, real, and distinct, or complex, with a non-zero real part. The remaining cases: eigenvalues which are zero, repeated, or pure imaginary are classified as borderline, and the subsequent steps don’t apply, or have limited application. See the next section.

4. According to the above, the acceptable geometric types are a saddle, node (not a star or a defective node, however), and a spiral. Assuming that this is what you have, for each critical point determine enough additional information (eigenvectors, direction of motion) to allow a sketch of the trajectories near the critical point.

5. In the xy-plane, mark the critical points. Around each, sketch the trajectories in its immediate neighborhood, as determined in the previous step, including the direction of motion.

6. Finally, sketch in some other trajectories to fill out the picture, making them compatible with the behavior of the trajectories you have already sketched near the critical points. Mark with an arrowhead the direction of motion on each trajectory.

If you have made a mistake in analyzing any of the critical points, it will often show up here — it will turn out to be impossible to draw in any plausible trajectories that complete the picture.

Remarks about the steps.

1. In the homework problems, the simultaneous equations whose solutions are the critical points will be reasonably easy to solve. In the real world, they may not be; a simultaneous-equation solver will have to be used (the standard programs — MatLab, Maple, Mathematica, Macsyma — all have them, but they are not always effective.)

2. If there are several critical points, one almost always uses the Jacobian matrix; if there is only one, use your judgment.

3. This method of analyzing non-linear systems rests on the assumption

4


� � � � � �

that in the neighborhood of a critical point, the non-linear system will look like its linearization at that point. For the borderline cases this may not be so — that is why they are rejected. The next two notes explain this more fully.

If one or more of the critical points turn out to be borderline cases, one usually resorts to numerical computation on the non-linear system. Occasionally one can use the reduction to a first order equation:

dy g(x, y) =

dx f (x, y)

to get information about the system.

Example 3. Sketch some trajectories of the system

x� = −x + xy y� = −2y + xy

.

Solution. We first find the critical points, by solving

−x + xy = x(−1 + y) = 0 −2y + xy = y(−2 + x) = 0

.

From the first equation, either x = 0 or y = 1. From the second equation,

x = 0 y = 0; y = 1 x = 2; critical points : (0, 0), (2, 1).⇒ ⇒

To linearize at the critical points, we compute the Jacobian matrices

J = −1

y + y

−2 x + x

; J(0,0) = −1

0 −20 J(2,1) =

10

02

.

Next we analyze the geometric type and stability of each critical point: (0, 0):

eigenvalues: ł1 = �−1, � ł2 = −2� �sink node 1 0

eigenvectors: �α1 = ; �α2 = 0 1

By the node-sketching principle, trajectories follow �α1 near the origin, are parallel to�α2 away from the origin.

(2, 1):eigenvalues: ł1 =

√2, ł2 = −

√2 unstable saddle� √

2 � �

−√

2 �

eigenvectors: �α1 = ; �α2 =1 1

5


� �

� �

Draw in these eigenvectors at the respective points (0, 0) and (2, 1), with arrowhead indicating direction of motion (into the critical point if ł < 0, away from critical point if ł > 0.) Draw in some nearby trajectories.

Then guess at some other trajectories compatible with these. See the figure for one attempt at this. Further information could be gotten by considering the associated first-order ODE in x and y.

Example. Sketch the phase portrait of the following system.

x� = 14x − 1

x2 − xy 2

y� = 16y − 1

y2 − xy 2

Critical points: 1 1

x 14 − 2

x − y = 0 ⇒ x = 0 or 14 − 2

x − y = 0

1 1 y 16 −

2 y − x = 0 ⇒ y = 0 or 16 −

2 y − x = 0.

x = 0 y = 0 or y = 32.⇒

y = 0 x = 0 or x = 28.⇒

x �= 0, y �= 0 ⇒ x = 12, y = 8. all critical points: (0, 0), (0, 32), (28, 0), (12, 8).⇒

6


� �

� � � � � �

J(x, y) = 14 −

−xy − y

16 −−

yx − x

Looking at each of the critical points in turn:

14 0 1 0J(0, 0) = : eigenvalues 14, 16; eigenvectors ,0 16 0 1

source node (see ’Source node’ picture below).⇒ � � � � � �

J(0, 32) = −18 0

: eigenvalues -18, -16; eigenvectors 1

, 0

−32 −16 16 1 sink node (see Sink node 1’ picture below).⇒ � � � � � �

−14 −28 1 −14J(28, 0) = 0 −12

: eigenvalues -14, -12; eigenvectors 0

, 1

sink node (see Sink node 2’ picture below).⇒ � �

J(12, 8) = −6 −12

: −8 −4

eigenvalues −5 ±√

97 ≈ −15, 5;

eigenvectors �

1 + 8

√97

�

, �

1 −8

√97

�

≈ �

118

�

, �

−98

�

saddle (see ’Saddle’ picture below).⇒

Rough sketch of system: First we sketch each of the critical points.

v v v

u��

��

��

��

��

��

��

��

��

��

��

• u

��

��

��

��

��

��

��

��

��• u��

��

��

��

�� •

Source node Sink node 1 Sink node 2

u

v

��

��

��

��

��

��

��

��

��

•

Saddle

7


Hand sketch – phase plane portrait. Computer plot – phase plane portrait.

8

�

Structural Stability

In the previous Note, we described how to get a rough picture of the trajectories of a non-linear system by linearizing at each of its critical points. The basic assumption of the method is that the linearized system will be a good approximation to the original non-linear system if you stay near the critical point.

The method only works however if the linearized system turns out to be a node, saddle, or spiral. What is it about these geometric types that allows the method to work, and why won’t it work if the linearized system turns out to be one of the other possibilities (dismissed as “borderline types” in the previous section)?

Briefly, the answer is that nodes, saddles, and spirals are structurally stable, while the other possibilities are not. We call a system

x� = f (x, y)y� = g(x, y) (1)

Structurally Stability: We say a system is structural if small changes in the system parameters (i.e., the constants that enter into the functions on the right hand side) do not change the geometric type or stability of its critical points (or its limit cycles, which will be defined in a later session -don’t worry about them for now).

Theorem. The 2 × 2 autonomous linear system

x� = ax + byy� = cx + dy

(2)

is structurally stable if it is a spiral, saddle, or node (but not a degenerate or star node).

Proof. The characteristic equation is

λ2 − (a + d)λ + (ad − bc) = 0,

and its roots (the eigenvalues) are

λ1, λ2 =(a + d) ± (a + d)2 − 4(ad − bc)

. (3)2

Let’s look at the cases one-by-one; assume first that the roots λ1 and λ2 are real and distinct. The possibilities in the theorem are given by the

Structural Stability OCW 18.03SC

following (note that since the roots are distinct, the node will not be degenerate or a star node):

λ1 > 0, λ2 > 0 unstable node λ1 < 0, λ2 < 0 asymptotically stable node λ1 > 0, λ2 < 0 unstable saddle.

The quadratic formula (3) shows that the roots depend continuously on the coefficients a, b, c, d. Thus if the coefficients are changed a little, the roots ł1 and ł2 will also be changed a little to ł�1 and ł2

� respectively; the new roots will still be real, and will have the same sign if the change is small enough. Thus the changed system will still have the same geometric type and stability. �

If the roots of the characteristic equation are complex, the reasoning is similar. Let us denote the complex roots by r ± si; we use the root ł = r + si, s > 0; then the possibilities to be considered for structural stability are

r > 0, s > 0 unstable spiralr < 0, s > 0 asymptotically stable spiral.

If a, b, c, d are changed a little, the root is changed to ł� = r� + s�i, where r� and s� are close to r and s respectively, since the quadratic formula (3) shows r and s depend continuously on the coefficients. If the change is small enough, r� will have the same sign as r and s� will still be positive, so the geometric type of the changed system will still be a spiral, with the same stability type. ��

Structural Stability of a non-linear system Theorem: For an autonomous non-linear system, the linearized system correctly classifies the crititcal point if the linear system is a spiral node, a nodal source or sink or a saddle.

It may not however correctly classify a center, defective node, star node or non-isolated critical point. That is, it is correct in open regions of the trace-determinant diagram and untrustworthy on the boundary lines.

2

Structural Stability OCW 18.03SC

trA

det A

saddle

nodal sink nodal source

spiral source

spiral sink defective or star node��

center��

degenerate ��

Trace-determinant diagram Idea: small changes in the eigenvalues don’t move far in trace⇒

determinant diagram.

3

The Borderline Geometric Types

All the other possibilities for the linear system (??) we call borderline types. We will show now that none of them is structurally stable; we begin with the center.

Eigenvalues pure imaginary. Once again we use the eigenvalue with the positive imaginary part: ł = 0 + si, s > 0. It corresponds to a center: the trajectories are a family of concentric ellipses, centered at the origin. If the coefficients a, b, c, d are changed a little, the eigenvalue 0 + si changes a little to r� + s�i, where r� ≈ 0, s� ≈ s, and there are three possibilities for the new eigenvalue:

0 + si r� + s�i : r� > 0 r� < 0 r� = 0→ s > 0 s� > 0 s� > 0 s� > 0 center source spiral sink spiral center

Correspondingly, there are three possibilities for how the geometric picture of the trajectories can change:

Eigenvalues real; one eigenvalue zero. Here ł1 = 0, and ł2 > 0 or ł2 < 0. The general solution to the system has the form (α1, α2 are the eigenvectors)

x = c1α1 + c2α2eł2t . If ł2 < 0, the geometric picture of its trajectories shows a line of critical points (constant solutions, corresponding to c2 = 0), with all other trajectories being parallel lines ending up (for t = ∞) at one of the critical points, as shown below.

We continue to assume ł2 < 0. As the coefficients of the system change a little, the two eigenvalues change a little also; there are three possibilities,

The Borderline Geometric Types OCW 18.03SC

since the eigenvalue λ = 0 can become positive, negative, or stay zero:

ł1 = 0 ł1� : ł1

� > 0 ł1� = 0 ł1 < 0→

ł2 < 0 ł2� : ł2

� < 0 ł2� < 0 ł2

� < 0→critical line unstable saddle critical line sink node

Here are the corresponding pictures. (The pictures would look the same if we assumed ł2 > 0, but the arrows on the trajectories would be reversed.)

One repeated real eigenvalue. Finally, we consider the case where ł1 = ł2. Here there are a number of possibilities, depending on whether ł1 is positive or negative, and whether the repeated eigenvalue is complete (i.e., has two independent eigenvectors), or defective (i.e., incomplete: only one eigenvector). Let us assume that ł1 < 0. We vary the coefficients of the system a little. By the same reasoning as before, the eigenvalues change a little, and by the same reasoning as before, we get as the main possibilities (omitting this time the one where the changed eigenvalue is still repeated):

ł1 < 0 ł�1 < 0 r + si →ł2 < 0 → ł2

� < 0 r − si ł1 = ł2 ł� �= ł2

� r ≈ ł1, s ≈ 0,1 sink node sink node sink spiral

Typical corresponding pictures for the complete case and the defective (incomplete) case are (the last one is left for you to experiment with on the computer screen)

complete: star node incomplete: defective node

2

The Borderline Geometric Types OCW 18.03SC

Remarks. Each of these three cases—one eigenvalue zero, pure imaginary eigenvalues, repeated real eigenvalue—has to be looked on as a borderline linear system: altering the coefficients slightly can give it an entirely different geometric type, and in the first two cases, possibly alter its stability as well.

3

� �

� �

Structural Stability for Non-linear Systems

In the preceding note we discussed the structural stability of a linear system. How does it apply to non-linear systems?

Suppose our non-linear system has a critical point at P, and we want to study its trajectories near P by linearizing the system at P.

This linearization is only an approximation to the original system, so if it turns out to be a borderline case, i.e., one sensitive to the exact value of the coefficients, the trajectories near P of the original system can look like any of the types obtainable by slightly changing the coefficients of the linearization.

It could also look like a combination of types. For instance, if the linearized system had a critical line (i.e., one eigenvalue zero), the original system could have a sink node on one half of the critical line, and an unstable saddle on the other half. (This actually occurs.)

In other words, the method of linearization to analyze a non-linear system near a critical point doesn’t fail entirely, but we don’t end up with a definite picture of the non-linear system near P; we only get a list of possibilities. In general one has to rely on computation or more powerful analytic tools to get a clearer answer. The first thing to try is a computer picture of the non-linear system, which often will give the answer.

Example. x� = y − x2, y� = −x + y2

Jacobian: J(x, y) = −−

21 x

21 y

Crititcal points: y − x2 = 0 ⇒ y = x2

−x + y2 = 0 ⇒ −x + x4 = 0 ⇒ x = 0, 1.(0, 0) and (1, 1) are the critical points.⇒

J(1, 1) = −2 1

: −1 2

characteristic equation: λ2 = 0 λ = ±√

3 − 3 ⇒

linearized system has a saddle. ⇒

This is structurally stable the nonlinear system has a saddle at (1, 1). � �⇒

0 1J(0, 0) = −1 0 : eigenvalues = ±i ⇒ a linearized center.

This is not structurally stable. The nonlinear system could be any one of a

� �

Structural Stability for Non-linear Systems OCW 18.03SC

center, spiral out or spiral in. Using a computer program it appears that (0,0) is in fact a center. (This can be proven using more advanced methods.)

We can show the trajectories near (0,0) are not spirals by exploiting the symmetry of the picture. First note, if (x(t), y(t) is a solution then so is (y(−t), x(−t). That is, the trajectory is symmetric in the line x = y. This implies it can’t be a spiral. Since the only other choice choice is that the critical point (0,0) is a center, the trajectories must be closed.

The following two examples show that a linearized center might also be a spiral in or a spiral out in the nonlinear system.

Example a. x� = y, y� = −x − y3. Example b. x� = y, y� = −x + y3. In both examples the only critical point is (0, 0).

0 1J(0, 0) = −1 0 ⇒ linearized center. This is not structurally stable.

In example a the critical point turns out to be a spiral sink. In example b it is a spiral source.

Below are computer-generated pictures. Because the y3 term causes the spiral to have a lot of turns we ’improved’ the pictures by using the power 1.1 instead.

Spiral in Spiral out

2

Introduction

In this final section we look at two important and interesting extensions of the ideas from qualitative DE theory we have been exploring inthis unit, namely limit cycles and chaos.

Limit Cycles

In analyzing non-linear systems in the xy-plane, we have so far concentrated on finding the critical points and analysing how the trajectories of the system look in the neighborhood of each critical point. This gives some feeling for how the other trajectories can behave, at least those which pass near anough to critical points.

Another important possibility which can influence how the trajectories look is if one of the trajectories traces out a closed curve C. If this happens, the associated solution x(t) will be geometrically realized by a point which goes round and round the curve C with a certain period T. That is, the solution vector

x(t) = (x(t), y(t))

will be a pair of periodic functions with period T :

x(t + T) = x(t), y(t + T) = y(t) for all t.

If there is such a closed curve, the nearby trajectories must behave something like C. The possibilities are illustrated below. The nearby trajectories can either spiral in toward C, they can spiral away from C, or they can themselves be closed curves. If the latter case does not hold — in other words, if C is an isolated closed curve — then C is called a limit cycle: stable, unstable, or semi-stable according to whether the nearby curves spiral towards C, away from C, or both.

The most important kind of limit cycle is the stable limit cycle, where nearby curves spiral towards C on both sides. Periodic processes in nature can often be represented as stable limit cycles, so that great interest is attached to finding such trajectories if they exist. Unfortunately, surprisingly little is known about how to do this, or how to show that a system has no

Limit Cycles OCW 18.03SC

limit cycles. There is active research in this subject today. We will present a few of the things that are known.

2

Showing Limit Cycles Exist

The main tool which historically has been used to show that the system

x� = f (x, y) (1)y� = g(x, y)

has a stable limit cycle is the

Poincare-Bendixson Theorem Suppose R is the finite region of the plane lying between two simple closed curves D1 and D2, and Fis the velocity vector field for the system (1). If

(i) at each point of D1 and D2, the field Fpoints toward the interior of R, and

(ii) R contains no critical points,

then the system (1) has a closed trajectory lying inside R.

The hypotheses of the theorem are illustrated by fig. 1. We will not give the proof of the theorem, which requires a background in Mathematical Analysis. Fortunately, the theorem strongly appeals to intuition. If we start on one of the boundary curves, the solution will enter R, since the velocity vector points into the interior of R. As time goes on, the solution can never leave R, since as it approaches a boundary curve, trying to escape from R, the velocity vectors are always pointing inwards, forcing it to stay inside R. Since the solution can never leave R, the only thing it can do as t ∞→is either approach a critical point — but there are none, by hypothesis — or spiral in towards a closed trajectory. Thus there is a closed trajectory inside R. (It cannot be an unstable limit cycle—it must be one of the other three cases shown above.)

To use the Poincare-Bendixson theorem, one has to search the vector field for closed curves D along which the velocity vectors all point towards

Showing Limit Cycles Exist OCW 18.03SC

the same side. Here is an example where they can be found.

Example 1. Consider the system

x� = −y + x(1 − x2 − y2) (2)y� = x + y(1 − x2 − y2)

Figure 2 shows how the associated velocity vector field looks on two circles. On a circle of radius 2 centered at the origin, the vector field points inwards, while on a circle of radius 1/2, the vector field points outwards. To prove this, we write the vector field along a circle of radius r as

x� = (−yi + xj) + (1 − r2)(xi + yj) . (3)

The first vector on the right side of (3) is tangent to the circle; the second vector points radially in for the big circle (r = 2), and radially out for the small circle (r = 1/2). Thus the sum of the two vectors given in (3) points inwards along the big circle and outwards along the small one.

We would like to conclude that the Poincare-Bendixson theorem applies to the ring-shaped region between the two circles. However, for this we must verify that R contains no critical points of the system. We leave you to show as an exercise that (0, 0) is the only critical point of the system; this shows that the ring-shaped region contains no critical points.

The above argument shows that the Poincare-Bendixson theorem can be applied to R, and we conclude that R contains a closed trajectory. In fact, it is easy to verify that x = cos t, y = sin t solves the system, so the unit circle is the locus of a closed trajectory. We leave as another exercise to show that it is actually a stable limit cycle for the system, and the only closed trajectory.

2

� � ��

Non-Existence of Limit Cycles

We turn our attention now to the negative side of the problem of showing limit cycles exist. Here are two theorems which can sometimes be used to show that a limit cycle does not exist.

1. Bendixson’s Criterion

If fx and gy are continuous in a region R which is simply-connected (i.e., without holes), and

∂ f ∂g ∂x

+ ∂y

�= 0 at any point of R,

then the system x� = f (x, y)y� = g(x, y) (1)

has no closed trajectories inside R.

Proof. Assume there is a closed trajectory C inside R. We shall derive a contradiction, by applying Green’s theorem, in its normal (or flux) form. This theorem says

C ( f i + gj) · n ds ≡

C f dy − g dx =

D (

∂

∂xf +

∂

∂yg ) dx dy . (2)

where D is the region inside the simple closed curve C.

This however is a contradiction. Namely, by hypothesis, the integrand on the right-hand side is continuous and never 0 in R; thus it is either always positive or always negative, and the right-hand side of (2) is therefore either positive or negative.

On the other hand, the left-hand side must be zero. For since C is a closed trajectory, C is always tangent to the velocity field f i + gj defined by the system. This means the normal vector n to C is always perpendicular to the velocity field f i + gj, so that the integrand ( f i + gj) n on the left is · identically zero.

This contradiction means that our assumption that R contained a closed trajectory of (1) was false, and Bendixson’s Criterion is proved. �

Critical-point Criterion A closed trajectory has a critical point in its interior.

�

Non-Existence of Limit Cycles OCW 18.03SC

If we turn this statement around, we see that it is really a criterion for non-existence: it says that if a region R is simply-connected (i.e., without holes) and has no critical points, then it cannot contain any limit cycles. For if it did, the Critical-point Criterion says there would be a critical point inside the limit cycle, and this point would also lie in R since R has no holes.

(Note carefully the distinction between this theorem, which says that limit cycles enclose regions which do contain critical points, and the Poincare-Bendixson theorem, which seems to imply that limit cycles tend to lie in regions which don’t contain critical points. The difference is that these latter regions always contain a hole; the critical points are in the hole. Example 1 illustrated this.

x� = ax + by Example 2. For what a and d does have closed trajec

y� = cx + dy tories?

Solution. By Bendixson’s criterion, a + d = 0 no closed trajecto� ⇒ries.

What if a + d = 0? Bendixson’s criterion says nothing. We go back to our analysis of the linear system. The characteristic equation of the system is

λ2 − (a + d)λ + (ad − bc) = 0 .

Assume a + d = 0. Then the characteristic roots have opposite sign if ad − bc < 0 and the system is a saddle; the roots are pure imaginary if ad − bc > 0 and the system is a center, which has closed trajectories. Thus

the system has closed trajectories ⇔ a + d = 0, ad − bc > 0.

2

�

The Van der Pol Equation

An important kind of second-order non-linear autonomous equation has the form

x�� + u(x) x� + v(x) = 0 (Liénard equation) . (1)

One might think of this as a model for a spring-mass system where the damping force u(x) depends on position (for example, the mass might be moving through a viscous medium of varying density), and the spring constant v(x) depends on how much the spring is stretched—this last is true of all springs, to some extent. We also allow for the possibility that u(x) < 0 (i.e., that there is "negative damping").

The system equivalent to (1) is

x = y� (2)y� = −v(x) − u(x) y

Under certain conditions, the system (2) has a unique stable limit cycle, or what is the same thing, the equation (1) has a unique periodic solution; and all nearby solutions tend towards this periodic solution as t ∞. The→conditions which guarantee this were given by Liénard, and generalized in the following theorem.

Levinson-Smith Theorem Suppose the following conditions are satisfied.

(a) u(x) is even and continuous,

(b) v(x) is odd, v(x) > 0 i f x > 0, and v(x) is continuous for all x,

(c) V(x) ∞ as x ∞, where V(x) = 0 x v(t) dt ,→ →

(d) for some k > 0, we have ⎫ U(x) < 0, for 0 < x < k, ⎬ � x U(x) > 0 and increasing, for x > k, where U(x) = u(t) dt. U(x) ∞, as x ∞, ⎭ 0 → →

Then, the system (2) has

i) a unique critical point at the origin;

ii) a unique non-zero closed trajectory C, which is a stable limit cycle around the origin;

iii) all other non-zero trajectories spiralling towards C as t ∞ .→

The Van der Pol Equation OCW 18.03SC

We omit the proof, as too difficult. A classic application is to the equation

x�� − a(1 − x2) x� + x = 0 (van der Pol equation) (3)

which describes the current x(t) in a certain type of vacuum tube. (The constant a is a positive parameter depending on the tube constants.) The equation has a unique non-zero periodic solution. Intuitively, think of it as modeling a non-linear spring-mass system. When |x| is large, the restoring and damping forces are large, so that |x| should decrease with time. But when |x| gets small, the damping becomes negative, which should make |x| tend to increase with time. Thus it is plausible that the solutions should oscillate; that it has exactly one periodic solution is a more subtle fact.

There is a lot of interest in limit cycles, because of their appearance in systems which model processes exhibiting periodicity. But not a great deal is known about them – this is still an area of active research.

2

Chaos

We give a very brief introduction to this subject using DE’s as the starting point. The interested reader who wishes to explore this subject further will find many good sources on the web.

1. Discrete Logistic Equation

The difference equation xn+1 = rxn(1 − xn) (r a constant) is the discrete logistic equation. One way it arises is as follows.

dP = aP − bP2 = model of logistic population growth.

dt Euler’s numerical method makes this a discrete system:

Pn+1 = Pn + (aPn − bP2)h.n

Rewrite this as Pn+1 = rPn − sPn 2.

rLet Pn = xn � xn+1 = rxn(1 − xn). s Since r = 1 + ah we will only consider r > 1 Given r and x0 iteration gives the sequence

x0, x1, x2, . . . , xn . . . Figure 1.

This is easy to implement on a computer: Figure 1 shows an x vs. r diagram. To make it we used the following recipe.

1. We choose a value of r and a starting point x0 = .5.

2. We iterate out to x500 in order to eliminate any transient behavior.

3. We then plot 1000 points (r, xn) for n = 501 to 1500.

The darker the plotted point the more times that we got that value of x.

Look for, instance at the value r = 1.5. The only x value plotted is the one at x = .333. This says that the iterated sequence x0, x1, . . . goes to a limit of .333. The values r = 2 and r = 2.5 behave similarly.

At around r = 3.1 the diagram bifurcates. That is, it splits into two branches. What this means is that the value of xn is cycles back and forth between two values. In the case r = 3.1 we get

x1001 = .5580, x1002 = .7646, x1003 = .5580, . . .

Chaos OCW 18.03SC

We call this a period 2 cycle.

As r increases from 3.1 we continue to get period 2 cycles until around r = 3.5. At this point both branches of the diagram bifurcate and we see four values plotted. This means the values of xn are cycling between four values. This is called a cycle with period 4 (or a 4-cycle for short).

This continues as r increases until the next bifurcation point where we get cycles of period 8. As r increases further, this period doubling continues to cycles of period 16, 32, etc.

Then around r = 3.57 something new happens: the periodic behavior disappears and seemingly random behavior occurs. This is called chaos.

At around r = 3.83 periodic behavior returns with cycles of period 3. As r increases we again see period doubling with cycles of period 6, then 12, then 24 etc. until this leads to chaos again.

After the chaotic region there is a value of r where we see period 5cycles. This is followed by period doubling, leading to chaos again. Then 7-cycles followed by period doubling to chaos, etc.

Figures 2-4. The pitchfork at various resolutions.

Remarks: 1. This period doubling to chaos is a phenomenon seen in many systems.

2. For any value of r there are fixed points and, often points with other periods. The computer doesn’t find them because they are not stable. In fact, there is a theorem that says if there is a point of period 3 then there are points of all orders.

2. Feigenbaum constant

If r1 = first bifurcation point, r2 = second etc. then

2

Chaos OCW 18.03SC

• • •

lim rk − rk−1 = 4.6692 . . . = the Feigenbaum constant.

k ∞→ rk+1 − rk

The same value occurs in many ’period doubling’ systems.

3. The forced Duffing Equation

Period doubling also happens in mechanical systems. If we apply a periodic force to the damped nonlinear spring we get the equation

mx�� + cx� + kx + βx3 = F0 cos ωt. A mass atop a thin metal wire is modeled by this equation with k < 0.

We look at the forced Duffing equation x�� + x� − x + x3 = F0 cos t and the equivalent nonlinear system

x� = y

y� = x − x3 − y + F0 cos t.

If F0 = 0 (unforced) then there are 3 equilibrium points: (0, 0) –unstable (saddle); (±1, 0) –stable (spiral sinks). These are shown in the pictures at right.

stable unstable stable

In a linear spring system the single critical point at the origin is stable and the frequency of the periodic response would equal ω (which in this case is 1) and doubling the amplitude of the input would simply double the amplitude of the output. In the Duffing system, the behavior is very different.

The plots below were made by taking x(0) = 1, x�(0) = 0, running the ode solver for t = 0 to 200, and plotting for t = 100 to 200. (We throw away t = 0 to 100 as transient.)

Just like the discrete logistic equation, we see period doubling to chaos.

3

Chaos OCW 18.03SC

4

Chaos OCW 18.03SC

4. Lorenz Strange Attractor

This example is the Lorenz System. It is a 3 dimensional system x� = −sx + sy

y� = −xz + rx − y

z� = xy − bz

where s, r, b are constants.

The following picture shows the famous ’butterfly’. (The plot should be three dimensional, showing x, y and z. In this case we just plotted z vs. x.) Like limit cycles or the periodic points in the pitchfork example this trajectory has a limiting set. That is. a set of points that are arbitrarily close to the trajectory for arbitrarily large t. In this case, it is called a strange attractor because it is such a complicated set.

5

18 03sc fall 2011 differential equations

Documents