mathematical foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. ·...

286
Mathemcal Foundons of enginr analysis Rico A. R. Picone Department of Mechanical Enginr Saint Martin’s University 04 December 2020 Copyright © 2020 Rico A. R. Picone All Rights Reserved

Upload: others

Post on 29-Mar-2021

0 views

Category:

Documents


0 download

TRANSCRIPT

Page 1: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview

Mathematical

Foundations

of engineering analysis

Rico A R Picone

Department of Mechanical Engineering

Saint Martinrsquos University

04 December 2020

Copyright copy 2020 Rico A R Picone All Rights

Reserved

Contents

01 Mathematics itself 8

itselftruTruth 9

Neo-classical theories of truth 9

The picture theory 12

The relativity of truth 15

Other ideas about truth 16

Where this leaves us 21

itselffoundThe foundations of mathematics 23

Algebra ex nihilo 24

The application of mathematics to

science 25

The rigorization of mathematics 25

The foundations of mathematics

are built 26

The foundations have cracks 29

Mathematics is considered empirical 30

itselfreasonMathematical reasoning 32

itselfoverviewMathematical topics in

overview 33

itselfengmathWhat is mathematics for

engineering 34

itselfex Exercises for Chapter 01 35

02 Mathematical reasoning logic

and set theory 36

setssetintroIntroduction to set theory 38

setslogicLogical connectives and

quantifiers 43

Contents Contents p 3

Logical connectives 43

Quantifiers 43

setsex Exercises for Chapter 02 45

Exe 021 hardhat 45

Exe 022 2 45

Exe 023 3 45

Exe 024 4 45

03 Probability 46

probmeasProbability and measurement 47

probprobBasic probability theory 48

Algebra of events 49

probconditionIndependence and

conditional probability 51

Conditional probability 52

probbay Bayesrsquo theorem 54

Testing outcomes 54

Posterior probabilities 55

probrandoRandom variables 60

probpxf Probability density and

mass functions 63

Binomial PMF 64

Gaussian PDF 70

probE Expectation 72

probmomentsCentral moments 75

probex Exercises for Chapter 03 78

Exe 031 5 78

04 Statistics 79

statstermsPopulations samples and

machine learning 81

statssampleEstimation of sample mean

and variance 83

Estimation and sample statistics 83

Sample mean variance and

standard deviation 83

Sample statistics as random

variables 85

Nonstationary signal statistics 86

statsconfidenceConfidence 98

Contents Contents p 4

Generate some data to test the

central limit theorem 99

Sample statistics 101

The truth about sample means 101

Gaussian and probability 102

statsstudentStudent confidence 110

statsmultivarMultivariate probability

and correlation 115

Marginal probability 117

Covariance 118

Conditional probability and

dependence 122

statsregressionRegression 125

statsex Exercises for Chapter 04 131

05 Vector calculus 132

vecsdiv Divergence surface

integrals and flux 135

Flux and surface integrals 135

Continuity 135

Divergence 136

Exploring divergence 137

vecscurlCurl line integrals and

circulation 144

Line integrals 144

Circulation 144

Curl 145

Zero curl circulation and path

independence 146

Exploring curl 147

vecsgradGradient 153

Gradient 153

Vector fields from gradients are

special 154

Exploring gradient 154

vecsstokedStokes and divergence

theorems 164

The divergence theorem 164

The Kelvin-Stokesrsquo theorem 165

Related theorems 165

vecsex Exercises for Chapter 05 166

Contents Contents p 5

Exe 051 6 166

06 Fourier and orthogonality 168

fourseriesFourier series 169

fourtransFourier transform 174

fourgeneralGeneralized fourier series

and orthogonality 184

fourex Exercises for Chapter 06 186

Exe 061 secrets 186

Exe 062 seesaw 189

Exe 063 9 189

Exe 064 10 190

Exe 065 11 190

Exe 066 12 190

Exe 067 13 190

Exe 068 14 190

Exe 069 savage 191

Exe 0610 strawman 192

07 Partial differential equations 193

pdeclassClassifying PDEs 196

pdesturmSturm-liouville problems 199

Types of boundary conditions 200

pdeseparationPDE solution by separation of

variables 205

pdewaveThe 1D wave equation 212

pdeex Exercises for Chapter 07 219

Exe 071 poltergeist 219

Exe 072 kathmandu 220

08 Optimization 222

optgrad Gradient descent 223

Stationary points 223

The gradient points the way 224

The classical method 225

The Barzilai and Borwein method 226

optlin Constrained linear

optimization 235

Feasible solutions form a polytope 235

Only the vertices matter 236

optsimplexThe simplex algorithm 238

optex Exercises for Chapter 08 245

Contents Contents p 6

Exe 081 cummerbund 245

Exe 082 melty 245

09 Nonlinear analysis 246

nlinss Nonlinear state-space models 248

Autonomous and nonautonomous

systems 248

Equilibrium 248

nlinchar Nonlinear system

characteristics 250

Those in-common with linear

systems 250

Stability 250

Qualities of equilibria 251

nlinsim Nonlinear system simulation 254

nlinpysimSimulating nonlinear systems

in Python 255

nlinex Exercises for Chapter 09 257

A Distribution tables 258

A01 Gaussian distribution table 259

A02 Studentrsquos tshydistribution table 261

B Fourier and Laplace tables 262

B01 Laplace transforms 263

B02 Fourier transforms 265

C Mathematics reference 268

C01 Quadratic forms 269

Completing the square 269

C02 Trigonometry 270

Triangle identities 270

Reciprocal identities 270

Pythagorean identities 270

Co-function identities 271

Even-odd identities 271

Sumshydifference formulas (AM or

lock-in) 271

Double angle formulas 271

Power-reducing or halfshyangle

formulas 271

Sum-toshyproduct formulas 272

Contents Contents p 7

Product-to-sum formulas 272

Two-to-one formulas 272

C03 Matrix inverses 273

C04 Laplace transforms 274

D Complex analysis 275

D01 Eulerrsquos formulas 276

Bibliography 277

itself

Mathematics itself

itself Mathematics itself tru Truth p 1

1 For much of this lecture I rely on the thorough overview

of Glanzberg (Michael Glanzberg Truth In The Stanford

Encyclopedia of Philosophy Ed by Edward N Zalta Fall 2018

Metaphysics Research Lab Stanford University 2018)

correspondence theory

itselftru Truth

1 Before we can discuss mathematical truth

we should begin with a discussion of truth

itself1 It is important to note that this is

obviously extremely incomplete My aim is

to give a sense of the subject via brutal

(mis)abbreviation

2 Of course the study of truth cannot but

be entangled with the study of the world

as such (metaphysics) and of knowledge

(epistemology) Some of the following theories

presuppose or imply a certain metaphysical

andor epistemological theory but which these

are is controversial

Neo-classical theories of truth

3 The neo-classical theories of truth

take for granted that there is truth and

attempt to explain what its precise nature is

(Michael Glanzberg Truth In The Stanford

Encyclopedia of Philosophy Ed by Edward N

Zalta Fall 2018 Metaphysics Research Lab

Stanford University 2018) What are provided

here are modern understandings of theories

developed primarily in the early 20th century

The correspondence theory

4 A version of what is called the

correspondence theory of truth is the

following

A proposition is true iff there is an

existing entity in the world that

corresponds with it

itself Mathematics itself tru Truth p 1

facts

false

coherence theory

2 This is typically put in terms of ldquobeliefsrdquo or ldquojudgmentsrdquo but for

brevity and parallelism I have cast it in terms of propositions It is

to this theory I have probably committed the most violence

coherence

Such existing entities are called facts Facts

are relational in that their parts (eg subject

predicate etc) are related in a certain way

5 Under this theory then if a proposition

does not correspond to a fact it is false

6 This theory of truth is rather intuitive

and consistently popular (Marian David The

Correspondence Theory of Truth In The

Stanford Encyclopedia of Philosophy Ed

by Edward N Zalta Fall 2016 Metaphysics

Research Lab Stanford University 2016 A

detailed overview of the correspondence

theory of truth)

The coherence theory

7 The coherence theory of truth is adamant

that the truth of any given proposition

is only as good as its holistic system of

propositions2 This includes (but typically

goes beyond) a requirement for consistency of

a given proposition with the whole and the

self-consistency of the whole itselfmdashsometimes

called coherence

8 For parallelism letrsquos attempt a succinct

formulation of this theory cast in terms of

propositions

A proposition is true iff it is

has coherence with a system of

propositions

itself Mathematics itself tru Truth p 1

pragmatism

3 Pragmatism was an American philosophical movement of the early

20th century that valued the success of ldquopracticalrdquo application of

theories For an introduction see Legg and Hookway (Catherine

Legg and Christopher Hookway Pragmatism In The Stanford

Encyclopedia of Philosophy Ed by Edward N Zalta Spring 2019

Metaphysics Research Lab Stanford University 2019 An introductory

article on the philsophical movement ldquopragmatismrdquo It includes an

important clarification of the pragmatic slogan ldquotruth is the end of

inquiryrdquo)

4 This is especially congruent with the work of William James (ibid)

9 Note that this has no reference to facts

whatsoever However it need not necessarily

preclude them

The pragmatic theory

10 Of the neo-classical theories of truth this

is probably the least agreed upon as having a

single clear statement (Glanzberg Truth)

However as with pragmatism in general3 the

pragmatic truth is oriented practically

11 Perhaps the most important aspect

of this theory is that it is thoroughly a

correspondence theory agreeing that true

propositions are those that correspond to

the world However there is a different

focus here that differentiates it from

correspondence theory proper it values as

more true that which has some sort of practical

use in human life

12 Wersquoll try to summarize pragmatism in two

slogans with slightly different emphases herersquos

the first again cast in propositional parallel

A proposition is true iff it works4

Now there are two ways this can be

understood (a) the proposition ldquoworksrdquo in

that it empirically corresponds to the world

or (b) the proposition ldquoworksrdquo in that it has

an effect that some agent intends The former

is pretty standard correspondence theory The

itself Mathematics itself tru Truth p 1

5 This is especially congruent with the work of Charles Sanders

Peirce (Legg and Hookway Pragmatism)

inquiry

facts

latter is new and fairly obviously has ethical

implications especially today

13 Let us turn to a second formulation

A proposition is true if it

corresponds with a process of

inquiry5

This has two interesting facets (a) an agentrsquos

active inquiry creates truth and (b) it is a

sort of correspondence theory that requires a

correspondence of a proposition with a process

of inquiry not as in the correspondence theory

with a fact about the world The latter has

shades of both correspondence theory and

coherence theory

The picture theory

Before we delve into this theory we must take

a moment to clarify some terminology

States of affairs and facts

14 When discussing the correspondence

theory we have used the term fact to mean

an actual state of things in the world A

problem arises in the correspondence theory

here It says that a proposition is true iff

there is a fact that corresponds with it What

of a negative proposition like ldquothere are no

cows in Antarcticardquo We would seem to need

a corresponding ldquonegative factrdquo in the world

to make this true If a fact is taken to be

itself Mathematics itself tru Truth p 1

6 But Barker and Jago (Stephen Barker and Mark Jago Being

Positive About Negative Facts In Philosophy and Phenomenological

Research 851 [2012] pp 117ndash138 DOI 10 1111 j 1933 -1592201000479x) have attempted just that

state of affairs

obtain

model

picture

7 See Wittgenstein (Ludwig Wittgenstein Tractatus Logico-

Philosophicus Ed by C familyi=C given=K Ogden giveni= O

Project Gutenberg International Library of Psychology Philosophy

and Scientific Method Kegan Paul Trench Trubner amp Co Ltd 1922

A brilliant work on what is possible to express in languagemdashand

what is not As Wittgenstein puts it ldquoWhat can be said at all can

be said clearly and whereof one cannot speak thereof one must be

silentrdquo) Biletzki and Matar (Anat Biletzki and Anat Matar Ludwig

Wittgenstein In The Stanford Encyclopedia of Philosophy Ed by

Edward N Zalta Summer 2018 Metaphysics Research Lab Stanford

University 2018 An introduction to Wittgenstein and his thought)

glock2016 () and Dolby (David Dolby Wittgenstein on Truth In

A Companion to Wittgenstein John Wiley amp Sons Ltd 2016 Chap 27

pp 433ndash442 ISBN 9781118884607 DOI 1010029781118884607ch27)

composed of a complex of actual objects and

relations it is hard to imagine such facts6

15 Furthermore if a proposition is true it

seems that it is the corresponding fact that

makes it so what then makes a proposition

false since there is no fact to support the

falsity (Mark Textor States of Affairs In

The Stanford Encyclopedia of Philosophy Ed

by Edward N Zalta Winter 2016 Metaphysics

Research Lab Stanford University 2016)

16 And what of nonsense There are some

propositions like ldquothere is a round cuberdquo

that are neither true nor false However

the preceding correspondence theory cannot

differentiate between false and nonsensical

propositions

17 A state of affairs is something possible

that may or may not be actual (ibid) If

a state of affairs is actual it is said to

obtain The picture theory will make central

this concept instead of that of the fact

The picture theory of meaning (and truth)

18 The picture theory of meaning uses the

analogy of the model or picture to explain

the meaningfulness of propositions7

A proposition names possible objects

and arranges these names to

correspond to a state of affairs

itself Mathematics itself tru Truth p 1

Figure 011 a representation of the picture theory

nonsense

See Figure 011 This also allows for an easy

account of truth falsity and nonsense

nonsense A sentence that appears to be

a proposition is actually not if the

arrangement of named objects is

impossible Such a sentence is simply

nonsense

truth A proposition is true if the state of

affairs it depicts obtains

falsity A proposition is false if the state of

affairs it depicts does not obtain

19 Now some (Hans Johann Glock Truth

in the Tractatus In Synthese 1482 [Jan

2006] pp 345ndash368 ISSN 1573-0964 DOI 10

1007 s11229 - 004 - 6226 - 2) argue this is

a correspondence theory and others (David

Dolby Wittgenstein on Truth In A Companion

to Wittgenstein John Wiley amp Sons Ltd 2016

Chap 27 pp 433ndash442 ISBN 9781118884607 DOI

1010029781118884607ch27) that it is not In

any case it certainly solves some issues that

have plagued the correspondence theory

ldquoWhat cannot be said must be shownrdquo

20 Something the picture theory does is

declare a limit on what can meaningfully be

said A proposition (as defined above) must be

potentially true or false Therefore something

that cannot be false (something necessarily

true) cannot be a proposition (ibid) And

there are certain things that are necessarily

true for language itself to be meaningfulmdash

paradigmatically the logical structure of the

world What a proposition does then is show

itself Mathematics itself tru Truth p 1

8 See also ( Slavoj Žižek Less Than Nothing Hegel and the Shadow

of Dialectical Materialism Verso 2012 ISBN 9781844678976 This is

one of the most interesting presentations of Hegel and Lacan by one

of the most exciting contemporary philosophers pp 25shy6) from whom I

stole the section title

language itself

9 This was one of the contributions to the ldquolinguistic turnrdquo

(Wikipedia Linguistic turnmdashWikipedia The Free Encyclopedia

http en wikipedia org w index php title =Linguistic 20turn amp oldid = 922305269 [Online accessed23shyOctobershy2019] 2019 Hey we all do it) of philosophy in the early

20th century

perspective

skepticism

via its own logical structure the necessary

(for there to be meaningful propositions at all)

logical structure of the world8

21 An interesting feature of this perspective

is that it opens up language itself to analysis

and limitation9 And furthermore it suggests

that the set of what is is smaller than the set

of what can be meaningfully spoken about

The relativity of truth

22 Each subject (ie agent) in the world with

their propositions has a perspective a given

moment a given place an historical-cultural-

linguistic situation At the very least the truth

of propositions must account for this Just

how a theory of truth should do so is a matter

of significant debate (Maria Baghramian and

J Adam Carter Relativism In The Stanford

Encyclopedia of Philosophy Ed by Edward N

Zalta Winter 2019 Metaphysics Research Lab

Stanford University 2019)

23 Some go so far as to be skeptical

about truth (Peter Klein Skepticism In The

Stanford Encyclopedia of Philosophy Ed by

Edward N Zalta Summer 2015 Metaphysics

Research Lab Stanford University 2015)

regarding it to be entirely impossible Others

say that while a proposition may or may not be

true we could never come to know this

24 Often underlying this conversation is the

question of there being a common world in

which we all participate and if so whether

or not we can properly represent this world

itself Mathematics itself tru Truth p 1

10 Especially notable here is the work of Alfred Tarski in the

midshy20th century

beliefs

sentences

interpreted sentences

quasi-quotation

in language such that multiple subjects could

come to justifiably agree or disagree on the

truth of a proposition If every proposition

is so relative that it is relevant to only the

proposer truth would seem of little value On

the other hand if truth is understood to

be ldquoobjectiverdquomdashindependent of subjective

perspectivemdasha number of objections can be

made (Baghramian and Carter Relativism) such

as that there is no non-subjective perspective

from which to judge truth

Other ideas about truth

25 There are too many credible ideas

about truth to attempt a reasonable summary

however I will attempt to highlight a few

important ones

Formal methods

26 A set of tools was developed for

exploring theories of truth especially

correspondence theories10 Focus turned

from beliefs to sentences which are akin to

propositions (Recall that the above theories

have already been recast in the more modern

language of propositions) Another aspect

of these sentences under consideration is

that they begin to be taken as interpreted

sentences they are already have meaning

27 Beyond this several technical apparatus

are introduced that formalize criteria for

truth For instance a sentence is given a

sign φ A need arises to distinguish between

the quotation of sentence φ and the unqoted

sentence φ which is then given the quasi-

quotation notation pφq For instance let φ

stand for snow is white then φ rarr snow is

itself Mathematics itself tru Truth p 1

Convention T

11 Wilfrid Hodges Tarskirsquos Truth Definitions In The Stanford

Encyclopedia of Philosophy Ed by Edward N Zalta Fall 2018

Metaphysics Research Lab Stanford University 2018 Mario Goacutemez-

Torrente Alfred Tarski In The Stanford Encyclopedia of

Philosophy Ed by Edward N Zalta Spring 2019 Metaphysics Research

Lab Stanford University 2019 Peter Hylton and Gary Kemp Willard

Van Orman Quine In The Stanford Encyclopedia of Philosophy Ed

by Edward N Zalta Spring 2019 Metaphysics Research Lab Stanford

University 2019

redundancy theory

white and pφq rarr lsquosnow is whitersquo Tarski

introduces Convention T which states that

for a fixed language L with fully interpreted

sentences (Glanzberg Truth)

An adequate theory of truth for L

must imply for each sentence φ of L

pφq is true if and only if φ

Using the same example then

lsquosnow is whitersquo if and only if snow

is white

Convention T states a general rule for the

adequacy of a theory of truth and is used in

several contemporary theories

28 We can see that these formal methods get

quite technical and fun For more see Hodges

GoacutemezshyTorrente and Hylton and Kemp11

Deflationary theories

29 Deflationary theories of truth try to

minimize or eliminate altogether the concept

of or use of the term lsquotruthrsquo For instance the

redundancy theory claim that (Glanzberg

Truth)

itself Mathematics itself tru Truth p 1

12 Daniel Stoljar and Nic Damnjanovic The Deflationary Theory of

Truth In The Stanford Encyclopedia of Philosophy Ed by Edward N

Zalta Fall 2014 Metaphysics Research Lab Stanford University 2014

truth-bearer

meaning

theory of use

To assert that pφq is true is just to

assert that φ

Therefore we can eliminate the use of lsquois truersquo

30 For more of less see Stoljar and

Damnjanovic12

Language

31 It is important to recognize that language

mediates truth that is truth is embedded in

language The way language in general affects

theories of truth has been studied extensively

For instance whether the truth-bearer is

a belief or a proposition or a sentencemdashor

something elsemdashhas been much discussed The

importance of the meaning of truth-bearers

like sentences has played another large role

Theories of meaning like the picture theory

presented above are often closely intertwined

with theories of truth

32 One of the most popular theories of

meaning is called the theory of use

For a large class of cases of the

employment of the word ldquomeaningrdquo

ndash though not for all ndash this word

can be explained in this way the

meaning of a word is its use in

the language (L Wittgenstein PMS

Hacker and J Schulte Philosophical

itself Mathematics itself tru Truth p 1

languageshygames

metaphysical realism

scientific realism

13 Drew Khlentzos Challenges to Metaphysical Realism In The

Stanford Encyclopedia of Philosophy Ed by Edward N Zalta Winter

2016 Metaphysics Research Lab Stanford University 2016

antishyrealism

Investigations Wiley 2010 ISBN

9781444307979)

This theory is accompanied by the concept of

languageshygames which are loosely defined as

rule-based contexts within which sentences

have uses The idea is that the meaning of

a given sentence is its use in a network of

meaning that is constantly evolving This view

tends to be understood as deflationary or

relativistic about truth

Metaphysical and epistemological

considerations

33 We began with the recognition that

truth is intertwined with metaphysics and

epistemology Letrsquos consider a few such topics

34 The first is metaphysical realism

which claims that there is a world existing

objectively independently of how we think

about or describe it This ldquorealismrdquo tends to be

closely tied to yet distinct from scientific

realism which goes further claiming the

world is ldquoactuallyrdquo as science describes

independently of the scientific descriptions

(eg there are actual objects corresponding to

the phenomena we call atoms molecules light

particles etc)

35 There have been many challenges to the

realist claim (for some recent versions see

Khlentzos13) put forth by what is broadly

called antishyrealism These vary but often

challenge the ability of realists to properly

itself Mathematics itself tru Truth p 2

Metaphysical idealism

Epistemological idealism

14 These definitions are explicated by Guyer and Horstmann (Paul

Guyer and Rolf-Peter Horstmann Idealism In The Stanford

Encyclopedia of Philosophy Ed by Edward N Zalta Winter 2018

Metaphysics Research Lab Stanford University 2018)

noumenal world

phenomenal world

link language to supposed objects in the

world

36 Metaphysical idealism has been

characterized as claiming that ldquomindrdquo or

ldquosubjectivityrdquo generate or completely compose

the world which has no being outside mind

Epistemological idealism on the other hand

while perhaps conceding that there is a world

independent of mind claims all knowledge

of the world is created through mind and

for mind and therefore can never escape a

sort of mindshyworld gap14 This epistemological

idealism has been highly influential since

the work of Immanuel Kant (I Kant P Guyer

and AW Wood Critique of Pure Reason The

Cambridge Edition of the Works of Immanuel

Kant Cambridge University Press 1999 ISBN

9781107268333) in the late 18th century

which ushered in the idea of the noumenal

world in-itself and the phenomenal world

which is how the noumenal world presents to

us Many have held that phenomena can be

known through inquiry whereas noumena are

inaccessible Furthermore what can be known is

restricted by the categories pre-existent in

the knower

37 Another approach taken by Georg Wilhelm

Friedrich Hegel (Paul Redding Georg

Wilhelm Friedrich Hegel In The Stanford

Encyclopedia of Philosophy Ed by Edward N

Zalta Summer 2018 Metaphysics Research

Lab Stanford University 2018) and other

German idealists following Kant is to reframe

reality as thoroughly integrating subjectivity

(GWF Hegel and AV Miller Phenomenology

of Spirit Motilal Banarsidass 1998 ISBN

itself Mathematics itself tru Truth p 1

Notion

9788120814738 Slavoj Žižek Less Than

Nothing Hegel and the Shadow of Dialectical

Materialism Verso 2012 ISBN 9781844678976

This is one of the most interesting presentations

of Hegel and Lacan by one of the most exciting

contemporary philosophers) that is ldquoeverything

turns on grasping and expressing the True not

only as Substance but equally as Subjectrdquo

A subjectrsquos proposition is true inasmuch as it

corresponds with its Notion (approximately

the idea or meaning for the subject) Some

hold that this idealism is compatible with a

sort of metaphysical realism at least as far

as understanding is not independent of but

rather beholden to reality (Žižek Less Than

Nothing Hegel and the Shadow of Dialectical

Materialism p 906 ff)

38 Clearly all these ideas have many

implications for theories of truth and vice

versa

Where this leaves us

39 The truth is hard What may at first

appear to be a simple concept becomes complex

upon analysis It is important to recognize

that we have only sampled some highlights of

the theories of truth I recommend further

study of this fascinating topic

40 Despite the difficulties of finding

definitive grounds for understanding truth

we are faced with the task of provisionally

forging ahead Much of what follows in

the study of mathematics makes certain

implicit and explicit assumptions about truth

However we have found that the foundations

itself Mathematics itself found Truth p 2

of these assumptions may themselves be

problematic It is my contention that despite

the lack of clear foundations it is still worth

studying engineering analysis its mathematical

foundations and the foundations of truth

itself My justification for this claim is that

I find the utility and the beauty of this study

highly rewarding

itself Mathematics itself found The foundations of mathematics p 1

axioms

deduction

proof

theorems

15 Throughout this section for the history of mathematics I rely

heavily on Kline (M Kline Mathematics The Loss of Certainty A

Galaxy book Oxford University Press 1982 ISBN 9780195030853 A

detailed account of the ldquoillogicalrdquo development of mathematics and

an exposition of its therefore remarkable utility in describing the

world)

Euclid

geometry

Gauss

itselffound The foundations of

mathematics

1 Mathematics has long been considered

exemplary for establishing truth Primarily

it uses a method that begins with

axiomsmdashunproven propositions that include

undefined termsmdashand uses logical deduction

to prove other propositions (theorems) to

show that they are necessarily true if the

axioms are

2 It may seem obvious that truth established

in this way would always be relative to the

truth of the axioms but throughout history

this footnote was often obscured by the

ldquoobviousrdquo or ldquointuitiverdquo universal truth of the

axioms15 For instance Euclid (Wikipedia

Euclid mdash Wikipedia The Free Encyclopedia

http en wikipedia org w index php

title = Euclid amp oldid = 923031048 [Online

accessed 26shyOctobershy2019] 2019) founded

geometrymdashthe study of mathematical objects

traditionally considered to represent physical

space like points lines etcmdashon axioms thought

so solid that it was not until the early 19th

century that Carl Friedrich Gauss (Wikipedia

Carl Friedrich Gauss mdash Wikipedia The Free

Encyclopedia httpenwikipediaorgw

indexphptitle=Carl20Friedrich20Gaussamp

oldid=922692291 [Online accessed 26shyOctober-

2019] 2019) and others recognized this was

only one among many possible geometries (M

Kline Mathematics The Loss of Certainty A

Galaxy book Oxford University Press 1982

ISBN 9780195030853 A detailed account of

the ldquoillogicalrdquo development of mathematics

and an exposition of its therefore remarkable

utility in describing the world) resting on

itself Mathematics itself found The foundations of mathematics p 1

Aristotle

algebra

rational numbers

natural numbers

integers

irrational numbers

different axioms Furthermore Aristotle

(Christopher Shields Aristotle In The

Stanford Encyclopedia of Philosophy Ed by

Edward N Zalta Winter 2016 Metaphysics

Research Lab Stanford University 2016) had

acknowledged that reasoning must begin

with undefined terms however even Euclid

(presumably aware of Aristotlersquos work) seemed

to forget this and provided definitions

obscuring the foundations of his work and

starting mathematics on a path that for over

2000 years would forget its own relativity

(Kline Mathematics The Loss of Certainty p 101-

2)

3 The foundations of Euclid were even

shakier than its murky starting point several

unstated axioms were used in proofs and some

proofs were otherwise erroneous However for

two millennia mathematics was seen as the field

wherein truth could be established beyond

doubt

Algebra ex nihilo

4 Although not much work new geometry

appeared during this period the field of

algebra (Wikipedia Algebra mdash Wikipedia The

Free Encyclopedia httpenwikipediaorg

windexphptitle=Algebraampoldid=920573802

[Online accessed 26shyOctobershy2019] 2019)mdashthe

study of manipulations of symbols standing for

numbers in generalmdashbegan with no axiomatic

foundation whatsoever The Greeks had a notion

of rational numbers ratios of natural numbers

(positive integers) and it was known that

many solutions to algebraic equations were

irrational (could not be expressed as a ratio

of integers) But these irrational numbers like

virtually everything else in algebra were

itself Mathematics itself found The foundations of mathematics p 1

negative numbers

imaginary numbers

complex numbers

optics

astronomy

calculus

Newtonian mechanics

non-Euclidean geometry

gradually accepted because they were so

useful in solving practical problems (they

could be approximated by rational numbers and

this seemed good enough) The rules of basic

arithmetic were accepted as applying to these

and other forms of new numbers that arose in

algebraic solutions negative imaginary and

complex numbers

The application of mathematics to science

5 During this time mathematics was being

applied to optics and astronomy Sir

Isaac Newton then built calculus upon

algebra applying it to what is now known as

Newtonian mechanics which was really more

the product of Leonhard Euler (George Smith

Isaac Newton In The Stanford Encyclopedia

of Philosophy Ed by Edward N Zalta Fall

2008 Metaphysics Research Lab Stanford

University 2008 Wikipedia Leonhard

Euler mdash Wikipedia The Free Encyclopedia

httpenwikipediaorgwindexphptitle=

Leonhard20Eulerampoldid=921824700 [Online

accessed 26shyOctobershy2019] 2019) Calculus

introduced its own dubious operations but

the success of mechanics in describing and

predicting physical phenomena was astounding

Mathematics was hailed as the language of

God (later Nature)

The rigorization of mathematics

6 It was not until Gauss created non-

Euclidean geometry in which Euclidrsquos were

shown to be one of many possible axioms

compatible with the world and William Rowan

Hamilton (Wikipedia William Rowan Hamilton

mdash Wikipedia The Free Encyclopedia http

en wikipedia org w index php

title=William20Rowan20Hamiltonampoldid=

923163451 [Online accessed 26shyOctober-

itself Mathematics itself found The foundations of mathematics p 1

quaternions

rigorization

symbolic logic

mathematical model

16 The branch of mathematics called model theory concerns itself

with general types of models that can be made from a given formal

system like an axiomatic mathematical system For more on model

theory see Hodges (Wilfrid Hodges Model Theory In The Stanford

Encyclopedia of Philosophy Ed by Edward N Zalta Fall 2018

Metaphysics Research Lab Stanford University 2018) It is noteworthy

that the engineeringscience use of the term ldquomathematical modelrdquo is

only loosely a ldquomodelrdquo in the sense of model theory

implicit definition

2019] 2019) created quaternions (Wikipedia

Quaternion mdash Wikipedia The Free Encyclopedia

http en wikipedia org w index

php title = Quaternion amp oldid = 920710557

[Online accessed 26shyOctobershy2019] 2019)

a number system in which multiplication is

noncommunicative that it became apparent

somethingwas fundamentally wrong with the

way truth in mathematics had been understood

This started a period of rigorization in

mathematics that set about axiomatizing and

proving 19th century mathematics This included

the development of symbolic logic which

aided in the process of deductive reasoning

7 An aspect of this rigorization is that

mathematicians came to terms with the axioms

that include undefined terms For instance

a ldquopointrdquo might be such an undefined term in

an axiom A mathematical model is what we

create when we attach these undefined terms

to objects which can be anything consistent

with the axioms16 The system that results

from proving theorems would then apply to

anything ldquoproperlyrdquo described by the axioms

So two masses might be assigned ldquopointsrdquo in

a Euclidean geometric space from which

we could be confident that for instance

the ldquodistancerdquo between these masses is the

Euclidean norm of the line drawn between the

points It could be said then that a ldquopointrdquo in

Euclidean geometry is implicitly defined by

its axioms and theorems and nothing else That

is mathematical objects are not inherently

tied to the physical objects to which we tend

to apply them Euclidean geometry is not

the study of physical space as it was long

consideredmdashit is the study of the objects

implicitly defined by its axioms and theorems

The foundations of mathematics are built

itself Mathematics itself found The foundations of mathematics p 2

consistent

theorem

Set theory

foundation

sets

ZFC set theory

8 The building of the modern foundations

mathematics began with clear axioms

solid reasoning (with symbolic logic) and

lofty yet seemingly attainable goals prove

theorems to support the already ubiquitous

mathematical techniques in geometry algebra

and calculus from axioms furthermore prove

that these axioms (and things they imply) do

not contradict each other ie are consistent

and that the axioms are not results of each

other (one that can be derived from others is

a theorem not an axiom)

9 Set theory is a type of formal axiomatic

system that all modern mathematics is

expressed with so set theory is often called

the foundation of mathematics (Joan Bagaria

Set Theory In The Stanford Encyclopedia

of Philosophy Ed by Edward N Zalta Fall

2019 Metaphysics Research Lab Stanford

University 2019) We will study the basics

in The primary objects in set theory are

sets informally collections of mathematical

objects There is not just one a single set

of axioms that is used as the foundation of

all mathematics for reasons will review in a

moment However the most popular set theory

is ZermeloshyFraenkel set theory with the

axiom of choice (ZFC) The axioms of ZF sans C

are as follows (ibid)

extensionality If two sets A and B have the

same elements then they are equal

empty set There exists a set denoted by emptyand called the empty set which has no

elements

pair Given any sets A and B there exists a set

itself Mathematics itself found The foundations of mathematics p 1

denoted by A B which contains A and B

as its only elements In particular there

exists the set A which has A as its only

element

power set For every set A there exists a set

denoted by P(A) and called the power set

of A whose elements are all the subsets of

A

union For every set A there exists a set

denoted by⋃A and called the union of

A whose elements are all the elements of

the elements of A

infinity There exists an infinite set In

particular there exists a set Z that

contains empty and such that if A isin Z then⋃A A isin Z

separation For every set A and every given

property there is a set containing

exactly the elements of A that have

that property A property is given by a

formula ϕ of the first-order language

of set theory Thus separation is not a

single axiom but an axiom schema that is

an infinite list of axioms one for each

formula ϕ

replacement For every given definable

function with domain a set A there is a

set whose elements are all the values of

the function

10 ZFC also has the axiom of choice (Bagaria

Set Theory)

choice For every set A of pairwiseshydisjoint

non-empty sets there exists a set that

contains exactly one element from each

set in A

itself Mathematics itself found The foundations of mathematics p 2

first incompleteness theorem

incomplete

17 Panu Raatikainen Goumldelrsquos Incompleteness Theorems In The

Stanford Encyclopedia of Philosophy Ed by Edward N Zalta Fall

2018 Metaphysics Research Lab Stanford University 2018 A through

and contemporary description of Goumldelrsquos incompleteness theorems

which have significant implications for the foundations and function

of mathematics and mathematical truth

undecidable

second incompleteness theorem

The foundations have cracks

11 The foundationalistsrsquo goal was to prove

that some set of axioms from which all of

mathematics can be derived is both consistent

(contains no contradictions) and complete

(every true statement is provable) The work of

Kurt Goumldel (Juliette Kennedy Kurt Goumldel In

The Stanford Encyclopedia of Philosophy Ed

by Edward N Zalta Winter 2018 Metaphysics

Research Lab Stanford University 2018) in

the mid 20th century shattered this dream by

proving in his first incompleteness theorem

that any consistent formal system within which

one can do some amount of basic arithmetic is

incomplete His argument is worth reviewing

(see Raatikainen17) but at its heart is an

undecidable statement like ldquoThis sentence is

unprovablerdquo Let U stand for this statement If

it is true it is unprovable If it is provable it is

false Therefore it is true iff it is provable

Then he shows that if a statement A that

essentially says ldquoarithmetic is consistentrdquo is

provable then so is the undecidable statement

U But if U is to be consistent it cannot be

provable and therefore neither can A be

provable

12 This is problematic It tells us virtually

any conceivable axiomatic foundation of

mathematics is incomplete If one is complete

it is inconsistent (and therefore worthless) One

problem this introduces is that a true theorem

may be impossible to prove but it turns out we

can never know that in advance of its proof if

it is provable

13 But it gets worse Goumldelrsquos second

itself Mathematics itself found The foundations of mathematics p 1

material implication

Banach-Tarski paradox

empirical

18 Kline Mathematics The Loss of Certainty

incompleteness theorem shows that such

systems cannot even be shown to be consistent

This means at any moment someone could find

an inconsistency in mathematics and not only

would we lose some of the theorems we would

lose them all This is because by what is called

the material implication (Kline Mathematics

The Loss of Certainty pp 187-8 264) if one

contradiction can be found every proposition

can be proven from it And if this is the case

all (even proven) theorems in the system would

be suspect

14 Even though no contradiction has yet

appeared in ZFC its axiom of choice which

is required for the proof of most of what

has thus far been proven generates the

Banach-Tarski paradox that says a sphere

of diameter x can be partitioned into a finite

number of pieces and recombined to form

two spheres of diameter x Troubling to say

the least Attempts were made for a while to

eliminate the use of the axiom of choice but

our buddy Goumldel later proved that if ZF is

consistent so is ZFC (ibid p 267)

Mathematics is considered empirical

15 Since its inception mathematics has been

applied extensively to the modeling of the

world Despite its cracked foundations it has

striking utility Many recent leading minds of

mathematics philosophy and science suggest

we treat mathematics as empirical like any

science subject to its success in describing

and predicting events in the world As Kline18

summarizes

itself Mathematics itself reason The foundations of mathematics p 2

The upshot [hellip] is that sound

mathematics must be determined

not by any one foundation which

may some day prove to be right The

ldquocorrectnessrdquo of mathematics must

be judged by its application to the

physical world Mathematics is an

empirical science much as Newtonian

mechanics It is correct only to the

extent that it works and when it

does not it must be modified It

is not a priori knowledge even

though it was so regarded for two

thousand years It is not absolute

or unchangeable

itself Mathematics itself overview Mathematical reasoning p 1

itselfreason Mathematical

reasoning

itself Mathematics itself engmath Mathematical topics in overview p 1

itselfoverview Mathematical

topics in

overview

itself Mathematics itself ex What is mathematics for engineering p 1

itselfengmath What is

mathematics for

engineering

language

sets Mathematics itself Exercises for Chapter 01 p 1

itselfex Exercises for Chapter

01

formal languages

logic

reasoning

propositions

theorems

proof

propositional calculus

logical connectives

first-order logic

sets

Mathematicalreasoning logicand set theory

In order to communicate mathematical

ideas effectively formal languages have

been developed within which logic ie

deductive (mathematical) reasoning can

proceed Propositions are statements that

can be either true gt or false perp Axiomatic

systems begin with statements (axioms) assumed

true Theorems are proven by deduction

In many forms of logic like propositional

calculus (Wikipedia Propositional

calculus mdash Wikipedia The Free Encyclopedia

httpenwikipediaorgwindexphptitle=

Propositional20calculusampoldid=914757384

[Online accessed 29shyOctobershy2019] 2019)

compound propositions are constructed

via logical connectives like ldquoandrdquo

and ldquoorrdquo of atomic propositions (see

Lecture setslogic) In others like first-order

logic (Wikipedia First-order logic mdash

Wikipedia The Free Encyclopedia http

enwikipediaorgwindexphptitle=First-

order 20logic amp oldid = 921437906 [Online

accessed 29shyOctobershy2019] 2019) there are also

sets Mathematical reasoning logic and set theory setintro p 3

quantifiers

set theory

logical quantifiers like ldquofor everyrdquo and ldquothere

existsrdquo

The mathematical objects and operations about

which most propositions are made are expressed

in terms of set theory which was introduced

in Lecture itselffound and will be expanded

upon in Lecture setssetintro We can say

that mathematical reasoning is comprised of

mathematical objects and operations expressed

in set theory and logic allows us to reason

therewith

sets Mathematical reasoning logic and set theory setintro Introduction to set theory p 1

set theory

1 K Ciesielski Set Theory for the Working Mathematician London

Mathematical Society Student Texts Cambridge University Press 1997

ISBN 9780521594653 A readable introduction to set theory

2 HB Enderton Elements of Set Theory Elsevier Science 1977 ISBN

9780080570426 A gentle introduction to set theory and mathematical

reasoningmdasha great place to start

set

field

addition

multiplication

subtraction

division

setssetintro Introduction to

set theory

Set theory is the language of the modern

foundation of mathematics as discussed

in Lecture itselffound It is unsurprising

then that it arises throughout the study of

mathematics We will use set theory extensively

in on probability theory

The axioms of ZFC set theory were introduced

in Lecture itselffound Instead of proceeding

in the pure mathematics way of introducing

and proving theorems we will opt for a more

applied approach in which we begin with some

simple definitions and include basic operations

A more thorough and still readable treatment

is given by Ciesielski1 and a very gentle

version by Enderton2

A set is a collection of objects Set theory

gives us a way to describe these collections

Often the objects in a set are numbers or sets

of numbers However a set could represent

collections of zebras and trees and hairballs

For instance here are some sets

A field is a set with special structure This

structure is provided by the addition (+)

and multiplication (times) operators and their

inverses subtraction (ndash) and division (divide)

sets Mathematical reasoning logic and set theory setintro Introduction to set theory p 2

real numbers

3 When the natural numbers include zero we write N0

set membership

set operations

union

The quintessential example of a field is the

set of real numbers R which admits these

operators making it a field The reals R thecomplex numbers C the integers Z and the

natural numbers3 N are the fields we typically

consider

Set membership is the belonging of an object

to a set It is denoted with the symbol isinwhich can be read ldquois an element ofrdquo for

element x and set X

For instance we might say 7 isin 1 7 2 or

4 isin 1 7 2 Or we might declare that a is a

real number by stating x isin R

Set operations can be used to construct new

sets from established sets We consider a few

common set operations now

The union cup of sets is the set containing all

the elements of the original sets (no repetition

allowed) The union of sets A and B is denoted

A cup B For instance let A = 1 2 3 and B = ndash1 3

then

sets Mathematical reasoning logic and set theory setintro Introduction to set theory p 3

intersection

empty set

set difference

subset

proper subset

The intersection cap of sets is a set containing

the elements common to all the original sets

The intersection of sets A and B is denoted

A cap B For instance let A = 1 2 3 and B = 2 3 4

then

If two sets have no elements in common the

intersection is the empty set empty = the unique

set with no elements

The set difference of two sets A and B is the

set of elements in A that arenrsquot also in B It is

denoted A B For instance let A = 1 2 3 and

B = 2 3 4 Then

A subset sube of a set is a set the elements of

which are contained in the original set If the

two sets are equal one is still considered a

subset of the other We call a subset that is

not equal to the other set a proper subset sub

sets Mathematical reasoning logic and set theory setintro Introduction to set theory p 4

complement

cartesian product

set-builder notation

For instance let A = 1 2 3 and B = 1 2 Then

The complement of a subset is a set of elements

of the original set that arenrsquot in the subset

For instance if B sube A then the complement of B

denoted B is

The cartesian product of two sets A and B is

denoted A times B and is the set of all ordered

pairs (a b) where a isin A and b isin B Itrsquos

worthwhile considering the following notation

for this definition

which means ldquothe cartesian product of A and B

is the ordered pair (a b) such that a isin A and

b isin Brdquo in set-builder notation (Wikipedia

Set-builder notation mdash Wikipedia The Free

Encyclopedia httpenwikipediaorgw

indexphptitle=Set-builder20notationamp

oldid=917328223 [Online accessed 29shyOctober-

sets Mathematical reasoning logic and set theory logic Introduction to set theory p 5

map

function

value

domain

codomain

image

2019] 2019)

Let A and B be sets A map or function f from

A to B is an assignment of some element a isin A

to each element b isin B The function is denoted

f A rarr B and we say that f maps each element

a isin A to an element f(a) isin B called the value

of a under f or a 7rarr f(a) We say that f has

domain A and codomain B The image of f is

the subset of its codomain B that contains

the values of all elements mapped by f from its

domain A

sets Mathematical reasoning logic and set theory ex Logical connectives and quantifiers p 1

first-order logic

atomistic proposition

compound proposition

logical connectives

not

andor

truth table

universal quantifier symbol

existential quantifier

setslogic Logical connectives

and quantifiers

In order to make compound propositions

we need to define logical connectives In

order to specify quantities of variables

we need to define logical quantifiers The

following is a form of first-order logic

(Wikipedia First-order logic mdash Wikipedia The

Free Encyclopedia)

Logical connectives

A proposition can be either true gt and false perpWhen it does not contain a logical connective

it is called an atomistic proposition

To combine propositions into a compound

proposition we require logical connectives

They are not (not) and (and) and or (or) Table 021

is a truth table for a number of connectives

Quantifiers

Logical quantifiers allow us to indicate

the quantity of a variable The universal

quantifier symbol forall means ldquofor allrdquo For

instance let A be a set then foralla isin A means

ldquofor all elements in Ardquo and gives this quantity

variable a The existential quantifier existmeans ldquothere exists at least onerdquo or ldquofor somerdquo

For instance let A be a set then exista isin A

Table 021 a truth table for logical connectivesThe first two columns are the truth values ofpropositions p and q the rest are outputs

not and or nand nor xor xnor

p q notp pand q por q p uarr q p darr q p Y q phArr q

perp perp gt perp perp gt gt perp gtperp gt gt perp gt gt perp gt perpgt perp perp perp gt gt perp gt perpgt gt perp gt gt perp perp perp gt

sets Mathematical reasoning logic and set theory ex Logical connectives and quantifiers p 2

means ldquothere exists at least one element a in

A helliprdquo

prob Mathematical reasoning logic and set theory Exercises for Chapter 02 p 1

setsex Exercises for Chapter

02

Exercise 021 (hardhat)

For the following write the set described in

set-builder notation

a A = 2 3 5 9 17 33 middot middot middot b B is the set of integers divisible by 11

c C = 13 14 15 middot middot middot d D is the set of reals between ndash3 and 42

Exercise 022 (2)

Let xy isin Rn Prove the Cauchy-Schwarz

Inequality

|x middot y| 6 xy (ex1)

Hint you may find the geometric definition of

the dot product helpful

Exercise 023 (3)

Let x isin Rn Prove that

x middot x = x2 (ex2)

Hint you may find the geometric definition of

the dot product helpful

Exercise 024 (4)

Let xy isin Rn Prove the Triangle Inequality

x + y 6 x + y (ex3)

Hint you may find the Cauchy-Schwarz

Inequality helpful

prob

Probability

prob Probability prob Probability and measurement p 1

Probability theory

1 For a good introduction to probability theory see Ash (Robert B

Ash Basic Probability Theory Dover Publications Inc 2008) or Jaynes

et al (ET Jaynes et al Probability Theory The Logic of Science

Cambridge University Press 2003 ISBN 9780521592710 An excellent

and comprehensive introduction to probability theory)

interpretation of probability

event

probmeas Probability and

measurement

Probability theory is a wellshydefined branch

of mathematics Andrey Kolmogorov described

a set of axioms in 1933 that are still in use

today as the foundation of probability

theory1

We will implicitly use these axioms in our

analysis The interpretation of probability is

a contentious matter Some believe probability

quantifies the frequency of the occurrence of

some event that is repeated in a large number

of trials Others believe it quantifies the

state of our knowledge or belief that some

event will occur

In experiments our measurements are tightly

coupled to probability This is apparent in the

questions we ask Here are some examples

1 How common is a given event

2 What is the probability we will reject

a good theory based on experimental

results

3 How repeatable are the results

4 How confident are we in the results

5 What is the character of the fluctuations

and drift in the data

6 How much data do we need

prob Probability prob Basic probability theory p 1

probability space

sample space

outcomes

probprob Basic probability

theory

The mathematical model for a class of

measurements is called the probability

space and is composed of a mathematical

triple of a sample space Ω σshyalgebra

F and probability measure P typically

denoted (ΩF P) each of which we will

consider in turn (Wikipedia Probability

space mdash Wikipedia The Free Encyclopedia

httpenwikipediaorgwindexphptitle=

Probability20spaceampoldid=914939789 [Online

accessed 31shyOctobershy2019] 2019)

The sample space Ω of an experiment is the

set representing all possible outcomes of the

experiment If a coin is flipped the sample

space is Ω = H T where H is heads and T

is tails If a coin is flipped twice the sample

space could be

However the same experiment can have

different sample spaces For instance for two

coin flips we could also choose

prob Probability prob Basic probability theory p 1

event

We base our choice of Ω on the problem at

hand

An event is a subset of the sample space

That is an event corresponds to a yes-or-no

question about the experiment For instance

event A (remember A sube Ω) in the coin flipping

experiment (two flips) might be A = HT TH A is

an event that corresponds to the question ldquoIs

the second flip different than the firstrdquo A is

the event for which the answer is ldquoyesrdquo

Algebra of events

Because events are sets we can perform the

usual set operations with them

Example probprobshy1 re set operations with events

Problem Statement

Consider a toss of a single die We choose

the sample space to be Ω = 1 2 3 4 5 6 Let the

following define events

A equiv the result is even = 2 4 6

B equiv the result is greater than 2 = 3 4 5 6

Find the following event combinations

A cup B A cap B A B B A A B

prob Probability condition Basic probability theory p 2

σshyalgebra

probability measure

The σshyalgebra F is the collection of events

of interest Often F is the set of all possible

events given a sample space Ω which is just

the power set of Ω (Wikipedia Probability

space mdash Wikipedia The Free Encyclopedia)

When referring to an event we often state

that it is an element of F For instance we

might say an event A isin F

Wersquore finally ready to assign probabilities to

events We define the probability measure

P F rarr [0 1] to be a function satisfying the

following conditions

1 For every event A isin F the probability

measure of A is greater than or equal to

zeromdashie P(A) gt 0

2 If an event is the entire sample space its

probability measure is unitymdashie if A = Ω

P(A) = 1

3 If events A1 A2 middot middot middot are disjoint sets (no

elements in common) then P(A1 cup A2 cup middot middot middot ) =P(A1) + P(A2) + middot middot middot

We conclude the basics by observing four facts

that can be proven from the definitions above

1

2

3

4

prob Probability condition Independence and conditional probability p 1

independent

probcondition Independence

and conditional

probability

Two events A and B are independent if and

only if

P(A cap B) = P(A)P(B)

If an experimenter must make a judgment

without data about the independence of

events they base it on their knowledge of the

events as discussed in the following example

Example probconditionshy1 re independence

Problem Statement

Answer the following questions and

imperatives

1 Consider a single fair die rolled twice

What is the probability that both rolls

are 6

2 What changes if the die is biased by a

weight such that P(6) = 17

3 What changes if the die is biased by a

magnet rolled on a magnetic dice-rolling

tray such that P(6) = 17

4 What changes if there are two dice

biased by weights such that for each

P(6) = 17 rolled once both resulting in

6

5 What changes if there are two dice

biased by magnets rolled together

prob Probability condition Independence and conditional probability p 1

dependent

conditional probability

Conditional probability

If events A and B are somehow dependent we

need a way to compute the probability of B

occurring given that A occurs This is called

the conditional probability of B given A and

is denoted P(B | A) For P(A) gt 0 it is defined as

P(B | A) =P(A cap B)P(A)

(condition1)

We can interpret this as a restriction of

the sample space Ω to A ie the new sample

space Ωprime = A sube Ω Note that if A and B are

independent we obtain the obvious result

prob Probability bay Independence and conditional probability p 2

Example probconditionshy2 re dependence

Problem Statement

Consider two unbiased dice rolled once

Let events A = sum of faces = 8 and

B = faces are equal What is the probability

the faces are equal given that their sum is 8

prob Probability bay Bayesrsquo theorem p 1

Bayesrsquo theorem

probbay Bayesrsquo theorem

Given two events A and B Bayesrsquo theorem (aka

Bayesrsquo rule) states that

P(A | B) = P(B | A)P(A)

P(B) (bay1)

Sometimes this is written

P(A | B) =P(B | A)P(A)

P(B | A)P(A) + P(B | notA)P(notA)(bay2)

=1

1 +P(B | notA)

P(B | A)middot P(notA)P(A)

(bay3)

This is a useful theorem for determining a

testrsquos effectiveness If a test is performed to

determine whether an event has occurred we

might as questions like ldquoif the test indicates

that the event has occurred what is the

probability it has actually occurredrdquo Bayesrsquo

theorem can help compute an answer

Testing outcomes

The test can be either positive or negative

meaning it can either indicate or not indicate

that A has occurred Furthermore this result

can be either true or false

A notA

positive (B) true false

negative (notB) false true

Table 031 test outcomeB for event A

There are four

options then

Consider an event

A and an event that

is a test result B

indicating that event

A has occurred

Table 031 shows

these four possible test outcomes The event

A occurring can lead to a true positive or a

prob Probability bay Bayesrsquo theorem p 1

sensitivity

detection rate

specificity

posterior probability

prior probability

false negative whereas notA can lead to a true

negative or a false positive

Terminology is important here

bull P(true positive) = P(B | A) aka sensitivity

or detection rate

bull P(true negative) = P(notB | notA) aka

specificity

bull P(false positive) = P(B | notA)

bull P(false negative) = P(notB | A)

Clearly the desirable result for any test

is that it is true However no test is true

100 percent of the time So sometimes it

is desirable to err on the side of the

false positive as in the case of a medical

diagnostic Other times it is more desirable to

err on the side of a false negative as in the

case of testing for defects in manufactured

balloons (when a false negative isnrsquot a big

deal)

Posterior probabilities

Returning to Bayesrsquo theorem we can evaluate

the posterior probability P(A | B) of the

event A having occurred given that the

test B is positive given information that

includes the prior probability P(A) of A The

form in Equation bay2 or (bay3) is typically

useful because it uses commonly known test

probabilities of the true positive P(B | A) and

of the false positive P(B | notA) We calculate

P(A | B) when we want to interpret test results

prob Probability bay Bayesrsquo theorem p 2

0 02 04 06 08 1

middot10ndash2

0

02

04

06

08

1

P(A) rarr

P(A|B)rarr

P(B|A) = 09000

P(B|A) = 09900

P(B|A) = 09990

P(B|A) = 09999

Figure 031 for different high-sensitivities theprobability that an event A occurred given thata test for it B is positive versus the probabilitythat the event A occurs under the assumption thespecificity equals the sensitivity

Some interesting results can be found from this

For instance if we let P(B | A) = P(notB | notA)

(sensitivity equal specificity) and realize that

P(B | notA) + P(notB | notA) = 1 (when notA either B or notB)

we can derive the expression

P(B | notA) = 1 ndash P(B | A) (bay4)

Using this and P(notA) = 1 ndash P(A) in Equation bay3

gives (recall wersquove assumed sensitivity equals

specificity)

P(A | B) =1

1 +1 ndash P(B | A)

P(B | A)middot 1 ndash P(A)P(A)

(bay5)

=1

1 +

(1

P(B | A)ndash 1

)(1

P(A)ndash 1

) (bay6)

This expression is plotted in Figure 031 See that

a positive result for a rare event (small P(A)) is

hard to trust unless the sensitivity P(B | A) and

specificity P(notB | notA) are very high indeed

prob Probability bay Bayesrsquo theorem p 3

Example probbayshy1 re Bayesrsquo theorem

Problem StatementSuppose 01 of springs manufactured at a

given plant are defective Suppose you need

to design a test that when it indicates

a deffective part the part is actually

defective 99 of the time What sensitivity

should your test have assuming it can be made

equal to its specificity

Problem Solution

The followingwas generated from a Jupyter

notebook with the following filename and

kernel

notebook filename bayes_theorem_example_01ipynbnotebook kernel python3

from sympy import for symbolicsimport numpy as np for numericsimport matplotlibpyplot as plt for plotsfrom IPythondisplay import display Markdown Latex

Define symbolic variables

var(p_Ap_nAp_Bp_nBp_B_Ap_B_nAp_A_Breal=True)

(p_A p_nA p_B p_nB p_B_A p_B_nA p_A_B)

prob Probability bay Bayesrsquo theorem p 4

Beginningwith Bayesrsquo theorem and assuming

the sensitivity and specificity are equal by

Equation bay4 we can derive the following

expression for the posterior probability P(A | B)

p_A_B_e1 = Eq(p_A_Bp_B_Ap_Ap_B)subs(

p_B p_B_Ap_A+p_B_nAp_nA conditional probp_B_nA 1-p_B_A Eq (35)p_nA 1-p_A

)display(p_A_B_e1)

pAB =pApBA

pApBA +(1 ndash pA

) (1 ndash pBA

)

Solve this for P(B | A) the quantity we seek

p_B_A_sol = solve(p_A_B_e1p_B_Adict=True)p_B_A_eq1 = Eq(p_B_Ap_B_A_sol[0][p_B_A])display(p_B_A_eq1)

pBA =pAB

(1 ndash pA

)ndash2pApAB + pA + pAB

Now letrsquos substitute the given probabilities

p_B_A_spec = p_B_A_eq1subs(

prob Probability rando Bayesrsquo theorem p 5

p_A 0001p_A_B 099

)display(p_B_A_spec)

pBA = 0999989888981011

Thatrsquos a tall order

prob Probability rando Random variables p 1

random variable

discrete random variable

continuous random variable

probrando Random variables

Probabilities are useful even when they do

not deal strictly with events It often occurs

that we measure something that has randomness

associated with it We use random variables to

represent these measurements

A random variable X Ω rarr R is a function

that maps an outcome ω from the sample

space Ω to a real number x isin R as shown in

Figure 032 A random variable will be denoted

with a capital letter (eg X and K) and a

specific value that it maps to (the value) will

be denoted with a lowercase letter (eg x and

k)

A discrete random variable K is one that

takes on discrete values A continuous random

variable X is one that takes on continuous

values

Example probrandoshy1 re dice again

Problem Statement

Roll two unbiased dice Let K be a random

variable representing the sum of the two Let

rvR

Ω

outcome ω

x

X

Figure 032 a random variable X maps an outcomeω isin Ω to an x isin R

prob Probability rando Random variables p 2

P(k) be the probability of the result k isin K Plotand interpret P(k)

Example probrandoshy2 re Johnson-Nyquist noise

Problem Statement

A resistor at nonzero temperature without

any applied voltage exhibits an interesting

phenomenon its voltage randomly fluctuates

This is called JohnsonshyNyquist noise and

is a result of thermal excitation of charge

carriers (electrons typically) For a given

resistor and measurement system let the

probability density function fV of the voltage

V across an unrealistically hot resistor be

fV(V) =1radicπendashV

2

Plot and interpret the meaning of this

function

prob Probability pxf Random variables p 3

prob Probability pxf Probability density and mass functions p 1

frequency distribution

probability mass function

Figure 033 plot of a probability mass function

frequency density distribution

probability density function

probpxf Probability density

and mass functions

Consider an experiment that measures a

random variable We can plot the relative

frequency of the measurand landing in

different ldquobinsrdquo (ranges of values) This is

called a frequency distribution or a

probability mass function (PMF)

Consider for instance a probability mass

function as plotted in Figure 033 where a

frequency ai can be interpreted as an estimate

of the probability of the measurand being in

the ith interval The sum of the frequencies

must be unity

with k being the number of bins

The frequency density distribution is similar

to the frequency distribution but with ai 7rarrai∆x where ∆x is the bin width

If we let the bin width approach zero we

derive the probability density function (PDF)

f(x) = limkrarrinfin∆xrarr0

ksumj=1

aj∆x (pxf1)

prob Probability pxf Probability density and mass functions p 1

Figure 034 plot of a probability density function

We typically think of a probability density

function f like the one in Figure 034 as a

function that can be integrated over to

find the probability of the random variable

(measurand) being in an interval [a b]

P(x isin [a b]) =ˆ b

af(χ)dχ (pxf2)

Of course

We now consider a common PMF and a common

PDF

Binomial PMF

Consider a random binary sequence of length

n such that each element is a random 0 or 1

generated independently like

(1 0 1 1 0 middot middot middot 1 1) (pxf3)

Let events 1 and 0 be mutually exclusive

and exhaustive and P(1) = p The probability of

the sequence above occurring is

prob Probability pxf Probability density and mass functions p 2

binomial distribution PDF

There are n choose k(n

k

)=

n

k(n ndash k) (pxf4)

possible combinations of k ones for n bits

Therefore the probability of any combination

of k ones in a series is

f(k) =

(n

k

)pk(1 ndash p)nndashk (pxf5)

We call Equation pxf5 the binomial

distribution PDF

Example probpxfshy1 re Binomial PMF

Problem Statement

Consider a field sensor that fails for a

given measurement with probability p Given

n measurements plot the binomial PMF as

a function of k failed measurements for

a few different probabilities of failure

p isin [004 025 05 075 096]

Problem SolutionFigure 036 shows Matlab code for

constructing the PDFs plotted in Figure 035

Note that the symmetry is due to the fact

that events 1 and 0 are mutually exclusive

and exhaustive

Example probpxfshy2 re hi

Problem Statement

Sed mattis erat sit amet gravida malesuada

elit augue egestas diam tempus scelerisque

nunc nisl vitae libero Sed consequat feugiat

massa Nunc porta eros in eleifend varius

erat leo rutrum dui non convallis lectus

prob Probability pxf Probability density and mass functions p 3

0 20 40 60 80 1000

005

01

015

02

number of failures in sequence

proba

bility

p = 004

p = 025

p = 05

p = 075

p = 096

Figure 035 binomial PDF for n = 100 measurementsand different values of P(1) = p the probability ofa measurement error The plot is generated by theMatlab code of Figure 036

parametersn = 100k_a = linspace(1nn)p_a = [042557596]

binomial functionf = (nkp) nchoosek(nk)p^k(1-p)^(n-k)

loop through to construct an arrayf_a = NaNones(length(k_a)length(p_a))for i = 1length(k_a)

for j = 1length(p_a)f_a(ij) = f(nk_a(i)p_a(j))

endend

plotfigurecolors = jet(length(p_a))for j = 1length(p_a)

bar(k_af_a(j)facecolorcolors(j)facealpha05displayname [$p = num2str(p_a(j))$]

) hold onendleg = legend(showlocationnorth)set(leginterpreterlatex)hold offxlabel(number of ones in sequence k)ylabel(probability)xlim([0100])

Figure 036 a Matlab script for generating binomialPMFs

prob Probability pxf Probability density and mass functions p 4

orci ut nibh Sed lorem massa nonummy quis

egestas id condimentum at nisl Maecenas at

nibh Aliquam et augue at nunc pellentesque

ullamcorper Duis nisl nibh laoreet suscipit

convallis ut rutrum id enim Phasellus odio

Nulla nulla elit molestie non scelerisque at

vestibulum eu nulla Ut odio nisl facilisis

id mollis et scelerisque nec enim Aenean sem

leo pellentesque sit amet scelerisque sit amet

vehicula pellentesque sapien

prob Probability pxf Probability density and mass functions p 5

prob Probability pxf Probability density and mass functions p 6

prob Probability E Probability density and mass functions p 1

Gaussian or normal random variable

mean

standard deviation

variance

Gaussian PDF

The Gaussian or normal random variable x has

PDF

f(x) =1

σradic2π

expndash(x ndash micro)2

2σ2 (pxf6)

Although wersquore not quite ready to understand

these quantities in detail it can be shown that

the parameters micro and σ have the following

meanings

bull micro is the mean of x

bull σ is the standard deviation of x and

bull σ2 is the variance of x

Consider the ldquobell-shapedrdquo Gaussian PDF in

Figure 037 It is always symmetric The mean micro

is its central value and the standard deviation

σ is directly related to its width We will

continue to explore the Gaussian distribution

in the following lectures especially in

Lecture statsconfidence

prob Probability E Probability density and mass functions p 2

ndash3 ndash2 ndash1 0 1 2 3

random variable x

Figure 037 PDF for Gaussian random variable x

mean micro = 0 and standard deviation σ = 1radic2

prob Probability E Expectation p 1

expected value

expectation

mean

probE Expectation

Recall that a random variable is a function

X Ω rarr R that maps from the sample

space to the reals Random variables are the

arguments of probability mass functions (PMFs)

and probability density functions (PDFs)

The expected value (or expectation) of a

random variable is akin to its ldquoaverage valuerdquo

and depends on its PMF or PDF The expected

value of a random variable X is denoted

〈X〉 or E[X] There are two definitions of

the expectation one for a discrete random

variable the other for a continuous random

variable Before we define them however it

is useful to predefine the most fundamental

property of a random variable its mean

Definition probE1 mean

The mean of a random variable X is defined

as

mX = E[X]

Letrsquos begin with a discrete random variable

Definition probE2 expectation of a

discrete random

variable

Let K be a discrete random variable and f its

PMF The expected value of K is defined as

E[K]=sumforallk

kf(k)

prob Probability E Expectation p 2

Example probEshy1 re expectation of a discrete random variable

Problem Statement

Given a discrete random variable K with PMF

shown below what is its mean mK

Let us now turn to the expectation of a

continuous random variable

Definition probE3 expectation of a

continuous random

variable

Let X be a continuous random variable and f

its PDF The expected value of X is defined as

E[X]=

ˆ infinndashinfin xf(x)dx

Example probEshy2 re expectation of a continuous random variable

Problem Statement

Given a continuous random variable X with

Gaussian PDF f what is the expected value of

X

random variable x

prob Probability moments Expectation p 3

Due to its sum or integral form the expected

value E [middot] has some familiar properties for

random variables X and Y and reals a and b

E [a] = a (E1a)

E[X + a

]= E

[X]+ a (E1b)

E[aX

]= aE

[X]

(E1c)

E[E[X]]= E

[X]

(E1d)

E[aX + bY

]= aE

[X]+ b E

[Y] (E1e)

prob Probability moments Central moments p 1

central moments

variance

probmoments Central moments

Given a probability mass function (PMF) or

probability density function (PDF) of a random

variable several useful parameters of the

random variable can be computed These are

called central moments which quantify

parameters relative to its mean

Definition probmoments1 central moments

The nth central moment of random variable X

with PDF f is defined as

E[(X ndash microX)

n] = ˆ infinndashinfin (x ndash microX)

nf(x)dx

For discrete random variable K with PMF f

E[(K ndash microK)

n] = infinsumforallk(k ndash microK)

nf(k)

Example probmomentsshy1 re first moment

Problem Statement

Prove that the first moment of continuous

random variable X is zero

The second central moment of random variable

X is called the variance and is denoted

prob Probability moments Central moments p 2

standard deviation

skewness

σ2X or Var[X]

or E[(X ndash microX)

2]

(moments1)

The variance is a measure of the width or

spread of the PMF or PDF We usually compute

the variance with the formula

Other properties of variance include for real

constant c

Var [c] = 0 (moments2)

Var[X + c

]= Var

[X]

(moments3)

Var[cX

]= c2 Var

[X] (moments4)

The standard deviation is defined as

Although the variance is mathematically more

convenient the standard deviation has the

same physical units as X so it is often the

more physically meaningful quantity Due to

its meaning as the width or spread of the

probability distribution and its sharing of

physical units it is a convenient choice for

error bars on plots of a random variable

The skewness Skew[X]is a normalized third

prob Probability ex Central moments p 3

asymmetry

kurtosis

tailedness

central moment

Skew[X]=E[(X ndash microX)

3]

σ3X

(moments5)

Skewness is a measure of asymmetry of a

random variablersquos PDF or PMF For a symmetric

PMF or PDF such as the Gaussian PDF Skew[X]=

0

The kurtosis Kurt[X]is a normalized fourth

central moment

Kurt[X]=E[(X ndash microX)

4]

σ4X

(moments6)

Kurtosis is a measure of the tailedness of a

random variablersquos PDF or PMF ldquoHeavierrdquo tails

yield higher kurtosis

A Gaussian random variable has PDF with

kurtosis 3 Given that for Gaussians both

skewness and kurtosis have nice values (0 and

3) we can think of skewness and and kurtosis

as measures of similarity to the Gaussian PDF

stats Probability Exercises for Chapter 03 p 1

probex Exercises for Chapter

03

Exercise 031 (5)

Several physical processes can be modeled

with a random walk a process of interatively

changing a quantity by some random

amount Infinitely many variations are

possible but common factors of variation

include probability distribution step

size dimensionality (eg oneshydimensional

twoshydimensional etc) and coordinate system

Graphical representations of these walks can be

beautiful Develop a computer program that

generates random walks and corresponding

graphics Do it well and call it art because it

is

estimation

stats

Statistics

Whereas probability theory is primarily

focused on the relations among mathematical

objects statistics is concerned with making

sense of the outcomes of observation (Steven

S Skiena Calculated Bets Computers Gambling

and Mathematical Modeling to Win Outlooks

Cambridge University Press 2001 DOI 10

1017CBO9780511547089 This includes a lucid

section on probability versus statistics also

available here httpswww3csstonybrook

edu~skienajaialaiexcerptsnode12html)

However we frequently use statistical methods

to estimate probabilistic models For instance

we will learn how to estimate the standard

deviation of a random process we have some

reason to expect has a Gaussian probability

distribution

Statistics has applications in nearly every

applied science and engineering discipline

Any time measurements are made statistical

analysis is how one makes sense of the results

For instance determining a reasonable level

of confidence in a measured parameter

stats Statistics terms p 3

machine learning

1 Ash Basic Probability Theory

2 Jaynes et al Probability Theory The Logic of Science

3 Erwin Kreyszig Advanced Engineering Mathematics 10th John Wiley amp

Sons Limited 2011 ISBN 9781119571094 The authoritative resource for

engineering mathematics It includes detailed accounts of probability

statistics vector calculus linear algebra fourier analysis ordinary

and partial differential equations and complex analysis It also

includes several other topics with varying degrees of depth Overall

it is the best place to start when seeking mathematical guidance

requires statistics

A particularly hot topic nowadays is machine

learning which seems to be a field with

applications that continue to expand This field

is fundamentally built on statistics

A good introduction to statistics appears at

the end of Ash1 A more involved introduction

is given by Jaynes et al2 The treatment by

Kreyszig3 is rather incomplete as will be our

own

stats Statistics terms Populations samples and machine learning p 1

population

sample

random

machine learning

training

training set

features

predictive model

statsterms Populations samples

and machine

learning

An experimentrsquos population is a complete

collection of objects that we would like to

study These objects can be people machines

processes or anything else we would like to

understand experimentally

Of course we typically canrsquot measure all of

the population Instead we take a subset of

the populationmdashcalled a samplemdashand infer the

characteristics of the entire population from

this sample

However this inference that the sample is

somehow representative of the population

assumes the sample size is sufficiently large

and that the sampling is random This means

selection of the sample should be such that no

one group within a population are systematically

over- or under-represented in the sample

Machine learning is a field that makes

extensive use of measurements and statistical

inference In it an algorithm is trained by

exposure to sample data which is called a

training set The variables measured are

called features Typically a predictive

model is developed that can be used to

extrapolate from the data to a new situation

The methods of statistical analysis we

introduce in this chapter are the foundation

of most machine learning methods

stats Statistics sample Populations samples and machine learning p 2

Example statstermsshy1 re combat boots

Problem Statement

Consider a robot Pierre with a particular

gravitas and sense of style He seeks just

the right-looking pair of combat boots

for wearing in the autumn rains Pierre

is to purchase the boots online via image

recognition and decides to gather data

by visiting a hipster hangout one evening to

train his style For contrast he also watches

footage of a White Nationalist rally focusing

special attention on the boots of wearers of

khakis and polos Comment on Pierrersquos methods

stats Statistics sample Estimation of sample mean and variance p 1

sample mean

statssample Estimation of sample

mean and variance

Estimation and sample statistics

The mean and variance definitions of

Lecture probE and Lecture probmoments apply

only to a random variable for which we have a

theoretical probability distribution Typically

it is not until after having performed many

measurements of a random variable that

we can assign a good distribution model

Until then measurements can help us estimate

aspects of the data We usually start by

estimating basic parameters such as mean

and variance before estimating a probability

distribution

There are two key aspects to randomness in the

measurement of a random variable First of

course there is the underlying randomness with

its probability distribution mean standard

deviation etc which we call the population

statistics Second there is the statistical

variability that is due to the fact that we

are estimating the random variablersquos statisticsmdash

called its sample statisticsmdashfrom some sample

Statistical variability is decreased with

greater sample size and number of samples

whereas the underlying randomness of the

random variable does not decrease Instead

our estimates of its probability distribution

and statistics improve

Sample mean variance and standard

deviation

The arithmetic mean or sample mean of a

measurand with sample size N represented by

stats Statistics sample Estimation of sample mean and variance p 2

population mean

sample variance

population variance

random variable X is defined as

x =1

N

Nsumi=1

xi (sample1)

If the sample size is large x rarr mX (the sample

mean approaches the mean) The population

mean is another name for the mean mX which

is equal to

mX = limNrarrinfin 1

N

Nsumi=1

xi (sample2)

Recall that the definition of the mean is

mX = E [x]

The sample variance of a measurand

represented by random variable X is defined

as

S2X =1

N ndash 1

Nsumi=1

(xi ndash x)2 (sample3)

If the sample size is large S2X rarr σ2X(the

sample variance approaches the variance) The

population variance is another term for the

variance σ2X and can be expressed as

σ2X = limNrarrinfin 1

N ndash 1

Nsumi=1

(xi ndash x)2 (sample4)

Recall that the definition of the variance is

σ2X= E

[(X ndash mX)

2]

The sample standard deviation of a measurand

represented by random variable X is defined

as

SX =

radicS2X (sample5)

If the sample size is large SX rarr σX (the sample

standard deviation approaches the standard

stats Statistics sample Estimation of sample mean and variance p 1

mean of means Xi

mean of standard deviations SXi

standard deviation of means SXi

standard deviation of standard deviations SSXi

deviation) The population standard deviation is

another term for the standard deviation σX

and can be expressed as

σX = limNrarrinfin

radicS2X (sample6)

Recall that the definition of the standard

deviation is σX =radicσ2X

Sample statistics as random variables

There is an ambiguity in our usage of the term

ldquosamplerdquo It can mean just one measurement

or it can mean a collection of measurements

gathered together Hopefully it is clear from

context

In the latter sense often we collect multiple

samples each of which has its own sample

mean Xi and standard deviation SXi In this

situation Xi and SXi are themselves random

variables (meta af I know) This means they

have their own sample means Xi and SXi and

standard deviations SXi

and SSXi

The mean of means Xi is equivalent to

a mean with a larger sample size and is

therefore our best estimate of the mean of

the underlying random process The mean of

standard deviations SXi is our best estimate

of the standard deviation of the underlying

random process The standard deviation

of means SXi

is a measure of the spread

in our estimates of the mean It is our best

estimate of the standard deviation of the

statistical variation and should therefore

tend to zero as sample size and number of

samples increases The standard deviation

of standard deviations SSXiis a measure of

stats Statistics sample Estimation of sample mean and variance p 1

the spread in our estimates of the standard

deviation of the underlying process It should

also tend to zero as sample size and number of

samples increases

Let N be the size of each sample It can be

shown that the standard deviation of the

means SXi

can be estimated from a single

sample standard deviation

SXiasympSXiradicN (sample7)

This shows that as the sample size N increases

the statistical variability of the mean

decreases (and in the limit approaches zero)

Nonstationary signal statistics

The sample mean variance and standard

deviation definitions above assume the random

process is stationarymdashthat is its population

mean does not vary with time However

a great many measurement signals have

populations that do vary with time ie they are

nonstationary Sometimes the nonstationarity

arises from a ldquodriftrdquo in the dc value of a

signal or some other slowly changing variable

But dynamic signals can also change in a

recognizable and predictable manner as when

say the temperature of a room changes when

a window is opened or when a water level

changes with the tide

Typically we would like to minimize the effect

of nonstationarity on the signal statistics In

certain cases such as drift the variation is a

nuissance only but other times it is the point

of the measurement

stats Statistics sample Estimation of sample mean and variance p 1

Two common techniques are used depending

on the overall type of nonstationarity If it

is periodic with some known or estimated

period the measurement data series can

be ldquofoldedrdquo or ldquoreshapedrdquo such that the ith

measurement of each period corresponds to

the ith measurement of all other periods In

this case somewhat counterintuitively we can

consider the ith measurements to correspond

to a sample of size N where N is the number of

periods over which measurements are made

When the signal is aperiodic we often simply

divide it into ldquosmallrdquo (relative to its overall

trend) intervals over which statistics are

computed separately

Note that in this discussion we have assumed

that the nonstationarity of the signal is due to

a variable that is deterministic (not random)

Example statssampleshy1 re measurement of gaussian noise on

nonstationary signal

Problem StatementConsider the measurement of the temperature

inside a desktop computer chassis via an

inexpensive thermistor a resistor that

changes resistance with temperature The

processor and power supply heat the chassis

in a manner that depends on processing

demand For the test protocol the processors

are cycled sinusoidally through processing

power levels at a frequency of 50 mHz for

nT = 12 periods and sampled at 1 Hz Assume

a temperature fluctuation between about 20

and 50 C and gaussian noise with standard

stats Statistics sample Estimation of sample mean and variance p 2

deviation 4 C Consider a sample to be the

multiple measurements of a certain instant in

the period

1 Generate and plot simulated

temperature data as a time series

and as a histogram or frequency

distribution Comment on why the

frequency distribution sucks

2 Compute the sample mean and standard

deviation for each sample in the cycle

3 Subtract the mean from each sample

in the period such that each sample

distribution is centered at zero Plot

the composite frequency distribution

of all samples together This represents

our best estimate of the frequency

distribution of the underlying process

4 Plot a comparison of the theoretical

mean which is 35 and the sample mean

of means with an error bar Vary the

number of samples nT and comment on its

effect on the estimate

5 Plot a comparison of the theoretical

standard deviation and the sample mean

of sample standard deviations with an

error bar Vary the number of samples

nT and comment on its effect on the

estimate

6 Plot the sample means over a single

period with error bars of plusmn one sample

standard deviation of the means This

represents our best estimate of the

sinusoidal heating temperature Vary the

number of samples nT and comment on

the estimate

Problem Solution

stats Statistics sample Estimation of sample mean and variance p 3

clear close all clear kernel

Generate the temperature data

The temperature data can be generated by

constructing an array that is passed to a

sinusoid then ldquorandomizedrdquo by gaussian

random numbers Note that we add 1 to npand n to avoid the sneaky fencepost error

f = 50e-3 Hz sinusoid frequencya = 15 C amplitude of oscillationdc = 35 C dc offset of oscillationfs = 1 Hz sampling frequencynT = 12 number of sinusoid periodss = 4 C standard deviationnp = fsf+1 number of samples per periodn = nTnp+1 total number of samples

t_a = linspace(0nTfn) time arraysin_a = dc + asin(2pift_a) sinusoidal arrayrng(43) seed the random number generatornoise_a = srandn(size(t_a)) gaussian noisesignal_a = sin_a + noise_a sinusoid + noise

Now that we have an array of ldquodatardquo wersquore

ready to plot

h = figurep = plot(t_asignal_ao-

Color[888]MarkerFaceColorbMarkerEdgeColornoneMarkerSize3)

xlabel(time (s))ylabel(temperature (C))hgsave(hfigurestemp)

stats Statistics sample Estimation of sample mean and variance p 4

0 50 100 150 200 250

20

40

60

time (s)

tempera

ture

(C)

Figure 041 temperature over time

This is something like what we might see for

continuous measurement data Now the

histogram

h = figurehistogram(signal_a

30 number of binsnormalizationprobability for PMF

)xlabel(temperature (C))ylabel(probability)hgsave(hfigurestemp)

10 20 30 40 50 600

002

004

006

008

temperature (C)

proba

bility

Figure 042 a poor histogram due tounstationarity of the signal

stats Statistics sample Estimation of sample mean and variance p 5

This sucks because we plot a frequency

distribution to tell us about the random

variation but this data includes the sinusoid

Sample mean variance and standard

deviation

To compute the sample mean micro and standard

deviation s for each sample in the period

we must ldquopick outrdquo the nT data points that

correspond to each other Currently theyrsquore in

one long 1 times n array signal_a It is helpful

to reshape the data so it is in an nT times nparray which each row corresponding to a

new period This leaves the correct points

aligned in columns It is important to note

that we can do this ldquofoldingrdquo operation only

when we know rather precisely the period

of the underlying sinusoid It is given in the

problem that it is a controlled experiment

variable If we did not know it we would

have to estimate it too from the data

signal_ar = reshape(signal_a(1end-1)[npnT])size(signal_ar) check sizesignal_ar(1314) print three rows four columns

ans =

12 21

ans =

302718 400946 408341 447662401836 372245 494076 461137400571 409718 461627 419145

stats Statistics sample Estimation of sample mean and variance p 6

Define the mean variance and standard

deviation functions as ldquoanonmymous

functionsrdquo We ldquoroll our ownrdquo These are not

as efficient or flexible as the built-in Matlab

functions mean var and std which should

typically be used

my_mean = (vec) sum(vec)length(vec)my_var = (vec) sum((vec-my_mean(vec))^2)(length(vec)-1)

my_std = (vec) sqrt(my_var(vec))

Now the sample mean variance and standard

deviations can be computed We proceed by

looping through each column of the reshaped

signal array

mu_a = NaNones(1np) initialize mean arrayvar_a = NaNones(1np) initialize var arrays_a = NaNones(1np) initialize std array

for i = 1np for each columnmu_a(i) = my_mean(signal_ar(i))var_a(i) = my_var(signal_ar(i))s_a(i) = sqrt(var_a(i)) touch of speed

end

Composite frequency distribution

The columns represent samples We want to

subtract the mean from each column We use

repmat to reproduce mu_a in nT rows so it can

be easily subtracted

stats Statistics sample Estimation of sample mean and variance p 7

signal_arz = signal_ar - repmat(mu_a[nT1])size(signal_arz) check sizesignal_arz(1314) print three rows four columns

ans =

12 21

ans =

-50881 09525 -02909 -1570048237 -19176 82826 -0222546972 18297 50377 -44216

Now that all samples have the same mean

we can lump them into one big bin for the

frequency distribution There are some nice

built-in functions to do a quick reshape and

fit

resizesignal_arzr = reshape(signal_arz[1nTnp])size(signal_arzr) check size fitpdfit_model = fitdist(signal_arzrnormal) fitx_a = linspace(-1515100)pdfit_a = pdf(pdfit_modelx_a)pdf_a = normpdf(x_a0s) theoretical pdf

ans =

1 252

Plot

stats Statistics sample Estimation of sample mean and variance p 8

h = figurehistogram(signal_arzr

round(ssqrt(nT)) number of binsnormalizationprobability for PMF

)hold onplot(x_apdfit_ab-linewidth2) hold onplot(x_apdf_ag--linewidth2)legend(pmfpdf estpdf)xlabel(zero-mean temperature (C))ylabel(probability massdensity)hgsave(hfigurestemp)

ndash15 ndash10 ndash5 0 5 10 150

005

01

015

zero-mean temperature (C)proba

bility

massdensity

pmf

pdf est

pdf

Figure 043 PMF and estimated and theoreticalPDFs

Means comparison

The sample mean of means is simply the

following

mu_mu = my_mean(mu_a)

mu_mu =

351175

stats Statistics sample Estimation of sample mean and variance p 9

The standard deviation that works as an

error bar which should reflect how well

we can estimate the point plotted is the

standard deviation of the means It is

difficult to compute this directly for a

nonstationary process We use the estimate

given above and improve upon it by using the

mean of standard deviations instead of a

single samplersquos standard deviation

s_mu = mean(s_a)sqrt(nT)

s_mu =

11580

Now for the simple plot

h = figurebar(mu_mu) hold on gives barerrorbar(mu_mus_murlinewidth2) error barax = gca current axisaxXTickLabels = $overlineoverlineX$axTickLabelInterpreter = latexhgsave(hfigurestemp)

Standard deviations comparison

The sample mean of standard deviations is

simply the following

mu_s = my_mean(s_a)

stats Statistics sample Estimation of sample mean and variance p 10

mu_s =

40114

The standard deviation that works as

an error bar which should reflect how

well we can estimate the point plotted is

the standard deviation of the standard

deviations We can compute this directly

s_s = my_std(s_a)

s_s =

08495

Now for the simple plot

h = figurebar(mu_s) hold on gives barerrorbar(mu_ss_srlinewidth2) error barsax = gca current axisaxXTickLabels = $overlineS_X$axTickLabelInterpreter = latexhgsave(hfigurestemp)

X0

10

20

30

40

Figure 044 sample mean of sample means

SX

0

2

4

Figure 045 sample standard deviation of sample means

Plot a period with error bars

Plotting the data with error bars is fairly

straightforward with the built-in errorbar

stats Statistics confidence Estimation of sample mean and variance p 11

function The main question is ldquowhich

standard deviationrdquo Since wersquore plotting the

means it makes sense to plot the error bars

as a single sample standard deviation of the

means

h = figuree1 = errorbar(t_a(1np)mu_as_muones(1np)b)hold ont_a2 = linspace(01f101)e2 = plot(t_a2dc + asin(2pift_a2)r-)xlim([t_a(1)t_a(np)])grid onxlabel(folded time (s))ylabel(temperature (C))legend([e1 e2]sample meanpopulation meanLocationNorthEast)hgsave(hfigurestemp)

0 5 10 15

20

30

40

50

folded time (s)

tempera

ture

(C)

sample mean

population mean

Figure 046 sample means over a period

stats Statistics confidence Confidence p 1

confidence

central limit theorem

statsconfidence Confidence

One really ought to have it to give a lecture

named it but wersquoll give it a try anyway

Confidence is used in the common sense

although we do endow it with a mathematical

definition to scare business majors who

arenrsquot actually impressed but indifferent

Approximately if under some reasonable

assumptions (probabilistic model) we estimate

the probability of some event to be P we say

we have P confidence in it I mean business

majors are all ldquoSupply and demand Letrsquos call

that a lsquolawrsquo rdquo so I think wersquore even

So wersquore back to computing probability from

distributionsmdashprobability density functions

(PDFs) and probability mass functions (PMFs)

Usually we care most about estimating the

mean of our distribution Recall from the

previous lecture that when several samples

are taken each with its own mean the mean

is itself a random variablemdashwith a mean of

course Meanception

But the mean has a probability distribution

of its own The central limit theorem has

as one of its implications that as the sample

size N gets large regardless of the sample

distributions this distribution of means

approaches the Gaussian distribution

But sometimes I always worry Irsquom being lied to

so letrsquos check

stats Statistics confidence Confidence p 1

clear close all clear kernel

Generate some data to test the central

limit theorem

Data can be generated by constructing an

array using a (seeded for consistency) random

number generator Letrsquos use a uniformly

distributed PDF between 0 and 1

N = 150 sample size (number of measurements per sample)M = 120 number of samplesn = NM total number of measurementsmu_pop = 05 because its a uniform PDF between 0 and 1

rng(11) seed the random number generatorsignal_a = rand(NM) uniform PDFsize(signal_a) check the size

ans =

150 120

Letrsquos take a look at the data by plotting

the first ten samples (columns) as shown in

Figure 047

This is something like what we might see for

continuous measurement data Now the

histogram shown in

samples_to_plot = 10h = figure

stats Statistics confidence Confidence p 1

0 20 40 60 80 100 120 1400

02

04

06

08

1

index

measurement

Figure 047 raw data with colors correspondingto samples

0 02 04 06 08 10

002

004

006

008

measurement

proba

bility

Figure 048 a histogram showing the approximatelyuniform distribution of each sample (color)

c = jet(samples_to_plot) color arrayfor j=1samples_to_plot

histogram(signal_a(j)30 number of binsfacecolorc(j)facealpha3normalizationprobability for PMF

)hold on

endhold offxlim([-05105])xlabel(measurement)ylabel(probability)hgsave(hfigurestemp)

This isnrsquot a great plot but it shows roughly

that each sample is fairly uniformly

distributed

stats Statistics confidence Confidence p 1

Sample statistics

Now letrsquos check out the sample statistics We

want the sample mean and standard deviation

of each column Letrsquos use the built-in functions

mean and std

mu_a = mean(signal_a1) mean of each columns_a = std(signal_a1) std of each column

Now we can compute the mean statistics both

the mean of the mean X and the standard

deviation of the mean sX which we donrsquot

strictly need for this part but wersquore curious

We choose to use the direct estimate instead

of the sXradicN formula but they should be close

mu_mu = mean(mu_a)s_mu = std(mu_a)

mu_mu =

04987

s_mu =

00236

The truth about sample means

Itrsquos the moment of truth Letrsquos look at the

distribution shown in Figure 049

stats Statistics confidence Confidence p 1

042 044 046 048 05 052 054 056 0580

01

02

03

measurement

proba

bility

Figure 049 a histogram showing the approximatelynormal distribution of the means

h = figurehistogram(mu_a

normalizationprobability for PMF)hold offxlabel(measurement)ylabel(probability)hgsave(hfigurestemp)

This looks like a Gaussian distribution about

the mean of means so I guess the central limit

theorem is legit

Gaussian and probability

We already know how to compute the

probability P a value of a random variable X

lies in a certain interval from a PMF or PDF

(the sum or the integral respectively) This

means that for sufficiently large sample size N

such that we can assume from the central limit

theorem that the sample means xi are normally

distributed the probability a sample mean

value xi is in a certain interval is given by

integrating the Gaussian PDF The Gaussian PDF

for random variable Y representing the sample

means is

f(y) =1

σradic2π

expndash(y ndash micro)2

2σ2 (confidence1)

stats Statistics confidence Confidence p 2

Gaussian cumulative distribution function

where micro is the population mean and σ is the

population standard deviation

The integral of f over some interval is the

probability a value will be in that interval

Unfortunately that integral is uncool It gives

rise to the definition of the error function

which for the Gaussian random variable Y is

erf(yb) =1radicπ

ˆ yb

ndashyb

endasht2dt (confidence2)

This expresses the probability a sample mean

being in the interval [ndashyb yb] if Y has mean 0

and variance 12

Matlab has this built-in as erf shown in

Figure 0410

y_a = linspace(03100)h = figurep1 = plot(y_aerf(y_a))p1LineWidth = 2grid onxlabel(interval bound $y_b$interpreterlatex)ylabel(error function $textrmerf(y_b)$

interpreterlatex)hgsave(hfigurestemp)

We could deal directly with the error

function but most people donrsquot and wersquore

weird enough as it is Instead people use the

Gaussian cumulative distribution function

stats Statistics confidence Confidence p 3

0 05 1 15 2 25 30

02

04

06

08

1

interval bound yb

errorfunctionerf(yb)

Figure 0410 the error function

(CDF) Φ Rrarr R which is defined as

Φ(z) =1

2

(1 + erf

(zradic2

))(confidence3)

and which expresses the probability of a

Gaussian random variable Z with mean 0 and

standard deviation 1 taking on a value in

the interval (ndashinfin z] The Gaussian CDF and PDF

are shown in Figure 0411 Values can be taken

directly from the graph but itrsquos more accurate

to use the table of values in Appendix A01

Thatrsquos great and all but occasionally (always)

we have Gaussian random variables with

nonzero means and nonunity standard

deviations It turns out we can shift any

Gaussian random variable by its mean and

scale it by its standard deviation to make

it have zero mean and standard deviation

We can then use Φ and interpret the results

as being relative to the mean and standard

deviation using phrases like ldquothe probability

it is within two standard deviations of its

meanrdquo The transformed random variable Z and

its values z are sometimes called the z-score

For a particular value x of a random variable

X we can compute its z-score (or value z of

stats Statistics confidence Confidence p 4

zb

ndash3 ndash2 ndash1 0 1 2 30

01

02

03

04

z

Gaussia

nPDFf(z)

Φ(zb)

f(z)

(a) the Gaussian PDF

ndash3 ndash2 ndash1 0 1 2 30

02

04

06

08

1

interval bound zb

Gaussia

nCDFΦ(zb)

(b) the Gaussian CDF

Figure 0411 the Gaussian PDF and CDF for z-scores

random variable Z) with the formula

z =x ndash microXσX

(confidence4)

and compute the probability of X taking on a

value within the interval say x isin [xbndash xb+]

from the table (Sample statistics X and SXare appropriate when population statistics are

unknown)

For instance compute the probability a

Gaussian random variable X with microX = 5 and

σX = 234 takes on a value within the interval

x isin [3 6]

1 Compute the z-score of each endpoint of

the interval

stats Statistics confidence Confidence p 5

z3 =3 ndash microXσX

asymp ndash085 (confidence5)

z6 =6 ndash microXσX

asymp 043 (confidence6)

2 Look up the CDF values for z3 and z6 which

are Φ(z3) = 01977 and Φ(z6) = 06664 3 The

CDF values correspond to the probabilities x lt 3

and x lt 6 Therefore to find the probability

x lies in that interval we subtract the lower

bound probability

P(x isin [3 6]) = P(x lt 6) ndash P(x lt 3) (confidence7)

= Φ(6) ndashΦ(3) (confidence8)

asymp 06664 ndash 01977 (confidence9)

asymp 04689 (confidence10)

So there is a 4689 probability and therefore

we have 4689 confidence that x isin [3 6]

Often we want to go the other way estimating

the symmetric interval [xbndash xb+] for which

there is a given probability In this case we

first look up the z-score corresponding to a

certain probability For concreteness given

the same population statistics above letrsquos find

the symmetric interval [xbndash xb+] over which we

have 90 confidence From the table we want

two symmetric z-scores that have CDFshyvalue

difference 09 Or in maths

Φ(zb+) ndashΦ(zbndash) = 09 and zb+ = ndashzbndash

(confidence11)

stats Statistics confidence Confidence p 6

Due to the latter relation and the additional

fact that the Gaussian CDF has antisymmetry

Φ(zb+) +Φ(zbndash) = 1 (confidence12)

Adding the two Φ equations

Φ(zb+) = 192 (confidence13)

= 095 (confidence14)

and Φ(zbndash) = 005 From the table these

correspond (with a linear interpolation) to

zb = zb+ = ndashzbndash asymp 1645 All that remains is

to solve the z-score formula for x

x = microX + zσX (confidence15)

From this

xb+ = microX + zb+σX asymp 8849 (confidence16)

xbndash = microX + zbndashσX asymp 1151 (confidence17)

and X has a 90 confidence interval

[1151 8849]

stats Statistics confidence Confidence p 7

Example statsconfidenceshy1 re gaussian confidence for a mean

Problem StatementConsider the data set generated above

What is our 95 confidence interval in our

estimate of the mean

Problem Solution

Assumingwe have a sufficiently large

data set the distribution of means is

approximately Gaussian Following the same

logic as above we need z-score that gives an

upper CDF value of From the

table we obtain the zb = zb+ = ndashzbndash below

z_b = 196

Now we can estimate the mean using our

sample and mean statistics

X = Xplusmn zbSX (confidence18)

mu_x_95 = mu_mu + [-z_bz_b]s_mu

mu_x_95 =

04526 05449

stats Statistics student Confidence p 8

This is our 95 confidence interval in our

estimate of the mean

stats Statistics student Student confidence p 1

Studentrsquos t-distribution

degrees of freedom

statsstudent Student

confidence

The central limit theorem tells us that for

large sample size N the distribution of the

means is Gaussian However for small sample

size the Gaussian isnrsquot as good of an estimate

Studentrsquos t-distribution is superior for

lower sample size and equivalent at higher

sample size Technically if the population

standard deviation σX is known even for

low sample size we should use the Gaussian

distribution However this rarely arises in

practice so we can usually get away with an

ldquoalways trdquo approach

A way that the tshydistribution accounts for

low-N is by having an entirely different

distribution for each N (seems a bit of a

cheat to me) Actually instead of N it uses

the degrees of freedom ν which is N minus the

number of parameters required to compute the

statistic Since the standard deviation requires

only the mean for most of our cases ν = N ndash 1

As with the Gaussian distribution the

tshydistributionrsquos integral is difficult to

calculate Typically we will use a t-table such

as the one given in Appendix A02 There are

three points of note

1 Since we are primarily concerned with

going from probabilityconfidence

values (eg P probabilityconfidence) to

intervals typically there is a column for

each probability

stats Statistics student Student confidence p 2

2 The extra parameter ν takes over one of

the dimensions of the table because three-

dimensional tables are illegal

3 Many of these tables are ldquotwo-sidedrdquo

meaning their t-scores and probabilities

assume you want the symmetric

probability about the mean over the

interval [ndashtb tb] where tb is your t-score

bound

Consider the following example

Example statsstudentshy1 re student confidence interval

Problem StatementWrite a Matlab script to generate a data

set with 200 samples and sample sizes

N isin 10 20 100 using any old distribution

Compare the distribution of the means for

the different N Use the sample distributions

and a t-table to compute 99 confidence

intervals

Problem Solution

Generate the data set

confidence = 099 requirement

M = 200 of samplesN_a = [1020100] sample sizes

mu = 27 population meansigma = 9 population std

rng(1) seed random number generatordata_a = mu + sigmarandn(N_a(end)M) normal

stats Statistics student Student confidence p 3

size(data_a) check sizedata_a(11015) check 10 rows and five columns

ans =

100 200

ans =

211589 302894 278705 307835 283662376305 171264 282973 240811 343486201739 443719 437059 390699 322002170135 326064 369030 379230 365747193900 329156 237230 224749 197709218460 138295 312479 169527 341876219719 346854 194480 187014 241642286054 322244 222873 269906 376746252282 187326 145011 283814 277645322780 341538 270382 188643 141752

Compute the means for different sample sizes

mu_a = NaNones(length(N_a)M)for i = 1length(N_a)

mu_a(i) = mean(data_a(1N_a(i)1M)1)end

Plotting the distribution of

the means yields Figure 0412

20 25 30 350

01

02

03

random variable X

proba

bility

(PMF)

interval for N = 10

interval for N = 20

interval for N = 100

stats Statistics student Student confidence p 4

Figure 0412 a histogram showing thedistribution of the means for each sample size

It makes sense that the larger the sample size

the smaller the spread A quantitative metric

for the spread is of course the standard

deviation of the means for each sample size

S_mu = std(mu_a02)

S_mu =

283652091810097

Look up t-table values or use Matlabrsquos tinvfor different sample sizes and 99 confidence

Use these the mean of means and the

standard deviation of means to compute the

99 confidence interval for each N

t_a = tinv(confidenceN_a-1)for i = 1length(N_a)

interval = mean(mu_a(i)) + [-11]t_a(i)S_mu(i)

disp(sprintf(interval for N = i N_a(i)))disp(interval)

end

stats Statistics multivar Student confidence p 5

t_a =

28214 25395 23646

interval for N = 10190942 351000

interval for N = 20216292 322535

interval for N = 100247036 294787

As expected the larger the sample size the

smaller the interval over which we have 99

confidence in the estimate

stats Statistics multivar Multivariate probability and correlation p 1

joint PDF

joint PMF

statsmultivar Multivariate

probability and

correlation

Thus far we have considered probability

density and mass functions (PDFs and PMFs)

of only one random variable But of course

often we measure multiple random variables

X1 X2 Xn during a single event meaning

a measurement consists of determining values

x1 x2 xn of these random variables

We can consider an n-tuple of continuous

random variables to form a sample space Ω = Rn

on which a multivariate function f Rn rarr Rcalled the joint PDF assigns a probability

density to each outcome x isin Rn The joint PDF

must be greater than or equal to zero for all

x isin Rn the multiple integral over Ω must be

unity and the multiple integral over a subset

of the sample space A sub Ω is the probability of

the event A

We can consider an n-tuple of discrete random

variables to form a sample space Nn0 on which

a multivariate function f Nn0 rarr R calledthe joint PMF assigns a probability to each

outcome x isin Nn0 The joint PMF must be greater

than or equal to zero for all x isin Nn0 the

multiple sum over Ω must be unity and the

multiple sum over a subset of the sample space

A sub Ω is the probability of the event A

stats Statistics multivar Multivariate probability and correlation p 2

Example statsmultivarshy1 re bivariate gaussian pdf

Problem StatementLetrsquos visualize multivariate PDFs by plotting

a bivariate gaussian using Matlabrsquos mvnpdffunction

Problem Solution

mu = [1020] meansSigma = [1002] cov well talk about thisx1_a = linspace(

mu(1)-5sqrt(Sigma(11))mu(1)+5sqrt(Sigma(11))30)

x2_a = linspace(mu(2)-5sqrt(Sigma(22))mu(2)+5sqrt(Sigma(22))30)

[X1X2] = meshgrid(x1_ax2_a)f = mvnpdf([X1() X2()]muSigma)f = reshape(flength(x2_a)length(x1_a))

h = figurep = surf(x1_ax2_af)xlabel($x_1$interpreterlatex)ylabel($x_2$interpreterlatex)zlabel($f(x_1x_2)$interpreterlatex)shading interphgsave(hfigurestemp)

The result is Figure 0413 Note

how the means and standard

deviations affect the distribution

510

1520

0

02

x1x2

f(x1x2)

stats Statistics multivar Multivariate probability and correlation p 1

marginal PDF

Figure 0413 twoshyvariable gaussian PDF

Marginal probability

The marginal PDF of a multivariate PDF is the

PDF of some subspace of Ω after one or more

variables have been ldquointegrated outrdquo such that

a fewer number of random variables remain

Of course these marginal PDFs must have the

same properties of any PDF such as integrating

to unity

Example statsmultivarshy2 re bivariate gaussian marginal probability

Problem StatementLetrsquos demonstrate this by numerically

integrating over x2 in the bivariate Gaussian

above

Problem Solution

Continuing from where we left off letrsquos

integrate

f1 = trapz(x2_af2) trapezoidal integration

Plotting this yields Figure 0414

h = figurep = plot(x1_af1)pLineWidth = 2xlabel($x_1$interpreterlatex)

stats Statistics multivar Multivariate probability and correlation p 1

machine learning

ylabel($g(x_1)=int_-infty^infty f(x_1x_2) d x_2$interpreterlatex)hgsave(hfigurestemp)

6 8 10 12 140

01

02

03

04

x1

g(x1)=acute infin ndashinfinf

(x1x2)dx2

Figure 0414 marginal Gaussian PDF g(x1)

We should probably verify that this

integrates to one

disp([integral over x_1 = sprintf(07ftrapz(x1_af1))]

)

integral over x_1 = 09999986

Not bad

Covariance

Very often especially in machine learning

stats Statistics multivar Multivariate probability and correlation p 1

artificial intelligence

covariance

correlation

sample covariance

covariance matrix

or artificial intelligence applications the

question about two random variables X and Y

is how do they coshyvary That is what is their

covariance defined as

Cov[XY

]equiv E

((X ndash microX)(Y ndash microY)

)= E(XY) ndash microXmicroY

Note that when X = Y the covariance is just

the variance When a covariance is large and

positive it is an indication that the random

variables are strongly correlated When

it is large and negative they are strongly

anti-correlated Zero covariance means

the variables are uncorrelated In fact

correlation is defined as

Cor[XY

]=

Cov[XY

]radicVar

[X]Var

[Y]

This is essentially the covariance ldquonormalizedrdquo

to the interval [ndash1 1]

Sample covariance

As with the other statistics wersquove considered

covariance can be estimated from

measurement The estimate called the sample

covariance qXY of random variables X and Y

with sample size N is given by

qXY =1

N ndash 1

Nsumi=1

(xi ndash X)(yi ndash Y)

Multivariate covariance

With n random variables Xi one can compute

the covariance of each pair It is common

practice to define an n times n matrix of

covariances called the covariance matrix Σ

stats Statistics multivar Multivariate probability and correlation p 2

sample covariance matrix

such that each pairrsquos covariance

Cov[Xi Xj

](multivar1)

appears in its row-column combination (making it

symmetric) as shown below

Σ =

Cov

[X1 X1

]Cov

[X1 X2

]middot middot middot Cov

[X1 Xn

]Cov

[X2 X1

]Cov

[X2 X2

]Cov

[X2 Xn

]

Cov[Xn X1

]Cov

[Xn X2

]middot middot middot Cov

[Xn Xn

]

The multivariate sample covariance matrix

Q is the same as above but with sample

covariances qXiXj

Both covariance matrices have correlation

analogs

Example statsmultivarshy3 re car data sample covariance and correlation

Problem StatementLetrsquos use a built-in multivariate data set

that describes different features of cars

listed below

d = load(carsmallmat) this is a struct

Letrsquos compute the sample covariance and

correlation matrices

Problem Solution

variables = MPGCylindersDisplacementHorsepowerWeightAcceleration

stats Statistics multivar Multivariate probability and correlation p 3

Model_Yearn = length(variables)m = length(dMPG)

data = NaNones(mn) preallocatefor i = 1n

data(i) = d(variablesi)end

cov_d = nancov(data) sample covariancecor_d = corrcov(cov_d) sample correlation

This is highly correlatedanticorrelated

data Letrsquos plot each variable versus each

other variable to see the correlations of each

We use a red-to-blue colormap to contrast

anticorrelation and correlation Purple then is

uncorrelated

The following builds the red-to-blue colormap

n_colors = 10cmap_rb = NaNones(n_colors3)for i = 1n_colors

a = in_colorscmap_rb(i) = (1-a)[100]+a[001]

end

h = figurefor i = 1n

for j = 1nsubplot(nnsub2ind([nn]ji))p = plot(d(variablesi)d(variablesj)

) hold onthis_color = cmap_rb(round((cor_d(ij)+1)(n_colors-1)2)

stats Statistics multivar Multivariate probability and correlation p 1

)pMarkerFaceColor = this_colorpMarkerEdgeColor = this_color

endendhgsave(hfigurestemp)

Figure 0415 car data correlation

Conditional probability and dependence

Independent variables are uncorrelated

However uncorrelated variables may or may

not be independent Therefore we cannot use

correlation alone as a test for independence

For instance for random variables X and Y

where X has some even distribution and Y = X2

clearly the variables are dependent However

the are also uncorrelated (due to symmetry)

stats Statistics multivar Multivariate probability and correlation p 2

Example statsmultivarshy4 re car data sample covariance and correlation

Problem StatementUsing a uniform distribution U(ndash1 1) show that

X and Y are uncorrelated (but dependent)

with Y = X2 with some sampling We compute

the correlation for different sample sizes

Problem Solution

N_a = round(linspace(10500100))qc_a = NaNones(size(N_a))rng(6)x_a = -1 + 2rand(max(N_a)1)y_a = x_a^2for i = 1length(N_a)

should write incremental algorithm but lazyq = cov(x_a(1N_a(i))y_a(1N_a(i)))qc = corrcov(q)qc_a(i) = qc(21) cross correlation

end

The absolute values of the correlations are

shown in Figure 0416 Note that we should

probably average several such curves to

estimate how the correlation would drop off

with N but the single curve describes our

understanding that the correlation in fact

approaches zero in the large-sample limit

h = figurep = plot(N_aabs(qc_a))pLineWidth = 2xlabel(sample size $N$interpreterlatex)ylabel(absolute sample correlationinterpreterlatex

stats Statistics regression Multivariate probability and correlation p 3

)hgsave(hfigurestemp)

0 100 200 300 400 5000

01

02

03

sample size Nabsolute

sampleco

rrelation

Figure 0416 absolute value of the sample

correlation between X sim U(ndash1 1) and Y = X2 fordifferent sample size N In the limit the populationcorrelation should approach zero and yet X and Yare not independent

stats Statistics regression Regression p 1

statsregression Regression

Suppose we have a sample with two

measurands (1) the force F through a spring and

(2) its displacement X (not from equilibrium)

We would like to determine an analytic

function that relates the variables perhaps

for prediction of the force given another

displacement

There is some variation in the measurement Letrsquos

say the following is the sample

X_a = 1e-3[10213041495061718092100] mF_a = [501504532559572599

610639670679703] N

Letrsquos take a look at the data The result is

Figure 0417

h = figurep = plot(X_a1e3F_abMarkerSize15)xlabel($X$ (mm)interpreterlatex)ylabel($F$ (N)interpreterlatex)xlim([0max(X_a1e3)])grid onhgsave(hfigurestemp)

How might we find an analytic function that

agrees with the data Broadly our strategy

will be to assume a general form of a function

and use the data to set the parameters in the

stats Statistics regression Regression p 2

0 20 40 60 80 10050

60

70

X (mm)

F(N)

Figure 0417 forceshydisplacement data

function such that the difference between the

data and the function is minimal

Let y be the analytic function that we would

like to fit to the data Let yi denote the

value of y(xi) where xi is the ith value of the

random variable X from the sample Then we

want to minimize the differences between the

force measurements Fi and yi

From calculus recall that we can minimize a

function by differentiating it and solving for

the zero-crossings (which correspond to local

maxima or minima)

First we need such a function to minimize

Perhaps the simplest effective function D is

constructed by squaring and summing the

differences we want to minimize for sample

size N

D(xi) =Nsumi=1

(Fi ndash yi)2 (regression1)

stats Statistics regression Regression p 3

(recall that yi = y(xi) which makes D a function

of x)

Now the form of y must be chosen We consider

only mth-order polynomial functions y but

others can be used in a similar manner

y(x) = a0 + a1x + a2x2 + middot middot middot + amxm (regression2)

If we treat D as a function of the polynomial

coefficients aj ie

D(a0 a1 middot middot middot am) (regression3)

and minimize D for each value of xi we must

take the partial derivatives of D with respect

to each aj and set each equal to zero

partD

parta0= 0

partD

parta1= 0 middot middot middot partD

partam= 0

This gives us N equations and m + 1 unknowns aj

Writing the system in matrix form

1 x1 x21middot middot middot xm1

1 x2 x22 middot middot middot xm21 x3 x23 middot middot middot xm3

1 xN x2Nmiddot middot middot xmN

︸ ︷︷ ︸

ANtimes(m+1)

a0a1a2

am

︸ ︷︷ ︸a(m+1)times1

=

F1F2F3

FN

︸ ︷︷ ︸bNtimes1

(regression4)

Typically N gt m and this is an overdetermined

system Therefore we usually canrsquot solve by

taking Andash1 because A doesnrsquot have an inverse

stats Statistics regression Regression p 4

Instead we either find the Moore-Penrose

pseudo-inverse Adagger and have a = Adaggerb as

the solution which is inefficientmdashor we can

approximate b with an algorithm such as that

used by Matlabrsquos operator In the latter case

a_a = Ab_a

Example statsregressionshy1 re regression

Problem StatementUse Matlabrsquos operator to find a good

polynomial fit for the sample Therersquos the

sometimesshydifficult question ldquowhat order

should we fitrdquo Letrsquos try out several and

see what the squaredshydifferences function D

gives

Problem Solution

N = length(X_a) sample sizem_a = 2N all the order up to N

A = NaNones(length(m_a)max(m_a)N)for k = 1length(m_a) each order

for j = 1N each measurementfor i = 1( m_a(k) + 1 ) each coef

A(kji) = X_a(j)^(i-1)end

endenddisp(squeeze(A(215)))

10000 00100 00001 00000 NaN10000 00210 00004 00000 NaN10000 00300 00009 00000 NaN10000 00410 00017 00001 NaN10000 00490 00024 00001 NaN10000 00500 00025 00001 NaN10000 00610 00037 00002 NaN10000 00710 00050 00004 NaN

stats Statistics regression Regression p 5

10000 00800 00064 00005 NaN10000 00920 00085 00008 NaN10000 01000 00100 00010 NaN

Wersquove printed the first five columns of the

third-order matrix which only has four

columns so NaNs fill in the rest

Now we can use the operator to solve for

the coefficients

a = NaNones(length(m_a)max(m_a))

warning(offall)for i = 1length(m_a)

A_now = squeeze(A(i1m_a(i)))a(i1m_a(i)) = (A_now(1m_a(i))F_a)

endwarning(onall)

n_plot = 100x_plot = linspace(min(X_a)max(X_a)n_plot)y = NaNones(n_plotlength(m_a)) preallocatefor i = 1length(m_a)

y(i) = polyval(fliplr(a(i1m_a(i)))x_plot)end

h = figurefor i = 12length(m_a)-1

p = plot(x_plot1e3y(i)linewidth15)hold onpDisplayName = sprintf(

order i (m_a(i)-1)

)

stats Statistics ex Regression p 6

endp = plot(X_a1e3F_abMarkerSize15)xlabel($X$ (mm)interpreterlatex)ylabel($F$ (N)interpreterlatex)pDisplayName = samplelegend(showlocationsoutheast)grid onhgsave(hfigurestemp)

20 40 60 80 100

50

60

70

X (mm)

F(N)

order 1

order 3

order 5

order 7

order 9

sample

Figure 0418 forceshydisplacement data with curvefits

vecs Statistics Exercises for Chapter 04 p 1

statsex Exercises for Chapter

04

calculus

limit

series

derivative

integral

vector calculus

vecs

Vector calculus

A great many physical situations of interest to

engineers can be described by calculus It can

describe how quantities continuously change

over (say) time and gives tools for computing

other quantities We assume familiarity with

the fundamentals of calculus limit series

derivative and integral From these and a

basic grasp of vectors we will outline some

of the highlights of vector calculus Vector

calculus is particularly useful for describing

the physics of for instance the following

mechanics of particles wherein is studied

the motion of particles and the forcing

causes thereof

rigid-body mechanics wherein is studied the

motion rotational and translational and

its forcing causes of bodies considered

rigid (undeformable)

solid mechanics wherein is studied the

motion and deformation and their forcing

causes of continuous solid bodies (those

that retain a specific resting shape)

vecs Vector calculus p 3

complex analysis

1 For an introduction to complex analysis see Kreyszig (Kreyszig

Advanced Engineering Mathematics Part D)

2 Ibid Chapters 9 10

3 HM Schey Div Grad Curl and All that An Informal Text on Vector

Calculus WW Norton 2005 ISBN 9780393925166

euclidean vector space

bases

components

basis vectors

invariant

coordinate system

Cartesian coordinates

fluid mechanics wherein is studied the

motion and its forcing causes of fluids

(liquids gases plasmas)

heat transfer wherein is studied the

movement of thermal energy through and

among bodies

electromagnetism wherein is studied

the motion and its forcing causes of

electrically charged particles

This last example was in fact very influential

in the original development of both vector

calculus and complex analysis1 It is not

an exaggeration to say that the topics above

comprise the majority of physical topics of

interest in engineering

A good introduction to vector calculus is

given by Kreyszig2 Perhaps the most famous

and enjoyable treatment is given by Schey3

in the adorably titled Div Grad Curl and All

that

It is important to note that in much of what

follows we will describe (typically the three-

dimensional space of our lived experience) as

a euclidean vector space an nshydimensional

vector space isomorphic to Rn As we know

from linear algebra any vector v isin Rn can be

expressed in any number of bases That is the

vector v is a basis-free object with multiple

basis representations The components and

basis vectors of a vector change with basis

changes but the vector itself is invariant A

coordinate system is in fact just a basis We

are most familiar of course with Cartesian

vecs Vector calculus div p 4

manifolds

non-euclidean geometry

parallel postulate

differential geometry

4 John M Lee Introduction to Smooth Manifolds second Vol 218

Graduate Texts in Mathematics Springer 2012

5 Francesco Bullo and Andrew D Lewis Geometric control of

mechanical systems modeling analysis and design for simple

mechanical control systems Ed by JE Marsden L Sirovich and M

Golubitsky Springer 2005

coordinates which is the specific orthonormal

basis b for Rn

b1 =

1

0

0

b2 =

0

1

0

middot middot middot bn =

0

0

1

(vecs1)

Manifolds are spaces that appear locally

as Rn but can be globally rather different

and can describe non-euclidean geometry

wherein euclidean geometryrsquos parallel

postulate is invalid Calculus on manifolds

is the focus of differential geometry a

subset of which we can consider our current

study A motivation for further study of

differential geometry is that it is very

convenient when dealingwith advanced

applications of mechanics such as rigid-body

mechanics of robots and vehicles A very nice

mathematical introduction is given by Lee4 and

Bullo and Lewis5 give a compact presentation

in the context of robotics

Vector fields have several important

properties of interest wersquoll explore in this

chapter Our goal is to gain an intuition of

these properties and be able to perform basic

calculation

vecs Vector calculus div Divergence surface integrals and flux p 1

surface integral

flux

6 Schey Div Grad Curl and All that An Informal Text on Vector

Calculus pp 21shy30

7 Kreyszig Advanced Engineering Mathematics sect 106

source

sink

incompressible

vecsdiv Divergence surface

integrals and flux

Flux and surface integrals

Consider a surface S Let

r(u v) = [x(u v) y(u v) z(u v)] be a parametric

position vector on a Euclidean vector space R3Furthermore let F R3 rarr R3 be a vectorshyvalued

function of r and let n be a unit-normal vector

on a surface S The surface integral

uml

S

F middot ndS (div1)

which integrates the normal of F over the

surface We call this quantity the flux of F

out of the surface S This terminology comes

from fluid flow for which the flux is the

mass flow rate out of S In general the flux

is a measure of a quantity (or field) passing

through a surface For more on computing

surface integrals see Schey6 and Kreyszig7

Continuity

Consider the flux out of a surface S that

encloses a volume ∆V divided by that volume

1

∆V

uml

S

F middot ndS (div2)

This gives a measure of flux per unit volume

for a volume of space Consider its physical

meaningwhen we interpret this as fluid flow

all fluid that enters the volume is negative flux

and all that leaves is positive If physical

conditions are such that we expect no fluid to

enter or exit the volume via what is called a

source or a sink and if we assume the density

of the fluid is uniform (this is called an

incompressible fluid) then all the fluid that

vecs Vector calculus div Divergence surface integrals and flux p 1

continuity equation

divergence

enters the volume must exit and we get

1

∆V

uml

S

F middot ndS = 0 (div3)

This is called a continuity equation although

typically this name is given to equations of the

form in the next section This equation is one of

the governing equations in continuum mechanics

Divergence

Letrsquos take the fluxshypershyvolume as the volume

∆Vrarr 0 we obtain the following

Equation div4 divergence integral

form

lim∆Vrarr0

1

∆V

uml

S

F middot ndS

This is called the divergence of F and is

defined at each point in R3 by taking the

volume to zero about it It is given the

shorthand div F

What interpretation can we give this quantity

It is a measure of the vector fieldrsquos flux

outward through a surface containing an

infinitesimal volume When we consider a fluid

a positive divergence is a local decrease in

density and a negative divergence is a density

increase If the fluid is incompressible and has

no sources or sinks we can write the continuity

vecs Vector calculus div Divergence surface integrals and flux p 1

equation

div F = 0 (div5)

In the Cartesian basis it can be shown that the

divergence is easily computed from the field

F = Fxi + Fyj + Fzk (div6)

as follows

Equation div7 divergence

differential form

div F = partxFx + partyFy + partzFz

Exploring divergence

Divergence is perhaps best explored by

considering it for a vector field in R2 Sucha field F = Fxi + Fyj can be represented

as a ldquoquiverrdquo plot If we overlay the quiver

plot over a ldquocolor densityrdquo plot representing

div F we can increase our intuition about the

divergence

The followingwas generated from a Jupyter

notebook with the following filename and

kernel

notebook filename div_surface_integrals_fluxipynbnotebook kernel python3

vecs Vector calculus div Divergence surface integrals and flux p 2

First load some Python packages

from sympy import import numpy as npimport matplotlibpyplot as pltfrom matplotlibticker import LogLocatorfrom matplotlibcolors import from sympyutilitieslambdify import lambdifyfrom IPythondisplay import display Markdown Latex

Now we define some symbolic variables and

functions

var(xy)F_x = Function(F_x)(xy)F_y = Function(F_y)(xy)

Rather than repeat code letrsquos write a single

function quiver_plotter_2D to make several

of these plots

def quiver_plotter_2D(field=F_xxyF_yxygrid_width=3grid_decimate_x=8grid_decimate_y=8norm=Normalize()density_operation=divprint_density=True

) define symbolicsvar(xy)F_x = Function(F_x)(xy)F_y = Function(F_y)(xy)

vecs Vector calculus div Divergence surface integrals and flux p 3

field_sub = field

compute densityif density_operation is divden = F_xdiff(x) + F_ydiff(y)

elif density_operation is curl in the k directionden = F_ydiff(x) - F_xdiff(y)

elseerror(div and curl are the only density operators)

den_simp = densubs(field_sub

)doit()simplify()if den_simpis_constant()

print(Warning density operator is constant (no density plot)

)if print_density

print(fThe density_operation is)display(den_simp)

lambdify for numericsF_x_sub = F_xsubs(field_sub)F_y_sub = F_ysubs(field_sub)F_x_fun = lambdify((xy)F_xsubs(field_sub)numpy)F_y_fun = lambdify((xy)F_ysubs(field_sub)numpy)if F_x_subis_constantF_x_fun1 = F_x_fun dummyF_x_fun = lambda xy F_x_fun1(xy)npones(xshape)

if F_y_subis_constantF_y_fun1 = F_y_fun dummyF_y_fun = lambda xy F_y_fun1(xy)npones(xshape)

if not den_simpis_constant()den_fun = lambdify((xy) den_simpnumpy)

create gridw = grid_widthY X = npmgrid[-ww100j -ww100j]

evaluate numericallyF_x_num = F_x_fun(XY)F_y_num = F_y_fun(XY)if not den_simpis_constant()den_num = den_fun(XY)

plotp = pltfigure() colormeshif not den_simpis_constant()cmap = pltget_cmap(PiYG)im = pltpcolormesh(XYden_numcmap=cmapnorm=norm)pltcolorbar()

Abs quiverdx = grid_decimate_ydy = grid_decimate_x

vecs Vector calculus div Divergence surface integrals and flux p 4

pltquiver(X[dxdy]Y[dxdy]F_x_num[dxdy]F_y_num[dxdy]units=xy scale=10

)plttitle(fF(xy) = [F_xsubs(field_sub)F_ysubs(field_sub)])return p

Note that while wersquore at it we included a

hook for density plots of the curl of F and

wersquoll return to this in a later lecture

Letrsquos inspect several cases

p = quiver_plotter_2D(field=F_xx2F_yy2

)

The div is

2x + 2y

p = quiver_plotter_2D(field=F_xxyF_yxy

)

The div is

vecs Vector calculus div Divergence surface integrals and flux p 5

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [x2y2]

10

5

0

5

10

Figure 051 png

x + y

p = quiver_plotter_2D(field=F_xx2+y2F_yx2+y2

)

The div is

2x + 2y

p = quiver_plotter_2D(field=F_xx2sqrt(x2+y2)F_yy2sqrt(x2+y2)

vecs Vector calculus div Divergence surface integrals and flux p 6

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [xyxy]

10

5

0

5

10

Figure 052 png

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [x2 + y2x2 + y2]

10

5

0

5

10

Figure 053 png

vecs Vector calculus curl Divergence surface integrals and flux p 7

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [x2sqrt(x2 + y2)y2sqrt(x2 + y2)]

100

10 1010 1

100

Figure 054 png

norm=SymLogNorm(linthresh=3 linscale=3))

The div is

ndashx3 ndash y3 + 2(x + y

) (x2 + y2

)(x2 + y2

) 32

vecs Vector calculus curl Curl line integrals and circulation p 1

line integral

8 Schey Div Grad Curl and All that An Informal Text on Vector

Calculus pp 63shy74

9 Kreyszig Advanced Engineering Mathematics sect 101 102

force

work

circulation

the law of circulation

vecscurl Curl line integrals

and circulation

Line integrals

Consider a curve C in a Euclidean vector space

R3 Let r(t) = [x(t) y(t) z(t)] be a parametric

representation of C Furthermore let F R3 rarr R3

be a vectorshyvalued function of r and let rprime(t)

be the tangent vector The line integral is

ˆ

C

F(r(t)) middot rprime(t) dt (curl1)

which integrates F along the curve For more

on computing line integrals see Schey8 and

Kreyszig9

If F is a force being applied to an object

moving along the curve C the line integral is

the work done by the force More generally the

line integral integrates F along the tangent of

C

Circulation

Consider the line integral over a closed curve

C denoted by

˛

C

F(r(t)) middot rprime(t) dt (curl2)

We call this quantity the circulation of F

around C

For certain vectorshyvalued functions F the

circulation is zero for every curve In these

cases (static electric fields for instance) this

is sometimes called the the law of circulation

vecs Vector calculus curl Curl line integrals and circulation p 2

curl

curl

Curl

Consider the division of the circulation around

a curve in R3 by the surface area it encloses

∆S

1

∆S

˛

C

F(r(t)) middot rprime(t) dt (curl3)

In a manner analogous to the operation

that gaves us the divergence letrsquos consider

shrinking this curve to a point and the surface

area to zero

lim∆Srarr0

1

∆S

˛

C

F(r(t)) middot rprime(t) dt (curl4)

We call this quantity the ldquoscalarrdquo curl of F

at each point in R3 in the direction normal

to ∆S as it shrinks to zero Taking three (or

n for Rn) ldquoscalarrdquo curls in indepedent normal

directions (enough to span the vector space)

we obtain the curl proper which is a vector-

valued function curl R3 rarr R3

The curl is coordinate-independent In

cartesian coordinates it can be shown to be

equivalent to the following

Equation curl5 curl differential

form cartesian coordinates

curl F =[partyFz ndash partzFy partzFx ndash partxFz partxFy ndash partyFx

]gt

But what does the curl of F represent It

quantifies the local rotation of F about each

point If F represents a fluidrsquos velocity curl F

vecs Vector calculus curl Curl line integrals and circulation p 1

vorticity

10 A region is simply connected if every curve in it can shrink to a

point without leaving the region An example of a region that is not

simply connected is the surface of a toroid

path independence

is the local rotation of the fluid about each

point and it is called the vorticity

Zero curl circulation and path

independence

Circulation

It can be shown that if the circulation of F on

all curves is zero then in each direction n and

at every point curl F = 0 (ie n middot curl F = 0)

Conversely for curl F = 0 in a simply connected

region10 F has zero circulation

Succinctly informally and without the

requisite qualifiers above

zero circulationrArr zero curl

(curl6)

zero curl + simply connected regionrArr zero circulation

(curl7)

Path independence

It can be shown that if the path integral of F

on all curves between any two points is path-

independent then in each direction n and

at every point curl F = 0 (ie n middot curl F = 0)

Conversely for curl F = 0 in a simply connected

region all line integrals are independent of

path

Succinctly informally and without the

requisite qualifiers above

path independencerArr zero curl

(curl8)

zero curl + simply connected regionrArr path independence

(curl9)

vecs Vector calculus curl Curl line integrals and circulation p 1

zero curl

zero circulation path independence

connectedness and

Figure 055 an implication graph relating zerocurl zero circulation path independence andconnectedness Green edges represent implication(a implies b) and black edges represent logicalconjunctions

hellip and how they relate

It is also true that

path independencehArr zero circulation

(curl10)

So putting it all together we get Figure 055

Exploring curl

Curl is perhaps best explored by considering

it for a vector field in R2 Such a field in

cartesian coordinates F = Fxi + Fyj has curl

curl F =[party0 ndash partzFy partzFx ndash partx0 partxFy ndash partyFx

]gt=[0 ndash 0 0 ndash 0 partxFy ndash partyFx

]gt=[0 0 partxFy ndash partyFx

]gt (curl11)

That is curl F = (partxFy ndash partyFx )k and the only

nonzero component is normal to the xyshyplane

If we overlay a quiver plot of F over a ldquocolor

densityrdquo plot representing the k-component of

curl F we can increase our intuition about the

curl

The followingwas generated from a Jupyter

vecs Vector calculus curl Curl line integrals and circulation p 2

notebook with the following filename and

kernel

notebook filename curl-and-line-integralsipynbnotebook kernel python3

First load some Python packages

from sympy import import numpy as npimport matplotlibpyplot as pltfrom matplotlibticker import LogLocatorfrom matplotlibcolors import from sympyutilitieslambdify import lambdifyfrom IPythondisplay import display Markdown Latex

Now we define some symbolic variables and

functions

var(xy)F_x = Function(F_x)(xy)F_y = Function(F_y)(xy)

We use the same function defined in

Lecture vecsdiv quiver_plotter_2D to make

several of these plots

Letrsquos inspect several cases

vecs Vector calculus curl Curl line integrals and circulation p 3

100 075 050 025 000 025 050 075 100100

075

050

025

000

025

050

075

100F(xy) = [0cos(2pix)]

6

4

2

0

2

4

6

Figure 056 png

p = quiver_plotter_2D(field=F_x0F_ycos(2pix)density_operation=curlgrid_decimate_x=2grid_decimate_y=10grid_width=1

)

The curl is

ndash2π sin (2πx)

p = quiver_plotter_2D(field=F_x0F_yx2density_operation=curl

vecs Vector calculus curl Curl line integrals and circulation p 4

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [0x2]

6

4

2

0

2

4

6

Figure 057 png

grid_decimate_x=2grid_decimate_y=20

)

The curl is

2x

p = quiver_plotter_2D(field=F_xy2F_yx2density_operation=curlgrid_decimate_x=2grid_decimate_y=20

)

vecs Vector calculus curl Curl line integrals and circulation p 5

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [y2x2]

6

4

2

0

2

4

6

Figure 058 png

The curl is

2x ndash 2y

p = quiver_plotter_2D(field=F_x-yF_yxdensity_operation=curlgrid_decimate_x=6grid_decimate_y=6

)

Warning density operator is constant (no densityplot)rarr

The curl is

vecs Vector calculus grad Curl line integrals and circulation p 6

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [-yx]

Figure 059 png

2

vecs Vector calculus grad Gradient p 1

gradient

direction

magnitude

principle of least action

vecsgrad Gradient

Gradient

The gradient grad of a scalarshyvalued

function f R3 rarr R is a vector field

F R3 rarr R3 that is grad f is a vectorshyvalued

function on R3 The gradientrsquos local direction

and magnitude are those of the local maximum

rate of increase of f This makes it useful in

optimization (eg in the method of gradient

descent)

In classical mechanics quantum mechanics

relativity string theory thermodynamics

and continuum mechanics (and elsewhere)

the principle of least action is taken as

fundamental (Richard P Feynman Robert B

Leighton and Matthew Sands The Feynman

Lectures on Physics New Millennium Perseus

Basic Books 2010) This principle tells us that

naturersquos laws quite frequently seem to be

derivable by assuming a certain quantitymdash

called actionmdashis minimized Considering then

that the gradient supplies us with a tool for

optimizing functions it is unsurprising that the

gradient enters into the equations of motion of

many physical quantities

The gradient is coordinate-independent but

its coordinate-free definitions donrsquot add much

to our intuition In cartesian coordinates it

can be shown to be equivalent to the following

vecs Vector calculus grad Gradient p 1

gradience

11 This is nonstandard terminology but wersquore bold

potentials

Equation grad1 gradient cartesian

coordinates

grad f =[partxf partyf partzf

]gt

Vector fields from gradients are special

Although all gradients are vector fields not

all vector fields are gradients That is given

a vector field F it may or may not be equal to

the gradient of any scalarshyvalued function f

Letrsquos say of a vector field that is a gradient

that it has gradience11 Those vector fields

that are gradients have special properties

Surprisingly those properties are connected to

path independence and curl It can be shown

that iff a field is a gradient line integrals of

the field are path independent That is for a

vector field

gradiencehArr path independence (grad2)

Consideringwhat we know from

Lecture vecscurl about path independence we

can expand Figure 055 to obtain Figure 0510

One implication is that gradients have zero

curl Many important fields that describe

physical interactions (eg static electric fields

Newtonian gravitational fields) are gradients

of scalar fields called potentials

Exploring gradient

Gradient is perhaps best explored by

considering it for a scalar field on R2 Sucha field in cartesian coordinates f(x y) has

vecs Vector calculus grad Gradient p 2

gradience

zero curl

zero circulation path independence

connectedness and

Figure 0510 an implication graph relating gradiencezero curl zero circulation path independence andconnectedness Green edges represent implication(a implies b) and black edges represent logicalconjunctions

gradient

grad f =[partxf partyf

]gt(grad3)

That is grad f = F = partxf i + partyf j If we overlay

a quiver plot of F over a ldquocolor densityrdquo

plot representing the f we can increase our

intuition about the gradient

The followingwas generated from a Jupyter

notebook with the following filename and

kernel

notebook filename gradipynbnotebook kernel python3

First load some Python packages

vecs Vector calculus grad Gradient p 3

from sympy import import numpy as npimport matplotlibpyplot as pltfrom matplotlibticker import LogLocatorfrom matplotlibcolors import from sympyutilitieslambdify import lambdifyfrom IPythondisplay import display Markdown Latex

Now we define some symbolic variables and

functions

var(xy)

(x y)

Rather than repeat code letrsquos write a single

function grad_plotter_2D to make several of

these plots

Letrsquos inspect several cases While considering

the following plots remember that they all

have zero curl

p = grad_plotter_2D(field=x

)

vecs Vector calculus grad Gradient p 4

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = x

3

2

1

0

1

2

Figure 0511 png

The gradient is

[1 0

]

p = grad_plotter_2D(field=x+y

)

The gradient is

[1 1

]

vecs Vector calculus grad Gradient p 1

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = x + y

6

4

2

0

2

4

Figure 0512 png

p = grad_plotter_2D(field=1

)

Warning field is constant (no plot)The gradient is

[0 0

]Gravitational potential

Gravitational potentials have the form of

1distance Letrsquos check out the gradient

vecs Vector calculus grad Gradient p 1

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = 1sqrt(x2 + y2)

100

101

Figure 0513 png

conic section

p = grad_plotter_2D(field=1sqrt(x2+y2)norm=SymLogNorm(linthresh=3 linscale=3)mask=True

)

The gradient is

[ndash x(x2+y2

) 32

ndashy(

x2+y2) 32

]

Conic section fields

Gradients of conic section fields can be

explored

vecs Vector calculus grad Gradient p 2

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = x2

1

2

3

4

5

6

7

8

9

Figure 0514 png

parabolic fields

eliptic fields

The following is called a parabolic field

p = grad_plotter_2D(field=x2

)

The gradient is

[2x 0

]

The following are called eliptic fields

vecs Vector calculus grad Gradient p 3

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = x2 + y2

2

4

6

8

10

12

14

16

18

Figure 0515 png

p = grad_plotter_2D(field=x2+y2

)

The gradient is

[2x 2y

]

p = grad_plotter_2D(field=-x2-y2

)

vecs Vector calculus grad Gradient p 4

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = -x2 - y2

18

16

14

12

10

8

6

4

2

Figure 0516 png

hyperbolic fields

The gradient is

[ndash2x ndash2y

]

The following is called a hyperbolic field

p = grad_plotter_2D(field=x2-y2

)

The gradient is

vecs Vector calculus stoked Gradient p 5

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = x2 - y2

8

6

4

2

0

2

4

6

8

Figure 0517 png

[2x ndash2y

]

vecs Vector calculus stoked Stokes and divergence theorems p 1

divergence theorem

triple integral

orientable

12 A surface is orientable if a consistent normal direction can

be defined Most surfaces are orientable but some notably the

Moumlbius strip cannot be See Kreyszig (Kreyszig Advanced Engineering

Mathematics sect 106) for more

13 Ibid sect 107

14 Schey Div Grad Curl and All that An Informal Text on Vector

Calculus pp 45-52

continuity equation

15 Ibid pp 49-52

vecsstoked Stokes and

divergence

theorems

Two theorems allow us to exchange certain

integrals in R3 for others that are easier to

evaluate

The divergence theorem

The divergence theorem asserts the equality

of the surface integral of a vector field F and

the triple integral of div F over the volume V

enclosed by the surface S in R3 That is

umlSF middot ndS =

˚Vdiv FdV (stoked1)

Caveats are that V is a closed region bounded

by the orientable12 surface S and that F is

continuous and continuously differentiable

over a region containing V This theorem makes

some intuitive sense we can think of the

divergence inside the volume ldquoaccumulatingrdquo

via the triple integration and equaling the

corresponding surface integral For more on the

divergence theorem see Kreyszig13 and Schey14

A lovely application of the divergence theorem

is that for any quantity of conserved stuff

(mass charge spin etc) distributed in a spatial

R3 with timeshydependent density ρ R4 rarr Rand velocity field v R4 rarr R3 the divergence

theorem can be applied to find that

parttρ = ndash div(ρv) (stoked2)

which is a more general form of a continuity

equation one of the governing equations of

many physical phenomena For a derivation of

this equation see Schey15

vecs Vector calculus ex Stokes and divergence theorems p 1

KelvinshyStokesrsquo theorem

piecewise smooth

16 A surface is smooth if its normal is continuous everywhere It

is piecewise smooth if it is composed of a finite number of smooth

surfaces

17 Kreyszig Advanced Engineering Mathematics sect 109

18 Schey Div Grad Curl and All that An Informal Text on Vector

Calculus pp 93-102

Greenersquos theorem

19 Kreyszig Advanced Engineering Mathematics sect 109

generalized Stokesrsquo theorem

20 Lee Introduction to Smooth Manifolds Ch 16

The Kelvin-Stokesrsquo theorem

The KelvinshyStokesrsquo theorem asserts the

equality of the circulation of a vector field

F over a closed curve C and the surface

integral of curl F over a surface S that has

boundary C That is for r(t) a parameterization

of C and surface normal n

˛CF(r(t)) middot rprime(t) dt =

umlSn middot curl FdS (stoked3)

Caveats are that S is piecewise smooth16 its

boundary C is a piecewise smooth simple closed

curve and F is continuous and continuously

differentiable over a region containing S

This theorem is also somewhat intuitive we

can think of the divergence over the surface

ldquoaccumulatingrdquo via the surface integration and

equaling the corresponding circulation For more

on the Kelvin-Stokesrsquo theorem see Kreyszig17

and Schey18

Related theorems

Greenersquos theorem is a twoshydimensional special

case of the Kelvin-Stokesrsquo theorem It is

described by Kreyszig19

It turns out that all of the above theorems

(and the fundamental theorem of calculus

which relates the derivative and integral)

are special cases of the generalized Stokesrsquo

theorem defined by differential geometry

We would need a deeper understanding of

differential geometry to understand this

theorem For more see Lee20

vecs Vector calculus ex Exercises for Chapter 05 p 1

vecsex Exercises for Chapter

05

Exercise 051 (6)

The PDE of Example pdeseparation-1 can be used

to describe the conduction of heat along

a long thin rod insulated along its length

where u(t x) represents temperature The initial

and dirichlet boundary conditions in that

example would be interpreted as an initial

temperature distribution along the bar and

fixed temperatures of the ends Now consider

the same PDE

parttu(t x) = kpart2xxu(t x) (ex1)

with real constant k now with neumann

boundary conditions on inverval x isin [0 L]

partxu|x=0 = 0 and partxu|x=L = 0 (ex2a)

and with initial condition

u(0 x) = f(x) (ex3)

where f is some piecewise continuous function

on [0 L] This represents the complete insulation

of the ends of the rod such that no heat flows

from the ends (or from anywhere else)

a Assume a product solution separate

variables into X(x) and T(t) and set the

separation constant to ndashλ

b Solve the boundary value problem for its

eigenfunctions Xn and eigenvalues λn

c Solve for the general solution of the

time variable ODE

d Write the product solution and apply the

initial condition f(x) by constructing it

from a generalized fourier series of the

product solution

four Vector calculus Exercises for Chapter 05 p 2

e Let L = 1 and f(x) = 100 ndash 200L |x ndash L2| as

shown in Figure 071 Compute the solution

series components Plot the sum of the

first 50 terms over x and t

0 02 04 06 08 1050100

x

f(x)

Figure 0518 initial condition for Exercise 051

four

Fourier andorthogonality

four Fourier and orthogonality series Fourier series p 1

frequency domain

Fourier analysis

1 Itrsquos important to note that the symbol ωn in this context is not the

natural frequency but a frequency indexed by integer n

frequency spectrum

fourseries Fourier series

Fourier series are mathematical series that

can represent a periodic signal as a sum

of sinusoids at different amplitudes and

frequencies They are useful for solving

for the response of a system to periodic

inputs However they are probably most

important conceptually they are our gateway

to thinking of signals in the frequency

domainmdashthat is as functions of frequency

(not time) To represent a function as a

Fourier series is to analyze it as a sum of

sinusoids at different frequencies1 ωn and

amplitudes an Its frequency spectrum is the

functional representation of amplitudes an

versus frequency ωn

Letrsquos begin with the definition

Definition fourseries1 Fourier series

trigonometric form

The Fourier analysis of a periodic function

y(t) is for n isin N0 period T and angular

frequency ωn = 2πnT

an =2

T

ˆ T2

ndashT2y(t) cos(ωnt)dt (series1)

bn =2

T

ˆ T2

ndashT2y(t) sin(ωnt)dt (series2)

The Fourier synthesis of a periodic function

y(t) with analysis components an and bn

corresponding to ωn is

y(t) =a02+

infinsumn=1

an cos(ωnt) + bn sin(ωnt)

(series3)

four Fourier and orthogonality series Fourier series p 2

harmonic

harmonic frequency

fundamental frequency

Letrsquos consider the complex form of the

Fourier series which is analogous to

Definition fourseries1 It may be helpful to

review Eulerrsquos formula(s) ndash see Appendix D01

Definition fourseries2 Fourier series

complex form

The Fourier analysis of a periodic function

y(t) is for n isin N0 period T and angular

frequency ωn = 2πnT

cplusmnn =1

T

ˆ T2

ndashT2y(t)endashjωntdt (series4)

The Fourier synthesis of a periodic function

y(t) with analysis components cn corresponding

to ωn is

y(t) =infinsumn=ndashinfin cne

jωnt (series5)

We call the integer n a harmonic and the

frequency associated with it

ωn = 2πnT (series6)

the harmonic frequency There is a special

name for the first harmonic (n = 1) the

fundamental frequency It is called this

because all other frequency components are

integer multiples of it

It is also possible to convert between the two

representations above

four Fourier and orthogonality series Fourier series p 3

harmonic amplitude

magnitude line spectrum

harmonic phase

Definition fourseries3 Fourier series

converting between

forms

The complex Fourier analysis of a periodic

function y(t) is for n isin N0 and an and bn as

defined above

cplusmnn =1

2

(a|n| ∓ jb|n|

)(series7)

The sinusoidal Fourier analysis of a periodic

function y(t) is for n isin N0 and cn as defined

above

an = cn + cndashn and (series8)

bn = j (cn ndash cndashn) (series9)

The harmonic amplitude Cn is

Cn =

radica2n + b

2n (series10)

= 2radiccncndashn (series11)

A magnitude line spectrum is a graph of the

harmonic amplitudes as a function of the

harmonic frequencies The harmonic phase is

θn = ndash arctan2(bn an) (see Appendix C02)

= arctan2(Im(cn) Re(cn))

The following illustration demonstrates how

sinusoidal components sum to represent a

square wave A line spectrum is also shown

four Fourier and orthogonality series Fourier series p 4

time

frequency

spectral amplitudeamplitude

Let us compute the associated spectral

components in the following example

Example fourseriesshy1 re Fourier series analysis line spectrum

Problem Statement

Compute the first five harmonic amplitudes

that represent the line spectrum for a square

wave in the figure above

four Fourier and orthogonality trans Fourier series p 5

four Fourier and orthogonality trans Fourier transform p 1

fourtrans Fourier transform

The followingwas generated from a Jupyter

notebook with the following filename and

kernel

notebook filename fourier_series_to_transformipynbnotebook kernel python3

We begin with the usual loading of modules

import numpy as np for numericsimport sympy as sp for symbolicsimport matplotlibpyplot as plt for plotsfrom IPythondisplay import display Markdown Latex

Letrsquos consider a periodic function f with

period T (T) Each period the function has a

triangular pulse of width δ (pulse_width) andheight δ2

period = 15 periodpulse_width = 2 pulse width

First we plot the function f in the time

domain Letrsquos begin by defining f

four Fourier and orthogonality trans Fourier transform p 2

def pulse_train(tTpulse_width)f = lambda xpulse_width2-abs(x) pulsetm = npmod(tT)if tm lt= pulse_width2

return f(tm)elif tm gt= T-pulse_width2

return f(-(tm-T))else

return 0

Now we develop a numerical array in time to

plot f

N = 201 number of points to plottpp = nplinspace(-period25period2N) time valuesfpp = nparray(npzeros(tppshape))for it_now in enumerate(tpp)

fpp[i] = pulse_train(t_nowperiodpulse_width)

axes = pltfigure(1)pltplot(tppfppb-linewidth=2) plotpltxlabel(time (s))pltxlim([-period23period2])pltxticks(

[0pulse_width2period][0$delta2$$T=+str(period)+$ s]

)pltyticks([0pulse_width2][0$delta2$])pltshow() display here

For δ = 2 and T isin [5 15 25] the leftshyhand

column of Figure 061 shows two triangle pulses

for each period T

four Fourier and orthogonality trans Fourier transform p 3

Consider the following argument Just as

a Fourier series is a frequency domain

representation of a periodic signal a Fourier

transform is a frequency domain representation

of an aperiodic signal (we will rigorously

define it in a moment) The Fourier series

components will have an analog then in the

Fourier transform Recall that they can be

computed by integrating over a period of the

signal If we increase that period infinitely

the function is effectively aperiodic The

result (within a scaling factor) will be the

Fourier transform analog of the Fourier series

components

Let us approach this understanding by actually

computing the Fourier series components for

increasing period T using Wersquoll use sympyto compute the Fourier series cosine and sine

components an and bn for component n (n) andperiod T (T)

spvar(xa_0a_nb_nreal=True)spvar(deltaTpositive=True)spvar(nnonnegative=True) a0 = 2Tspintegrate( (delta2-spAbs(x)) (x-delta2delta2) otherwise zero )simplify()an = spintegrate(

2T(delta2-spAbs(x))spcos(2sppinTx)(x-delta2delta2) otherwise zero

)simplify()bn = 2Tspintegrate(

(delta2-spAbs(x))spsin(2sppinTx)(x-delta2delta2) otherwise zero

)simplify()display(spEq(a_nan)spEq(b_nbn))

four Fourier and orthogonality trans Fourier transform p 4

an =

T(1ndashcos

(πδnT

))π2n2

for n 6= 0δ2

2Totherwise

bn = 0

Furthermore let us compute the harmonic

amplitude

(f_harmonic_amplitude)

Cn =

radica2n + b

2n (trans1)

which we have also scaled by a factor Tδ in

order to plot it with a convenient scale

spvar(C_npositive=True)cn = spsqrt(an2+bn2)display(spEq(C_ncn))

Cn =

T∣∣∣cos(πδn

T

)ndash1∣∣∣

π2n2for n 6= 0

δ2

2Totherwise

Now we lambdify the symbolic expression for a

numpy function

cn_f = splambdify((nTdelta)cn)

Now we can plot

four Fourier and orthogonality trans Fourier transform p 5

amplitude CnTδ

δ2T = 5 s

δ2

5 10

δ2

δ2

T = 15 s

δ2

5 10

δ2

δ2

T = 25 s

δ2

time (s)

5 10

δ2

frequency ω (rads)

Figure 061 triangle pulse trains (left column) withlonger periods descending and their correspondingline spectra (right column) scaled for convenientcomparison

omega_max = 12 rads max frequency in line spectrumn_max = round(omega_maxperiod(2nppi)) max harmonicn_a = nplinspace(0n_maxn_max+1)omega = 2nppin_aperiodaxes = pltfigure(2)markerline stemlines baseline = pltstem(

omega periodpulse_widthcn_f(n_aperiodpulse_width)linefmt=b- markerfmt=bo basefmt=r-use_line_collection=True

)pltxlabel(frequency $omega$ (rads))pltxlim([0omega_max])pltylim([0pulse_width2])pltyticks([0pulse_width2][0$delta2$])pltshow() show here

The line spectra are shown in the rightshyhand

column of Figure 061 Note that with our chosen

scaling as T increases the line spectra reveal

a distinct waveform

four Fourier and orthogonality trans Fourier transform p 6

2 4 6 8 10 12

δ2

frequency ω (rads)

F(ω)

Figure 062 F(ω) our mysterious Fourier seriesamplitude analog

Let F be the continuous function of angular

frequency ω

F(ω) =δ

2middot sin

2(ωδ4)

(ωδ4)2 (trans2)

First we plot it

F = lambda w pulse_width2 npsin(wpulse_width(22))2 (wpulse_width(22))2

N = 201 number of points to plotwpp = nplinspace(00001omega_maxN)Fpp = []for i in range(0N)

Fppappend(F(wpp[i])) build array of function valuesaxes = pltfigure(3)pltplot(wppFppb-linewidth=2) plotpltxlim([0omega_max])pltyticks([0pulse_width2][0$delta2$])pltxlabel(frequency $omega$ (rads))pltylabel($F(omega)$)pltshow()

Letrsquos consider the plot in Figure 062 of F Itrsquos

obviously the function emerging in Figure 061

from increasing the period of our pulse train

four Fourier and orthogonality trans Fourier transform p 7

Now we are ready to define the Fourier

transform and its inverse

Definition fourtrans1 Fourier transforms

trigonometric form

Fourier transform (analysis)

A(ω) =

ˆ infinndashinfin y(t) cos(ωt)dt (trans3)

B(ω) =

ˆ infinndashinfin y(t) sin(ωt)dt (trans4)

Inverse Fourier transform (synthesis)

y(t) =1

ˆ infinndashinfin A(ω) cos(ωt)dω +

1

ˆ infinndashinfin B(ω) sin(ωt)dω

(trans5)

Definition fourtrans2 Fourier transforms

complex form

Fourier transform F (analysis)

F(y(t)) = Y(ω) =

ˆ infinndashinfin y(t)endashjωtdt (trans6)

Inverse Fourier transform Fndash1 (synthesis)

Fndash1(Y(ω)) = y(t) =1

ˆ infinndashinfin Y(ω)ejωtdω (trans7)

So now we have defined the Fourier transform

There are many applications including solving

differential equations and frequency domain

representationsmdashcalled spectramdashof time domain

functions

There is a striking similarity between the

four Fourier and orthogonality trans Fourier transform p 8

Fourier transform and the Laplace transform

with which you are already acquainted

In fact the Fourier transform is a special

case of a Laplace transform with Laplace

transform variable s = jω instead of having

some real component Both transforms convert

differential equations to algebraic equations

which can be solved and inversely transformed

to find timeshydomain solutions The Laplace

transform is especially important to use

when an input function to a differential

equation is not absolutely integrable and

the Fourier transform is undefined (for

example our definition will yield a transform

for neither the unit step nor the unit ramp

functions) However the Laplace transform is

also preferred for initial value problems due

to its convenient way of handling them The

two transforms are equally useful for solving

steady state problems Although the Laplace

transform has many advantages for spectral

considerations the Fourier transform is the

only game in town

A table of Fourier transforms and their

properties can be found in Appendix B02

Example fourtransshy1 re a Fourier transform

Problem Statement

Consider the aperiodic signal y(t) = us(t)endashat

with us the unit step function and a gt 0 The

signal is plotted below Derive the complex

frequency spectrum and plot its magnitude

and phase

ndash2 0 2 40

051

t

y(t)

four Fourier and orthogonality trans Fourier transform p 9

Problem SolutionThe signal is aperiodic so the Fourier

transform can be computed from

Equation trans6

Y(ω) =

ˆ infinndashinfin y(t)endashjωtdt

=

ˆ infinndashinfin us(t)e

ndashatendashjωtdt (def of y)

=

ˆ infin0

endashatendashjωtdt (us effect)

=

ˆ infin0

e(ndashandashjω)tdt (multiply)

=1

ndasha ndash jωe(ndashandashjω)t

∣∣∣∣infin0dt (antiderivative)

=1

ndasha ndash jω

(limtrarrinfin e(ndashandashjω)t ndash e0

)(evaluate)

=1

ndasha ndash jω

(limtrarrinfin endashatendashjωt ndash 1

)(arrange)

=1

ndasha ndash jω

((0)(complex with mag 6 1) ndash 1

)(limit)

=ndash1

ndasha ndash jω(consequence)

=1

a + jω

=a ndash jω

a ndash jωmiddot 1

a + jω(rationalize)

=a ndash jω

a2 +ω2

The magnitude and phase of this complex

function are straightforward to compute

|Y(ω)| =radicRe(Y(ω))2 + Im(Y(ω))2

=1

a2 +ω2

radica2 +ω2

=1radic

a2 +ω2

angY(ω) = ndash arctan(ωa)

four Fourier and orthogonality general Fourier transform p 10

Now we can plot these functions of ω Setting

a = 1 (arbitrarily) we obtain the following

plots

0

05

1

|Y(ω)|

ndash10 ndash5 0 5 10

ndash1

0

1

ω

angY(ω)

four Fourier and orthogonality general Generalized fourier series and orthogonality p 1

2 A function f is square-integrable ifacuteinfinndashinfin |f(x)|2 dx ltinfin

inner product

weight function

3 This definition of the inner product can be extended to functions

on R2 and R3 domains using double- and triple-integration See

( Schey Div Grad Curl and All that An Informal Text on Vector

Calculus p 261)

orthogonality

basis

function space

legendre polynomials

fourgeneral Generalized

fourier series and

orthogonality

Let f R rarr C g R rarr C and w R rarr Cbe complex functions For square-integrable2

f g and w the inner product of f and g with

weight function w over the interval [a b] sube Ris3

〈f g〉w =ˆ b

af(x)g(x)w(x) dx (general1)

where g denotes the complex conjugate of

g The inner product of functions can be

considered analogous to the inner (or dot)

product of vectors

The fourier series components can be found by a

special property of the sin and cos functions

called orthogonality In general functions

f and g from above are orthogonal over the

interval [a b] iff

〈f g〉w = 0 (general2)

for weight function w Similar to how a set of

orthogonal vectors can be a basis for a vector

space a set of orthogonal functions can be

a basis for a function space a vector space

of functions from one set to another (with

certain caveats)

In addition to some sets of sinusoids there

are several other important sets of functions

that are orthogonal For instance sets of

legendre polynomials (Erwin Kreyszig

Advanced Engineering Mathematics 10th John

Wiley amp Sons Limited 2011 ISBN 9781119571094

The authoritative resource for engineering

four Fourier and orthogonality ex Generalized fourier series and orthogonality p 2

bessel functions

generalized fourier series

Fourier components

synthesis

analysis

fourier-legendre series

fourier-bessel series

mathematics It includes detailed accounts

of probability statistics vector calculus

linear algebra fourier analysis ordinary

and partial differential equations and

complex analysis It also includes several

other topics with varying degrees of depth

Overall it is the best place to start when

seeking mathematical guidance sect 52) and bessel

functions (Kreyszig Advanced Engineering

Mathematics sect 54) are orthogonal

As with sinusoids the orthogonality of some

sets of functions allows us to compute their

series components Let functions f0 f1 middot middot middot beorthogonal with respect to weight function

w on interval [a b] and let α0 α1 middot middot middot be realconstants A generalized fourier series is

(ibid sect 116)

f(x) =infinsumm=0

αmfm(x) (general3)

and represents a function f as a convergent

series It can be shown that the Fourier

components αm can be computed from

αm =〈f fm〉w〈fm fm〉w

(general4)

In keepingwith our previous terminology

for fourier series Equation general3 and

Equation general4 are called general fourier

synthesis and analysis respectively

For the aforementioned legendre and bessel

functions the generalized fourier series are

called fourier-legendre and fourier-bessel

series (ibid sect 116) These and the standard

fourier series (Lecture fourseries) are of

particular interest for the solution of partial

differential equations (Chapter 07)

four Fourier and orthogonality ex Exercises for Chapter 06 p 1

3 Python code in this section was generated from a Jupyter notebook

named random_signal_fftipynb with a python3 kernel

fourex Exercises for Chapter

06

Exercise 061 (secrets)

This exercise encodes a ldquosecret wordrdquo into a

sampled waveform for decoding via a discrete

fourier transform (DFT) The nominal goal of

the exercise is to decode the secret word

Along the way plotting and interpreting the DFT

will be important

First load relevant packages

import numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

We define two functions letter_to_number to

convert a letter into an integer index of the

alphabet (a becomes 1 b becomes 2 etc) and

string_to_number_list to convert a string to

a list of ints as shown in the example at the

end

def letter_to_number(letter)return ord(letter) - 96

def string_to_number_list(string)out = [] listfor i in range(0len(string))

outappend(letter_to_number(string[i]))return out list

print(faces = string_to_number_list(aces) )

four Fourier and orthogonality ex Exercises for Chapter 06 p 2

aces = [1 3 5 19]

Now we encode a code string code into

a signal by beginningwith ldquowhite noiserdquo

which is broadband (appears throughout the

spectrum) and adding to it sin functions

with amplitudes corresponding to the letter

assignments of the code and harmonic

corresponding to the position of the letter in

the string For instance the string bad would

be represented by noise plus the signal

2 sin2πt + 1 sin 4πt + 4 sin 6πt (ex1)

Letrsquos set this up for secret word chupcabra

N = 2000Tm = 30T = float(Tm)float(N)fs = 1Tx = nplinspace(0 Tm N)noise = 4nprandomnormal(0 1 N)code = chupcabra the secret wordcode_number_array = nparray(string_to_number_list(code))y = nparray(noise)for i in range(0len(code))

y = y + code_number_array[i]npsin(2nppi(i+1)x)

For proper decoding later it is important

to know the fundamental frequency of the

generated data

print(ffundamental frequency = fs Hz)

four Fourier and orthogonality ex Exercises for Chapter 06 p 3

0 1 2 3 4 5 6 7

time (s)

minus60

minus40

minus20

0

20

40

60

y n

Figure 063 the chupacabra signal

fundamental frequency = 6666666666666667 Hz

Now we plot

fig ax = pltsubplots()pltplot(xy)pltxlim([0Tm4])pltxlabel(time (s))pltylabel($y_n$)pltshow()

Finally we can save our data to a numpy file

secretsnpy to distribute our message

npsave(secretsy)

four Fourier and orthogonality ex Exercises for Chapter 06 p 1

Now I have done this (for a different secret

word) and saved the data download it here

ricopiconemathematical_foundationssourcesecretsnpy

In order to load the npy file into Python we

can use the following command

secret_array = npload(secretsnpy)

Your job is to (a) perform a DFT (b) plot the

spectrum and (c) decode the message Here are

a few hints

1 Use from scipy import fft to do the

DFT

2 Use a hanning window to minimize the

end-effects See numpyhanning for

instance The fft call might then look

like

2fft(nphanning(N)secret_array)N

where N = len(secret_array)3 Use only the positive spectrum you can

lop off the negative side and double the

positive side

Exercise 062 (seesaw)

Consider the periodic function f R rarr R with

period T defined for one period as

f(t) = at for t isin (ndashT2 T2] (ex2)

where a T isin R Perform a fourier analysis on

f Letting a = 5 and T = 1 plot f along with the

partial sum of the fourier synthesis the first

50 nonzero components over t isin [ndashT T]

four Fourier and orthogonality ex Exercises for Chapter 06 p 1

Exercise 063 (9)

Consider the function f Rrarr R defined as

f(t) =

a ndash a|t|T for t isin [ndashT T]

0 otherwise(ex3)

where a T isin R Perform a fourier analysis on f

resulting in F(ω) Plot F for various a and T

Exercise 064 (10)

Consider the function f Rrarr R defined as

f(t) = aendashb|tndashT| (ex4)

where a b T isin R Perform a fourier analysis on

f resulting in F(ω) Plot F for various a b and

T

Exercise 065 (11)

Consider the function f Rrarr R defined as

f(t) = a cosω0t + b sinω0t (ex5)

where a bω0 isin R constants Perform a fourier

analysis on f resulting in F(ω)

Exercise 066 (12)

Derive a fourier transform property for

expressions including function f Rrarr R for

f(t) cos(ω0t +ψ)

where ω0 ψ isin R

Exercise 067 (13)

Consider the function f Rrarr R defined as

f(t) = aus(t)endashbt cos(ω0t +ψ) (ex6)

where a bω0 ψ isin R and us(t) is the unit

step function Perform a fourier analysis on

f resulting in F(ω) Plot F for various a b ω0 ψ

and T

four Fourier and orthogonality ex Exercises for Chapter 06 p 1

Exercise 068 (14)

Consider the function f Rrarr R defined as

f(t) = g(t) cos(ω0t) (ex7)

where ω0 isin R and g R rarr R will be defined in

each part below Perform a fourier analysis on

f for each g below for ω1 isin R a constant and

consider how things change if ω1 rarr ω0

a g(t) = cos(ω1t)

b g(t) = sin(ω1t)

Exercise 069 (savage)

An instrument called a ldquolock-in amplifierrdquo can

measure a sinusoidal signal A cos(ω0t + ψ) =

a cos(ω0t) + b sin(ω0t) at a known frequency ω0with exceptional accuracy even in the presence

of significant noise N(t) The workings of these

devices can be described in two operations

first the following operations on the signal

with its noise f1(t) = a cos(ω0t) + b sin(ω0t) + N(t)

f2(t) = f1(t) cos(ω1t) and f3(t) = f1(t) sin(ω1t)

(ex8)

where ω0 ω1 a b isin R Note the relation of

this operation to the Fourier analysis of

Exercise 068 The key is to know with some

accuracty ω0 such that the instrument can set

ω1 asymp ω0 The second operation on the signal is

an aggressive lowshypass filter The filtered f2and f3 are called the inshyphase and quadrature

components of the signal and are typically

given a complex representation

(inshyphase) + j (quadrature)

Explain with fourier analyses on f2 and f3

pde Fourier and orthogonality Exercises for Chapter 06 p 1

a what F2 = F(f2) looks like

b what F3 = F(f3) looks like

c why we want ω1 asymp ω0 d why a lowshypass filter is desirable and

e what the timeshydomain signal will look

like

Exercise 0610 (strawman)

Consider again the lock-in amplifier explored

in Exercise 069 Investigate the lock-in

amplifier numerically with the following steps

a Generate a noisy sinusoidal signal

at some frequency ω0 Include enough

broadband white noise that the signal

is invisible in a timeshydomain plot

b Generate f2 and f3 as described in

Exercise 069

c Apply a timeshydomain discrete lowshypass

filter to each f2 7rarr φ2 and f3 7rarr φ3 such

as scipyrsquos scipysignalsosfiltfilt tocomplete the lock-in amplifier operation

Plot the results in time and as a complex

(polar) plot

d Perform a discrete fourier transform

on each f2 7rarr F2 and f3 7rarr F3 Plot the

spectra

e Construct a frequency domain lowshypass

filter F and apply it (multiply) to each

F2 7rarr Fprime2 and F3 7rarr Fprime3 Plot the filtered

spectra

f Perform an inverse discrete fourier

transform to each Fprime2 7rarr fprime2 and Fprime3 7rarr fprime3

Plot the results in time and as a complex

(polar) plot

g Compare the two methods used ie time-

domain filtering versus frequencyshydomain

filtering

ordinary differential equations

lumped-parameter modeling

pde

Partialdifferential

equations

An ordinary differential equation is one

with (ordinary) derivatives of functions a

single variable eachmdashtime in many applications

These typically describe quantities in some sort

of lumped-parameter way mass as a ldquopoint

particlerdquo a springrsquos force as a function of

timeshyvarying displacement across it a resistorrsquos

current as a function of timeshyvarying voltage

across it Given the simplicity of such models

in comparison to the wildness of nature

it is quite surprising how well they work

for a great many phenomena For instance

electronics rigid body mechanics population

dynamics bulk fluid mechanics and bulk heat

transfer can be lumpedshyparameter modeled

However as we saw in there are many

phenomena of which we require more detailed

models These include

pde Partial differential equations p 3

timeshyvarying spatial distribution

partial differential equations

dependent variables

independent variables

analytic solution

numeric solution

1 There are some analytic techniques for gaining insight into PDEs

for which there are no known solutions such as considering the

phase space This is an active area of research for more see Bove

Colombini and Santo (Antonio Bove F (Ferruccio) Colombini and

Daniele Del Santo Phase space analysis of partial differential

equations eng Progress in nonlinear differential equations and

their applications v 69 Boston Berlin Birkhaumluser 2006 ISBN

9780817645212)

bull detailed fluid mechanics

bull detailed heat transfer

bull solid mechanics

bull electromagnetism and

bull quantum mechanics

In many cases what is required to account

for is the timeshyvarying spatial distribution

of a quantity In fluid mechanics we treat a

fluid as having quantities such as density and

velocity that vary continuous over space and

time Deriving the governing equations for such

phenomena typically involves vector calculus

we observed in that statements about

quantities like the divergence (eg continuity)

can be made about certain scalar and vector

fields Such statements are governing equations

(eg the continuity equation) and they are

partial differential equations (PDEs) because

the quantities of interest called dependent

variables (eg density and velocity) are both

temporally and spatially varying (temporal

and spatial variables are therefore called

independent variables)

In this chapter we explore the analytic

solution of PDEs This is related to but

distinct from the numeric solution (ie

simulation) of PDEs which is another important

topic Many PDEs have no known analytic

solution so numeric solution is the best

available option1 However it is important

to note that the insight one can gain from an

analytic solution is often much greater than

that from a numeric solution This is easily

understood when one considers that a numeric

solution is an approximation for a specific set

pde Partial differential equations class p 4

2 Kreyszig Advanced Engineering Mathematics Ch 12

3 WA Strauss Partial Differential Equations An Introduction Wiley

2007 ISBN 9780470054567 A thorough and yet relatively compact

introduction

4 R Haberman Applied Partial Differential Equations with Fourier

Series and Boundary Value Problems (Classic Version) Pearson Modern

Classics for Advanced Mathematics Pearson Education Canada 2018

ISBN 9780134995434

of initial and boundary conditions Typically

very little can be said of what would happen

in general although this is often what we seek

to know So despite the importance of numeric

solution one should always prefer an analytic

solution

Three good texts on PDEs for further study are

Kreyszig2 Strauss3 and Haberman4

pde Partial differential equations class Classifying PDEs p 1

initial conditions

boundary conditions

well-posed problem

linear

nonlinear

pdeclass Classifying PDEs

PDEs often have an infinite number of

solutions however when applying them to

physical systems we usually assume there is

a deterministic or at least a probabilistic

sequence of events will occur Therefore we

impose additonal constraints on a PDE usually

in the form of

1 initial conditions values of

independent variables over all space at

an initial time and

2 boundary conditions values of

independent variables (or their

derivatives) over all time

Ideally imposing such conditions leaves us

with a well-posed problem which has three

aspects (Antonio Bove F (Ferruccio) Colombini

and Daniele Del Santo Phase space analysis of

partial differential equations eng Progress

in nonlinear differential equations and their

applications v 69 Boston Berlin Birkhaumluser

2006 ISBN 9780817645212 sect 15)

existence There exists at least one solution

uniqueness There exists at most one solution

stability If the PDE boundary conditons or

initial conditions are changed slightly

the solution changes only slightly

As with ODEs PDEs can be linear or nonlinear

pde Partial differential equations class Classifying PDEs p 2

order

second-order PDEs

forcing function

homogeneous

that is the independent variables and

their derivatives can appear in only linear

combinations (linear PDE) or in one or more

nonlinear combination (nonlinear PDE) As with

ODEs there are more known analytic solutions

to linear PDEs than nonlinear PDEs

The order of a PDE is the order of its

highest partial derivative A great many

physical models can be described by second-

order PDEs or systems thereof Let u be an

independent scalar variable a function of

m temporal and spatial variables xi isin RnA second-order linear PDE has the form

for coefficients αβγ δ real functions

of xi (WA Strauss Partial Differential

Equations An Introduction Wiley 2007 ISBN

9780470054567 A thorough and yet relatively

compact introduction sect 16)

nsumi=1

nsumj=1

αijpart2xixju︸ ︷︷ ︸

second-order terms

+msumk=1

(γkpartxku + δku

)︸ ︷︷ ︸

first- and zeroth-order terms

= f(x1 middot middot middot xn)︸ ︷︷ ︸forcing

(class1)

where f is called a forcing function When f

is zero Equation class1 is called homogeneous

We can consider the coefficients αij to be

components of a matrix A with rows indexed

by i and columns indexed by j There are four

prominent classes defined by the eigenvalues of

A

elliptic the eigenvalues all have the same sign

parabolic the eigenvalues have the same sign

except one that is zero

hyperbolic exactly one eigenvalue has the

opposite sign of the others and

pde Partial differential equations sturm Classifying PDEs p 3

cauchyshykowalevski theorem

cauchy problems

ultrahyperbolic at least two eigenvalues of

each signs

The first three of these have received extensive

treatment They are named after conic

sections due to the similarity the equations

have with polynomials when derivatives

are considered analogous to powers of

polynomial variables For instance here is a

case of each of the first three classes

part2xxu + part2yyu = 0 (elliptic)

part2xxu ndash part2yyu = 0 (hyperbolic)

part2xxu ndash parttu = 0 (parabolic)

When A depends on xi it may have multiple

classes across its domain In general this

equation and its associated initial and

boundary conditions do not comprise a well-

posed problem however several special cases

have been shown to be wellshyposed Thus far

the most general statement of existence and

uniqueness is the cauchyshykowalevski theorem

for cauchy problems

pde Partial differential equations sturm Sturm-liouville problems p 1

sturm-liouville (SshyL) differential equation

regular SshyL problem

5 For the SshyL problem to be regular it has the additional constraints

that p qσ are continuous and pσ gt 0 on [a b] This is also sometimes

called the sturm-liouville eigenvalue problem See Haberman (Haberman

Applied Partial Differential Equations with Fourier Series and

Boundary Value Problems (Classic Version) sect 53) for the more general

(non-regular) SshyL problem and Haberman (ibid sect 74) for the multi-

dimensional analog

boundary value problems

eigenfunctions

eigenvalues

6 These eigenvalues are closely related to but distinct from the

ldquoeigenvaluesrdquo that arise in systems of linear ODEs

7 Ibid sect 53

pdesturm Sturm-liouville

problems

Before we introduce an important solution

method for PDEs in Lecture pdeseparation we

consider an ordinary differential equation

that will arise in that method when dealing

with a single spatial dimension x the sturm-

liouville (SshyL) differential equation Let

p qσ be functions of x on open interval

(a b) Let X be the dependent variable and λ

constant The regular SshyL problem is the SshyL

ODE5

d

dx

(pXprime

)+ qX + λσX = 0 (sturm1a)

with boundary conditions

β1X(a) + β2Xprime(a) = 0 (sturm2a)

β3X(b) + β4Xprime(b) = 0 (sturm2b)

with coefficients βi isin R This is a type of

boundary value problem

This problem has nontrivial solutions

called eigenfunctions Xn(x) with n isin Z+ corresponding to specific values of λ = λn

called eigenvalues6 There are several

important theorems proven about this (see

Haberman7) Of greatest interest to us are that

1 there exist an infinite number of

eigenfunctions Xn (unique within a

multiplicative constant)

2 there exists a unique corresponding real

eigenvalue λn for each eigenfunction Xn

3 the eigenvalues can be ordered as λ1 lt

λ2 lt middot middot middot

pde Partial differential equations sturm Sturm-liouville problems p 1

homogeneous boundary conditions

periodic boundary conditions

4 eigenfunction Xn has n ndash 1 zeros on open

interval (a b)

5 the eigenfunctions Xn form an

orthogonal basis with respect to

weighting function σ such that

any piecewise continuous function

f [a b] rarr R can be represented by a

generalized fourier series on [a b]

This last theorem will be of particular interest

in Lecture pdeseparation

Types of boundary conditions

Boundary conditions of the sturm-liouville

kind (sturm2) have four sub-types

dirichlet for just β2 β4 = 0

neumann for just β1 β3 = 0

robin for all βi 6= 0 andmixed if β1 = 0 β3 6= 0 if β2 = 0 β4 6= 0

There are many problems that are not regular

sturm-liouville problems For instance the

rightshyhand sides of Equation sturm2 are

zero making them homogeneous boundary

conditions however these can also be

nonzero Another case is periodic boundary

conditions

X(a) = X(b) (sturm3a)

Xprime(a) = Xprime(b) (sturm3b)

pde Partial differential equations sturm Sturm-liouville problems p 2

Example pdesturmshy1 re a sturm-liouville problem with dirichlet

boundary conditions

Problem Statement

Consider the differential equation

Xprimeprime + λX = 0 (sturm4)

with dirichlet boundary conditions on the

boundary of the interval [0 L]

X(0) = 0 and X(L) = 0 (sturm5)

Solve for the eigenvalues and eigenfunctions

Problem SolutionThis is a sturm-liouville problem so we know

the eigenvalues are real The well-known

general solutions to the ODE is

X(x) =

k1 + k2x λ = 0

k1ejradicλx + k2e

ndashjradicλx otherwise

(sturm6)

with real constants k1 k2 The solution must

also satisfy the boundary conditions Letrsquos

apply them to the case of λ = 0 first

X(0) = 0rArr k1 + k2(0) = 0rArr k1 = 0 (sturm7)

X(L) = 0rArr k1 + k2(L) = 0rArr k2 = ndashk1L (sturm8)

Together these imply k1 = k2 = 0 which gives

the trivial solution X(x) = 0 in which we

arenrsquot interested We say then for nontrivial

solutions λ 6= 0 Now letrsquos check λ lt 0 The

solution becomes

X(x) = k1endashradic|λ|x + k2e

radic|λ|x (sturm9)

= k3 cosh(radic|λ|x) + k4 sinh(

radic|λ|x) (sturm10)

where k3 and k4 are real constants Again

applying the boundary conditions

X(0) = 0rArr k3 cosh(0) + k4 sinh(0) = 0rArr k3 + 0 = 0rArr k3 = 0

X(L) = 0rArr 0cosh(radic|λ|L) + k4 sinh(

radic|λ|L) = 0rArr k4 sinh(

radic|λ|L) = 0

However sinh(radic|λ|L) 6= 0 for L gt 0 so

k4 = k3 = 0mdashagain the trivial solution Now

pde Partial differential equations sturm Sturm-liouville problems p 3

letrsquos try λ gt 0 The solution can be written

X(x) = k5 cos(radicλx) + k6 sin(

radicλx) (sturm11)

Applying the boundary conditions for this

case

X(0) = 0rArr k5 cos(0) + k6 sin(0) = 0rArr k5 + 0 = 0rArr k5 = 0

X(L) = 0rArr 0cos(radicλL) + k6 sin(

radicλL) = 0rArr k6 sin(

radicλL) = 0

Now sin(radicλL) = 0 for

radicλL = nπrArr

λ =(nπL

)2 (n isin Z+)

Therefore the only nontrivial solutions

that satisfy both the ODE and the boundary

conditions are the eigenfunctions

Xn(x) = sin(radic

λnx)

(sturm12a)

= sin(nπLx)

(sturm12b)

with corresponding eigenvalues

λn =(nπL

)2 (sturm13)

Note that because λ gt 0 λ1 is the lowest

eigenvalue

Plotting the eigenfunctions

The followingwas generated from a Jupyter

notebook with the following filename and

kernel

notebook filename eigenfunctions_example_plotipynbnotebook kernel python3

First load some Python packages

pde Partial differential equations sturm Sturm-liouville problems p 4

import numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

Set L = 1 and compute values for the first

four eigenvalues lambda_n and eigenfunctions

X_n

L = 1x = nplinspace(0L100)n = nplinspace(144dtype=int)lambda_n = (nnppiL)2X_n = npzeros([len(n)len(x)])for in_i in enumerate(n)X_n[i] = npsin(npsqrt(lambda_n[i])x)

Plot the eigenfunctions

for in_i in enumerate(n)pltplot(

xX_n[i]linewidth=2label=n = +str(n_i)

)pltlegend()pltshow() display the plot

00 02 04 06 08 10

ndash1

ndash05

0

05

1

n = 1

n = 2

n = 3

n = 4

pde Partial differential equations separation Sturm-liouville problems p 5

We see that the fourth of the SshyL theorems

appears true n ndash 1 zeros of Xn exist on the

open interval (0 1)

pde Partial differential equations separation PDE solution by separation of variables p 1

separation of variables

linear

product solution

separable PDEs

pdeseparation PDE solution by

separation of

variables

We are now ready to learn one of the

most important techniques for solving PDEs

separation of variables It applies only to

linear PDEs since it will require the principle

of superposition Not all linear PDEs yield to

this solution technique but several that are

important do

The technique includes the following steps

assume a product solution Assume the

solution can be written as a product

solution up the product of functions

of each independent variable

separate PDE Substitute up into the PDE and

rearrange such that at least one side

of the equation has functions of a single

independent variabe If this is possible

the PDE is called separable

set equal to a constant Each side of

the equation depends on different

independent variables therefore they

must each equal the same constant often

called ndashλ

repeat separation as needed If there are more

than two independent variables there

will be an ODE in the separated variable

and a PDE (with one fewer variables) in

the other independent variables Attempt

to separate the PDE until only ODEs

remain

solve each boundary value problem Solve

each boundary value problem ODE

pde Partial differential equations separation PDE solution by separation of variables p 1

eigenfunctions

superposition

ignoring the initial conditions for now

solve the time variable ODE Solve for the

general solution of the time variable ODE

sans initial conditions

construct the product solution Multiply the

solution in each variable to construct

the product solution up If the boundary

value problems were sturm-liouville

the product solution is a family of

eigenfunctions from which any function

can be constructed via a generalized

fourier series

apply the initial condition The product

solutions individually usually do not

meet the initial condition However a

generalized fourier series of them nearly

always does Superposition tells us a

linear combination of solutions to the

PDE and boundary conditions is also

a solution the unique series that also

satisfies the initial condition is the

unique solution to the entire problem

Example pdeseparationshy1 re 1D diffusion equation

Problem Statement

Consider the oneshydimensional diffusion

equation PDEa

parttu(t x) = kpart2xxu(t x) (separation1)

with real constant k with dirichlet boundary

conditions on inverval x isin [0 L]

u(t 0) = 0 (separation2a)

u(t L) = 0 (separation2b)

and with initial condition

u(0 x) = f(x) (separation3)

where f is some piecewise continuous function

on [0 L]

pde Partial differential equations separation PDE solution by separation of variables p 2

a For more on the diffusion or heat equation see Haberman

(Haberman Applied Partial Differential Equations with Fourier

Series and Boundary Value Problems (Classic Version) sect 23)

Kreyszig (Kreyszig Advanced Engineering Mathematics sect 125)

and Strauss (Strauss Partial Differential Equations An

Introduction sect 23)

Problem Solution

Assume a product solution

First we assume a product solution of the

form up(t x) = T(t)X(x) where T and X are

unknown functions on t gt 0 and x isin [0 L]

Separate PDE

Second we substitute the product solution

into Equation separation1 and separate

variables

TprimeX = kTXprimeprime rArr (separation4)

Tprime

kT=Xprimeprime

X (separation5)

So it is separable Note that we chose to

group k with T which was arbitrary but

conventional

Set equal to a constant

Since these two sides depend on different

independent variables (t and x) they must

equal the same constant we call ndashλ so we

have two ODEs

Tprime

kT= ndashλ rArr Tprime + λkT = 0 (separation6)

Xprimeprime

X= ndashλ rArr Xprimeprime + λX = 0 (separation7)

Solve the boundary value problem

The latter of these equations with the

boundary conditions (separation2) is precisely

the same sturm-liouville boundary value

problem from Example pdesturm-1 which had

pde Partial differential equations separation PDE solution by separation of variables p 3

eigenfunctions

Xn(x) = sin(radic

λnx)

(separation8a)

= sin(nπLx)

(separation8b)

with corresponding (positive) eigenvalues

λn =(nπL

)2 (separation9)

Solve the time variable ODE

The time variable ODE is homogeneous and has

the familiar general solution

T(t) = cendashkλt (separation10)

with real constant c However the boundary

value problem restricted values of λ to λn so

Tn(t) = cendashk(nπL)

2t (separation11)

Construct the product solution

The product solution is

up(t x) = Tn(t)Xn(x)

= cendashk(nπL)2t sin

(nπLx) (separation12)

This is a family of solutions that each satisfy

only exotically specific initial conditions

Apply the initial condition

The initial condition is u(0 x) = f(x) The

eigenfunctions of the boundary value problem

form a fourier series that satisfies the initial

condition on the interval [0 L] if we extend

f to be periodic and odd over x (Kreyszig

Advanced Engineering Mathematics p 550) we

call the extension flowast The odd series synthesis

can be written

flowast(x) =infinsumn=1

bn sin(nπLx)

(separation13)

pde Partial differential equations separation PDE solution by separation of variables p 4

where the fourier analysis gives

bn =2

L

ˆ L

0flowast(χ) sin

(nπLχ) (separation14)

So the complete solution is

u(t x) =infinsumn=1

bnendashk(nπL)

2t sin(nπLx)

(separation15)

Notice this satisfies the PDE the boundary

conditions and the initial condition

Plotting solutions

If we want to plot solutions we need to

specify an initial condition u(0 x) = flowast(x)

over [0 L] We can choose anything piecewise

continuous but for simplicity letrsquos let

f(x) = 1 (x isin [0 L])

The odd periodic extension is an odd square

wave The integral (separation14) gives

bn =4

nπ(1 ndash cos(nπ))

=

0 n even

4nπ n odd

(separation16)

Now we can write the solution as

u(t x) =infinsum

n=1 n odd

4

nπendashk(nπL)

2t sin(nπLx)

(separation17)

Plotting in Python

First load some Python packages

import numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

pde Partial differential equations separation PDE solution by separation of variables p 5

Set k = L = 1 and sum values for the first Nterms of the solution

L = 1k = 1N = 100x = nplinspace(0L300)t = nplinspace(02(Lnppi)2100)u_n = npzeros([len(t)len(x)])for n in range(N)n = n+1 because index starts at 0if n 2 == 0 even

pass already initialized to zeroselse odd

u_n += 4(nnppi)npouter(npexp(-k(nnppiL)2t)npsin(nnppiLx)

)

Letrsquos first plot the initial condition

p = pltfigure()pltplot(xu_n[0])pltxlabel(space $x$)pltylabel($u(0x)$)pltshow()

00 02 04 06 08 10space x

00

02

04

06

08

10

12

u(0x)

pde Partial differential equations wave PDE solution by separation of variables p 6

Now we plot the entire response

p = pltfigure()pltcontourf(txu_nT)c = pltcolorbar()cset_label($u(tx))pltxlabel(time $t$)pltylabel(space $x$)pltshow()

000 005 010 015 020

time t

00

02

04

06

08

10

spacex

000

015

030

045

060

075

090

105

120

u(tx)

We see the diffusive action proceeds as we

expected

Python code in this section was generated from a Jupyter

notebook named pde_separation_example_01ipynb with a

python3 kernel

pde Partial differential equations wave The 1D wave equation p 1

wave equation

8 Kreyszig Advanced Engineering Mathematics sect 129 1210

9 Haberman Applied Partial Differential Equations with Fourier Series

and Boundary Value Problems (Classic Version) sect 45 73

pdewave The 1D wave equation

The oneshydimensional wave equation is the

linear PDE

part2ttu(t x) = c2part2xxu(t x) (wave1)

with real constant c This equation models

such phenomena as strings fluids sound and

light It is subject to initial and boundary

conditions and can be extended to multiple

spatial dimensions For 2D and 3D examples

in rectangular and polar coordinates see

Kreyszig8 and Haberman9

Example pdewaveshy1 re vibrating string PDE solution by separation of

variables

Problem Statement

Consider the oneshydimensional wave equation

PDE

part2ttu(t x) = c2part2xxu(t x) (wave2)

with real constant c and with dirichlet

boundary conditions on inverval x isin [0 L]

u(t 0) = 0 and u(t L) = 0 (wave3a)

and with initial conditions (we need two

because of the second timeshyderivative)

u(0 x) = f(x) and parttu(0 x) = g(x) (wave4)

where f and g are some piecewise continuous

functions on [0 L]

Problem Solution

Assume a product solution

First we assume a product solution of the

form up(t x) = T(t)X(x) where T and X are

unknown functions on t gt 0 and x isin [0 L]

pde Partial differential equations wave The 1D wave equation p 2

Separate PDE

Second we substitute the product solution

into Equation wave2 and separate variables

TprimeprimeX = c2TXprimeprime rArr (wave5)

Tprimeprime

c2T=Xprimeprime

X (wave6)

So it is separable Note that we chose to

group c with T which was arbitrary but

conventional

Set equal to a constant

Since these two sides depend on different

independent variables (t and x) they must

equal the same constant we call ndashλ so we

have two ODEs

Tprimeprime

c2T= ndashλ rArr Tprimeprime + λc2T = 0 (wave7)

Xprimeprime

X= ndashλ rArr Xprimeprime + λX = 0 (wave8)

Solve the boundary value problem

The latter of these equations with the

boundary conditions (wave3) is precisely

the same sturm-liouville boundary value

problem from Example pdesturm-1 which had

eigenfunctions

Xn(x) = sin(radic

λnx)

(wave9a)

= sin(nπLx)

(wave9b)

with corresponding (positive) eigenvalues

λn =(nπL

)2 (wave10)

Solve the time variable ODE

The time variable ODE is homogeneous and

with λ restricted by the reals by the

boundary value problem has the familiar

general solution

T(t) = k1 cos(cradicλt) + k2 sin(c

radicλt) (wave11)

pde Partial differential equations wave The 1D wave equation p 3

with real constants k1 and k2 However the

boundary value problem restricted values of λ

to λn so

Tn(t) = k1 cos(cnπLt)+ k2 sin

(cnπLt)

(wave12)

Construct the product solution

The product solution is

up(t x) = Tn(t)Xn(x)

= k1 sin(nπLx)cos

(cnπLt)+ k2 sin

(nπLx)sin

(cnπLt)

This is a family of solutions that each satisfy

only exotically specific initial conditions

Apply the initial conditions

Recall that superposition tells us that any

linear combination of the product solution is

also a solution Therefore

u(t x) =infinsumn=1

an sin(nπLx)cos

(cnπLt)+ bn sin

(nπLx)sin

(cnπLt)

(wave13)

is a solution If an and bn are properly

selected to satisfy the initial conditions

Equation wave13 will be the solution to the

entire problem Substituting t = 0 into our

potential solution gives

u(0 x) =infinsumn=1

an sin(nπLx)

(wave14a)

parttu(t x)|t=0 =infinsumn=1

bncnπ

Lsin

(nπLx) (wave14b)

Let us extend f and g to be periodic and odd

over x we call the extensions flowast and glowast From

Equation wave14 the intial conditions are

satsified if

flowast(x) =infinsumn=1

an sin(nπLx)

(wave15a)

glowast(x) =infinsumn=1

bncnπ

Lsin

(nπLx) (wave15b)

pde Partial differential equations wave The 1D wave equation p 4

We identify these as two odd fourier

syntheses The corresponding fourier analyses

are

an =2

L

ˆ L

0flowast(χ) sin

(nπLχ)

(wave16a)

bncnπ

L=2

L

ˆ L

0glowast(χ) sin

(nπLχ)

(wave16b)

So the complete solution is Equation wave13

with components given by Equation wave16

Notice this satisfies the PDE the boundary

conditions and the initial condition

Discussion

It can be shown that this series solution is

equivalent to two travelingwaves that are

interfering (see Habermana and Kreyszigb)

This is convenient because computing the

series solution exactly requires an infinite

summation We show in the following section

that the approximation by partial summation is

still quite good

Choosing specific initial conditions

If we want to plot solutions we need to

specify initial conditions over [0 L] Letrsquos

model a string being suddenly struck from

rest as

f(x) = 0 (wave17)

g(x) = δ(x ndash ∆L) (wave18)

where δ is the dirac delta distribution and

∆ isin [0 L] is a fraction of L representing the

location of the string being struck The odd

periodic extension is an odd pulse train The

integrals of (wave16) give

an = 0 (wave19a)

bn =2

cnπ

ˆ L

0δ(χ ndash ∆L) sin

(nπLχ)dx

=2

cnπsin(nπ∆) (sifting property)

pde Partial differential equations wave The 1D wave equation p 5

Now we can write the solution as

u(t x) =infinsumn=1

2

cnπsin(nπ∆) sin

(nπLx)sin

(cnπLt)

(wave20)

Plotting in Python

First load some Python packages

import numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

Set c = L = 1 and sum values for the first

N terms of the solution for some striking

location ∆

Delta = 01 0 lt= Delta lt= LL = 1c = 1N = 150t = nplinspace(030(Lnppi)2100)x = nplinspace(0L150)t_b x_b = npmeshgrid(tx)u_n = npzeros([len(x)len(t)])for n in range(N)n = n+1 because index starts at 0u_n += 4(cnnppi)

npsin(nnppiDelta) npsin(cnnppiLt_b) npsin(nnppiLx_b)

Letrsquos first plot some early snapshots of the

response

pde Partial differential equations wave The 1D wave equation p 6

import seaborn as snsn_snaps = 7snsset_palette(snsdiverging_palette(

240 10 n=n_snaps center=dark)

)

fig ax = pltsubplots()it = nplinspace(277n_snapsdtype=int)for i in range(len(it))axplot(xu_n[it[i]]label=ft = t[i]3f)

lgd = axlegend(bbox_to_anchor=(105 1)loc=upper left

)pltxlabel(space $x$)pltylabel($u(tx)$)pltshow()

00 02 04 06 08 10

space x

minus10

minus05

00

05

10

u(tx

)

t = 0000

t = 0031

t = 0061

t = 0092

t = 0123

t = 0154

t = 0184

Now we plot the entire response

fig ax = pltsubplots()p = axcontourf(txu_n)c = figcolorbar(pax=ax)cset_label($u(tx)$)pltxlabel(time $t$)pltylabel(space $x$)pltshow()

pde Partial differential equations ex The 1D wave equation p 7

00 05 10 15 20 25 30

time t

00

02

04

06

08

10

spac

ex

minus12

minus09

minus06

minus03

00

03

06

09

12

u(tx

)

We see a wave develop and travel reflecting

and inverting off each boundary

a Haberman Applied Partial Differential Equations with Fourier

Series and Boundary Value Problems (Classic Version) sect 44

b Kreyszig Advanced Engineering Mathematics sect 122

b Python code in this section was generated from a Jupyter

notebook named pde_separation_example_02ipynb with a

python3 kernel

pde Partial differential equations ex Exercises for Chapter 07 p 1

pdeex Exercises for Chapter 07

Exercise 071 (poltergeist)

The PDE of Example pdeseparation-1 can be used

to describe the conduction of heat along

a long thin rod insulated along its length

where u(t x) represents temperature The initial

and dirichlet boundary conditions in that

example would be interpreted as an initial

temperature distribution along the bar and

fixed temperatures of the ends Now consider

the same PDE

parttu(t x) = kpart2xxu(t x) (ex1)

with real constant k now with neumann

boundary conditions on inverval x isin [0 L]

partxu|x=0 = 0 and partxu|x=L = 0 (ex2a)

and with initial condition

u(0 x) = f(x) (ex3)

where f is some piecewise continuous function

on [0 L] This represents the complete insulation

of the ends of the rod such that no heat flows

from the ends (or from anywhere else)

a Assume a product solution separate

variables into X(x) and T(t) and set the

separation constant to ndashλ

b Solve the boundary value problem for its

eigenfunctions Xn and eigenvalues λn

c Solve for the general solution of the

time variable ODE

d Write the product solution and apply the

initial condition f(x) by constructing it

from a generalized fourier series of the

product solution

pde Partial differential equations ex Exercises for Chapter 07 p 1

e Let L = k = 1 and f(x) = 100ndash200L |xndashL2| as

shown in Figure 071 Compute the solution

series components Plot the sum of the

first 50 terms over x and t

0 02 04 06 08 1050100

x

f(x)

Figure 071 initial condition for Exercise 051Exercise 072 (kathmandu)

Consider the free vibration of a uniform

and relatively thin beammdashwith modulus of

elasticity E second moment of cross-sectional

area I and massshyper-length micromdashpinned at each

end The PDE describing this is a version of

the euler-bernoulli beam equation for vertical

motion u

part2ttu(t x) = ndashα2part4xxxxu(t x) (ex4)

with real constant α defined as

α2 =EI

micro (ex5)

Pinned supports fix vertical motion such that

we have boundary conditions on interval

x isin [0 L]

u(t 0) = 0 and u(t L) = 0 (ex6a)

Additionally pinned supports cannot provide

a moment so

part2xxu|x=0 = 0 and part2xxu|x=L = 0 (ex6b)

Furthermore consider the initial conditions

u(0 x) = f(x) and parttu|t=0 = 0 (ex7a)

where f is some piecewise continuous function

on [0 L]

opt Partial differential equations Exercises for Chapter 07 p 2

a Assume a product solution separate

variables into X(x) and T(t) and set the

separation constant to ndashλ

b Solve the boundary value problem for

its eigenfunctions Xn and eigenvalues λn

Assume real λ gt 0 (itrsquos true but tedious to

show)

c Solve for the general solution of the

time variable ODE

d Write the product solution and apply

the initial conditions by constructing

it from a generalized fourier series of

the product solution

e Let L = α = 1 and f(x) = sin(10πxL)

as shown in Figure 072 Compute the

solution series components Plot the sum

of the first 50 terms over x and t

0 02 04 06 08 1ndash101

x

f(x)

Figure 072 initial condition for Exercise 072

opt

Optimization

opt Optimization grad Gradient descent p 1

objective function

extremum

global extremum

local extremum

gradient descent

stationary point

saddle point

hessian matrix

optgrad Gradient descent

Consider a multivariate function f Rn rarr Rthat represents some cost or value This is

called an objective function and we often

want to find an X isin Rn that yields frsquos

extremum minimum or maximum depending on

whichever is desirable

It is important to note however that some

functions have no finite extremum Other

functions have multiple Finding a global

extremum is generally difficult however

many good methods exist for finding a local

extremum an extremum for some region R sub Rn

The method explored here is called gradient

descent It will soon become apparent why it

has this name

Stationary points

Recall from basic calculus that a function f of

a single variable had potential local extrema

where df(x) dx = 0 The multivariate version of

this for multivariate function f is

grad f = 0 (grad1)

A value X for which Equation grad1 holds is

called a stationary point However as in the

univariate case a stationary point may not be

a local extremum in these cases it called a

saddle point

Consider the hessian matrix H with values for

independent variables xi

Hij = part2xixjf (grad2)

opt Optimization grad Gradient descent p 1

second partial derivative test

positive definite

negative definite

indefinite

convex

concave

the gradient points toward stationary points

For a stationary point X the second partial

derivative test tells us if it is a local

maximum local minimum or saddle point

minimum If H(X) is positive definite (all its

eigenvalues are positive)

X is a local minimum

maximum If H(X) is negative definite (all its

eigenvalues are negative)

X is a local maximum

saddle If H(X) is indefinite (it has both

positive and negative eigenvalues)

X is a saddle point

These are sometimes called tests for concavity

minima occur where f is convex and maxima

where f is concave (ie where ndashf is convex)

It turns out however that solving

Equation grad1 directly for stationary points is

generally hard Therefore we will typically use

an iterative technique for estimating them

The gradient points the way

Although Equation grad1 isnrsquot usually directly

useful for computing stationary points it

suggests iterative techniques that are Several

techniques rely on the insight that the

gradient points toward stationary points

Recall from Lecture vecsgrad that grad f is a

vector field that points in the direction of

greatest increase in f

opt Optimization grad Gradient descent p 1

line search

step size α

Consider starting at some point x0 and

wanting to move iteratively closer to a

stationary point So if one is seeking a

maximum of f then choose x1 to be in the

direction of grad f If one is seeking a

minimum of f then choose x1 to be opposite the

direction of grad f

The question becomes how far α should we go

in (or opposite) the direction of the gradient

Surely too-small α will require more iteration

and too-large α will lead to poor convergence

or missing minima altogether This framing of

the problem is called line search There are a

few common methods for choosing α called the

step size some more computationally efficient

than others

Two methods for choosing the step size

are described below Both are framed as

minimization methods but changing the sign of

the step turns them into maximization methods

The classical method

Let

gk = grad f(xk) (grad3)

the gradient at the algorithmrsquos current

estimate xk of the minimum The classical

method of choosing α is to attempt to solve

analytically for

αk = argminα

f(xk ndash αgk) (grad4)

This solution approximates the function f

as one varies α It is approximate because

as α varies so should x But even with α as

the only variable Equation grad4 may be

opt Optimization grad Gradient descent p 1

1 Kreyszig Advanced Engineering Mathematics sect 221

2 Jonathan Barzilai and Jonathan M Borwein Two-Point Step Size

Gradient Methods In IMA Journal of Numerical Analysis 81 (Jan 1988)

pp 141ndash148 issn 0272-4979 doi 101093imanum81141 This

includes an innovative line search method

difficult or impossible to solve However this

is sometimes called the ldquooptimalrdquo choice for

α Here ldquooptimalityrdquo refers not to practicality

but to ideality This method is rarely used to

solve practical problems

The algorithm of the classical gradient descent

method can be summarized in the pseudocode

of Algorithm 1 It is described further in

Kreyszig1

Algorithm 1 Classical gradient descent

1 procedure classical_minimizer(fx0 T)

2 while δx gt T do until threshold T is

met

3 gk larr grad f(xk)

4 αk larr argminα f(xk ndash αgk)

5 xk+1 larr xk ndash αkgk6 δxlarr xk+1 ndash xk7 klarr k + 1

8 end while

9 return xk the threshold was reached

10 end procedureThe Barzilai and Borwein method

In practice several non-classical methods are

used for choosing step size α Most of these

construct criteria for step sizes that are too

small and too large and prescribe choosing

some α that (at least in certain cases) must

be in the sweet-spot in between Barzilai and

Borwein2 developed such a prescription which

we now present

Let ∆xk = xk ndash xkndash1 and ∆gk = gk ndash gkndash1 This

method minimizes ∆x ndash α∆g2 by choosing

αk =∆xk middot ∆gk∆gk middot ∆gk

(grad5)

opt Optimization grad Gradient descent p 2

3 Barzilai and Borwein Two-Point Step Size Gradient Methods

The algorithm of this gradient descent

method can be summarized in the pseudocode

of Algorithm 2 It is described further in

Barzilai and Borwein3

Algorithm 2 Barzilai and Borwein gradient

descent

1 procedure barzilai_minimizer(fx0 T)

2 while δx gt T do until threshold T is

met

3 gk larr grad f(xk)

4 ∆gk larr gk ndash gkndash15 ∆xk larr xk ndash xkndash16 αk larr

∆xkmiddot∆gk∆gkmiddot∆gk

7 xk+1 larr xk ndash αkgk8 δxlarr xk+1 ndash xk9 klarr k + 1

10 end while

11 return xk the threshold was reached

12 end procedure

Example optgradshy1 re Barzilai and Borwein gradient descent

Problem Statement

Consider the functions (a) f1 R2 rarr R and (b)

f2 R2 rarr R defined as

f1(x) = (x1 ndash 25)2 + 13(x2 + 10)

2 (grad6)

f2(x) =1

2x middot Ax ndash b middot x (grad7)

where

A =

[20 0

0 10

]and (grad8a)

b =[1 1

]gt (grad8b)

Use the method of Barzilai and Borweina

starting at some x0 to find a minimum of each

function

a Ibid

Problem Solution

First load some Python packages

opt Optimization grad Gradient descent p 3

from sympy import import numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latexfrom tabulate import tabulate

We begin by writing a class gradient_descent_minto perform the gradient descent This is not

optimized for speed

class gradient_descent_min() A Barzilai and Borwein gradient descent class

Inputs f Python function of x variables x list of symbolic variables (eg [x1 x2]) x0 list of numeric initial guess of a min of f T step size threshold for stopping the descent

To execute the gradient descent call descend method

nb This is only for gradients in cartesiancoordinates Further work would be to implementthis in multiple or generalized coordinatesSee the grad method below for implementation

def __init__(selffxx0T)selff = fselfx = Array(x)selfx0 = nparray(x0)selfT = Tselfn = len(x0) size of xselfg = lambdify(xselfgrad(fx)numpy)selfxk = nparray(x0)selftable =

def descend(self) unpack variablesf = selffx = selfxx0 = selfx0T = selfTg = selfg initialize variablesN = 0

opt Optimization grad Gradient descent p 4

x_k = x0dx = 2T cant be zerox_km1 = 9x0-1 cant equal x0g_km1 = nparray(g(x_km1))N_max = 100 max iterationstable_data = [[Nx0nparray(g(x0))0]]while (dx gt T and N lt N_max) or N lt 1N += 1 increment indexg_k = nparray(g(x_k))dg_k = g_k - g_km1dx_k = x_k - x_km1alpha_k = dx_kdot(dg_k)dg_kdot(dg_k)x_km1 = x_k storex_k = x_k - alpha_kg_k savetable_dataappend([Nx_kg_kalpha_k])selfxk = npvstack((selfxkx_k)) store other variablesg_km1 = g_kdx = nplinalgnorm(x_k - x_km1) check

selftabulater(table_data)

def tabulater(selftable_data)npset_printoptions(precision=2)tabulateLATEX_ESCAPE_RULES=selftable[python] = tabulate(table_dataheaders=[Nx_kg_kalpha]

)selftable[latex] = tabulate(table_dataheaders=[

$N$$bmx_k$$bmg_k$$alpha$]tablefmt=latex_raw

)

def grad(selffx) cartesian coords gradientreturn derive_by_array(f(x)x)

First consider f1

var(x1 x2)x = Array([x1x2])f1 = lambda x (x[0]-25)2 + 13(x[1]+10)2gd = gradient_descent_min(f=f1x=xx0=[-5040]T=1e-8)

opt Optimization grad Gradient descent p 5

Perform the gradient descent

gddescend()

Print the interesting variables

print(gdtable[python])

N xk gk α

0 [-50 40] [-150 1300] 0

1 [-4365 -1503] [-150 1300] 00423296

2 [shy3836 -10 ] [-1373 -13074] 00384979

3 [shy3311 -10 ] [-127e+02 124e-01] 0041454

4 [ 2499 -10 ] [-116e+02 -962e-03] 0499926

5 [ 25 -1005] [-002 012] 0499999

6 [ 25 -10] [-184e-08 -138e+00] 00385225

7 [ 25 -10] [-170e-08 219e-03] 00384615

8 [ 25 -10] [-157e-08 000e+00] 00384615

Now letrsquos lambdify the function f1 so we can

plot

f1_lambda = lambdify((x1x2)f1(x)numpy)

Now letrsquos plot a contour plot with the

gradient descent overlaid

opt Optimization grad Gradient descent p 6

fig ax = pltsubplots() contour plotX1 = nplinspace(-100100100)X2 = nplinspace(-5050100)X1 X2 = npmeshgrid(X1X2)F1 = f1_lambda(X1X2)pltcontourf(X1X2F1) gradient descent plotfrom mpl_toolkitsmplot3d import Axes3Dfrom matplotlibcollections import LineCollectionxX1 = gdxk[0]xX2 = gdxk[1]points = nparray([xX1 xX2])Treshape(-1 1 2)segments = npconcatenate([points[-1] points[1]] axis=1

)lc = LineCollection(segmentscmap=pltget_cmap(Reds)

)lcset_array(nplinspace(01len(xX1))) color segslcset_linewidth(3)axautoscale(False) avoid the scatter changing limsaxadd_collection(lc)axscatter(xX1xX2zorder=1marker=ocolor=pltcmReds(nplinspace(01len(xX1)))edgecolor=none

)pltshow()

minus100 minus75 minus50 minus25 0 25 50 75 100

minus40

minus20

0

20

40

opt Optimization grad Gradient descent p 7

Now consider f2

A = Matrix([[100][020]])b = Matrix([[11]])def f2(x)X = Array([x])tomatrix()Treturn 12Xdot(AX) - bdot(X)

gd = gradient_descent_min(f=f2x=xx0=[50-40]T=1e-8)

Perform the gradient descent

gddescend()

Print the interesting variables

print(gdtable[python])

opt Optimization grad Gradient descent p 8

N xk gk α

0 [ 50 -40] [ 499 -801] 0

1 [1758 1204] [ 499 -801] 00649741

2 [ 807 -101] [17478 23988] 00544221

3 [362 017] [ 7966 shy2122] 00558582

4 [ 049 -005] [3516 249] 00889491

5 [01 014] [ 389 -194] 00990201

6 [01 0 ] [004 19 ] 00750849

7 [01 005] [ 001 -095] 0050005

8 [01 005] [474e-03 958e-05] 00500012

9 [01 005] [ 237e-03 shy238e-09] 00999186

10 [01 005] [193e-06 237e-09] 01

11 [01 005] [ 000e+00 shy237e-09] 00999997

Now letrsquos lambdify the function f2 so we can

plot

f2_lambda = lambdify((x1x2)f2(x)numpy)

Now letrsquos plot a contour plot with the

gradient descent overlaid

fig ax = pltsubplots() contour plotX1 = nplinspace(-100100100)X2 = nplinspace(-5050100)X1 X2 = npmeshgrid(X1X2)F2 = f2_lambda(X1X2)pltcontourf(X1X2F2) gradient descent plotfrom mpl_toolkitsmplot3d import Axes3Dfrom matplotlibcollections import LineCollectionxX1 = gdxk[0]xX2 = gdxk[1]points = nparray([xX1 xX2])Treshape(-1 1 2)

opt Optimization lin Gradient descent p 9

segments = npconcatenate([points[-1] points[1]] axis=1

)lc = LineCollection(segmentscmap=pltget_cmap(Reds)

)lcset_array(nplinspace(01len(xX1))) color segslcset_linewidth(3)axautoscale(False) avoid the scatter changing limsaxadd_collection(lc)axscatter(xX1xX2zorder=1marker=ocolor=pltcmReds(nplinspace(01len(xX1)))edgecolor=none

)pltshow()

minus100 minus75 minus50 minus25 0 25 50 75 100

minus40

minus20

0

20

40

Python code in this section was generated from a Jupyter

notebook named gradient_descentipynb with a python3kernel

opt Optimization lin Constrained linear optimization p 1

constraints

linear programming problem

maximize

constrained

linear

feasible solution

optimal

hyperplane

optlin Constrained linear

optimization

Consider a linear objective function f

Rn rarr R with variables xi in vector x and

coefficients ci in vector c

f(x) = c middot x (lin1)

subject to the linear constraintsmdashrestrictions

on ximdash

Ax 6 a (lin2a)

Bx = b and (lin2b)

l 6 x 6 u (lin2c)

where A and B are constant matrices and

ab lu are nshyvectors This is one formulation

of what is called a linear programming

problem Usually we want to maximize f over

the constraints Such problems frequently

arise throughout engineering for instance

in manufacturing transportation operations

etc They are called constrained because

there are constraints on x they are called

linear because the objective function and the

constraints are linear

We call a pair (x f(x)) for which x satisfies

Equation lin2 a feasible solution Of course

not every feasible solution is optimal a

feasible solution is optimal iff there exists no

other feasible solution for which f is greater

(assumingwersquore maximizing) We call the vector

subspace of feasible solutions S sub Rn

Feasible solutions form a polytope

Consider the effect of the constraints Each

of the equalities and inequalities defines a

linear hyperplane in Rn (ie a linear subspace

opt Optimization lin Constrained linear optimization p 1

flat

polytope

vertices

basic feasible solutions

level set

an optimal solution occurs at a vertex of S

of dimension n ndash 1) either as a boundary of

S (inequality) or as a restriction of S to the

hyperplane When joined these hyperplanes

are the boundary of S (equalities restrict S

to lower dimension) So we see that each of

the boundaries of S is flat which makes S a

polytope (in R2 a polygon) What makes this

especially interesting is that polytopes have

vertices where the hyperplanes intersect

Solutions at the vertices are called basic

feasible solutions

Only the vertices matter

Our objective function f is linear so for

some constant h f(x) = h defines a level set

that is itself a hyperplane H in Rn If thishyperplane intersects S at a point x (x f(x) = h)

is the corresponding solution There are three

possibilities when H intersects S

1 H cap S is a vertex of S

2 H cap S is a boundary hyperplane of S or

3 H cap S slices through the interior of S

However this third option implies that there

exists a level set G corresponding to f(x) =

g such that G intersects S and g gt h so

solutions on H cap S are not optimal (We have

not proven this but it may be clear from

our progression) We conclude that either the

first or second case must be true for optimal

solutions And notice that in both cases a

(potentially optimal) solution occurs at at

least one vertex The key insight then is that

an optimal solution occurs at a vertex of

S

opt Optimization simplex Constrained linear optimization p 2

This means we donrsquot need to search all of S

or even its boundary we need only search the

vertices Helpful as this is it restricts us down

to( n constraints

)potentially optimal solutionsmdash

usually still too many to search in a naiumlve

way In Lecture optsimplex this is mitigated

by introducing a powerful searching method

opt Optimization simplex The simplex algorithm p 1

simplex algorithm

4 Another Python package pulp (PuLP) is probably more popular for

linear programming however we choose scipyoptimize because it

has applications beyond linear programming

optsimplex The simplex

algorithm

The simplex algorithm (or ldquomethodrdquo) is an

iterative technique for finding an optimal

solution of the linear programming problem

of Equations lin1 and lin2 The details of

the algorithm are somewhat involved but

the basic idea is to start at a vertex of the

feasible solution space S and traverse an edge

of the polytope that leads to another vertex

with a greater value of f Then repeat this

process until there is no neighboring vertex

with a greater value of f at which point the

solution is guaranteed to be optimal

Rather than present the details of the

algorithm we choose to show an example using

Python There have been some improvements

on the original algorithm that have been

implemented into many standard software

packages including Pythonrsquos scipy package

(Pauli Virtanen et al SciPy 10ndashFundamental

Algorithms for Scientific Computing in Python

In arXiv eshyprints arXiv190710121 [July 2019]

arXiv190710121) module scipyoptimize4

Example optsimplexshy1 re simplex method using scipyoptimize

Problem Statement

Maximize the objective function

f(x) = c middot x (simplex1a)

for x isin R2 and

c =[5 2

]gt(simplex1b)

opt Optimization simplex The simplex algorithm p 2

subject to constraints

0 6 x1 6 10 (simplex2a)

ndash5 6 x2 6 15 (simplex2b)

4x1 + x2 6 40 (simplex2c)

x1 + 3x2 6 35 (simplex2d)

ndash8x1 ndash x2 gt ndash75 (simplex2e)

Problem Solution

First load some Python packages

from scipyoptimize import linprogimport numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

Encoding the problem

Before we can use linprog we must first

encode Equations simplex2 and ex4 into

a form linprog will recognize We begin

with f which we can write as c middot x with the

coefficients of c as follows

c = [-5 -2] negative to find max

Wersquove negated each constant because

linprog minimizes f and we want to

maximize f Now letrsquos encode the inequality

constraints We will write the leftshyhand

side coefficients in the matrix A and the

opt Optimization simplex The simplex algorithm p 3

rightshyhand-side values in vector a such that

Ax 6 a (simplex3)

Notice that one of our constraint

inequalities is gt instead of 6 We can

flip this by multiplying the inequality by ndash1

We use simple lists to encode A and a

A = [[4 1][1 3][8 1]

]a = [40 35 75]

Now we need to define the lower l and upper

u bounds of x The function linprog expects

these to be in a single list of lower- and

upper-bounds of each xi

lu = [(0 10)(-515)

]

We want to keep track of each step linprogtakes We can access these by defining a

function callback to be passed to linprog

x = [] for storing the stepsdef callback(res) called at each step

opt Optimization simplex The simplex algorithm p 4

global xprint(fnit = resnit x_k = resx)xappend(resxcopy()) store

Now we need to call linprog We donrsquot have

any equality constraints so we need only use

the keyword arguments A_ub=A and b_ub=aFor demonstration purposes we tell it to

use the simplex method which is not as

good as its other methods which use better

algorithms based on the simplex

res = linprog(cA_ub=Ab_ub=abounds=lumethod=simplexcallback=callback

)

x = nparray(x)

nit = 0 x_k = [ 0 -5]nit = 1 x_k = [10 -5]nit = 2 x_k = [875 5 ]nit = 3 x_k = [772727273 909090909]nit = 4 x_k = [772727273 909090909]nit = 5 x_k = [772727273 909090909]nit = 5 x_k = [772727273 909090909]

So the optimal solution (x f(x)) is as follows

opt Optimization simplex The simplex algorithm p 5

print(foptimum x resx)print(foptimum f(x) -resfun)

optimum x [772727273 909090909]optimum f(x) 5681818181818182

The last point was repeated

1 once because there was no adjacent

vertex with greater f(x) and

2 twice because the algorithm calls

lsquocallbacklsquo twice on the last step

Plotting

When the solution space is in R2 it is

helpful to graphically represent the solution

space constraints and the progression of

the algorithm We begin by defining the

inequality lines from A and a over the bounds

of x1

n = len(c) number of variables xm = npshape(A)[0] number of inequality constraintsx2 = npempty([m2])for i in range(0m)x2[i] = -A[i][0]A[i][1]nparray(lu[0]) + a[i]A[i][1]

Now we plot a contour plot of f over

the bounds of x1 and x2 and overlay the

opt Optimization simplex The simplex algorithm p 6

inequality constraints and the steps of the

algorithm stored in x

lu_array = nparray(lu)fig ax = pltsubplots()mplrcParams[lineslinewidth] = 3 contour plotX1 = nplinspace(lu_array[0]100)X2 = nplinspace(lu_array[1]100)X1 X2 = npmeshgrid(X1X2)F2 = -c[0]X1 + -c[1]X2 negative because max hackcon = axcontourf(X1X2F2)cbar = figcolorbar(conax=ax)cbaraxset_ylabel(objective function) bounds on xun = nparray([11])opts = cwlabelNonelinewidth6pltplot(lu_array[0]lu_array[10]unopts)pltplot(lu_array[0]lu_array[11]unopts)pltplot(lu_array[00]unlu_array[1]opts)pltplot(lu_array[01]unlu_array[1]opts) inequality constraintsfor i in range(0m)p = pltplot(lu[0]x2[i]c=w)

pset_label(constraint) stepspltplot(x[0]x[1]-oc=rclip_on=Falsezorder=20label=simplex

)pltylim(lu_array[1])pltxlabel($x_1$)pltylabel($x_2$)pltlegend()pltshow()

opt Optimization ex The simplex algorithm p 7

0 2 4 6 8 10x1

minus50

minus25

00

25

50

75

100

125

150

x2

constraint

simplex

minus15

0

15

30

45

60

75

90

obje

ctiv

efu

nct

ion

Python code in this section was generated from a Jupyter

notebook named simplex_linear_programmingipynb with a

python3 kernel

nlin Optimization Exercises for Chapter 08 p 1

5 Barzilai and Borwein Two-Point Step Size Gradient Methods

optex Exercises for Chapter 08

Exercise 081 (cummerbund)

Consider the functions (a) f1 R2 rarr R and (b)

f2 R2 rarr R defined as

f1(x) = 4(x1 ndash 16)2 + (x2 + 64)

2 + x1 sin2 x1 (ex1)

f2(x) =1

2x middot Ax ndash b middot x (ex2)

where

A =

[5 0

0 15

]and (ex3a)

b =[ndash2 1

]gt (ex3b)

Use the method of Barzilai and Borwein5

starting at some x0 to find a minimum of each

function

Exercise 082 (melty)

Maximize the objective function

f(x) = c middot x (ex4a)

for x isin R3 and

c =[3 ndash8 1

]gt(ex4b)

subject to constraints

0 6 x1 6 20 (ex5a)

ndash5 6 x2 6 0 (ex5b)

5 6 x3 6 17 (ex5c)

x1 + 4x2 6 50 (ex5d)

2x1 + x3 6 43 (ex5e)

ndash4x1 + x2 ndash 5x3 gt ndash99 (ex5f)

1 As is customary we frequently say ldquosystemrdquo when we mean

ldquomathematical system modelrdquo Recall that multiple models may be

used for any given physical system depending on what one wants to

know

nlin

Nonlinear analysis

1 The ubiquity of near-linear systems and

the tools we have for analyses thereof can

sometimes give the impression that nonlinear

systems are exotic or even downright

flamboyant However a great many systems1

important for a mechanical engineer are

frequently hopelessly nonlinear Here are a some

examples of such systems

bull A robot arm

bull Viscous fluid flow (usually modelled by

the navier-stokes equations)

bull

bull Anything that ldquofills uprdquo or ldquosaturatesrdquo

bull Nonlinear optics

bull Einsteinrsquos field equations (gravitation in

general relativity)

bull Heat radiation and nonlinear heat

conduction

bull Fracture mechanics

bull

nlin Nonlinear analysis ss p 3

2 SH Strogatz and M Dichter Nonlinear Dynamics and Chaos

Second Studies in Nonlinearity Avalon Publishing 2016 isbn

9780813350844

3 W Richard Kolk and Robert A Lerman Nonlinear System Dynamics

1st ed Springer US 1993 isbn 978-1-4684shy6496shy2

2 Lest we think this is merely an

inconvenience we should keep in mind that

it is actually the nonlinearity that makes

many phenomena useful For instance the

depends on the nonlinearity

of its optics Similarly transistors and the

digital circuits made thereby (including the

microprocessor) wouldnrsquot function if their

physics were linear

3 In this chapter we will see some ways

to formulate characterize and simulate

nonlinear systems Purely

are few for nonlinear systems

Most are beyond the scope of this text but we

describe a few mostly in Lecture nlinchar

Simulation via numerical integration of

nonlinear dynamical equations is the most

accessible technique so it is introduced

4 We skip a discussion of linearization

of course if this is an option it is

preferable Instead we focus on the

5 For a good introduction to nonlinear

dynamics see Strogatz and Dichter2 A more

engineer-oriented introduction is Kolk and

Lerman3

nlin Nonlinear analysis ss Nonlinear state-space models p 1

nonlinear stateshyspace models

autonomous system

nonautonomous system

4 Strogatz and Dichter Nonlinear Dynamics and Chaos

equilibrium state

stationary point

nlinss Nonlinear stateshyspace

models

1 A state-space model has the general form

dx

dt= f(xu t) (ss1a)

y = (ss1b)

where f and g are vectorshyvalued functions

that depend on the system Nonlinear state-

space models are those for which f is a

functional of either x or u

For instance a state variable x1 might appear

as x21or two state variables might combine as

x1x2 or an input u1 might enter the equations as

log u1

Autonomous and nonautonomous systems

2 An autonomous system is one for which

f(x) with neither time nor input appearing

explicitly A nonautonomous system is one

for which either t or u do appear explicitly

in f It turns out that we can always write

nonautonomous systems as autonomous by

substituting in u(t) and introducing an extra

for t4

3 Therefore without loss of generality we

will focus on ways of analyzing autonomous

systems

Equilibrium

4 An equilibrium state (also called a

) x is one for which

dx dt = 0 In most cases this occurs only

when the input u is a constant u and for

timeshyvarying systems at a given time t For

autonomous systems equilibrium occurs when

nlin Nonlinear analysis char Nonlinear state-space models p 2

the following holds

(ss2)

This is a system of nonlinear algebraic

equations which can be challenging to solve for

x However frequently several solutionsmdashthat

is equilibrium statesmdashdo exist

nlin Nonlinear analysis char Nonlinear system characteristics p 1

system order

nlinchar Nonlinear system

characteristics

1 Characterizing nonlinear systems can be

challengingwithout the tools developed

for system characterization

However there are ways of characterizing

nonlinear systems and wersquoll here explore a few

DUM See Equation ss1a

Those in-common with linear systems

2 As with linear systems the system order is

either the number of stateshyvariables required

to describe the system or equivalently the

highest-order in a single

scalar differential equation describing the

system

3 Similarly nonlinear systems can have state

variables that depend on alone

or those that also depend on

(or some other independent variable) The

former lead to ordinary differential equations

(ODEs) and the latter to partial differential

equations (PDEs)

4 Equilibrium was already considered in

Lecture nlinss

Stability

5 In terms of system performance perhaps

no other criterion is as important as

nlin Nonlinear analysis char Nonlinear system characteristics p 1

Lyapunov stability theory

5 William L Brogan Modern Control Theory Third Prentice Hall 1991

Ch 10

6 A Choukchou-Braham et al Analysis and Control of

Underactuated Mechanical Systems SpringerLink Buumlcher Springer

International Publishing 2013 isbn 9783319026367 App A

Definition nlinchar1 Stability

If x is perturbed from an equilibrium state x

the response x(t) can

1 asymptotically return to x

(asymptotically )

2 diverge from x ( ) or

3 remain perturned or oscillate

about x with a constant amplitude

( stable)

Notice that this definition is actually

local stability in the neighborhood of one

equilibrium may not be the same as in the

neighborhood of another

6 Other than nonlinear systemsrsquo lack of

linear systemsrsquo eigenvalues poles and roots

of the characteristic equation from which to

compute it the primary difference between

the stability of linear and nonlinear systems

is that nonlinear system stability is often

difficult to establish

Using a linear systemrsquos eigenvalues it is

straightforward to establish stable unstable

and marginally stable subspaces of state-space

(via transforming to an eigenvector basis)

For nonlinear systems no such method exists

However we are not without tools to explore

nonlinear system stability One mathematical

tool to consider is

which is beyond the scope of this

course but has good treatments in5 and6

Qualities of equilibria

7 Equilibria (ie stationary points) come in

a variety of qualities It is instructive to

nlin Nonlinear analysis char Nonlinear system characteristics p 2

x

xprime

(a) r lt 0

x

xprime

(b) r = 0

x

xprime

(c) r gt 0

Figure 091 plots of xprime versus x for Equation char1

attractor

sink

stable

consider the first-order differential equation

in state variable with real constant

xprime = rx ndash x3 (char1)

If we plot xprime versus x for different values of r

we obtain the plots of Figure 091

8 By definition equilibria occur when

xprime = 0 so the xshyaxis crossings of Figure 091

are equilibria The blue arrows on the xshyaxis

show the of state

change xprime quantified by the plots For both

(a) and (b) only one equilibrium exists x = 0

Note that the blue arrows in both plots point

toward the equilibrium In such casesmdashthat is

when a exists around

an equilibrium for which state changes point

toward the equilibriummdashthe equilibrium is

called an or

Note that attractors are

9 Now consider (c) of Figure 091 When r gt 0

three equilibria emerge This change of the

number of equilibria with the changing of a

parameter is called a

nlin Nonlinear analysis sim Nonlinear system characteristics p 3

bifurcation diagram

repeller

source

unstable

saddle

A plot of bifurcations versus the parameter

is called a bifurcation diagram The x =

0 equilibrium now has arrows that point

from it Such an equilibrium is

called a or

and is The other two

equilibria here are (stable) attractors

Consider a very small initial condition x(0) = ε

If ε gt 0 the repeller pushes away x and the

positive attractor pulls x to itself Conversely

if ε lt 0 the repeller again pushes away x and

the negative attractor pulls x to itself

10 Another type of equilibrium is called

the one which acts as an

attractor along some lines and as a repeller

along others We will see this type in the

following example

Example nlincharshy1 re Saddle bifurcation

Problem Statement

Consider the dynamical equation

xprime = x2 + r (char2)

with r a real constant Sketch xprime vs x for

negative zero and positive r Identify and

classify each of the equilibria

nlin Nonlinear analysis pysim Nonlinear system simulation p 1

nlinsim Nonlinear system

simulation

nlin Nonlinear analysis pysim Simulating nonlinear systems in Python p 1

nlinpysim Simulating nonlinear

systems in Python

Example nlinpysimshy1 re a nonlinear unicycle

Problem Statementasdf

Problem Solution

First load some Python packages

import numpy as npimport sympy as spfrom scipyintegrate import solve_ivpimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

The state equation can be encoded via the

following function f

def f(t x u c)dxdt = [

x[3]npcos(x[2])x[3]npsin(x[2])x[4]1c[0] u(t)[0]1c[1] u(t)[1]

]return dxdt

The input function u must also be defined

nlin Nonlinear analysis ex Simulating nonlinear systems in Python p 2

def u(t)return [

15(1+npcos(t))25npsin(3t)

]

Define time spans initial values and constantstspan = nplinspace(0 50 300)xinit = [00000]mass = 10inertia = 10c = [massinertia]

Solve differential equationsol = solve_ivp(

lambda t x f(t x u c)[tspan[0] tspan[-1]]xinitt_eval=tspan

)

Letrsquos first plot the trajectory and

instantaneous velocity

ndash75 ndash50 ndash25 0 25 50 75 100x

ndash75

ndash50

ndash25

0

25

50

75

100

y

Figure 092

xp = soly[3]npcos(soly[2])yp = soly[3]npsin(soly[2])p = pltfigure()pltplot(soly[0]soly[1])pltquiver(soly[0]soly[1]xpyp)pltxlabel($x$)pltylabel($y$)pltshow()

Python code in this section was generated from a Jupyter

notebook named horcruxipynb with a python3 kernel

Nonlinear analysis Exercises for Chapter 09 p 1

nlinex Exercises for Chapter 09

A

Distribution tables

A Distribution tables A01 Gaussian distribution table p 1

A01 Gaussian distribution table

Below are plots of the Gaussian probability

density function f and cumulative distribution

function Φ Below them is Table A1 of CDF

values

zb

ndash3 ndash2 ndash1 0 1 2 30

01

02

03

04

z

Gaussia

nPDFf(z)

Φ(zb)

f(z)

(a) the Gaussian PDF

ndash3 ndash2 ndash1 0 1 2 30

02

04

06

08

1

interval bound zb

Gaussia

nCDFΦ(zb)

(b) the Gaussian CDF

Figure A1 the Gaussian PDF and CDF for z-scoresTable A1 z-score table Φ(zb) = P(z isin (ndashinfin zb])

zb 0 1 2 3 4 5 6 7 8 9

shy34 00003 00003 00003 00003 00003 00003 00003 00003 00003 00002

shy33 00005 00005 00005 00004 00004 00004 00004 00004 00004 00003

shy32 00007 00007 00006 00006 00006 00006 00006 00005 00005 00005

shy31 00010 00009 00009 00009 00008 00008 00008 00008 00007 00007

shy30 00013 00013 00013 00012 00012 00011 00011 00011 00010 00010

shy29 00019 00018 00018 00017 00016 00016 00015 00015 00014 00014

shy28 00026 00025 00024 00023 00023 00022 00021 00021 00020 00019

shy27 00035 00034 00033 00032 00031 00030 00029 00028 00027 00026

shy26 00047 00045 00044 00043 00041 00040 00039 00038 00037 00036

shy25 00062 00060 00059 00057 00055 00054 00052 00051 00049 00048

shy24 00082 00080 00078 00075 00073 00071 00069 00068 00066 00064

shy23 00107 00104 00102 00099 00096 00094 00091 00089 00087 00084

shy22 00139 00136 00132 00129 00125 00122 00119 00116 00113 00110

shy21 00179 00174 00170 00166 00162 00158 00154 00150 00146 00143

shy20 00228 00222 00217 00212 00207 00202 00197 00192 00188 00183

shy19 00287 00281 00274 00268 00262 00256 00250 00244 00239 00233

shy18 00359 00351 00344 00336 00329 00322 00314 00307 00301 00294

shy17 00446 00436 00427 00418 00409 00401 00392 00384 00375 00367

shy16 00548 00537 00526 00516 00505 00495 00485 00475 00465 00455

A Distribution tables A01 Gaussian distribution table p 2

Table A1 z-score table Φ(zb) = P(z isin (ndashinfin zb])

zb 0 1 2 3 4 5 6 7 8 9

shy15 00668 00655 00643 00630 00618 00606 00594 00582 00571 00559

shy14 00808 00793 00778 00764 00749 00735 00721 00708 00694 00681

shy13 00968 00951 00934 00918 00901 00885 00869 00853 00838 00823

shy12 01151 01131 01112 01093 01075 01056 01038 01020 01003 00985

shy11 01357 01335 01314 01292 01271 01251 01230 01210 01190 01170

shy10 01587 01562 01539 01515 01492 01469 01446 01423 01401 01379

shy09 01841 01814 01788 01762 01736 01711 01685 01660 01635 01611

shy08 02119 02090 02061 02033 02005 01977 01949 01922 01894 01867

shy07 02420 02389 02358 02327 02296 02266 02236 02206 02177 02148

shy06 02743 02709 02676 02643 02611 02578 02546 02514 02483 02451

shy05 03085 03050 03015 02981 02946 02912 02877 02843 02810 02776

shy04 03446 03409 03372 03336 03300 03264 03228 03192 03156 03121

shy03 03821 03783 03745 03707 03669 03632 03594 03557 03520 03483

shy02 04207 04168 04129 04090 04052 04013 03974 03936 03897 03859

shy01 04602 04562 04522 04483 04443 04404 04364 04325 04286 04247

shy00 05000 04960 04920 04880 04840 04801 04761 04721 04681 04641

00 05000 05040 05080 05120 05160 05199 05239 05279 05319 05359

01 05398 05438 05478 05517 05557 05596 05636 05675 05714 05753

02 05793 05832 05871 05910 05948 05987 06026 06064 06103 06141

03 06179 06217 06255 06293 06331 06368 06406 06443 06480 06517

04 06554 06591 06628 06664 06700 06736 06772 06808 06844 06879

05 06915 06950 06985 07019 07054 07088 07123 07157 07190 07224

06 07257 07291 07324 07357 07389 07422 07454 07486 07517 07549

07 07580 07611 07642 07673 07704 07734 07764 07794 07823 07852

08 07881 07910 07939 07967 07995 08023 08051 08078 08106 08133

09 08159 08186 08212 08238 08264 08289 08315 08340 08365 08389

10 08413 08438 08461 08485 08508 08531 08554 08577 08599 08621

11 08643 08665 08686 08708 08729 08749 08770 08790 08810 08830

12 08849 08869 08888 08907 08925 08944 08962 08980 08997 09015

13 09032 09049 09066 09082 09099 09115 09131 09147 09162 09177

14 09192 09207 09222 09236 09251 09265 09279 09292 09306 09319

15 09332 09345 09357 09370 09382 09394 09406 09418 09429 09441

16 09452 09463 09474 09484 09495 09505 09515 09525 09535 09545

17 09554 09564 09573 09582 09591 09599 09608 09616 09625 09633

18 09641 09649 09656 09664 09671 09678 09686 09693 09699 09706

19 09713 09719 09726 09732 09738 09744 09750 09756 09761 09767

20 09772 09778 09783 09788 09793 09798 09803 09808 09812 09817

21 09821 09826 09830 09834 09838 09842 09846 09850 09854 09857

22 09861 09864 09868 09871 09875 09878 09881 09884 09887 09890

23 09893 09896 09898 09901 09904 09906 09909 09911 09913 09916

24 09918 09920 09922 09925 09927 09929 09931 09932 09934 09936

25 09938 09940 09941 09943 09945 09946 09948 09949 09951 09952

26 09953 09955 09956 09957 09959 09960 09961 09962 09963 09964

27 09965 09966 09967 09968 09969 09970 09971 09972 09973 09974

28 09974 09975 09976 09977 09977 09978 09979 09979 09980 09981

29 09981 09982 09982 09983 09984 09984 09985 09985 09986 09986

30 09987 09987 09987 09988 09988 09989 09989 09989 09990 09990

31 09990 09991 09991 09991 09992 09992 09992 09992 09993 09993

32 09993 09993 09994 09994 09994 09994 09994 09995 09995 09995

33 09995 09995 09995 09996 09996 09996 09996 09996 09996 09997

34 09997 09997 09997 09997 09997 09997 09997 09997 09997 09998

A Distribution tables Studentrsquos tshydistribution table p 1

A02 Studentrsquos t-distribution

tableTable A2 two-tail inverse studentrsquos t-distribution table

percent probability

ν 600 667 750 800 875 900 950 975 990 995 999

1 0325 0577 1000 1376 2414 3078 6314 12706 31821 63657 31831

2 0289 0500 0816 1061 1604 1886 2920 4303 6965 9925 22327

3 0277 0476 0765 0978 1423 1638 2353 3182 4541 5841 10215

4 0271 0464 0741 0941 1344 1533 2132 2776 3747 4604 7173

5 0267 0457 0727 0920 1301 1476 2015 2571 3365 4032 5893

6 0265 0453 0718 0906 1273 1440 1943 2447 3143 3707 5208

7 0263 0449 0711 0896 1254 1415 1895 2365 2998 3499 4785

8 0262 0447 0706 0889 1240 1397 1860 2306 2896 3355 4501

9 0261 0445 0703 0883 1230 1383 1833 2262 2821 3250 4297

10 0260 0444 0700 0879 1221 1372 1812 2228 2764 3169 4144

11 0260 0443 0697 0876 1214 1363 1796 2201 2718 3106 4025

12 0259 0442 0695 0873 1209 1356 1782 2179 2681 3055 3930

13 0259 0441 0694 0870 1204 1350 1771 2160 2650 3012 3852

14 0258 0440 0692 0868 1200 1345 1761 2145 2624 2977 3787

15 0258 0439 0691 0866 1197 1341 1753 2131 2602 2947 3733

16 0258 0439 0690 0865 1194 1337 1746 2120 2583 2921 3686

17 0257 0438 0689 0863 1191 1333 1740 2110 2567 2898 3646

18 0257 0438 0688 0862 1189 1330 1734 2101 2552 2878 3610

19 0257 0438 0688 0861 1187 1328 1729 2093 2539 2861 3579

20 0257 0437 0687 0860 1185 1325 1725 2086 2528 2845 3552

21 0257 0437 0686 0859 1183 1323 1721 2080 2518 2831 3527

22 0256 0437 0686 0858 1182 1321 1717 2074 2508 2819 3505

23 0256 0436 0685 0858 1180 1319 1714 2069 2500 2807 3485

24 0256 0436 0685 0857 1179 1318 1711 2064 2492 2797 3467

25 0256 0436 0684 0856 1178 1316 1708 2060 2485 2787 3450

26 0256 0436 0684 0856 1177 1315 1706 2056 2479 2779 3435

27 0256 0435 0684 0855 1176 1314 1703 2052 2473 2771 3421

28 0256 0435 0683 0855 1175 1313 1701 2048 2467 2763 3408

29 0256 0435 0683 0854 1174 1311 1699 2045 2462 2756 3396

30 0256 0435 0683 0854 1173 1310 1697 2042 2457 2750 3385

35 0255 0434 0682 0852 1170 1306 1690 2030 2438 2724 3340

40 0255 0434 0681 0851 1167 1303 1684 2021 2423 2704 3307

45 0255 0434 0680 0850 1165 1301 1679 2014 2412 2690 3281

50 0255 0433 0679 0849 1164 1299 1676 2009 2403 2678 3261

55 0255 0433 0679 0848 1163 1297 1673 2004 2396 2668 3245

60 0254 0433 0679 0848 1162 1296 1671 2000 2390 2660 3232infin 0253 0431 0674 0842 1150 1282 1645 1960 2326 2576 3090

B

Fourier andLaplace tables

B Fourier and Laplace tables B01 Laplace transforms p 1

B01 Laplace transforms

Table B1 is a table with functions of time

f(t) on the left and corresponding Laplace

transforms L(s) on the right Where applicable

s = σ + jω is the Laplace transform variable

T is the timeshydomain period ω02πT is the

corresponding angular frequency j =radicndash1 a isin R+

and b t0 isin R are constants

Table B1 Laplace transform identities

function of time t function of Laplace s

a1f1(t) + a2f2(t) a1F1(s) + a2F2(s)

f(t ndash t0) F(s)endasht0s

fprime(t) sF(s) ndash f(0)

dnf(t)

dtnsnF(s) + s(nndash1)f(0) + s(nndash2)fprime(0) + middot middot middot + f(nndash1)(0)

ˆ t

0f(τ)dτ

1

sF(s)

tf(t) ndashFprime(s)

f1(t) lowast f2(t) =ˆ infinndashinfin f1(τ)f2(t ndash τ)dτ F1(s)F2(s)

δ(t) 1

us(t) 1s

ur(t) 1s2

tnndash1(n ndash 1) 1sn

endashat1

s + a

tendashat1

(s + a)2

B Fourier and Laplace tables B01 Laplace transforms p 2

1

(n ndash 1)tnndash1endashat

1

(s + a)n

1

a ndash b(eat ndash ebt)

1

(s ndash a)(s ndash b)(a 6= b)

1

a ndash b(aeat ndash bebt)

s

(s ndash a)(s ndash b)(a 6= b)

sinωtω

s2 +ω2

cosωts

s2 +ω2

eat sinωtω

(s ndash a)2 +ω2

eat cosωts ndash a

(s ndash a)2 +ω2

B Fourier and Laplace tables B02 Fourier transforms p 1

B02 Fourier transforms

Table B2 is a table with functions of time

f(t) on the left and corresponding Fourier

transforms F(ω) on the right Where applicable

T is the timeshydomain period ω02πT is the

corresponding angular frequency j =radicndash1 a isin R+

and b t0 isin R are constants Furthermore feand f0 are even and odd functions of time

respectively and it can be shown that any

function f can be written as the sum f(t) =

fe(t) + f0(t) (Hsu1967)

Table B2 Fourier transform identities

function of time t function of frequency ω

a1f1(t) + a2f2(t) a1F1(ω) + a2F2(ω)

f(at)1

|a|F(ωa)

f(ndasht) F(ndashω)

f(t ndash t0) F(ω)endashjωt0

f(t) cosω0t1

2F(ω ndashω0) +

1

2F(ω +ω0)

f(t) sinω0t1

j2F(ω ndashω0) ndash

1

j2F(ω +ω0)

fe(t) Re F(ω)

f0(t) j Im F(ω)

F(t) 2πf(ndashω)

fprime(t) jωF(ω)

dnf(t)

dtn(jω)nF(ω)

ˆ t

ndashinfin f(τ)dτ1

jωF(ω) + πF(0)δ(ω)

B Fourier and Laplace tables B02 Fourier transforms p 2

ndashjtf(t) Fprime(ω)

(ndashjt)nf(t)dnF(ω)

dωn

f1(t) lowast f2(t) =ˆ infinndashinfin f1(τ)f2(t ndash τ)dτ F1(ω)F2(ω)

f1(t)f2(t)1

2πF1(ω) lowast F2(ω) =

1

ˆ infinndashinfin F1(α)F2(ω ndash α)dα

endashatus(t)1

jω + a

endasha|t|2a

a2 +ω2

endashat2 radic

πa endashω2(4a)

1 for |t| lt a2 else 0a sin(aω2)

aω2

tendashatus(t)1

(a + jω)2

tnndash1

(n ndash 1)endashat)nus(t)

1

(a + jω)n

1

a2 + t2π

aendasha|ω|

δ(t) 1

δ(t ndash t0) endashjωt0

us(t) πδ(ω) +1

us(t ndash t0) πδ(ω) +1

jωendashjωt0

1 2πδ(ω)

t 2πjδprime(ω)

tn 2πjndnδ(ω)

dωn

ejω0t 2πδ(ω ndashω0)

cosω0t πδ(ω ndashω0) + πδ(ω +ω0)

B Fourier and Laplace tables Fourier transforms p 3

sinω0t ndashjπδ(ω ndashω0) + jπδ(ω +ω0)

us(t) cosω0tjω

ω20 ndashω2+π

2δ(ω ndashω0) +

π

2δ(ω +ω0)

us(t) sinω0tω0

ω20 ndashω2+π

2jδ(ω ndashω0) ndash

π

2jδ(ω +ω0)

tus(t) jπδprime(ω) ndash 1ω2

1t πj ndash 2πjus(ω)

1tn(ndashjω)nndash1

(n ndash 1)

(πj ndash 2πjus(ω)

)sgn t

2

jωinfinsumn=ndashinfin δ(t ndash nT) ω0

infinsumn=ndashinfin δ(ω ndash nω0)

C

Mathematicsreference

C Mathematics reference C01 Quadratic forms p 1

C01 Quadratic forms

The solution to equations of the form ax2 + bx +

c = 0 is

x =ndashbplusmn

radicb2 ndash 4ac

2a (C011)

Completing the square

This is accomplished by reshywriting the quadratic

formula in the form of the leftshyhand-side (LHS)

of this equality which describes factorization

x2 + 2xh + h2 = (x + h)2 (C012)

C Mathematics reference C02 Trigonometry p 1

C02 Trigonometry

Triangle identities

With reference to the below figure the law of

sines is

sin α

a=sin β

b=sin γ

c(C021)

and the law of cosines is

c2 = a2 + b2 ndash 2ab cos γ (C022a)

b2 = a2 + c2 ndash 2ac cos β (C022b)

a2 = c2 + b2 ndash 2cb cos α (C022c)

b

ca

α γ

β

Reciprocal identities

csc u =1

sin u(C023a)

sec u =1

cos u(C023b)

cot u =1

tan u(C023c)

Pythagorean identities

1 = sin2 u + cos2 u (C024a)

sec2 u = 1 + tan2 u (C024b)

csc2 u = 1 + cot2 u (C024c)

C Mathematics reference C02 Trigonometry p 1

Co-function identities

sin(π2ndash u

)= cos u (C025a)

cos(π2ndash u

)= sin u (C025b)

tan(π2ndash u

)= cot u (C025c)

csc(π2ndash u

)= sec u (C025d)

sec(π2ndash u

)= csc u (C025e)

cot(π2ndash u

)= tan u (C025f)

Even-odd identities

sin(ndashu) = ndash sin u (C026a)

cos(ndashu) = cos u (C026b)

tan(ndashu) = ndash tan u (C026c)

Sumshydifference formulas (AM or lock-in)

sin(uplusmn v) = sin u cos vplusmn cos u sin v (C027a)

cos(uplusmn v) = cos u cos v∓ sin u sin v (C027b)

tan(uplusmn v) = tan uplusmn tan v1∓ tanu tan v

(C027c)

Double angle formulas

sin(2u) = 2 sin u cos u (C028a)

cos(2u) = cos2 u ndash sin2 u (C028b)

= 2 cos2 u ndash 1 (C028c)

= 1 ndash 2 sin2 u (C028d)

tan(2u) =2 tanu

1 ndash tan2 u(C028e)

Power-reducing or halfshyangle formulas

sin2 u =1 ndash cos(2u)

2(C029a)

cos2 u =1 + cos(2u)

2(C029b)

tan2 u =1 ndash cos(2u)

1 + cos(2u)(C029c)

C Mathematics reference C02 Trigonometry p 1

Sum-toshyproduct formulas

sin u + sin v = 2 sinu + v

2cos

u ndash v

2(C0210a)

sin u ndash sin v = 2 cosu + v

2sin

u ndash v

2(C0210b)

cos u + cos v = 2 cosu + v

2cos

u ndash v

2(C0210c)

cos u ndash cos v = ndash2 sinu + v

2sin

u ndash v

2(C0210d)

Product-to-sum formulas

sin u sin v =1

2

[cos(u ndash v) ndash cos(u + v)

](C0211a)

cos u cos v =1

2

[cos(u ndash v) + cos(u + v)

](C0211b)

sin u cos v =1

2

[sin(u + v) + sin(u ndash v)

](C0211c)

cos u sin v =1

2

[sin(u + v) ndash sin(u ndash v)

](C0211d)

Two-to-one formulas

A sin u + B cos u = C sin(u + φ) (C0212a)

= C cos(u +ψ) where (C0212b)

C =

radicA2 + B2 (C0212c)

φ = arctanB

A(C0212d)

ψ = ndash arctanA

B(C0212e)

C Mathematics reference C03 Matrix inverses p 1

adjoint

C03 Matrix inverses

This is a guide to inverting 1times1 2times2 and ntimesnmatrices

Let A be the 1times 1 matrix

A =[a]

The inverse is simply the reciprocal

Andash1 =[1a

]

Let B be the 2times 2 matrix

B =

[b11 b12b21 b22

]

It can be shown that the inverse follows a

simple pattern

Bndash1 =1

det B

[b22 ndashb12ndashb21 b11

]

=1

b11b22 ndash b12b21

[b22 ndashb12ndashb21 b11

]

Let C be an n times n matrix It can be shown that

its inverse is

Cndash1 =1

det CadjC

where adj is the adjoint of C

C Mathematics reference Laplace transforms p 1

C04 Laplace transforms

The definition of the one-side Laplace and

inverse Laplace transforms follow

Definition C041 Laplace transforms (one-

sided)

Laplace transform L

L(y(t)) = Y(s) =

ˆ infin0

y(t)endashstdt (C041)

Inverse Laplace transform Lndash1

Lndash1(Y(s)) = y(t) =1

2πj

ˆ σ+jinfinσndashjinfin Y(s)estds (C042)

See Table B1 for a list of properties and

common transforms

D

Complex analysis

D Complex analysis Eulerrsquos formulas p 1

Eulerrsquos formulaD01 Eulerrsquos formulas

Eulerrsquos formula is our bridge backshyand forth

between trigonomentric forms (cos θ and sin θ)

and complex exponential form (ejθ)

ejθ = cos θ + j sin θ (The Formulas of Euler1)

Here are a few useful identities implied by

Eulerrsquos formula

endashjθ = cos θ ndash j sin θ (The Formulas of Euler2a)

cos θ = Re (ejθ) (The Formulas of Euler2b)

=1

2

(ejθ + endashjθ

)(The Formulas of Euler2c)

sin θ = Im (ejθ) (The Formulas of Euler2d)

=1

j2

(ejθ ndash endashjθ

)

(The Formulas of Euler2e)

Bibliography

[1] Robert B Ash Basic Probability Theory

Dover Publications Inc 2008

[2] Joan Bagaria Set Theory In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Fall 2019

Metaphysics Research Lab Stanford

University 2019

[3] Maria Baghramian and J Adam

Carter Relativism In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Winter 2019

Metaphysics Research Lab Stanford

University 2019

[4] Stephen Barker and Mark Jago Being

Positive About Negative Facts In

Philosophy and Phenomenological

Research 851 (2012) pp 117ndash138 doi 10

1111j1933-1592201000479x

[5] Jonathan Barzilai and Jonathan M

Borwein Two-Point Step Size Gradient

Methods In IMA Journal of Numerical

Analysis 81 (Jan 1988) pp 141ndash148 issn

0272-4979 doi 101093imanum81141

This includes an innovative line search

method

D Complex analysis Bibliography p 3

1 FBADL04

[6] Anat Biletzki and Anat Matar Ludwig

Wittgenstein In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Summer 2018

Metaphysics Research Lab Stanford

University 2018 An introduction to

Wittgenstein and his thought

[7] Antonio Bove F (Ferruccio) Colombini

and Daniele Del Santo Phase space

analysis of partial differential

equations eng Progress in nonlinear

differential equations and their

applications v 69 Boston Berlin

Birkhaumluser 2006 isbn 9780817645212

[8] William L Brogan Modern Control Theory

Third Prentice Hall 1991

[9] Francesco Bullo and Andrew D Lewis

Geometric control of mechanical systems

modeling analysis and design for simple

mechanical control systems Ed by JE

Marsden L Sirovich and M Golubitsky

Springer 2005

[10] Francesco Bullo and Andrew D Lewis

Supplementary Chapters for Geometric

Control of Mechanical Systems1 Jan

2005

[11] A Choukchou-Braham et al Analysis and

Control of Underactuated Mechanical

Systems SpringerLink Buumlcher Springer

International Publishing 2013 isbn

9783319026367

[12] K Ciesielski Set Theory for the

Working Mathematician London

Mathematical Society Student Texts

Cambridge University Press 1997 isbn

9780521594653 A readable introduction

to set theory

D Complex analysis Bibliography p 4

[13] Marian David The Correspondence

Theory of Truth In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Fall 2016 Metaphysics

Research Lab Stanford University

2016 A detailed overview of the

correspondence theory of truth

[14] David Dolby Wittgenstein on Truth

In A Companion to Wittgenstein John

Wiley amp Sons Ltd 2016 Chap 27 pp 433ndash

442 isbn 9781118884607 doi 101002

9781118884607ch27

[15] Peter J Eccles An Introduction to

Mathematical Reasoning Numbers

Sets and Functions Cambridge

University Press 1997 doi

10 1017 CBO9780511801136 A gentle

introduction to mathematical reasoning

It includes introductory treatments of

set theory and number systems

[16] HB Enderton Elements of Set

Theory Elsevier Science 1977 isbn

9780080570426 A gentle introduction to

set theory and mathematical reasoningmdasha

great place to start

[17] Richard P Feynman Robert B Leighton

and Matthew Sands The Feynman

Lectures on Physics New Millennium

Perseus Basic Books 2010

[18] Michael Glanzberg Truth In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Fall 2018

Metaphysics Research Lab Stanford

University 2018

[19] Hans Johann Glock Truth in the

Tractatus In Synthese 1482 (Jan

2006) pp 345ndash368 issn 1573-0964 doi

101007s11229-004-6226-2

D Complex analysis Bibliography p 5

[20] Mario GoacutemezshyTorrente Alfred Tarski

In The Stanford Encyclopedia of

Philosophy Ed by Edward N Zalta

Spring 2019 Metaphysics Research Lab

Stanford University 2019

[21] Paul Guyer and Rolf-Peter Horstmann

Idealism In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Winter 2018

Metaphysics Research Lab Stanford

University 2018

[22] R Haberman Applied Partial Differential

Equations with Fourier Series and

Boundary Value Problems (Classic

Version) Pearson Modern Classics for

Advanced Mathematics Pearson Education

Canada 2018 isbn 9780134995434

[23] GWF Hegel and AV Miller

Phenomenology of Spirit Motilal

Banarsidass 1998 isbn 9788120814738

[24] Wilfrid Hodges Model Theory In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Fall 2018

Metaphysics Research Lab Stanford

University 2018

[25] Wilfrid Hodges Tarskirsquos Truth

Definitions In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Fall 2018 Metaphysics

Research Lab Stanford University 2018

[26] Peter Hylton and Gary Kemp Willard

Van Orman Quine In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Spring 2019 Metaphysics

Research Lab Stanford University 2019

[27] ET Jaynes et al Probability Theory The

Logic of Science Cambridge University

Press 2003 isbn 9780521592710

D Complex analysis Bibliography p 6

An excellent and comprehensive

introduction to probability theory

[28] I Kant P Guyer and AW Wood Critique

of Pure Reason The Cambridge Edition

of the Works of Immanuel Kant

Cambridge University Press 1999 isbn

9781107268333

[29] Juliette Kennedy Kurt Goumldel In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Winter 2018

Metaphysics Research Lab Stanford

University 2018

[30] Drew Khlentzos Challenges to

Metaphysical Realism In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Winter 2016

Metaphysics Research Lab Stanford

University 2016

[31] Peter Klein Skepticism In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Summer 2015

Metaphysics Research Lab Stanford

University 2015

[32] M Kline Mathematics The Loss

of Certainty A Galaxy book

Oxford University Press 1982 isbn

9780195030853 A detailed account

of the ldquoillogicalrdquo development of

mathematics and an exposition of

its therefore remarkable utility in

describing the world

[33] W Richard Kolk and Robert A Lerman

Nonlinear System Dynamics 1st ed

Springer US 1993 isbn 978-1-4684shy6496shy2

[34] Erwin Kreyszig Advanced Engineering

Mathematics 10th John Wiley amp Sons

Limited 2011 isbn 9781119571094 The

authoritative resource for engineering

mathematics It includes detailed

D Complex analysis Bibliography p 7

accounts of probability statistics

vector calculus linear algebra

fourier analysis ordinary and partial

differential equations and complex

analysis It also includes several other

topics with varying degrees of depth

Overall it is the best place to start

when seeking mathematical guidance

[35] John M Lee Introduction to Smooth

Manifolds second Vol 218 Graduate

Texts in Mathematics Springer 2012

[36] Catherine Legg and Christopher

Hookway Pragmatism In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Spring 2019 Metaphysics

Research Lab Stanford University

2019 An introductory article on the

philsophical movement ldquopragmatismrdquo It

includes an important clarification of

the pragmatic slogan ldquotruth is the end of

inquiryrdquo

[37] Daniel Liberzon Calculus of Variations

and Optimal Control Theory A Concise

Introduction Princeton University Press

2012 isbn 9780691151878

[38] Panu Raatikainen Goumldelrsquos Incompleteness

Theorems In The Stanford Encyclopedia

of Philosophy Ed by Edward N Zalta

Fall 2018 Metaphysics Research Lab

Stanford University 2018 A through

and contemporary description of

Goumldelrsquos incompleteness theorems which

have significant implications for the

foundations and function of mathematics

and mathematical truth

[39] Paul Redding Georg Wilhelm Friedrich

Hegel In The Stanford Encyclopedia

of Philosophy Ed by Edward N Zalta

Summer 2018 Metaphysics Research Lab

Stanford University 2018

D Complex analysis Bibliography p 8

[40] HM Schey Div Grad Curl and All that An

Informal Text on Vector Calculus WW

Norton 2005 isbn 9780393925166

[41] Christopher Shields Aristotle In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Winter 2016

Metaphysics Research Lab Stanford

University 2016

[42] Steven S Skiena Calculated Bets

Computers Gambling and Mathematical

Modeling to Win Outlooks Cambridge

University Press 2001 doi 10 1017

CBO9780511547089 This includes a lucid

section on probability versus statistics

also available here https www3

csstonybrookedu~skienajaialai

excerptsnode12html

[43] George Smith Isaac Newton In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Fall 2008

Metaphysics Research Lab Stanford

University 2008

[44] Daniel Stoljar and Nic Damnjanovic

The Deflationary Theory of Truth

In The Stanford Encyclopedia of

Philosophy Ed by Edward N Zalta Fall

2014 Metaphysics Research Lab Stanford

University 2014

[45] WA Strauss Partial Differential

Equations An Introduction Wiley 2007

isbn 9780470054567 A thorough and yet

relatively compact introduction

[46] SH Strogatz and M Dichter Nonlinear

Dynamics and Chaos Second Studies

in Nonlinearity Avalon Publishing 2016

isbn 9780813350844

[47] Mark Textor States of Affairs In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Winter 2016

D Complex analysis Bibliography p 9

Metaphysics Research Lab Stanford

University 2016

[48] Pauli Virtanen et al SciPy

10ndashFundamental Algorithms for

Scientific Computing in Python In arXiv

eshyprints arXiv190710121 (July 2019)

arXiv190710121

[49] Wikipedia Algebra mdash Wikipedia The Free

Encyclopedia httpenwikipedia

orgwindexphptitle=Algebraampoldid=

920573802 [Online accessed 26shyOctober-

2019] 2019

[50] Wikipedia Carl Friedrich Gauss mdash

Wikipedia The Free Encyclopedia http

en wikipedia org w index php

title=Carl20Friedrich20Gaussampoldid=

922692291 [Online accessed 26shyOctober-

2019] 2019

[51] Wikipedia Euclid mdash Wikipedia The Free

Encyclopedia httpenwikipedia

orgwindexphptitle=Euclidampoldid=

923031048 [Online accessed 26shyOctober-

2019] 2019

[52] Wikipedia First-order logic mdash

Wikipedia The Free Encyclopedia http

enwikipediaorgwindexphptitle=

First-order20logicampoldid=921437906

[Online accessed 29shyOctobershy2019] 2019

[53] Wikipedia Fundamental interaction mdash

Wikipedia The Free Encyclopedia http

en wikipedia org w index php

title=Fundamental20interactionampoldid=

925884124 [Online accessed 16shyNovember-

2019] 2019

[54] Wikipedia Leonhard Euler mdash Wikipedia

The Free Encyclopedia http en

wikipediaorgwindexphptitle=

Leonhard 20Euler amp oldid = 921824700

[Online accessed 26shyOctobershy2019] 2019

D Complex analysis Bibliography p 10

[55] Wikipedia Linguistic turnmdashWikipedia

The Free Encyclopedia http en

wikipediaorgwindexphptitle=

Linguistic20turnampoldid=922305269

[Online accessed 23shyOctobershy2019] 2019

Hey we all do it

[56] Wikipedia Probability space mdash

Wikipedia The Free Encyclopedia http

enwikipediaorgwindexphptitle=

Probability20spaceampoldid=914939789

[Online accessed 31shyOctobershy2019] 2019

[57] Wikipedia Propositional calculus mdash

Wikipedia The Free Encyclopedia http

en wikipedia org w index php

title=Propositional20calculusampoldid=

914757384 [Online accessed 29shyOctober-

2019] 2019

[58] Wikipedia Quaternion mdash Wikipedia The

Free Encyclopedia httpenwikipedia

orgwindexphptitle=Quaternionamp

oldid=920710557 [Online accessed 26-

Octobershy2019] 2019

[59] Wikipedia Set-builder notation mdash

Wikipedia The Free Encyclopedia http

enwikipediaorgwindexphptitle=

Set-builder20notationampoldid=917328223

[Online accessed 29shyOctobershy2019] 2019

[60] Wikipedia William Rowan Hamilton mdash

Wikipedia The Free Encyclopedia http

enwikipediaorgwindexphptitle=

William 20Rowan 20Hamilton amp oldid =

923163451 [Online accessed 26shyOctober-

2019] 2019

[61] L Wittgenstein PMS Hacker and J

Schulte Philosophical Investigations

Wiley 2010 isbn 9781444307979

[62] Ludwig Wittgenstein Tractatus Logico-

Philosophicus Ed by C familyi=C

given=K Ogden giveni= O Project

Complex analysis Bibliography p 11

Gutenberg International Library of

Psychology Philosophy and Scientific

Method Kegan Paul Trench Trubner amp Co

Ltd 1922 A brilliant work on what is

possible to express in languagemdashand what

is not As Wittgenstein puts it ldquoWhat can

be said at all can be said clearly and

whereof one cannot speak thereof one

must be silentrdquo

[63] Slavoj Žižek Less Than Nothing

Hegel and the Shadow of Dialectical

Materialism Verso 2012 isbn

9781844678976 This is one of the most

interesting presentations of Hegel and

Lacan by one of the most exciting

contemporary philosophers

  • Mathematics itself
    • Truth
      • Neo-classical theories of truth
      • The picture theory
      • The relativity of truth
      • Other ideas about truth
      • Where this leaves us
        • The foundations of mathematics
          • Algebra ex nihilo
          • The application of mathematics to science
          • The rigorization of mathematics
          • The foundations of mathematics are built
          • The foundations have cracks
          • Mathematics is considered empirical
            • Mathematical reasoning
            • Mathematical topics in overview
            • What is mathematics for engineering
            • Exercises for Chapter 01
              • Mathematical reasoning logic and set theory
                • Introduction to set theory
                • Logical connectives and quantifiers
                  • Logical connectives
                  • Quantifiers
                    • Exercises for Chapter 02
                      • Exe 021 hardhat
                      • Exe 022 2
                      • Exe 023 3
                      • Exe 024 4
                          • Probability
                            • Probability and measurement
                            • Basic probability theory
                              • Algebra of events
                                • Independence and conditional probability
                                  • Conditional probability
                                    • Bayes theorem
                                      • Testing outcomes
                                      • Posterior probabilities
                                        • Random variables
                                        • Probability density and mass functions
                                          • Binomial PMF
                                          • Gaussian PDF
                                            • Expectation
                                            • Central moments
                                            • Exercises for Chapter 03
                                              • Exe 031 5
                                                  • Statistics
                                                    • Populations samples and machine learning
                                                    • Estimation of sample mean and variance
                                                      • Estimation and sample statistics
                                                      • Sample mean variance and standard deviation
                                                      • Sample statistics as random variables
                                                      • Nonstationary signal statistics
                                                        • Confidence
                                                          • Generate some data to test the central limit theorem
                                                          • Sample statistics
                                                          • The truth about sample means
                                                          • Gaussian and probability
                                                            • Student confidence
                                                            • Multivariate probability and correlation
                                                              • Marginal probability
                                                              • Covariance
                                                              • Conditional probability and dependence
                                                                • Regression
                                                                • Exercises for Chapter 04
                                                                  • Vector calculus
                                                                    • Divergence surface integrals and flux
                                                                      • Flux and surface integrals
                                                                      • Continuity
                                                                      • Divergence
                                                                      • Exploring divergence
                                                                        • Curl line integrals and circulation
                                                                          • Line integrals
                                                                          • Circulation
                                                                          • Curl
                                                                          • Zero curl circulation and path independence
                                                                          • Exploring curl
                                                                            • Gradient
                                                                              • Gradient
                                                                              • Vector fields from gradients are special
                                                                              • Exploring gradient
                                                                                • Stokes and divergence theorems
                                                                                  • The divergence theorem
                                                                                  • The Kelvin-Stokes theorem
                                                                                  • Related theorems
                                                                                    • Exercises for Chapter 05
                                                                                      • Exe 051 6
                                                                                          • Fourier and orthogonality
                                                                                            • Fourier series
                                                                                            • Fourier transform
                                                                                            • Generalized fourier series and orthogonality
                                                                                            • Exercises for Chapter 06
                                                                                              • Exe 061 secrets
                                                                                              • Exe 062 seesaw
                                                                                              • Exe 063 9
                                                                                              • Exe 064 10
                                                                                              • Exe 065 11
                                                                                              • Exe 066 12
                                                                                              • Exe 067 13
                                                                                              • Exe 068 14
                                                                                              • Exe 069 savage
                                                                                              • Exe 0610 strawman
                                                                                                  • Partial differential equations
                                                                                                    • Classifying PDEs
                                                                                                    • Sturm-liouville problems
                                                                                                      • Types of boundary conditions
                                                                                                        • PDE solution by separation of variables
                                                                                                        • The 1D wave equation
                                                                                                        • Exercises for Chapter 07
                                                                                                          • Exe 071 poltergeist
                                                                                                          • Exe 072 kathmandu
                                                                                                              • Optimization
                                                                                                                • Gradient descent
                                                                                                                  • Stationary points
                                                                                                                  • The gradient points the way
                                                                                                                  • The classical method
                                                                                                                  • The Barzilai and Borwein method
                                                                                                                    • Constrained linear optimization
                                                                                                                      • Feasible solutions form a polytope
                                                                                                                      • Only the vertices matter
                                                                                                                        • The simplex algorithm
                                                                                                                        • Exercises for Chapter 08
                                                                                                                          • Exe 081 cummerbund
                                                                                                                          • Exe 082 melty
                                                                                                                              • Nonlinear analysis
                                                                                                                                • Nonlinear state-space models
                                                                                                                                  • Autonomous and nonautonomous systems
                                                                                                                                  • Equilibrium
                                                                                                                                    • Nonlinear system characteristics
                                                                                                                                      • Those in-common with linear systems
                                                                                                                                      • Stability
                                                                                                                                      • Qualities of equilibria
                                                                                                                                        • Nonlinear system simulation
                                                                                                                                        • Simulating nonlinear systems in Python
                                                                                                                                        • Exercises for Chapter 09
                                                                                                                                          • Distribution tables
                                                                                                                                            • Gaussian distribution table
                                                                                                                                            • Students t-distribution table
                                                                                                                                              • Fourier and Laplace tables
                                                                                                                                                • Laplace transforms
                                                                                                                                                • Fourier transforms
                                                                                                                                                  • Mathematics reference
                                                                                                                                                    • Quadratic forms
                                                                                                                                                      • Completing the square
                                                                                                                                                        • Trigonometry
                                                                                                                                                          • Triangle identities
                                                                                                                                                          • Reciprocal identities
                                                                                                                                                          • Pythagorean identities
                                                                                                                                                          • Co-function identities
                                                                                                                                                          • Even-odd identities
                                                                                                                                                          • Sum-difference formulas (AM or lock-in)
                                                                                                                                                          • Double angle formulas
                                                                                                                                                          • Power-reducing or half-angle formulas
                                                                                                                                                          • Sum-to-product formulas
                                                                                                                                                          • Product-to-sum formulas
                                                                                                                                                          • Two-to-one formulas
                                                                                                                                                            • Matrix inverses
                                                                                                                                                            • Laplace transforms
                                                                                                                                                              • Complex analysis
                                                                                                                                                                • Eulers formulas
                                                                                                                                                                  • Bibliography
Page 2: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview

Contents

01 Mathematics itself 8

itselftruTruth 9

Neo-classical theories of truth 9

The picture theory 12

The relativity of truth 15

Other ideas about truth 16

Where this leaves us 21

itselffoundThe foundations of mathematics 23

Algebra ex nihilo 24

The application of mathematics to

science 25

The rigorization of mathematics 25

The foundations of mathematics

are built 26

The foundations have cracks 29

Mathematics is considered empirical 30

itselfreasonMathematical reasoning 32

itselfoverviewMathematical topics in

overview 33

itselfengmathWhat is mathematics for

engineering 34

itselfex Exercises for Chapter 01 35

02 Mathematical reasoning logic

and set theory 36

setssetintroIntroduction to set theory 38

setslogicLogical connectives and

quantifiers 43

Contents Contents p 3

Logical connectives 43

Quantifiers 43

setsex Exercises for Chapter 02 45

Exe 021 hardhat 45

Exe 022 2 45

Exe 023 3 45

Exe 024 4 45

03 Probability 46

probmeasProbability and measurement 47

probprobBasic probability theory 48

Algebra of events 49

probconditionIndependence and

conditional probability 51

Conditional probability 52

probbay Bayesrsquo theorem 54

Testing outcomes 54

Posterior probabilities 55

probrandoRandom variables 60

probpxf Probability density and

mass functions 63

Binomial PMF 64

Gaussian PDF 70

probE Expectation 72

probmomentsCentral moments 75

probex Exercises for Chapter 03 78

Exe 031 5 78

04 Statistics 79

statstermsPopulations samples and

machine learning 81

statssampleEstimation of sample mean

and variance 83

Estimation and sample statistics 83

Sample mean variance and

standard deviation 83

Sample statistics as random

variables 85

Nonstationary signal statistics 86

statsconfidenceConfidence 98

Contents Contents p 4

Generate some data to test the

central limit theorem 99

Sample statistics 101

The truth about sample means 101

Gaussian and probability 102

statsstudentStudent confidence 110

statsmultivarMultivariate probability

and correlation 115

Marginal probability 117

Covariance 118

Conditional probability and

dependence 122

statsregressionRegression 125

statsex Exercises for Chapter 04 131

05 Vector calculus 132

vecsdiv Divergence surface

integrals and flux 135

Flux and surface integrals 135

Continuity 135

Divergence 136

Exploring divergence 137

vecscurlCurl line integrals and

circulation 144

Line integrals 144

Circulation 144

Curl 145

Zero curl circulation and path

independence 146

Exploring curl 147

vecsgradGradient 153

Gradient 153

Vector fields from gradients are

special 154

Exploring gradient 154

vecsstokedStokes and divergence

theorems 164

The divergence theorem 164

The Kelvin-Stokesrsquo theorem 165

Related theorems 165

vecsex Exercises for Chapter 05 166

Contents Contents p 5

Exe 051 6 166

06 Fourier and orthogonality 168

fourseriesFourier series 169

fourtransFourier transform 174

fourgeneralGeneralized fourier series

and orthogonality 184

fourex Exercises for Chapter 06 186

Exe 061 secrets 186

Exe 062 seesaw 189

Exe 063 9 189

Exe 064 10 190

Exe 065 11 190

Exe 066 12 190

Exe 067 13 190

Exe 068 14 190

Exe 069 savage 191

Exe 0610 strawman 192

07 Partial differential equations 193

pdeclassClassifying PDEs 196

pdesturmSturm-liouville problems 199

Types of boundary conditions 200

pdeseparationPDE solution by separation of

variables 205

pdewaveThe 1D wave equation 212

pdeex Exercises for Chapter 07 219

Exe 071 poltergeist 219

Exe 072 kathmandu 220

08 Optimization 222

optgrad Gradient descent 223

Stationary points 223

The gradient points the way 224

The classical method 225

The Barzilai and Borwein method 226

optlin Constrained linear

optimization 235

Feasible solutions form a polytope 235

Only the vertices matter 236

optsimplexThe simplex algorithm 238

optex Exercises for Chapter 08 245

Contents Contents p 6

Exe 081 cummerbund 245

Exe 082 melty 245

09 Nonlinear analysis 246

nlinss Nonlinear state-space models 248

Autonomous and nonautonomous

systems 248

Equilibrium 248

nlinchar Nonlinear system

characteristics 250

Those in-common with linear

systems 250

Stability 250

Qualities of equilibria 251

nlinsim Nonlinear system simulation 254

nlinpysimSimulating nonlinear systems

in Python 255

nlinex Exercises for Chapter 09 257

A Distribution tables 258

A01 Gaussian distribution table 259

A02 Studentrsquos tshydistribution table 261

B Fourier and Laplace tables 262

B01 Laplace transforms 263

B02 Fourier transforms 265

C Mathematics reference 268

C01 Quadratic forms 269

Completing the square 269

C02 Trigonometry 270

Triangle identities 270

Reciprocal identities 270

Pythagorean identities 270

Co-function identities 271

Even-odd identities 271

Sumshydifference formulas (AM or

lock-in) 271

Double angle formulas 271

Power-reducing or halfshyangle

formulas 271

Sum-toshyproduct formulas 272

Contents Contents p 7

Product-to-sum formulas 272

Two-to-one formulas 272

C03 Matrix inverses 273

C04 Laplace transforms 274

D Complex analysis 275

D01 Eulerrsquos formulas 276

Bibliography 277

itself

Mathematics itself

itself Mathematics itself tru Truth p 1

1 For much of this lecture I rely on the thorough overview

of Glanzberg (Michael Glanzberg Truth In The Stanford

Encyclopedia of Philosophy Ed by Edward N Zalta Fall 2018

Metaphysics Research Lab Stanford University 2018)

correspondence theory

itselftru Truth

1 Before we can discuss mathematical truth

we should begin with a discussion of truth

itself1 It is important to note that this is

obviously extremely incomplete My aim is

to give a sense of the subject via brutal

(mis)abbreviation

2 Of course the study of truth cannot but

be entangled with the study of the world

as such (metaphysics) and of knowledge

(epistemology) Some of the following theories

presuppose or imply a certain metaphysical

andor epistemological theory but which these

are is controversial

Neo-classical theories of truth

3 The neo-classical theories of truth

take for granted that there is truth and

attempt to explain what its precise nature is

(Michael Glanzberg Truth In The Stanford

Encyclopedia of Philosophy Ed by Edward N

Zalta Fall 2018 Metaphysics Research Lab

Stanford University 2018) What are provided

here are modern understandings of theories

developed primarily in the early 20th century

The correspondence theory

4 A version of what is called the

correspondence theory of truth is the

following

A proposition is true iff there is an

existing entity in the world that

corresponds with it

itself Mathematics itself tru Truth p 1

facts

false

coherence theory

2 This is typically put in terms of ldquobeliefsrdquo or ldquojudgmentsrdquo but for

brevity and parallelism I have cast it in terms of propositions It is

to this theory I have probably committed the most violence

coherence

Such existing entities are called facts Facts

are relational in that their parts (eg subject

predicate etc) are related in a certain way

5 Under this theory then if a proposition

does not correspond to a fact it is false

6 This theory of truth is rather intuitive

and consistently popular (Marian David The

Correspondence Theory of Truth In The

Stanford Encyclopedia of Philosophy Ed

by Edward N Zalta Fall 2016 Metaphysics

Research Lab Stanford University 2016 A

detailed overview of the correspondence

theory of truth)

The coherence theory

7 The coherence theory of truth is adamant

that the truth of any given proposition

is only as good as its holistic system of

propositions2 This includes (but typically

goes beyond) a requirement for consistency of

a given proposition with the whole and the

self-consistency of the whole itselfmdashsometimes

called coherence

8 For parallelism letrsquos attempt a succinct

formulation of this theory cast in terms of

propositions

A proposition is true iff it is

has coherence with a system of

propositions

itself Mathematics itself tru Truth p 1

pragmatism

3 Pragmatism was an American philosophical movement of the early

20th century that valued the success of ldquopracticalrdquo application of

theories For an introduction see Legg and Hookway (Catherine

Legg and Christopher Hookway Pragmatism In The Stanford

Encyclopedia of Philosophy Ed by Edward N Zalta Spring 2019

Metaphysics Research Lab Stanford University 2019 An introductory

article on the philsophical movement ldquopragmatismrdquo It includes an

important clarification of the pragmatic slogan ldquotruth is the end of

inquiryrdquo)

4 This is especially congruent with the work of William James (ibid)

9 Note that this has no reference to facts

whatsoever However it need not necessarily

preclude them

The pragmatic theory

10 Of the neo-classical theories of truth this

is probably the least agreed upon as having a

single clear statement (Glanzberg Truth)

However as with pragmatism in general3 the

pragmatic truth is oriented practically

11 Perhaps the most important aspect

of this theory is that it is thoroughly a

correspondence theory agreeing that true

propositions are those that correspond to

the world However there is a different

focus here that differentiates it from

correspondence theory proper it values as

more true that which has some sort of practical

use in human life

12 Wersquoll try to summarize pragmatism in two

slogans with slightly different emphases herersquos

the first again cast in propositional parallel

A proposition is true iff it works4

Now there are two ways this can be

understood (a) the proposition ldquoworksrdquo in

that it empirically corresponds to the world

or (b) the proposition ldquoworksrdquo in that it has

an effect that some agent intends The former

is pretty standard correspondence theory The

itself Mathematics itself tru Truth p 1

5 This is especially congruent with the work of Charles Sanders

Peirce (Legg and Hookway Pragmatism)

inquiry

facts

latter is new and fairly obviously has ethical

implications especially today

13 Let us turn to a second formulation

A proposition is true if it

corresponds with a process of

inquiry5

This has two interesting facets (a) an agentrsquos

active inquiry creates truth and (b) it is a

sort of correspondence theory that requires a

correspondence of a proposition with a process

of inquiry not as in the correspondence theory

with a fact about the world The latter has

shades of both correspondence theory and

coherence theory

The picture theory

Before we delve into this theory we must take

a moment to clarify some terminology

States of affairs and facts

14 When discussing the correspondence

theory we have used the term fact to mean

an actual state of things in the world A

problem arises in the correspondence theory

here It says that a proposition is true iff

there is a fact that corresponds with it What

of a negative proposition like ldquothere are no

cows in Antarcticardquo We would seem to need

a corresponding ldquonegative factrdquo in the world

to make this true If a fact is taken to be

itself Mathematics itself tru Truth p 1

6 But Barker and Jago (Stephen Barker and Mark Jago Being

Positive About Negative Facts In Philosophy and Phenomenological

Research 851 [2012] pp 117ndash138 DOI 10 1111 j 1933 -1592201000479x) have attempted just that

state of affairs

obtain

model

picture

7 See Wittgenstein (Ludwig Wittgenstein Tractatus Logico-

Philosophicus Ed by C familyi=C given=K Ogden giveni= O

Project Gutenberg International Library of Psychology Philosophy

and Scientific Method Kegan Paul Trench Trubner amp Co Ltd 1922

A brilliant work on what is possible to express in languagemdashand

what is not As Wittgenstein puts it ldquoWhat can be said at all can

be said clearly and whereof one cannot speak thereof one must be

silentrdquo) Biletzki and Matar (Anat Biletzki and Anat Matar Ludwig

Wittgenstein In The Stanford Encyclopedia of Philosophy Ed by

Edward N Zalta Summer 2018 Metaphysics Research Lab Stanford

University 2018 An introduction to Wittgenstein and his thought)

glock2016 () and Dolby (David Dolby Wittgenstein on Truth In

A Companion to Wittgenstein John Wiley amp Sons Ltd 2016 Chap 27

pp 433ndash442 ISBN 9781118884607 DOI 1010029781118884607ch27)

composed of a complex of actual objects and

relations it is hard to imagine such facts6

15 Furthermore if a proposition is true it

seems that it is the corresponding fact that

makes it so what then makes a proposition

false since there is no fact to support the

falsity (Mark Textor States of Affairs In

The Stanford Encyclopedia of Philosophy Ed

by Edward N Zalta Winter 2016 Metaphysics

Research Lab Stanford University 2016)

16 And what of nonsense There are some

propositions like ldquothere is a round cuberdquo

that are neither true nor false However

the preceding correspondence theory cannot

differentiate between false and nonsensical

propositions

17 A state of affairs is something possible

that may or may not be actual (ibid) If

a state of affairs is actual it is said to

obtain The picture theory will make central

this concept instead of that of the fact

The picture theory of meaning (and truth)

18 The picture theory of meaning uses the

analogy of the model or picture to explain

the meaningfulness of propositions7

A proposition names possible objects

and arranges these names to

correspond to a state of affairs

itself Mathematics itself tru Truth p 1

Figure 011 a representation of the picture theory

nonsense

See Figure 011 This also allows for an easy

account of truth falsity and nonsense

nonsense A sentence that appears to be

a proposition is actually not if the

arrangement of named objects is

impossible Such a sentence is simply

nonsense

truth A proposition is true if the state of

affairs it depicts obtains

falsity A proposition is false if the state of

affairs it depicts does not obtain

19 Now some (Hans Johann Glock Truth

in the Tractatus In Synthese 1482 [Jan

2006] pp 345ndash368 ISSN 1573-0964 DOI 10

1007 s11229 - 004 - 6226 - 2) argue this is

a correspondence theory and others (David

Dolby Wittgenstein on Truth In A Companion

to Wittgenstein John Wiley amp Sons Ltd 2016

Chap 27 pp 433ndash442 ISBN 9781118884607 DOI

1010029781118884607ch27) that it is not In

any case it certainly solves some issues that

have plagued the correspondence theory

ldquoWhat cannot be said must be shownrdquo

20 Something the picture theory does is

declare a limit on what can meaningfully be

said A proposition (as defined above) must be

potentially true or false Therefore something

that cannot be false (something necessarily

true) cannot be a proposition (ibid) And

there are certain things that are necessarily

true for language itself to be meaningfulmdash

paradigmatically the logical structure of the

world What a proposition does then is show

itself Mathematics itself tru Truth p 1

8 See also ( Slavoj Žižek Less Than Nothing Hegel and the Shadow

of Dialectical Materialism Verso 2012 ISBN 9781844678976 This is

one of the most interesting presentations of Hegel and Lacan by one

of the most exciting contemporary philosophers pp 25shy6) from whom I

stole the section title

language itself

9 This was one of the contributions to the ldquolinguistic turnrdquo

(Wikipedia Linguistic turnmdashWikipedia The Free Encyclopedia

http en wikipedia org w index php title =Linguistic 20turn amp oldid = 922305269 [Online accessed23shyOctobershy2019] 2019 Hey we all do it) of philosophy in the early

20th century

perspective

skepticism

via its own logical structure the necessary

(for there to be meaningful propositions at all)

logical structure of the world8

21 An interesting feature of this perspective

is that it opens up language itself to analysis

and limitation9 And furthermore it suggests

that the set of what is is smaller than the set

of what can be meaningfully spoken about

The relativity of truth

22 Each subject (ie agent) in the world with

their propositions has a perspective a given

moment a given place an historical-cultural-

linguistic situation At the very least the truth

of propositions must account for this Just

how a theory of truth should do so is a matter

of significant debate (Maria Baghramian and

J Adam Carter Relativism In The Stanford

Encyclopedia of Philosophy Ed by Edward N

Zalta Winter 2019 Metaphysics Research Lab

Stanford University 2019)

23 Some go so far as to be skeptical

about truth (Peter Klein Skepticism In The

Stanford Encyclopedia of Philosophy Ed by

Edward N Zalta Summer 2015 Metaphysics

Research Lab Stanford University 2015)

regarding it to be entirely impossible Others

say that while a proposition may or may not be

true we could never come to know this

24 Often underlying this conversation is the

question of there being a common world in

which we all participate and if so whether

or not we can properly represent this world

itself Mathematics itself tru Truth p 1

10 Especially notable here is the work of Alfred Tarski in the

midshy20th century

beliefs

sentences

interpreted sentences

quasi-quotation

in language such that multiple subjects could

come to justifiably agree or disagree on the

truth of a proposition If every proposition

is so relative that it is relevant to only the

proposer truth would seem of little value On

the other hand if truth is understood to

be ldquoobjectiverdquomdashindependent of subjective

perspectivemdasha number of objections can be

made (Baghramian and Carter Relativism) such

as that there is no non-subjective perspective

from which to judge truth

Other ideas about truth

25 There are too many credible ideas

about truth to attempt a reasonable summary

however I will attempt to highlight a few

important ones

Formal methods

26 A set of tools was developed for

exploring theories of truth especially

correspondence theories10 Focus turned

from beliefs to sentences which are akin to

propositions (Recall that the above theories

have already been recast in the more modern

language of propositions) Another aspect

of these sentences under consideration is

that they begin to be taken as interpreted

sentences they are already have meaning

27 Beyond this several technical apparatus

are introduced that formalize criteria for

truth For instance a sentence is given a

sign φ A need arises to distinguish between

the quotation of sentence φ and the unqoted

sentence φ which is then given the quasi-

quotation notation pφq For instance let φ

stand for snow is white then φ rarr snow is

itself Mathematics itself tru Truth p 1

Convention T

11 Wilfrid Hodges Tarskirsquos Truth Definitions In The Stanford

Encyclopedia of Philosophy Ed by Edward N Zalta Fall 2018

Metaphysics Research Lab Stanford University 2018 Mario Goacutemez-

Torrente Alfred Tarski In The Stanford Encyclopedia of

Philosophy Ed by Edward N Zalta Spring 2019 Metaphysics Research

Lab Stanford University 2019 Peter Hylton and Gary Kemp Willard

Van Orman Quine In The Stanford Encyclopedia of Philosophy Ed

by Edward N Zalta Spring 2019 Metaphysics Research Lab Stanford

University 2019

redundancy theory

white and pφq rarr lsquosnow is whitersquo Tarski

introduces Convention T which states that

for a fixed language L with fully interpreted

sentences (Glanzberg Truth)

An adequate theory of truth for L

must imply for each sentence φ of L

pφq is true if and only if φ

Using the same example then

lsquosnow is whitersquo if and only if snow

is white

Convention T states a general rule for the

adequacy of a theory of truth and is used in

several contemporary theories

28 We can see that these formal methods get

quite technical and fun For more see Hodges

GoacutemezshyTorrente and Hylton and Kemp11

Deflationary theories

29 Deflationary theories of truth try to

minimize or eliminate altogether the concept

of or use of the term lsquotruthrsquo For instance the

redundancy theory claim that (Glanzberg

Truth)

itself Mathematics itself tru Truth p 1

12 Daniel Stoljar and Nic Damnjanovic The Deflationary Theory of

Truth In The Stanford Encyclopedia of Philosophy Ed by Edward N

Zalta Fall 2014 Metaphysics Research Lab Stanford University 2014

truth-bearer

meaning

theory of use

To assert that pφq is true is just to

assert that φ

Therefore we can eliminate the use of lsquois truersquo

30 For more of less see Stoljar and

Damnjanovic12

Language

31 It is important to recognize that language

mediates truth that is truth is embedded in

language The way language in general affects

theories of truth has been studied extensively

For instance whether the truth-bearer is

a belief or a proposition or a sentencemdashor

something elsemdashhas been much discussed The

importance of the meaning of truth-bearers

like sentences has played another large role

Theories of meaning like the picture theory

presented above are often closely intertwined

with theories of truth

32 One of the most popular theories of

meaning is called the theory of use

For a large class of cases of the

employment of the word ldquomeaningrdquo

ndash though not for all ndash this word

can be explained in this way the

meaning of a word is its use in

the language (L Wittgenstein PMS

Hacker and J Schulte Philosophical

itself Mathematics itself tru Truth p 1

languageshygames

metaphysical realism

scientific realism

13 Drew Khlentzos Challenges to Metaphysical Realism In The

Stanford Encyclopedia of Philosophy Ed by Edward N Zalta Winter

2016 Metaphysics Research Lab Stanford University 2016

antishyrealism

Investigations Wiley 2010 ISBN

9781444307979)

This theory is accompanied by the concept of

languageshygames which are loosely defined as

rule-based contexts within which sentences

have uses The idea is that the meaning of

a given sentence is its use in a network of

meaning that is constantly evolving This view

tends to be understood as deflationary or

relativistic about truth

Metaphysical and epistemological

considerations

33 We began with the recognition that

truth is intertwined with metaphysics and

epistemology Letrsquos consider a few such topics

34 The first is metaphysical realism

which claims that there is a world existing

objectively independently of how we think

about or describe it This ldquorealismrdquo tends to be

closely tied to yet distinct from scientific

realism which goes further claiming the

world is ldquoactuallyrdquo as science describes

independently of the scientific descriptions

(eg there are actual objects corresponding to

the phenomena we call atoms molecules light

particles etc)

35 There have been many challenges to the

realist claim (for some recent versions see

Khlentzos13) put forth by what is broadly

called antishyrealism These vary but often

challenge the ability of realists to properly

itself Mathematics itself tru Truth p 2

Metaphysical idealism

Epistemological idealism

14 These definitions are explicated by Guyer and Horstmann (Paul

Guyer and Rolf-Peter Horstmann Idealism In The Stanford

Encyclopedia of Philosophy Ed by Edward N Zalta Winter 2018

Metaphysics Research Lab Stanford University 2018)

noumenal world

phenomenal world

link language to supposed objects in the

world

36 Metaphysical idealism has been

characterized as claiming that ldquomindrdquo or

ldquosubjectivityrdquo generate or completely compose

the world which has no being outside mind

Epistemological idealism on the other hand

while perhaps conceding that there is a world

independent of mind claims all knowledge

of the world is created through mind and

for mind and therefore can never escape a

sort of mindshyworld gap14 This epistemological

idealism has been highly influential since

the work of Immanuel Kant (I Kant P Guyer

and AW Wood Critique of Pure Reason The

Cambridge Edition of the Works of Immanuel

Kant Cambridge University Press 1999 ISBN

9781107268333) in the late 18th century

which ushered in the idea of the noumenal

world in-itself and the phenomenal world

which is how the noumenal world presents to

us Many have held that phenomena can be

known through inquiry whereas noumena are

inaccessible Furthermore what can be known is

restricted by the categories pre-existent in

the knower

37 Another approach taken by Georg Wilhelm

Friedrich Hegel (Paul Redding Georg

Wilhelm Friedrich Hegel In The Stanford

Encyclopedia of Philosophy Ed by Edward N

Zalta Summer 2018 Metaphysics Research

Lab Stanford University 2018) and other

German idealists following Kant is to reframe

reality as thoroughly integrating subjectivity

(GWF Hegel and AV Miller Phenomenology

of Spirit Motilal Banarsidass 1998 ISBN

itself Mathematics itself tru Truth p 1

Notion

9788120814738 Slavoj Žižek Less Than

Nothing Hegel and the Shadow of Dialectical

Materialism Verso 2012 ISBN 9781844678976

This is one of the most interesting presentations

of Hegel and Lacan by one of the most exciting

contemporary philosophers) that is ldquoeverything

turns on grasping and expressing the True not

only as Substance but equally as Subjectrdquo

A subjectrsquos proposition is true inasmuch as it

corresponds with its Notion (approximately

the idea or meaning for the subject) Some

hold that this idealism is compatible with a

sort of metaphysical realism at least as far

as understanding is not independent of but

rather beholden to reality (Žižek Less Than

Nothing Hegel and the Shadow of Dialectical

Materialism p 906 ff)

38 Clearly all these ideas have many

implications for theories of truth and vice

versa

Where this leaves us

39 The truth is hard What may at first

appear to be a simple concept becomes complex

upon analysis It is important to recognize

that we have only sampled some highlights of

the theories of truth I recommend further

study of this fascinating topic

40 Despite the difficulties of finding

definitive grounds for understanding truth

we are faced with the task of provisionally

forging ahead Much of what follows in

the study of mathematics makes certain

implicit and explicit assumptions about truth

However we have found that the foundations

itself Mathematics itself found Truth p 2

of these assumptions may themselves be

problematic It is my contention that despite

the lack of clear foundations it is still worth

studying engineering analysis its mathematical

foundations and the foundations of truth

itself My justification for this claim is that

I find the utility and the beauty of this study

highly rewarding

itself Mathematics itself found The foundations of mathematics p 1

axioms

deduction

proof

theorems

15 Throughout this section for the history of mathematics I rely

heavily on Kline (M Kline Mathematics The Loss of Certainty A

Galaxy book Oxford University Press 1982 ISBN 9780195030853 A

detailed account of the ldquoillogicalrdquo development of mathematics and

an exposition of its therefore remarkable utility in describing the

world)

Euclid

geometry

Gauss

itselffound The foundations of

mathematics

1 Mathematics has long been considered

exemplary for establishing truth Primarily

it uses a method that begins with

axiomsmdashunproven propositions that include

undefined termsmdashand uses logical deduction

to prove other propositions (theorems) to

show that they are necessarily true if the

axioms are

2 It may seem obvious that truth established

in this way would always be relative to the

truth of the axioms but throughout history

this footnote was often obscured by the

ldquoobviousrdquo or ldquointuitiverdquo universal truth of the

axioms15 For instance Euclid (Wikipedia

Euclid mdash Wikipedia The Free Encyclopedia

http en wikipedia org w index php

title = Euclid amp oldid = 923031048 [Online

accessed 26shyOctobershy2019] 2019) founded

geometrymdashthe study of mathematical objects

traditionally considered to represent physical

space like points lines etcmdashon axioms thought

so solid that it was not until the early 19th

century that Carl Friedrich Gauss (Wikipedia

Carl Friedrich Gauss mdash Wikipedia The Free

Encyclopedia httpenwikipediaorgw

indexphptitle=Carl20Friedrich20Gaussamp

oldid=922692291 [Online accessed 26shyOctober-

2019] 2019) and others recognized this was

only one among many possible geometries (M

Kline Mathematics The Loss of Certainty A

Galaxy book Oxford University Press 1982

ISBN 9780195030853 A detailed account of

the ldquoillogicalrdquo development of mathematics

and an exposition of its therefore remarkable

utility in describing the world) resting on

itself Mathematics itself found The foundations of mathematics p 1

Aristotle

algebra

rational numbers

natural numbers

integers

irrational numbers

different axioms Furthermore Aristotle

(Christopher Shields Aristotle In The

Stanford Encyclopedia of Philosophy Ed by

Edward N Zalta Winter 2016 Metaphysics

Research Lab Stanford University 2016) had

acknowledged that reasoning must begin

with undefined terms however even Euclid

(presumably aware of Aristotlersquos work) seemed

to forget this and provided definitions

obscuring the foundations of his work and

starting mathematics on a path that for over

2000 years would forget its own relativity

(Kline Mathematics The Loss of Certainty p 101-

2)

3 The foundations of Euclid were even

shakier than its murky starting point several

unstated axioms were used in proofs and some

proofs were otherwise erroneous However for

two millennia mathematics was seen as the field

wherein truth could be established beyond

doubt

Algebra ex nihilo

4 Although not much work new geometry

appeared during this period the field of

algebra (Wikipedia Algebra mdash Wikipedia The

Free Encyclopedia httpenwikipediaorg

windexphptitle=Algebraampoldid=920573802

[Online accessed 26shyOctobershy2019] 2019)mdashthe

study of manipulations of symbols standing for

numbers in generalmdashbegan with no axiomatic

foundation whatsoever The Greeks had a notion

of rational numbers ratios of natural numbers

(positive integers) and it was known that

many solutions to algebraic equations were

irrational (could not be expressed as a ratio

of integers) But these irrational numbers like

virtually everything else in algebra were

itself Mathematics itself found The foundations of mathematics p 1

negative numbers

imaginary numbers

complex numbers

optics

astronomy

calculus

Newtonian mechanics

non-Euclidean geometry

gradually accepted because they were so

useful in solving practical problems (they

could be approximated by rational numbers and

this seemed good enough) The rules of basic

arithmetic were accepted as applying to these

and other forms of new numbers that arose in

algebraic solutions negative imaginary and

complex numbers

The application of mathematics to science

5 During this time mathematics was being

applied to optics and astronomy Sir

Isaac Newton then built calculus upon

algebra applying it to what is now known as

Newtonian mechanics which was really more

the product of Leonhard Euler (George Smith

Isaac Newton In The Stanford Encyclopedia

of Philosophy Ed by Edward N Zalta Fall

2008 Metaphysics Research Lab Stanford

University 2008 Wikipedia Leonhard

Euler mdash Wikipedia The Free Encyclopedia

httpenwikipediaorgwindexphptitle=

Leonhard20Eulerampoldid=921824700 [Online

accessed 26shyOctobershy2019] 2019) Calculus

introduced its own dubious operations but

the success of mechanics in describing and

predicting physical phenomena was astounding

Mathematics was hailed as the language of

God (later Nature)

The rigorization of mathematics

6 It was not until Gauss created non-

Euclidean geometry in which Euclidrsquos were

shown to be one of many possible axioms

compatible with the world and William Rowan

Hamilton (Wikipedia William Rowan Hamilton

mdash Wikipedia The Free Encyclopedia http

en wikipedia org w index php

title=William20Rowan20Hamiltonampoldid=

923163451 [Online accessed 26shyOctober-

itself Mathematics itself found The foundations of mathematics p 1

quaternions

rigorization

symbolic logic

mathematical model

16 The branch of mathematics called model theory concerns itself

with general types of models that can be made from a given formal

system like an axiomatic mathematical system For more on model

theory see Hodges (Wilfrid Hodges Model Theory In The Stanford

Encyclopedia of Philosophy Ed by Edward N Zalta Fall 2018

Metaphysics Research Lab Stanford University 2018) It is noteworthy

that the engineeringscience use of the term ldquomathematical modelrdquo is

only loosely a ldquomodelrdquo in the sense of model theory

implicit definition

2019] 2019) created quaternions (Wikipedia

Quaternion mdash Wikipedia The Free Encyclopedia

http en wikipedia org w index

php title = Quaternion amp oldid = 920710557

[Online accessed 26shyOctobershy2019] 2019)

a number system in which multiplication is

noncommunicative that it became apparent

somethingwas fundamentally wrong with the

way truth in mathematics had been understood

This started a period of rigorization in

mathematics that set about axiomatizing and

proving 19th century mathematics This included

the development of symbolic logic which

aided in the process of deductive reasoning

7 An aspect of this rigorization is that

mathematicians came to terms with the axioms

that include undefined terms For instance

a ldquopointrdquo might be such an undefined term in

an axiom A mathematical model is what we

create when we attach these undefined terms

to objects which can be anything consistent

with the axioms16 The system that results

from proving theorems would then apply to

anything ldquoproperlyrdquo described by the axioms

So two masses might be assigned ldquopointsrdquo in

a Euclidean geometric space from which

we could be confident that for instance

the ldquodistancerdquo between these masses is the

Euclidean norm of the line drawn between the

points It could be said then that a ldquopointrdquo in

Euclidean geometry is implicitly defined by

its axioms and theorems and nothing else That

is mathematical objects are not inherently

tied to the physical objects to which we tend

to apply them Euclidean geometry is not

the study of physical space as it was long

consideredmdashit is the study of the objects

implicitly defined by its axioms and theorems

The foundations of mathematics are built

itself Mathematics itself found The foundations of mathematics p 2

consistent

theorem

Set theory

foundation

sets

ZFC set theory

8 The building of the modern foundations

mathematics began with clear axioms

solid reasoning (with symbolic logic) and

lofty yet seemingly attainable goals prove

theorems to support the already ubiquitous

mathematical techniques in geometry algebra

and calculus from axioms furthermore prove

that these axioms (and things they imply) do

not contradict each other ie are consistent

and that the axioms are not results of each

other (one that can be derived from others is

a theorem not an axiom)

9 Set theory is a type of formal axiomatic

system that all modern mathematics is

expressed with so set theory is often called

the foundation of mathematics (Joan Bagaria

Set Theory In The Stanford Encyclopedia

of Philosophy Ed by Edward N Zalta Fall

2019 Metaphysics Research Lab Stanford

University 2019) We will study the basics

in The primary objects in set theory are

sets informally collections of mathematical

objects There is not just one a single set

of axioms that is used as the foundation of

all mathematics for reasons will review in a

moment However the most popular set theory

is ZermeloshyFraenkel set theory with the

axiom of choice (ZFC) The axioms of ZF sans C

are as follows (ibid)

extensionality If two sets A and B have the

same elements then they are equal

empty set There exists a set denoted by emptyand called the empty set which has no

elements

pair Given any sets A and B there exists a set

itself Mathematics itself found The foundations of mathematics p 1

denoted by A B which contains A and B

as its only elements In particular there

exists the set A which has A as its only

element

power set For every set A there exists a set

denoted by P(A) and called the power set

of A whose elements are all the subsets of

A

union For every set A there exists a set

denoted by⋃A and called the union of

A whose elements are all the elements of

the elements of A

infinity There exists an infinite set In

particular there exists a set Z that

contains empty and such that if A isin Z then⋃A A isin Z

separation For every set A and every given

property there is a set containing

exactly the elements of A that have

that property A property is given by a

formula ϕ of the first-order language

of set theory Thus separation is not a

single axiom but an axiom schema that is

an infinite list of axioms one for each

formula ϕ

replacement For every given definable

function with domain a set A there is a

set whose elements are all the values of

the function

10 ZFC also has the axiom of choice (Bagaria

Set Theory)

choice For every set A of pairwiseshydisjoint

non-empty sets there exists a set that

contains exactly one element from each

set in A

itself Mathematics itself found The foundations of mathematics p 2

first incompleteness theorem

incomplete

17 Panu Raatikainen Goumldelrsquos Incompleteness Theorems In The

Stanford Encyclopedia of Philosophy Ed by Edward N Zalta Fall

2018 Metaphysics Research Lab Stanford University 2018 A through

and contemporary description of Goumldelrsquos incompleteness theorems

which have significant implications for the foundations and function

of mathematics and mathematical truth

undecidable

second incompleteness theorem

The foundations have cracks

11 The foundationalistsrsquo goal was to prove

that some set of axioms from which all of

mathematics can be derived is both consistent

(contains no contradictions) and complete

(every true statement is provable) The work of

Kurt Goumldel (Juliette Kennedy Kurt Goumldel In

The Stanford Encyclopedia of Philosophy Ed

by Edward N Zalta Winter 2018 Metaphysics

Research Lab Stanford University 2018) in

the mid 20th century shattered this dream by

proving in his first incompleteness theorem

that any consistent formal system within which

one can do some amount of basic arithmetic is

incomplete His argument is worth reviewing

(see Raatikainen17) but at its heart is an

undecidable statement like ldquoThis sentence is

unprovablerdquo Let U stand for this statement If

it is true it is unprovable If it is provable it is

false Therefore it is true iff it is provable

Then he shows that if a statement A that

essentially says ldquoarithmetic is consistentrdquo is

provable then so is the undecidable statement

U But if U is to be consistent it cannot be

provable and therefore neither can A be

provable

12 This is problematic It tells us virtually

any conceivable axiomatic foundation of

mathematics is incomplete If one is complete

it is inconsistent (and therefore worthless) One

problem this introduces is that a true theorem

may be impossible to prove but it turns out we

can never know that in advance of its proof if

it is provable

13 But it gets worse Goumldelrsquos second

itself Mathematics itself found The foundations of mathematics p 1

material implication

Banach-Tarski paradox

empirical

18 Kline Mathematics The Loss of Certainty

incompleteness theorem shows that such

systems cannot even be shown to be consistent

This means at any moment someone could find

an inconsistency in mathematics and not only

would we lose some of the theorems we would

lose them all This is because by what is called

the material implication (Kline Mathematics

The Loss of Certainty pp 187-8 264) if one

contradiction can be found every proposition

can be proven from it And if this is the case

all (even proven) theorems in the system would

be suspect

14 Even though no contradiction has yet

appeared in ZFC its axiom of choice which

is required for the proof of most of what

has thus far been proven generates the

Banach-Tarski paradox that says a sphere

of diameter x can be partitioned into a finite

number of pieces and recombined to form

two spheres of diameter x Troubling to say

the least Attempts were made for a while to

eliminate the use of the axiom of choice but

our buddy Goumldel later proved that if ZF is

consistent so is ZFC (ibid p 267)

Mathematics is considered empirical

15 Since its inception mathematics has been

applied extensively to the modeling of the

world Despite its cracked foundations it has

striking utility Many recent leading minds of

mathematics philosophy and science suggest

we treat mathematics as empirical like any

science subject to its success in describing

and predicting events in the world As Kline18

summarizes

itself Mathematics itself reason The foundations of mathematics p 2

The upshot [hellip] is that sound

mathematics must be determined

not by any one foundation which

may some day prove to be right The

ldquocorrectnessrdquo of mathematics must

be judged by its application to the

physical world Mathematics is an

empirical science much as Newtonian

mechanics It is correct only to the

extent that it works and when it

does not it must be modified It

is not a priori knowledge even

though it was so regarded for two

thousand years It is not absolute

or unchangeable

itself Mathematics itself overview Mathematical reasoning p 1

itselfreason Mathematical

reasoning

itself Mathematics itself engmath Mathematical topics in overview p 1

itselfoverview Mathematical

topics in

overview

itself Mathematics itself ex What is mathematics for engineering p 1

itselfengmath What is

mathematics for

engineering

language

sets Mathematics itself Exercises for Chapter 01 p 1

itselfex Exercises for Chapter

01

formal languages

logic

reasoning

propositions

theorems

proof

propositional calculus

logical connectives

first-order logic

sets

Mathematicalreasoning logicand set theory

In order to communicate mathematical

ideas effectively formal languages have

been developed within which logic ie

deductive (mathematical) reasoning can

proceed Propositions are statements that

can be either true gt or false perp Axiomatic

systems begin with statements (axioms) assumed

true Theorems are proven by deduction

In many forms of logic like propositional

calculus (Wikipedia Propositional

calculus mdash Wikipedia The Free Encyclopedia

httpenwikipediaorgwindexphptitle=

Propositional20calculusampoldid=914757384

[Online accessed 29shyOctobershy2019] 2019)

compound propositions are constructed

via logical connectives like ldquoandrdquo

and ldquoorrdquo of atomic propositions (see

Lecture setslogic) In others like first-order

logic (Wikipedia First-order logic mdash

Wikipedia The Free Encyclopedia http

enwikipediaorgwindexphptitle=First-

order 20logic amp oldid = 921437906 [Online

accessed 29shyOctobershy2019] 2019) there are also

sets Mathematical reasoning logic and set theory setintro p 3

quantifiers

set theory

logical quantifiers like ldquofor everyrdquo and ldquothere

existsrdquo

The mathematical objects and operations about

which most propositions are made are expressed

in terms of set theory which was introduced

in Lecture itselffound and will be expanded

upon in Lecture setssetintro We can say

that mathematical reasoning is comprised of

mathematical objects and operations expressed

in set theory and logic allows us to reason

therewith

sets Mathematical reasoning logic and set theory setintro Introduction to set theory p 1

set theory

1 K Ciesielski Set Theory for the Working Mathematician London

Mathematical Society Student Texts Cambridge University Press 1997

ISBN 9780521594653 A readable introduction to set theory

2 HB Enderton Elements of Set Theory Elsevier Science 1977 ISBN

9780080570426 A gentle introduction to set theory and mathematical

reasoningmdasha great place to start

set

field

addition

multiplication

subtraction

division

setssetintro Introduction to

set theory

Set theory is the language of the modern

foundation of mathematics as discussed

in Lecture itselffound It is unsurprising

then that it arises throughout the study of

mathematics We will use set theory extensively

in on probability theory

The axioms of ZFC set theory were introduced

in Lecture itselffound Instead of proceeding

in the pure mathematics way of introducing

and proving theorems we will opt for a more

applied approach in which we begin with some

simple definitions and include basic operations

A more thorough and still readable treatment

is given by Ciesielski1 and a very gentle

version by Enderton2

A set is a collection of objects Set theory

gives us a way to describe these collections

Often the objects in a set are numbers or sets

of numbers However a set could represent

collections of zebras and trees and hairballs

For instance here are some sets

A field is a set with special structure This

structure is provided by the addition (+)

and multiplication (times) operators and their

inverses subtraction (ndash) and division (divide)

sets Mathematical reasoning logic and set theory setintro Introduction to set theory p 2

real numbers

3 When the natural numbers include zero we write N0

set membership

set operations

union

The quintessential example of a field is the

set of real numbers R which admits these

operators making it a field The reals R thecomplex numbers C the integers Z and the

natural numbers3 N are the fields we typically

consider

Set membership is the belonging of an object

to a set It is denoted with the symbol isinwhich can be read ldquois an element ofrdquo for

element x and set X

For instance we might say 7 isin 1 7 2 or

4 isin 1 7 2 Or we might declare that a is a

real number by stating x isin R

Set operations can be used to construct new

sets from established sets We consider a few

common set operations now

The union cup of sets is the set containing all

the elements of the original sets (no repetition

allowed) The union of sets A and B is denoted

A cup B For instance let A = 1 2 3 and B = ndash1 3

then

sets Mathematical reasoning logic and set theory setintro Introduction to set theory p 3

intersection

empty set

set difference

subset

proper subset

The intersection cap of sets is a set containing

the elements common to all the original sets

The intersection of sets A and B is denoted

A cap B For instance let A = 1 2 3 and B = 2 3 4

then

If two sets have no elements in common the

intersection is the empty set empty = the unique

set with no elements

The set difference of two sets A and B is the

set of elements in A that arenrsquot also in B It is

denoted A B For instance let A = 1 2 3 and

B = 2 3 4 Then

A subset sube of a set is a set the elements of

which are contained in the original set If the

two sets are equal one is still considered a

subset of the other We call a subset that is

not equal to the other set a proper subset sub

sets Mathematical reasoning logic and set theory setintro Introduction to set theory p 4

complement

cartesian product

set-builder notation

For instance let A = 1 2 3 and B = 1 2 Then

The complement of a subset is a set of elements

of the original set that arenrsquot in the subset

For instance if B sube A then the complement of B

denoted B is

The cartesian product of two sets A and B is

denoted A times B and is the set of all ordered

pairs (a b) where a isin A and b isin B Itrsquos

worthwhile considering the following notation

for this definition

which means ldquothe cartesian product of A and B

is the ordered pair (a b) such that a isin A and

b isin Brdquo in set-builder notation (Wikipedia

Set-builder notation mdash Wikipedia The Free

Encyclopedia httpenwikipediaorgw

indexphptitle=Set-builder20notationamp

oldid=917328223 [Online accessed 29shyOctober-

sets Mathematical reasoning logic and set theory logic Introduction to set theory p 5

map

function

value

domain

codomain

image

2019] 2019)

Let A and B be sets A map or function f from

A to B is an assignment of some element a isin A

to each element b isin B The function is denoted

f A rarr B and we say that f maps each element

a isin A to an element f(a) isin B called the value

of a under f or a 7rarr f(a) We say that f has

domain A and codomain B The image of f is

the subset of its codomain B that contains

the values of all elements mapped by f from its

domain A

sets Mathematical reasoning logic and set theory ex Logical connectives and quantifiers p 1

first-order logic

atomistic proposition

compound proposition

logical connectives

not

andor

truth table

universal quantifier symbol

existential quantifier

setslogic Logical connectives

and quantifiers

In order to make compound propositions

we need to define logical connectives In

order to specify quantities of variables

we need to define logical quantifiers The

following is a form of first-order logic

(Wikipedia First-order logic mdash Wikipedia The

Free Encyclopedia)

Logical connectives

A proposition can be either true gt and false perpWhen it does not contain a logical connective

it is called an atomistic proposition

To combine propositions into a compound

proposition we require logical connectives

They are not (not) and (and) and or (or) Table 021

is a truth table for a number of connectives

Quantifiers

Logical quantifiers allow us to indicate

the quantity of a variable The universal

quantifier symbol forall means ldquofor allrdquo For

instance let A be a set then foralla isin A means

ldquofor all elements in Ardquo and gives this quantity

variable a The existential quantifier existmeans ldquothere exists at least onerdquo or ldquofor somerdquo

For instance let A be a set then exista isin A

Table 021 a truth table for logical connectivesThe first two columns are the truth values ofpropositions p and q the rest are outputs

not and or nand nor xor xnor

p q notp pand q por q p uarr q p darr q p Y q phArr q

perp perp gt perp perp gt gt perp gtperp gt gt perp gt gt perp gt perpgt perp perp perp gt gt perp gt perpgt gt perp gt gt perp perp perp gt

sets Mathematical reasoning logic and set theory ex Logical connectives and quantifiers p 2

means ldquothere exists at least one element a in

A helliprdquo

prob Mathematical reasoning logic and set theory Exercises for Chapter 02 p 1

setsex Exercises for Chapter

02

Exercise 021 (hardhat)

For the following write the set described in

set-builder notation

a A = 2 3 5 9 17 33 middot middot middot b B is the set of integers divisible by 11

c C = 13 14 15 middot middot middot d D is the set of reals between ndash3 and 42

Exercise 022 (2)

Let xy isin Rn Prove the Cauchy-Schwarz

Inequality

|x middot y| 6 xy (ex1)

Hint you may find the geometric definition of

the dot product helpful

Exercise 023 (3)

Let x isin Rn Prove that

x middot x = x2 (ex2)

Hint you may find the geometric definition of

the dot product helpful

Exercise 024 (4)

Let xy isin Rn Prove the Triangle Inequality

x + y 6 x + y (ex3)

Hint you may find the Cauchy-Schwarz

Inequality helpful

prob

Probability

prob Probability prob Probability and measurement p 1

Probability theory

1 For a good introduction to probability theory see Ash (Robert B

Ash Basic Probability Theory Dover Publications Inc 2008) or Jaynes

et al (ET Jaynes et al Probability Theory The Logic of Science

Cambridge University Press 2003 ISBN 9780521592710 An excellent

and comprehensive introduction to probability theory)

interpretation of probability

event

probmeas Probability and

measurement

Probability theory is a wellshydefined branch

of mathematics Andrey Kolmogorov described

a set of axioms in 1933 that are still in use

today as the foundation of probability

theory1

We will implicitly use these axioms in our

analysis The interpretation of probability is

a contentious matter Some believe probability

quantifies the frequency of the occurrence of

some event that is repeated in a large number

of trials Others believe it quantifies the

state of our knowledge or belief that some

event will occur

In experiments our measurements are tightly

coupled to probability This is apparent in the

questions we ask Here are some examples

1 How common is a given event

2 What is the probability we will reject

a good theory based on experimental

results

3 How repeatable are the results

4 How confident are we in the results

5 What is the character of the fluctuations

and drift in the data

6 How much data do we need

prob Probability prob Basic probability theory p 1

probability space

sample space

outcomes

probprob Basic probability

theory

The mathematical model for a class of

measurements is called the probability

space and is composed of a mathematical

triple of a sample space Ω σshyalgebra

F and probability measure P typically

denoted (ΩF P) each of which we will

consider in turn (Wikipedia Probability

space mdash Wikipedia The Free Encyclopedia

httpenwikipediaorgwindexphptitle=

Probability20spaceampoldid=914939789 [Online

accessed 31shyOctobershy2019] 2019)

The sample space Ω of an experiment is the

set representing all possible outcomes of the

experiment If a coin is flipped the sample

space is Ω = H T where H is heads and T

is tails If a coin is flipped twice the sample

space could be

However the same experiment can have

different sample spaces For instance for two

coin flips we could also choose

prob Probability prob Basic probability theory p 1

event

We base our choice of Ω on the problem at

hand

An event is a subset of the sample space

That is an event corresponds to a yes-or-no

question about the experiment For instance

event A (remember A sube Ω) in the coin flipping

experiment (two flips) might be A = HT TH A is

an event that corresponds to the question ldquoIs

the second flip different than the firstrdquo A is

the event for which the answer is ldquoyesrdquo

Algebra of events

Because events are sets we can perform the

usual set operations with them

Example probprobshy1 re set operations with events

Problem Statement

Consider a toss of a single die We choose

the sample space to be Ω = 1 2 3 4 5 6 Let the

following define events

A equiv the result is even = 2 4 6

B equiv the result is greater than 2 = 3 4 5 6

Find the following event combinations

A cup B A cap B A B B A A B

prob Probability condition Basic probability theory p 2

σshyalgebra

probability measure

The σshyalgebra F is the collection of events

of interest Often F is the set of all possible

events given a sample space Ω which is just

the power set of Ω (Wikipedia Probability

space mdash Wikipedia The Free Encyclopedia)

When referring to an event we often state

that it is an element of F For instance we

might say an event A isin F

Wersquore finally ready to assign probabilities to

events We define the probability measure

P F rarr [0 1] to be a function satisfying the

following conditions

1 For every event A isin F the probability

measure of A is greater than or equal to

zeromdashie P(A) gt 0

2 If an event is the entire sample space its

probability measure is unitymdashie if A = Ω

P(A) = 1

3 If events A1 A2 middot middot middot are disjoint sets (no

elements in common) then P(A1 cup A2 cup middot middot middot ) =P(A1) + P(A2) + middot middot middot

We conclude the basics by observing four facts

that can be proven from the definitions above

1

2

3

4

prob Probability condition Independence and conditional probability p 1

independent

probcondition Independence

and conditional

probability

Two events A and B are independent if and

only if

P(A cap B) = P(A)P(B)

If an experimenter must make a judgment

without data about the independence of

events they base it on their knowledge of the

events as discussed in the following example

Example probconditionshy1 re independence

Problem Statement

Answer the following questions and

imperatives

1 Consider a single fair die rolled twice

What is the probability that both rolls

are 6

2 What changes if the die is biased by a

weight such that P(6) = 17

3 What changes if the die is biased by a

magnet rolled on a magnetic dice-rolling

tray such that P(6) = 17

4 What changes if there are two dice

biased by weights such that for each

P(6) = 17 rolled once both resulting in

6

5 What changes if there are two dice

biased by magnets rolled together

prob Probability condition Independence and conditional probability p 1

dependent

conditional probability

Conditional probability

If events A and B are somehow dependent we

need a way to compute the probability of B

occurring given that A occurs This is called

the conditional probability of B given A and

is denoted P(B | A) For P(A) gt 0 it is defined as

P(B | A) =P(A cap B)P(A)

(condition1)

We can interpret this as a restriction of

the sample space Ω to A ie the new sample

space Ωprime = A sube Ω Note that if A and B are

independent we obtain the obvious result

prob Probability bay Independence and conditional probability p 2

Example probconditionshy2 re dependence

Problem Statement

Consider two unbiased dice rolled once

Let events A = sum of faces = 8 and

B = faces are equal What is the probability

the faces are equal given that their sum is 8

prob Probability bay Bayesrsquo theorem p 1

Bayesrsquo theorem

probbay Bayesrsquo theorem

Given two events A and B Bayesrsquo theorem (aka

Bayesrsquo rule) states that

P(A | B) = P(B | A)P(A)

P(B) (bay1)

Sometimes this is written

P(A | B) =P(B | A)P(A)

P(B | A)P(A) + P(B | notA)P(notA)(bay2)

=1

1 +P(B | notA)

P(B | A)middot P(notA)P(A)

(bay3)

This is a useful theorem for determining a

testrsquos effectiveness If a test is performed to

determine whether an event has occurred we

might as questions like ldquoif the test indicates

that the event has occurred what is the

probability it has actually occurredrdquo Bayesrsquo

theorem can help compute an answer

Testing outcomes

The test can be either positive or negative

meaning it can either indicate or not indicate

that A has occurred Furthermore this result

can be either true or false

A notA

positive (B) true false

negative (notB) false true

Table 031 test outcomeB for event A

There are four

options then

Consider an event

A and an event that

is a test result B

indicating that event

A has occurred

Table 031 shows

these four possible test outcomes The event

A occurring can lead to a true positive or a

prob Probability bay Bayesrsquo theorem p 1

sensitivity

detection rate

specificity

posterior probability

prior probability

false negative whereas notA can lead to a true

negative or a false positive

Terminology is important here

bull P(true positive) = P(B | A) aka sensitivity

or detection rate

bull P(true negative) = P(notB | notA) aka

specificity

bull P(false positive) = P(B | notA)

bull P(false negative) = P(notB | A)

Clearly the desirable result for any test

is that it is true However no test is true

100 percent of the time So sometimes it

is desirable to err on the side of the

false positive as in the case of a medical

diagnostic Other times it is more desirable to

err on the side of a false negative as in the

case of testing for defects in manufactured

balloons (when a false negative isnrsquot a big

deal)

Posterior probabilities

Returning to Bayesrsquo theorem we can evaluate

the posterior probability P(A | B) of the

event A having occurred given that the

test B is positive given information that

includes the prior probability P(A) of A The

form in Equation bay2 or (bay3) is typically

useful because it uses commonly known test

probabilities of the true positive P(B | A) and

of the false positive P(B | notA) We calculate

P(A | B) when we want to interpret test results

prob Probability bay Bayesrsquo theorem p 2

0 02 04 06 08 1

middot10ndash2

0

02

04

06

08

1

P(A) rarr

P(A|B)rarr

P(B|A) = 09000

P(B|A) = 09900

P(B|A) = 09990

P(B|A) = 09999

Figure 031 for different high-sensitivities theprobability that an event A occurred given thata test for it B is positive versus the probabilitythat the event A occurs under the assumption thespecificity equals the sensitivity

Some interesting results can be found from this

For instance if we let P(B | A) = P(notB | notA)

(sensitivity equal specificity) and realize that

P(B | notA) + P(notB | notA) = 1 (when notA either B or notB)

we can derive the expression

P(B | notA) = 1 ndash P(B | A) (bay4)

Using this and P(notA) = 1 ndash P(A) in Equation bay3

gives (recall wersquove assumed sensitivity equals

specificity)

P(A | B) =1

1 +1 ndash P(B | A)

P(B | A)middot 1 ndash P(A)P(A)

(bay5)

=1

1 +

(1

P(B | A)ndash 1

)(1

P(A)ndash 1

) (bay6)

This expression is plotted in Figure 031 See that

a positive result for a rare event (small P(A)) is

hard to trust unless the sensitivity P(B | A) and

specificity P(notB | notA) are very high indeed

prob Probability bay Bayesrsquo theorem p 3

Example probbayshy1 re Bayesrsquo theorem

Problem StatementSuppose 01 of springs manufactured at a

given plant are defective Suppose you need

to design a test that when it indicates

a deffective part the part is actually

defective 99 of the time What sensitivity

should your test have assuming it can be made

equal to its specificity

Problem Solution

The followingwas generated from a Jupyter

notebook with the following filename and

kernel

notebook filename bayes_theorem_example_01ipynbnotebook kernel python3

from sympy import for symbolicsimport numpy as np for numericsimport matplotlibpyplot as plt for plotsfrom IPythondisplay import display Markdown Latex

Define symbolic variables

var(p_Ap_nAp_Bp_nBp_B_Ap_B_nAp_A_Breal=True)

(p_A p_nA p_B p_nB p_B_A p_B_nA p_A_B)

prob Probability bay Bayesrsquo theorem p 4

Beginningwith Bayesrsquo theorem and assuming

the sensitivity and specificity are equal by

Equation bay4 we can derive the following

expression for the posterior probability P(A | B)

p_A_B_e1 = Eq(p_A_Bp_B_Ap_Ap_B)subs(

p_B p_B_Ap_A+p_B_nAp_nA conditional probp_B_nA 1-p_B_A Eq (35)p_nA 1-p_A

)display(p_A_B_e1)

pAB =pApBA

pApBA +(1 ndash pA

) (1 ndash pBA

)

Solve this for P(B | A) the quantity we seek

p_B_A_sol = solve(p_A_B_e1p_B_Adict=True)p_B_A_eq1 = Eq(p_B_Ap_B_A_sol[0][p_B_A])display(p_B_A_eq1)

pBA =pAB

(1 ndash pA

)ndash2pApAB + pA + pAB

Now letrsquos substitute the given probabilities

p_B_A_spec = p_B_A_eq1subs(

prob Probability rando Bayesrsquo theorem p 5

p_A 0001p_A_B 099

)display(p_B_A_spec)

pBA = 0999989888981011

Thatrsquos a tall order

prob Probability rando Random variables p 1

random variable

discrete random variable

continuous random variable

probrando Random variables

Probabilities are useful even when they do

not deal strictly with events It often occurs

that we measure something that has randomness

associated with it We use random variables to

represent these measurements

A random variable X Ω rarr R is a function

that maps an outcome ω from the sample

space Ω to a real number x isin R as shown in

Figure 032 A random variable will be denoted

with a capital letter (eg X and K) and a

specific value that it maps to (the value) will

be denoted with a lowercase letter (eg x and

k)

A discrete random variable K is one that

takes on discrete values A continuous random

variable X is one that takes on continuous

values

Example probrandoshy1 re dice again

Problem Statement

Roll two unbiased dice Let K be a random

variable representing the sum of the two Let

rvR

Ω

outcome ω

x

X

Figure 032 a random variable X maps an outcomeω isin Ω to an x isin R

prob Probability rando Random variables p 2

P(k) be the probability of the result k isin K Plotand interpret P(k)

Example probrandoshy2 re Johnson-Nyquist noise

Problem Statement

A resistor at nonzero temperature without

any applied voltage exhibits an interesting

phenomenon its voltage randomly fluctuates

This is called JohnsonshyNyquist noise and

is a result of thermal excitation of charge

carriers (electrons typically) For a given

resistor and measurement system let the

probability density function fV of the voltage

V across an unrealistically hot resistor be

fV(V) =1radicπendashV

2

Plot and interpret the meaning of this

function

prob Probability pxf Random variables p 3

prob Probability pxf Probability density and mass functions p 1

frequency distribution

probability mass function

Figure 033 plot of a probability mass function

frequency density distribution

probability density function

probpxf Probability density

and mass functions

Consider an experiment that measures a

random variable We can plot the relative

frequency of the measurand landing in

different ldquobinsrdquo (ranges of values) This is

called a frequency distribution or a

probability mass function (PMF)

Consider for instance a probability mass

function as plotted in Figure 033 where a

frequency ai can be interpreted as an estimate

of the probability of the measurand being in

the ith interval The sum of the frequencies

must be unity

with k being the number of bins

The frequency density distribution is similar

to the frequency distribution but with ai 7rarrai∆x where ∆x is the bin width

If we let the bin width approach zero we

derive the probability density function (PDF)

f(x) = limkrarrinfin∆xrarr0

ksumj=1

aj∆x (pxf1)

prob Probability pxf Probability density and mass functions p 1

Figure 034 plot of a probability density function

We typically think of a probability density

function f like the one in Figure 034 as a

function that can be integrated over to

find the probability of the random variable

(measurand) being in an interval [a b]

P(x isin [a b]) =ˆ b

af(χ)dχ (pxf2)

Of course

We now consider a common PMF and a common

PDF

Binomial PMF

Consider a random binary sequence of length

n such that each element is a random 0 or 1

generated independently like

(1 0 1 1 0 middot middot middot 1 1) (pxf3)

Let events 1 and 0 be mutually exclusive

and exhaustive and P(1) = p The probability of

the sequence above occurring is

prob Probability pxf Probability density and mass functions p 2

binomial distribution PDF

There are n choose k(n

k

)=

n

k(n ndash k) (pxf4)

possible combinations of k ones for n bits

Therefore the probability of any combination

of k ones in a series is

f(k) =

(n

k

)pk(1 ndash p)nndashk (pxf5)

We call Equation pxf5 the binomial

distribution PDF

Example probpxfshy1 re Binomial PMF

Problem Statement

Consider a field sensor that fails for a

given measurement with probability p Given

n measurements plot the binomial PMF as

a function of k failed measurements for

a few different probabilities of failure

p isin [004 025 05 075 096]

Problem SolutionFigure 036 shows Matlab code for

constructing the PDFs plotted in Figure 035

Note that the symmetry is due to the fact

that events 1 and 0 are mutually exclusive

and exhaustive

Example probpxfshy2 re hi

Problem Statement

Sed mattis erat sit amet gravida malesuada

elit augue egestas diam tempus scelerisque

nunc nisl vitae libero Sed consequat feugiat

massa Nunc porta eros in eleifend varius

erat leo rutrum dui non convallis lectus

prob Probability pxf Probability density and mass functions p 3

0 20 40 60 80 1000

005

01

015

02

number of failures in sequence

proba

bility

p = 004

p = 025

p = 05

p = 075

p = 096

Figure 035 binomial PDF for n = 100 measurementsand different values of P(1) = p the probability ofa measurement error The plot is generated by theMatlab code of Figure 036

parametersn = 100k_a = linspace(1nn)p_a = [042557596]

binomial functionf = (nkp) nchoosek(nk)p^k(1-p)^(n-k)

loop through to construct an arrayf_a = NaNones(length(k_a)length(p_a))for i = 1length(k_a)

for j = 1length(p_a)f_a(ij) = f(nk_a(i)p_a(j))

endend

plotfigurecolors = jet(length(p_a))for j = 1length(p_a)

bar(k_af_a(j)facecolorcolors(j)facealpha05displayname [$p = num2str(p_a(j))$]

) hold onendleg = legend(showlocationnorth)set(leginterpreterlatex)hold offxlabel(number of ones in sequence k)ylabel(probability)xlim([0100])

Figure 036 a Matlab script for generating binomialPMFs

prob Probability pxf Probability density and mass functions p 4

orci ut nibh Sed lorem massa nonummy quis

egestas id condimentum at nisl Maecenas at

nibh Aliquam et augue at nunc pellentesque

ullamcorper Duis nisl nibh laoreet suscipit

convallis ut rutrum id enim Phasellus odio

Nulla nulla elit molestie non scelerisque at

vestibulum eu nulla Ut odio nisl facilisis

id mollis et scelerisque nec enim Aenean sem

leo pellentesque sit amet scelerisque sit amet

vehicula pellentesque sapien

prob Probability pxf Probability density and mass functions p 5

prob Probability pxf Probability density and mass functions p 6

prob Probability E Probability density and mass functions p 1

Gaussian or normal random variable

mean

standard deviation

variance

Gaussian PDF

The Gaussian or normal random variable x has

PDF

f(x) =1

σradic2π

expndash(x ndash micro)2

2σ2 (pxf6)

Although wersquore not quite ready to understand

these quantities in detail it can be shown that

the parameters micro and σ have the following

meanings

bull micro is the mean of x

bull σ is the standard deviation of x and

bull σ2 is the variance of x

Consider the ldquobell-shapedrdquo Gaussian PDF in

Figure 037 It is always symmetric The mean micro

is its central value and the standard deviation

σ is directly related to its width We will

continue to explore the Gaussian distribution

in the following lectures especially in

Lecture statsconfidence

prob Probability E Probability density and mass functions p 2

ndash3 ndash2 ndash1 0 1 2 3

random variable x

Figure 037 PDF for Gaussian random variable x

mean micro = 0 and standard deviation σ = 1radic2

prob Probability E Expectation p 1

expected value

expectation

mean

probE Expectation

Recall that a random variable is a function

X Ω rarr R that maps from the sample

space to the reals Random variables are the

arguments of probability mass functions (PMFs)

and probability density functions (PDFs)

The expected value (or expectation) of a

random variable is akin to its ldquoaverage valuerdquo

and depends on its PMF or PDF The expected

value of a random variable X is denoted

〈X〉 or E[X] There are two definitions of

the expectation one for a discrete random

variable the other for a continuous random

variable Before we define them however it

is useful to predefine the most fundamental

property of a random variable its mean

Definition probE1 mean

The mean of a random variable X is defined

as

mX = E[X]

Letrsquos begin with a discrete random variable

Definition probE2 expectation of a

discrete random

variable

Let K be a discrete random variable and f its

PMF The expected value of K is defined as

E[K]=sumforallk

kf(k)

prob Probability E Expectation p 2

Example probEshy1 re expectation of a discrete random variable

Problem Statement

Given a discrete random variable K with PMF

shown below what is its mean mK

Let us now turn to the expectation of a

continuous random variable

Definition probE3 expectation of a

continuous random

variable

Let X be a continuous random variable and f

its PDF The expected value of X is defined as

E[X]=

ˆ infinndashinfin xf(x)dx

Example probEshy2 re expectation of a continuous random variable

Problem Statement

Given a continuous random variable X with

Gaussian PDF f what is the expected value of

X

random variable x

prob Probability moments Expectation p 3

Due to its sum or integral form the expected

value E [middot] has some familiar properties for

random variables X and Y and reals a and b

E [a] = a (E1a)

E[X + a

]= E

[X]+ a (E1b)

E[aX

]= aE

[X]

(E1c)

E[E[X]]= E

[X]

(E1d)

E[aX + bY

]= aE

[X]+ b E

[Y] (E1e)

prob Probability moments Central moments p 1

central moments

variance

probmoments Central moments

Given a probability mass function (PMF) or

probability density function (PDF) of a random

variable several useful parameters of the

random variable can be computed These are

called central moments which quantify

parameters relative to its mean

Definition probmoments1 central moments

The nth central moment of random variable X

with PDF f is defined as

E[(X ndash microX)

n] = ˆ infinndashinfin (x ndash microX)

nf(x)dx

For discrete random variable K with PMF f

E[(K ndash microK)

n] = infinsumforallk(k ndash microK)

nf(k)

Example probmomentsshy1 re first moment

Problem Statement

Prove that the first moment of continuous

random variable X is zero

The second central moment of random variable

X is called the variance and is denoted

prob Probability moments Central moments p 2

standard deviation

skewness

σ2X or Var[X]

or E[(X ndash microX)

2]

(moments1)

The variance is a measure of the width or

spread of the PMF or PDF We usually compute

the variance with the formula

Other properties of variance include for real

constant c

Var [c] = 0 (moments2)

Var[X + c

]= Var

[X]

(moments3)

Var[cX

]= c2 Var

[X] (moments4)

The standard deviation is defined as

Although the variance is mathematically more

convenient the standard deviation has the

same physical units as X so it is often the

more physically meaningful quantity Due to

its meaning as the width or spread of the

probability distribution and its sharing of

physical units it is a convenient choice for

error bars on plots of a random variable

The skewness Skew[X]is a normalized third

prob Probability ex Central moments p 3

asymmetry

kurtosis

tailedness

central moment

Skew[X]=E[(X ndash microX)

3]

σ3X

(moments5)

Skewness is a measure of asymmetry of a

random variablersquos PDF or PMF For a symmetric

PMF or PDF such as the Gaussian PDF Skew[X]=

0

The kurtosis Kurt[X]is a normalized fourth

central moment

Kurt[X]=E[(X ndash microX)

4]

σ4X

(moments6)

Kurtosis is a measure of the tailedness of a

random variablersquos PDF or PMF ldquoHeavierrdquo tails

yield higher kurtosis

A Gaussian random variable has PDF with

kurtosis 3 Given that for Gaussians both

skewness and kurtosis have nice values (0 and

3) we can think of skewness and and kurtosis

as measures of similarity to the Gaussian PDF

stats Probability Exercises for Chapter 03 p 1

probex Exercises for Chapter

03

Exercise 031 (5)

Several physical processes can be modeled

with a random walk a process of interatively

changing a quantity by some random

amount Infinitely many variations are

possible but common factors of variation

include probability distribution step

size dimensionality (eg oneshydimensional

twoshydimensional etc) and coordinate system

Graphical representations of these walks can be

beautiful Develop a computer program that

generates random walks and corresponding

graphics Do it well and call it art because it

is

estimation

stats

Statistics

Whereas probability theory is primarily

focused on the relations among mathematical

objects statistics is concerned with making

sense of the outcomes of observation (Steven

S Skiena Calculated Bets Computers Gambling

and Mathematical Modeling to Win Outlooks

Cambridge University Press 2001 DOI 10

1017CBO9780511547089 This includes a lucid

section on probability versus statistics also

available here httpswww3csstonybrook

edu~skienajaialaiexcerptsnode12html)

However we frequently use statistical methods

to estimate probabilistic models For instance

we will learn how to estimate the standard

deviation of a random process we have some

reason to expect has a Gaussian probability

distribution

Statistics has applications in nearly every

applied science and engineering discipline

Any time measurements are made statistical

analysis is how one makes sense of the results

For instance determining a reasonable level

of confidence in a measured parameter

stats Statistics terms p 3

machine learning

1 Ash Basic Probability Theory

2 Jaynes et al Probability Theory The Logic of Science

3 Erwin Kreyszig Advanced Engineering Mathematics 10th John Wiley amp

Sons Limited 2011 ISBN 9781119571094 The authoritative resource for

engineering mathematics It includes detailed accounts of probability

statistics vector calculus linear algebra fourier analysis ordinary

and partial differential equations and complex analysis It also

includes several other topics with varying degrees of depth Overall

it is the best place to start when seeking mathematical guidance

requires statistics

A particularly hot topic nowadays is machine

learning which seems to be a field with

applications that continue to expand This field

is fundamentally built on statistics

A good introduction to statistics appears at

the end of Ash1 A more involved introduction

is given by Jaynes et al2 The treatment by

Kreyszig3 is rather incomplete as will be our

own

stats Statistics terms Populations samples and machine learning p 1

population

sample

random

machine learning

training

training set

features

predictive model

statsterms Populations samples

and machine

learning

An experimentrsquos population is a complete

collection of objects that we would like to

study These objects can be people machines

processes or anything else we would like to

understand experimentally

Of course we typically canrsquot measure all of

the population Instead we take a subset of

the populationmdashcalled a samplemdashand infer the

characteristics of the entire population from

this sample

However this inference that the sample is

somehow representative of the population

assumes the sample size is sufficiently large

and that the sampling is random This means

selection of the sample should be such that no

one group within a population are systematically

over- or under-represented in the sample

Machine learning is a field that makes

extensive use of measurements and statistical

inference In it an algorithm is trained by

exposure to sample data which is called a

training set The variables measured are

called features Typically a predictive

model is developed that can be used to

extrapolate from the data to a new situation

The methods of statistical analysis we

introduce in this chapter are the foundation

of most machine learning methods

stats Statistics sample Populations samples and machine learning p 2

Example statstermsshy1 re combat boots

Problem Statement

Consider a robot Pierre with a particular

gravitas and sense of style He seeks just

the right-looking pair of combat boots

for wearing in the autumn rains Pierre

is to purchase the boots online via image

recognition and decides to gather data

by visiting a hipster hangout one evening to

train his style For contrast he also watches

footage of a White Nationalist rally focusing

special attention on the boots of wearers of

khakis and polos Comment on Pierrersquos methods

stats Statistics sample Estimation of sample mean and variance p 1

sample mean

statssample Estimation of sample

mean and variance

Estimation and sample statistics

The mean and variance definitions of

Lecture probE and Lecture probmoments apply

only to a random variable for which we have a

theoretical probability distribution Typically

it is not until after having performed many

measurements of a random variable that

we can assign a good distribution model

Until then measurements can help us estimate

aspects of the data We usually start by

estimating basic parameters such as mean

and variance before estimating a probability

distribution

There are two key aspects to randomness in the

measurement of a random variable First of

course there is the underlying randomness with

its probability distribution mean standard

deviation etc which we call the population

statistics Second there is the statistical

variability that is due to the fact that we

are estimating the random variablersquos statisticsmdash

called its sample statisticsmdashfrom some sample

Statistical variability is decreased with

greater sample size and number of samples

whereas the underlying randomness of the

random variable does not decrease Instead

our estimates of its probability distribution

and statistics improve

Sample mean variance and standard

deviation

The arithmetic mean or sample mean of a

measurand with sample size N represented by

stats Statistics sample Estimation of sample mean and variance p 2

population mean

sample variance

population variance

random variable X is defined as

x =1

N

Nsumi=1

xi (sample1)

If the sample size is large x rarr mX (the sample

mean approaches the mean) The population

mean is another name for the mean mX which

is equal to

mX = limNrarrinfin 1

N

Nsumi=1

xi (sample2)

Recall that the definition of the mean is

mX = E [x]

The sample variance of a measurand

represented by random variable X is defined

as

S2X =1

N ndash 1

Nsumi=1

(xi ndash x)2 (sample3)

If the sample size is large S2X rarr σ2X(the

sample variance approaches the variance) The

population variance is another term for the

variance σ2X and can be expressed as

σ2X = limNrarrinfin 1

N ndash 1

Nsumi=1

(xi ndash x)2 (sample4)

Recall that the definition of the variance is

σ2X= E

[(X ndash mX)

2]

The sample standard deviation of a measurand

represented by random variable X is defined

as

SX =

radicS2X (sample5)

If the sample size is large SX rarr σX (the sample

standard deviation approaches the standard

stats Statistics sample Estimation of sample mean and variance p 1

mean of means Xi

mean of standard deviations SXi

standard deviation of means SXi

standard deviation of standard deviations SSXi

deviation) The population standard deviation is

another term for the standard deviation σX

and can be expressed as

σX = limNrarrinfin

radicS2X (sample6)

Recall that the definition of the standard

deviation is σX =radicσ2X

Sample statistics as random variables

There is an ambiguity in our usage of the term

ldquosamplerdquo It can mean just one measurement

or it can mean a collection of measurements

gathered together Hopefully it is clear from

context

In the latter sense often we collect multiple

samples each of which has its own sample

mean Xi and standard deviation SXi In this

situation Xi and SXi are themselves random

variables (meta af I know) This means they

have their own sample means Xi and SXi and

standard deviations SXi

and SSXi

The mean of means Xi is equivalent to

a mean with a larger sample size and is

therefore our best estimate of the mean of

the underlying random process The mean of

standard deviations SXi is our best estimate

of the standard deviation of the underlying

random process The standard deviation

of means SXi

is a measure of the spread

in our estimates of the mean It is our best

estimate of the standard deviation of the

statistical variation and should therefore

tend to zero as sample size and number of

samples increases The standard deviation

of standard deviations SSXiis a measure of

stats Statistics sample Estimation of sample mean and variance p 1

the spread in our estimates of the standard

deviation of the underlying process It should

also tend to zero as sample size and number of

samples increases

Let N be the size of each sample It can be

shown that the standard deviation of the

means SXi

can be estimated from a single

sample standard deviation

SXiasympSXiradicN (sample7)

This shows that as the sample size N increases

the statistical variability of the mean

decreases (and in the limit approaches zero)

Nonstationary signal statistics

The sample mean variance and standard

deviation definitions above assume the random

process is stationarymdashthat is its population

mean does not vary with time However

a great many measurement signals have

populations that do vary with time ie they are

nonstationary Sometimes the nonstationarity

arises from a ldquodriftrdquo in the dc value of a

signal or some other slowly changing variable

But dynamic signals can also change in a

recognizable and predictable manner as when

say the temperature of a room changes when

a window is opened or when a water level

changes with the tide

Typically we would like to minimize the effect

of nonstationarity on the signal statistics In

certain cases such as drift the variation is a

nuissance only but other times it is the point

of the measurement

stats Statistics sample Estimation of sample mean and variance p 1

Two common techniques are used depending

on the overall type of nonstationarity If it

is periodic with some known or estimated

period the measurement data series can

be ldquofoldedrdquo or ldquoreshapedrdquo such that the ith

measurement of each period corresponds to

the ith measurement of all other periods In

this case somewhat counterintuitively we can

consider the ith measurements to correspond

to a sample of size N where N is the number of

periods over which measurements are made

When the signal is aperiodic we often simply

divide it into ldquosmallrdquo (relative to its overall

trend) intervals over which statistics are

computed separately

Note that in this discussion we have assumed

that the nonstationarity of the signal is due to

a variable that is deterministic (not random)

Example statssampleshy1 re measurement of gaussian noise on

nonstationary signal

Problem StatementConsider the measurement of the temperature

inside a desktop computer chassis via an

inexpensive thermistor a resistor that

changes resistance with temperature The

processor and power supply heat the chassis

in a manner that depends on processing

demand For the test protocol the processors

are cycled sinusoidally through processing

power levels at a frequency of 50 mHz for

nT = 12 periods and sampled at 1 Hz Assume

a temperature fluctuation between about 20

and 50 C and gaussian noise with standard

stats Statistics sample Estimation of sample mean and variance p 2

deviation 4 C Consider a sample to be the

multiple measurements of a certain instant in

the period

1 Generate and plot simulated

temperature data as a time series

and as a histogram or frequency

distribution Comment on why the

frequency distribution sucks

2 Compute the sample mean and standard

deviation for each sample in the cycle

3 Subtract the mean from each sample

in the period such that each sample

distribution is centered at zero Plot

the composite frequency distribution

of all samples together This represents

our best estimate of the frequency

distribution of the underlying process

4 Plot a comparison of the theoretical

mean which is 35 and the sample mean

of means with an error bar Vary the

number of samples nT and comment on its

effect on the estimate

5 Plot a comparison of the theoretical

standard deviation and the sample mean

of sample standard deviations with an

error bar Vary the number of samples

nT and comment on its effect on the

estimate

6 Plot the sample means over a single

period with error bars of plusmn one sample

standard deviation of the means This

represents our best estimate of the

sinusoidal heating temperature Vary the

number of samples nT and comment on

the estimate

Problem Solution

stats Statistics sample Estimation of sample mean and variance p 3

clear close all clear kernel

Generate the temperature data

The temperature data can be generated by

constructing an array that is passed to a

sinusoid then ldquorandomizedrdquo by gaussian

random numbers Note that we add 1 to npand n to avoid the sneaky fencepost error

f = 50e-3 Hz sinusoid frequencya = 15 C amplitude of oscillationdc = 35 C dc offset of oscillationfs = 1 Hz sampling frequencynT = 12 number of sinusoid periodss = 4 C standard deviationnp = fsf+1 number of samples per periodn = nTnp+1 total number of samples

t_a = linspace(0nTfn) time arraysin_a = dc + asin(2pift_a) sinusoidal arrayrng(43) seed the random number generatornoise_a = srandn(size(t_a)) gaussian noisesignal_a = sin_a + noise_a sinusoid + noise

Now that we have an array of ldquodatardquo wersquore

ready to plot

h = figurep = plot(t_asignal_ao-

Color[888]MarkerFaceColorbMarkerEdgeColornoneMarkerSize3)

xlabel(time (s))ylabel(temperature (C))hgsave(hfigurestemp)

stats Statistics sample Estimation of sample mean and variance p 4

0 50 100 150 200 250

20

40

60

time (s)

tempera

ture

(C)

Figure 041 temperature over time

This is something like what we might see for

continuous measurement data Now the

histogram

h = figurehistogram(signal_a

30 number of binsnormalizationprobability for PMF

)xlabel(temperature (C))ylabel(probability)hgsave(hfigurestemp)

10 20 30 40 50 600

002

004

006

008

temperature (C)

proba

bility

Figure 042 a poor histogram due tounstationarity of the signal

stats Statistics sample Estimation of sample mean and variance p 5

This sucks because we plot a frequency

distribution to tell us about the random

variation but this data includes the sinusoid

Sample mean variance and standard

deviation

To compute the sample mean micro and standard

deviation s for each sample in the period

we must ldquopick outrdquo the nT data points that

correspond to each other Currently theyrsquore in

one long 1 times n array signal_a It is helpful

to reshape the data so it is in an nT times nparray which each row corresponding to a

new period This leaves the correct points

aligned in columns It is important to note

that we can do this ldquofoldingrdquo operation only

when we know rather precisely the period

of the underlying sinusoid It is given in the

problem that it is a controlled experiment

variable If we did not know it we would

have to estimate it too from the data

signal_ar = reshape(signal_a(1end-1)[npnT])size(signal_ar) check sizesignal_ar(1314) print three rows four columns

ans =

12 21

ans =

302718 400946 408341 447662401836 372245 494076 461137400571 409718 461627 419145

stats Statistics sample Estimation of sample mean and variance p 6

Define the mean variance and standard

deviation functions as ldquoanonmymous

functionsrdquo We ldquoroll our ownrdquo These are not

as efficient or flexible as the built-in Matlab

functions mean var and std which should

typically be used

my_mean = (vec) sum(vec)length(vec)my_var = (vec) sum((vec-my_mean(vec))^2)(length(vec)-1)

my_std = (vec) sqrt(my_var(vec))

Now the sample mean variance and standard

deviations can be computed We proceed by

looping through each column of the reshaped

signal array

mu_a = NaNones(1np) initialize mean arrayvar_a = NaNones(1np) initialize var arrays_a = NaNones(1np) initialize std array

for i = 1np for each columnmu_a(i) = my_mean(signal_ar(i))var_a(i) = my_var(signal_ar(i))s_a(i) = sqrt(var_a(i)) touch of speed

end

Composite frequency distribution

The columns represent samples We want to

subtract the mean from each column We use

repmat to reproduce mu_a in nT rows so it can

be easily subtracted

stats Statistics sample Estimation of sample mean and variance p 7

signal_arz = signal_ar - repmat(mu_a[nT1])size(signal_arz) check sizesignal_arz(1314) print three rows four columns

ans =

12 21

ans =

-50881 09525 -02909 -1570048237 -19176 82826 -0222546972 18297 50377 -44216

Now that all samples have the same mean

we can lump them into one big bin for the

frequency distribution There are some nice

built-in functions to do a quick reshape and

fit

resizesignal_arzr = reshape(signal_arz[1nTnp])size(signal_arzr) check size fitpdfit_model = fitdist(signal_arzrnormal) fitx_a = linspace(-1515100)pdfit_a = pdf(pdfit_modelx_a)pdf_a = normpdf(x_a0s) theoretical pdf

ans =

1 252

Plot

stats Statistics sample Estimation of sample mean and variance p 8

h = figurehistogram(signal_arzr

round(ssqrt(nT)) number of binsnormalizationprobability for PMF

)hold onplot(x_apdfit_ab-linewidth2) hold onplot(x_apdf_ag--linewidth2)legend(pmfpdf estpdf)xlabel(zero-mean temperature (C))ylabel(probability massdensity)hgsave(hfigurestemp)

ndash15 ndash10 ndash5 0 5 10 150

005

01

015

zero-mean temperature (C)proba

bility

massdensity

pmf

pdf est

pdf

Figure 043 PMF and estimated and theoreticalPDFs

Means comparison

The sample mean of means is simply the

following

mu_mu = my_mean(mu_a)

mu_mu =

351175

stats Statistics sample Estimation of sample mean and variance p 9

The standard deviation that works as an

error bar which should reflect how well

we can estimate the point plotted is the

standard deviation of the means It is

difficult to compute this directly for a

nonstationary process We use the estimate

given above and improve upon it by using the

mean of standard deviations instead of a

single samplersquos standard deviation

s_mu = mean(s_a)sqrt(nT)

s_mu =

11580

Now for the simple plot

h = figurebar(mu_mu) hold on gives barerrorbar(mu_mus_murlinewidth2) error barax = gca current axisaxXTickLabels = $overlineoverlineX$axTickLabelInterpreter = latexhgsave(hfigurestemp)

Standard deviations comparison

The sample mean of standard deviations is

simply the following

mu_s = my_mean(s_a)

stats Statistics sample Estimation of sample mean and variance p 10

mu_s =

40114

The standard deviation that works as

an error bar which should reflect how

well we can estimate the point plotted is

the standard deviation of the standard

deviations We can compute this directly

s_s = my_std(s_a)

s_s =

08495

Now for the simple plot

h = figurebar(mu_s) hold on gives barerrorbar(mu_ss_srlinewidth2) error barsax = gca current axisaxXTickLabels = $overlineS_X$axTickLabelInterpreter = latexhgsave(hfigurestemp)

X0

10

20

30

40

Figure 044 sample mean of sample means

SX

0

2

4

Figure 045 sample standard deviation of sample means

Plot a period with error bars

Plotting the data with error bars is fairly

straightforward with the built-in errorbar

stats Statistics confidence Estimation of sample mean and variance p 11

function The main question is ldquowhich

standard deviationrdquo Since wersquore plotting the

means it makes sense to plot the error bars

as a single sample standard deviation of the

means

h = figuree1 = errorbar(t_a(1np)mu_as_muones(1np)b)hold ont_a2 = linspace(01f101)e2 = plot(t_a2dc + asin(2pift_a2)r-)xlim([t_a(1)t_a(np)])grid onxlabel(folded time (s))ylabel(temperature (C))legend([e1 e2]sample meanpopulation meanLocationNorthEast)hgsave(hfigurestemp)

0 5 10 15

20

30

40

50

folded time (s)

tempera

ture

(C)

sample mean

population mean

Figure 046 sample means over a period

stats Statistics confidence Confidence p 1

confidence

central limit theorem

statsconfidence Confidence

One really ought to have it to give a lecture

named it but wersquoll give it a try anyway

Confidence is used in the common sense

although we do endow it with a mathematical

definition to scare business majors who

arenrsquot actually impressed but indifferent

Approximately if under some reasonable

assumptions (probabilistic model) we estimate

the probability of some event to be P we say

we have P confidence in it I mean business

majors are all ldquoSupply and demand Letrsquos call

that a lsquolawrsquo rdquo so I think wersquore even

So wersquore back to computing probability from

distributionsmdashprobability density functions

(PDFs) and probability mass functions (PMFs)

Usually we care most about estimating the

mean of our distribution Recall from the

previous lecture that when several samples

are taken each with its own mean the mean

is itself a random variablemdashwith a mean of

course Meanception

But the mean has a probability distribution

of its own The central limit theorem has

as one of its implications that as the sample

size N gets large regardless of the sample

distributions this distribution of means

approaches the Gaussian distribution

But sometimes I always worry Irsquom being lied to

so letrsquos check

stats Statistics confidence Confidence p 1

clear close all clear kernel

Generate some data to test the central

limit theorem

Data can be generated by constructing an

array using a (seeded for consistency) random

number generator Letrsquos use a uniformly

distributed PDF between 0 and 1

N = 150 sample size (number of measurements per sample)M = 120 number of samplesn = NM total number of measurementsmu_pop = 05 because its a uniform PDF between 0 and 1

rng(11) seed the random number generatorsignal_a = rand(NM) uniform PDFsize(signal_a) check the size

ans =

150 120

Letrsquos take a look at the data by plotting

the first ten samples (columns) as shown in

Figure 047

This is something like what we might see for

continuous measurement data Now the

histogram shown in

samples_to_plot = 10h = figure

stats Statistics confidence Confidence p 1

0 20 40 60 80 100 120 1400

02

04

06

08

1

index

measurement

Figure 047 raw data with colors correspondingto samples

0 02 04 06 08 10

002

004

006

008

measurement

proba

bility

Figure 048 a histogram showing the approximatelyuniform distribution of each sample (color)

c = jet(samples_to_plot) color arrayfor j=1samples_to_plot

histogram(signal_a(j)30 number of binsfacecolorc(j)facealpha3normalizationprobability for PMF

)hold on

endhold offxlim([-05105])xlabel(measurement)ylabel(probability)hgsave(hfigurestemp)

This isnrsquot a great plot but it shows roughly

that each sample is fairly uniformly

distributed

stats Statistics confidence Confidence p 1

Sample statistics

Now letrsquos check out the sample statistics We

want the sample mean and standard deviation

of each column Letrsquos use the built-in functions

mean and std

mu_a = mean(signal_a1) mean of each columns_a = std(signal_a1) std of each column

Now we can compute the mean statistics both

the mean of the mean X and the standard

deviation of the mean sX which we donrsquot

strictly need for this part but wersquore curious

We choose to use the direct estimate instead

of the sXradicN formula but they should be close

mu_mu = mean(mu_a)s_mu = std(mu_a)

mu_mu =

04987

s_mu =

00236

The truth about sample means

Itrsquos the moment of truth Letrsquos look at the

distribution shown in Figure 049

stats Statistics confidence Confidence p 1

042 044 046 048 05 052 054 056 0580

01

02

03

measurement

proba

bility

Figure 049 a histogram showing the approximatelynormal distribution of the means

h = figurehistogram(mu_a

normalizationprobability for PMF)hold offxlabel(measurement)ylabel(probability)hgsave(hfigurestemp)

This looks like a Gaussian distribution about

the mean of means so I guess the central limit

theorem is legit

Gaussian and probability

We already know how to compute the

probability P a value of a random variable X

lies in a certain interval from a PMF or PDF

(the sum or the integral respectively) This

means that for sufficiently large sample size N

such that we can assume from the central limit

theorem that the sample means xi are normally

distributed the probability a sample mean

value xi is in a certain interval is given by

integrating the Gaussian PDF The Gaussian PDF

for random variable Y representing the sample

means is

f(y) =1

σradic2π

expndash(y ndash micro)2

2σ2 (confidence1)

stats Statistics confidence Confidence p 2

Gaussian cumulative distribution function

where micro is the population mean and σ is the

population standard deviation

The integral of f over some interval is the

probability a value will be in that interval

Unfortunately that integral is uncool It gives

rise to the definition of the error function

which for the Gaussian random variable Y is

erf(yb) =1radicπ

ˆ yb

ndashyb

endasht2dt (confidence2)

This expresses the probability a sample mean

being in the interval [ndashyb yb] if Y has mean 0

and variance 12

Matlab has this built-in as erf shown in

Figure 0410

y_a = linspace(03100)h = figurep1 = plot(y_aerf(y_a))p1LineWidth = 2grid onxlabel(interval bound $y_b$interpreterlatex)ylabel(error function $textrmerf(y_b)$

interpreterlatex)hgsave(hfigurestemp)

We could deal directly with the error

function but most people donrsquot and wersquore

weird enough as it is Instead people use the

Gaussian cumulative distribution function

stats Statistics confidence Confidence p 3

0 05 1 15 2 25 30

02

04

06

08

1

interval bound yb

errorfunctionerf(yb)

Figure 0410 the error function

(CDF) Φ Rrarr R which is defined as

Φ(z) =1

2

(1 + erf

(zradic2

))(confidence3)

and which expresses the probability of a

Gaussian random variable Z with mean 0 and

standard deviation 1 taking on a value in

the interval (ndashinfin z] The Gaussian CDF and PDF

are shown in Figure 0411 Values can be taken

directly from the graph but itrsquos more accurate

to use the table of values in Appendix A01

Thatrsquos great and all but occasionally (always)

we have Gaussian random variables with

nonzero means and nonunity standard

deviations It turns out we can shift any

Gaussian random variable by its mean and

scale it by its standard deviation to make

it have zero mean and standard deviation

We can then use Φ and interpret the results

as being relative to the mean and standard

deviation using phrases like ldquothe probability

it is within two standard deviations of its

meanrdquo The transformed random variable Z and

its values z are sometimes called the z-score

For a particular value x of a random variable

X we can compute its z-score (or value z of

stats Statistics confidence Confidence p 4

zb

ndash3 ndash2 ndash1 0 1 2 30

01

02

03

04

z

Gaussia

nPDFf(z)

Φ(zb)

f(z)

(a) the Gaussian PDF

ndash3 ndash2 ndash1 0 1 2 30

02

04

06

08

1

interval bound zb

Gaussia

nCDFΦ(zb)

(b) the Gaussian CDF

Figure 0411 the Gaussian PDF and CDF for z-scores

random variable Z) with the formula

z =x ndash microXσX

(confidence4)

and compute the probability of X taking on a

value within the interval say x isin [xbndash xb+]

from the table (Sample statistics X and SXare appropriate when population statistics are

unknown)

For instance compute the probability a

Gaussian random variable X with microX = 5 and

σX = 234 takes on a value within the interval

x isin [3 6]

1 Compute the z-score of each endpoint of

the interval

stats Statistics confidence Confidence p 5

z3 =3 ndash microXσX

asymp ndash085 (confidence5)

z6 =6 ndash microXσX

asymp 043 (confidence6)

2 Look up the CDF values for z3 and z6 which

are Φ(z3) = 01977 and Φ(z6) = 06664 3 The

CDF values correspond to the probabilities x lt 3

and x lt 6 Therefore to find the probability

x lies in that interval we subtract the lower

bound probability

P(x isin [3 6]) = P(x lt 6) ndash P(x lt 3) (confidence7)

= Φ(6) ndashΦ(3) (confidence8)

asymp 06664 ndash 01977 (confidence9)

asymp 04689 (confidence10)

So there is a 4689 probability and therefore

we have 4689 confidence that x isin [3 6]

Often we want to go the other way estimating

the symmetric interval [xbndash xb+] for which

there is a given probability In this case we

first look up the z-score corresponding to a

certain probability For concreteness given

the same population statistics above letrsquos find

the symmetric interval [xbndash xb+] over which we

have 90 confidence From the table we want

two symmetric z-scores that have CDFshyvalue

difference 09 Or in maths

Φ(zb+) ndashΦ(zbndash) = 09 and zb+ = ndashzbndash

(confidence11)

stats Statistics confidence Confidence p 6

Due to the latter relation and the additional

fact that the Gaussian CDF has antisymmetry

Φ(zb+) +Φ(zbndash) = 1 (confidence12)

Adding the two Φ equations

Φ(zb+) = 192 (confidence13)

= 095 (confidence14)

and Φ(zbndash) = 005 From the table these

correspond (with a linear interpolation) to

zb = zb+ = ndashzbndash asymp 1645 All that remains is

to solve the z-score formula for x

x = microX + zσX (confidence15)

From this

xb+ = microX + zb+σX asymp 8849 (confidence16)

xbndash = microX + zbndashσX asymp 1151 (confidence17)

and X has a 90 confidence interval

[1151 8849]

stats Statistics confidence Confidence p 7

Example statsconfidenceshy1 re gaussian confidence for a mean

Problem StatementConsider the data set generated above

What is our 95 confidence interval in our

estimate of the mean

Problem Solution

Assumingwe have a sufficiently large

data set the distribution of means is

approximately Gaussian Following the same

logic as above we need z-score that gives an

upper CDF value of From the

table we obtain the zb = zb+ = ndashzbndash below

z_b = 196

Now we can estimate the mean using our

sample and mean statistics

X = Xplusmn zbSX (confidence18)

mu_x_95 = mu_mu + [-z_bz_b]s_mu

mu_x_95 =

04526 05449

stats Statistics student Confidence p 8

This is our 95 confidence interval in our

estimate of the mean

stats Statistics student Student confidence p 1

Studentrsquos t-distribution

degrees of freedom

statsstudent Student

confidence

The central limit theorem tells us that for

large sample size N the distribution of the

means is Gaussian However for small sample

size the Gaussian isnrsquot as good of an estimate

Studentrsquos t-distribution is superior for

lower sample size and equivalent at higher

sample size Technically if the population

standard deviation σX is known even for

low sample size we should use the Gaussian

distribution However this rarely arises in

practice so we can usually get away with an

ldquoalways trdquo approach

A way that the tshydistribution accounts for

low-N is by having an entirely different

distribution for each N (seems a bit of a

cheat to me) Actually instead of N it uses

the degrees of freedom ν which is N minus the

number of parameters required to compute the

statistic Since the standard deviation requires

only the mean for most of our cases ν = N ndash 1

As with the Gaussian distribution the

tshydistributionrsquos integral is difficult to

calculate Typically we will use a t-table such

as the one given in Appendix A02 There are

three points of note

1 Since we are primarily concerned with

going from probabilityconfidence

values (eg P probabilityconfidence) to

intervals typically there is a column for

each probability

stats Statistics student Student confidence p 2

2 The extra parameter ν takes over one of

the dimensions of the table because three-

dimensional tables are illegal

3 Many of these tables are ldquotwo-sidedrdquo

meaning their t-scores and probabilities

assume you want the symmetric

probability about the mean over the

interval [ndashtb tb] where tb is your t-score

bound

Consider the following example

Example statsstudentshy1 re student confidence interval

Problem StatementWrite a Matlab script to generate a data

set with 200 samples and sample sizes

N isin 10 20 100 using any old distribution

Compare the distribution of the means for

the different N Use the sample distributions

and a t-table to compute 99 confidence

intervals

Problem Solution

Generate the data set

confidence = 099 requirement

M = 200 of samplesN_a = [1020100] sample sizes

mu = 27 population meansigma = 9 population std

rng(1) seed random number generatordata_a = mu + sigmarandn(N_a(end)M) normal

stats Statistics student Student confidence p 3

size(data_a) check sizedata_a(11015) check 10 rows and five columns

ans =

100 200

ans =

211589 302894 278705 307835 283662376305 171264 282973 240811 343486201739 443719 437059 390699 322002170135 326064 369030 379230 365747193900 329156 237230 224749 197709218460 138295 312479 169527 341876219719 346854 194480 187014 241642286054 322244 222873 269906 376746252282 187326 145011 283814 277645322780 341538 270382 188643 141752

Compute the means for different sample sizes

mu_a = NaNones(length(N_a)M)for i = 1length(N_a)

mu_a(i) = mean(data_a(1N_a(i)1M)1)end

Plotting the distribution of

the means yields Figure 0412

20 25 30 350

01

02

03

random variable X

proba

bility

(PMF)

interval for N = 10

interval for N = 20

interval for N = 100

stats Statistics student Student confidence p 4

Figure 0412 a histogram showing thedistribution of the means for each sample size

It makes sense that the larger the sample size

the smaller the spread A quantitative metric

for the spread is of course the standard

deviation of the means for each sample size

S_mu = std(mu_a02)

S_mu =

283652091810097

Look up t-table values or use Matlabrsquos tinvfor different sample sizes and 99 confidence

Use these the mean of means and the

standard deviation of means to compute the

99 confidence interval for each N

t_a = tinv(confidenceN_a-1)for i = 1length(N_a)

interval = mean(mu_a(i)) + [-11]t_a(i)S_mu(i)

disp(sprintf(interval for N = i N_a(i)))disp(interval)

end

stats Statistics multivar Student confidence p 5

t_a =

28214 25395 23646

interval for N = 10190942 351000

interval for N = 20216292 322535

interval for N = 100247036 294787

As expected the larger the sample size the

smaller the interval over which we have 99

confidence in the estimate

stats Statistics multivar Multivariate probability and correlation p 1

joint PDF

joint PMF

statsmultivar Multivariate

probability and

correlation

Thus far we have considered probability

density and mass functions (PDFs and PMFs)

of only one random variable But of course

often we measure multiple random variables

X1 X2 Xn during a single event meaning

a measurement consists of determining values

x1 x2 xn of these random variables

We can consider an n-tuple of continuous

random variables to form a sample space Ω = Rn

on which a multivariate function f Rn rarr Rcalled the joint PDF assigns a probability

density to each outcome x isin Rn The joint PDF

must be greater than or equal to zero for all

x isin Rn the multiple integral over Ω must be

unity and the multiple integral over a subset

of the sample space A sub Ω is the probability of

the event A

We can consider an n-tuple of discrete random

variables to form a sample space Nn0 on which

a multivariate function f Nn0 rarr R calledthe joint PMF assigns a probability to each

outcome x isin Nn0 The joint PMF must be greater

than or equal to zero for all x isin Nn0 the

multiple sum over Ω must be unity and the

multiple sum over a subset of the sample space

A sub Ω is the probability of the event A

stats Statistics multivar Multivariate probability and correlation p 2

Example statsmultivarshy1 re bivariate gaussian pdf

Problem StatementLetrsquos visualize multivariate PDFs by plotting

a bivariate gaussian using Matlabrsquos mvnpdffunction

Problem Solution

mu = [1020] meansSigma = [1002] cov well talk about thisx1_a = linspace(

mu(1)-5sqrt(Sigma(11))mu(1)+5sqrt(Sigma(11))30)

x2_a = linspace(mu(2)-5sqrt(Sigma(22))mu(2)+5sqrt(Sigma(22))30)

[X1X2] = meshgrid(x1_ax2_a)f = mvnpdf([X1() X2()]muSigma)f = reshape(flength(x2_a)length(x1_a))

h = figurep = surf(x1_ax2_af)xlabel($x_1$interpreterlatex)ylabel($x_2$interpreterlatex)zlabel($f(x_1x_2)$interpreterlatex)shading interphgsave(hfigurestemp)

The result is Figure 0413 Note

how the means and standard

deviations affect the distribution

510

1520

0

02

x1x2

f(x1x2)

stats Statistics multivar Multivariate probability and correlation p 1

marginal PDF

Figure 0413 twoshyvariable gaussian PDF

Marginal probability

The marginal PDF of a multivariate PDF is the

PDF of some subspace of Ω after one or more

variables have been ldquointegrated outrdquo such that

a fewer number of random variables remain

Of course these marginal PDFs must have the

same properties of any PDF such as integrating

to unity

Example statsmultivarshy2 re bivariate gaussian marginal probability

Problem StatementLetrsquos demonstrate this by numerically

integrating over x2 in the bivariate Gaussian

above

Problem Solution

Continuing from where we left off letrsquos

integrate

f1 = trapz(x2_af2) trapezoidal integration

Plotting this yields Figure 0414

h = figurep = plot(x1_af1)pLineWidth = 2xlabel($x_1$interpreterlatex)

stats Statistics multivar Multivariate probability and correlation p 1

machine learning

ylabel($g(x_1)=int_-infty^infty f(x_1x_2) d x_2$interpreterlatex)hgsave(hfigurestemp)

6 8 10 12 140

01

02

03

04

x1

g(x1)=acute infin ndashinfinf

(x1x2)dx2

Figure 0414 marginal Gaussian PDF g(x1)

We should probably verify that this

integrates to one

disp([integral over x_1 = sprintf(07ftrapz(x1_af1))]

)

integral over x_1 = 09999986

Not bad

Covariance

Very often especially in machine learning

stats Statistics multivar Multivariate probability and correlation p 1

artificial intelligence

covariance

correlation

sample covariance

covariance matrix

or artificial intelligence applications the

question about two random variables X and Y

is how do they coshyvary That is what is their

covariance defined as

Cov[XY

]equiv E

((X ndash microX)(Y ndash microY)

)= E(XY) ndash microXmicroY

Note that when X = Y the covariance is just

the variance When a covariance is large and

positive it is an indication that the random

variables are strongly correlated When

it is large and negative they are strongly

anti-correlated Zero covariance means

the variables are uncorrelated In fact

correlation is defined as

Cor[XY

]=

Cov[XY

]radicVar

[X]Var

[Y]

This is essentially the covariance ldquonormalizedrdquo

to the interval [ndash1 1]

Sample covariance

As with the other statistics wersquove considered

covariance can be estimated from

measurement The estimate called the sample

covariance qXY of random variables X and Y

with sample size N is given by

qXY =1

N ndash 1

Nsumi=1

(xi ndash X)(yi ndash Y)

Multivariate covariance

With n random variables Xi one can compute

the covariance of each pair It is common

practice to define an n times n matrix of

covariances called the covariance matrix Σ

stats Statistics multivar Multivariate probability and correlation p 2

sample covariance matrix

such that each pairrsquos covariance

Cov[Xi Xj

](multivar1)

appears in its row-column combination (making it

symmetric) as shown below

Σ =

Cov

[X1 X1

]Cov

[X1 X2

]middot middot middot Cov

[X1 Xn

]Cov

[X2 X1

]Cov

[X2 X2

]Cov

[X2 Xn

]

Cov[Xn X1

]Cov

[Xn X2

]middot middot middot Cov

[Xn Xn

]

The multivariate sample covariance matrix

Q is the same as above but with sample

covariances qXiXj

Both covariance matrices have correlation

analogs

Example statsmultivarshy3 re car data sample covariance and correlation

Problem StatementLetrsquos use a built-in multivariate data set

that describes different features of cars

listed below

d = load(carsmallmat) this is a struct

Letrsquos compute the sample covariance and

correlation matrices

Problem Solution

variables = MPGCylindersDisplacementHorsepowerWeightAcceleration

stats Statistics multivar Multivariate probability and correlation p 3

Model_Yearn = length(variables)m = length(dMPG)

data = NaNones(mn) preallocatefor i = 1n

data(i) = d(variablesi)end

cov_d = nancov(data) sample covariancecor_d = corrcov(cov_d) sample correlation

This is highly correlatedanticorrelated

data Letrsquos plot each variable versus each

other variable to see the correlations of each

We use a red-to-blue colormap to contrast

anticorrelation and correlation Purple then is

uncorrelated

The following builds the red-to-blue colormap

n_colors = 10cmap_rb = NaNones(n_colors3)for i = 1n_colors

a = in_colorscmap_rb(i) = (1-a)[100]+a[001]

end

h = figurefor i = 1n

for j = 1nsubplot(nnsub2ind([nn]ji))p = plot(d(variablesi)d(variablesj)

) hold onthis_color = cmap_rb(round((cor_d(ij)+1)(n_colors-1)2)

stats Statistics multivar Multivariate probability and correlation p 1

)pMarkerFaceColor = this_colorpMarkerEdgeColor = this_color

endendhgsave(hfigurestemp)

Figure 0415 car data correlation

Conditional probability and dependence

Independent variables are uncorrelated

However uncorrelated variables may or may

not be independent Therefore we cannot use

correlation alone as a test for independence

For instance for random variables X and Y

where X has some even distribution and Y = X2

clearly the variables are dependent However

the are also uncorrelated (due to symmetry)

stats Statistics multivar Multivariate probability and correlation p 2

Example statsmultivarshy4 re car data sample covariance and correlation

Problem StatementUsing a uniform distribution U(ndash1 1) show that

X and Y are uncorrelated (but dependent)

with Y = X2 with some sampling We compute

the correlation for different sample sizes

Problem Solution

N_a = round(linspace(10500100))qc_a = NaNones(size(N_a))rng(6)x_a = -1 + 2rand(max(N_a)1)y_a = x_a^2for i = 1length(N_a)

should write incremental algorithm but lazyq = cov(x_a(1N_a(i))y_a(1N_a(i)))qc = corrcov(q)qc_a(i) = qc(21) cross correlation

end

The absolute values of the correlations are

shown in Figure 0416 Note that we should

probably average several such curves to

estimate how the correlation would drop off

with N but the single curve describes our

understanding that the correlation in fact

approaches zero in the large-sample limit

h = figurep = plot(N_aabs(qc_a))pLineWidth = 2xlabel(sample size $N$interpreterlatex)ylabel(absolute sample correlationinterpreterlatex

stats Statistics regression Multivariate probability and correlation p 3

)hgsave(hfigurestemp)

0 100 200 300 400 5000

01

02

03

sample size Nabsolute

sampleco

rrelation

Figure 0416 absolute value of the sample

correlation between X sim U(ndash1 1) and Y = X2 fordifferent sample size N In the limit the populationcorrelation should approach zero and yet X and Yare not independent

stats Statistics regression Regression p 1

statsregression Regression

Suppose we have a sample with two

measurands (1) the force F through a spring and

(2) its displacement X (not from equilibrium)

We would like to determine an analytic

function that relates the variables perhaps

for prediction of the force given another

displacement

There is some variation in the measurement Letrsquos

say the following is the sample

X_a = 1e-3[10213041495061718092100] mF_a = [501504532559572599

610639670679703] N

Letrsquos take a look at the data The result is

Figure 0417

h = figurep = plot(X_a1e3F_abMarkerSize15)xlabel($X$ (mm)interpreterlatex)ylabel($F$ (N)interpreterlatex)xlim([0max(X_a1e3)])grid onhgsave(hfigurestemp)

How might we find an analytic function that

agrees with the data Broadly our strategy

will be to assume a general form of a function

and use the data to set the parameters in the

stats Statistics regression Regression p 2

0 20 40 60 80 10050

60

70

X (mm)

F(N)

Figure 0417 forceshydisplacement data

function such that the difference between the

data and the function is minimal

Let y be the analytic function that we would

like to fit to the data Let yi denote the

value of y(xi) where xi is the ith value of the

random variable X from the sample Then we

want to minimize the differences between the

force measurements Fi and yi

From calculus recall that we can minimize a

function by differentiating it and solving for

the zero-crossings (which correspond to local

maxima or minima)

First we need such a function to minimize

Perhaps the simplest effective function D is

constructed by squaring and summing the

differences we want to minimize for sample

size N

D(xi) =Nsumi=1

(Fi ndash yi)2 (regression1)

stats Statistics regression Regression p 3

(recall that yi = y(xi) which makes D a function

of x)

Now the form of y must be chosen We consider

only mth-order polynomial functions y but

others can be used in a similar manner

y(x) = a0 + a1x + a2x2 + middot middot middot + amxm (regression2)

If we treat D as a function of the polynomial

coefficients aj ie

D(a0 a1 middot middot middot am) (regression3)

and minimize D for each value of xi we must

take the partial derivatives of D with respect

to each aj and set each equal to zero

partD

parta0= 0

partD

parta1= 0 middot middot middot partD

partam= 0

This gives us N equations and m + 1 unknowns aj

Writing the system in matrix form

1 x1 x21middot middot middot xm1

1 x2 x22 middot middot middot xm21 x3 x23 middot middot middot xm3

1 xN x2Nmiddot middot middot xmN

︸ ︷︷ ︸

ANtimes(m+1)

a0a1a2

am

︸ ︷︷ ︸a(m+1)times1

=

F1F2F3

FN

︸ ︷︷ ︸bNtimes1

(regression4)

Typically N gt m and this is an overdetermined

system Therefore we usually canrsquot solve by

taking Andash1 because A doesnrsquot have an inverse

stats Statistics regression Regression p 4

Instead we either find the Moore-Penrose

pseudo-inverse Adagger and have a = Adaggerb as

the solution which is inefficientmdashor we can

approximate b with an algorithm such as that

used by Matlabrsquos operator In the latter case

a_a = Ab_a

Example statsregressionshy1 re regression

Problem StatementUse Matlabrsquos operator to find a good

polynomial fit for the sample Therersquos the

sometimesshydifficult question ldquowhat order

should we fitrdquo Letrsquos try out several and

see what the squaredshydifferences function D

gives

Problem Solution

N = length(X_a) sample sizem_a = 2N all the order up to N

A = NaNones(length(m_a)max(m_a)N)for k = 1length(m_a) each order

for j = 1N each measurementfor i = 1( m_a(k) + 1 ) each coef

A(kji) = X_a(j)^(i-1)end

endenddisp(squeeze(A(215)))

10000 00100 00001 00000 NaN10000 00210 00004 00000 NaN10000 00300 00009 00000 NaN10000 00410 00017 00001 NaN10000 00490 00024 00001 NaN10000 00500 00025 00001 NaN10000 00610 00037 00002 NaN10000 00710 00050 00004 NaN

stats Statistics regression Regression p 5

10000 00800 00064 00005 NaN10000 00920 00085 00008 NaN10000 01000 00100 00010 NaN

Wersquove printed the first five columns of the

third-order matrix which only has four

columns so NaNs fill in the rest

Now we can use the operator to solve for

the coefficients

a = NaNones(length(m_a)max(m_a))

warning(offall)for i = 1length(m_a)

A_now = squeeze(A(i1m_a(i)))a(i1m_a(i)) = (A_now(1m_a(i))F_a)

endwarning(onall)

n_plot = 100x_plot = linspace(min(X_a)max(X_a)n_plot)y = NaNones(n_plotlength(m_a)) preallocatefor i = 1length(m_a)

y(i) = polyval(fliplr(a(i1m_a(i)))x_plot)end

h = figurefor i = 12length(m_a)-1

p = plot(x_plot1e3y(i)linewidth15)hold onpDisplayName = sprintf(

order i (m_a(i)-1)

)

stats Statistics ex Regression p 6

endp = plot(X_a1e3F_abMarkerSize15)xlabel($X$ (mm)interpreterlatex)ylabel($F$ (N)interpreterlatex)pDisplayName = samplelegend(showlocationsoutheast)grid onhgsave(hfigurestemp)

20 40 60 80 100

50

60

70

X (mm)

F(N)

order 1

order 3

order 5

order 7

order 9

sample

Figure 0418 forceshydisplacement data with curvefits

vecs Statistics Exercises for Chapter 04 p 1

statsex Exercises for Chapter

04

calculus

limit

series

derivative

integral

vector calculus

vecs

Vector calculus

A great many physical situations of interest to

engineers can be described by calculus It can

describe how quantities continuously change

over (say) time and gives tools for computing

other quantities We assume familiarity with

the fundamentals of calculus limit series

derivative and integral From these and a

basic grasp of vectors we will outline some

of the highlights of vector calculus Vector

calculus is particularly useful for describing

the physics of for instance the following

mechanics of particles wherein is studied

the motion of particles and the forcing

causes thereof

rigid-body mechanics wherein is studied the

motion rotational and translational and

its forcing causes of bodies considered

rigid (undeformable)

solid mechanics wherein is studied the

motion and deformation and their forcing

causes of continuous solid bodies (those

that retain a specific resting shape)

vecs Vector calculus p 3

complex analysis

1 For an introduction to complex analysis see Kreyszig (Kreyszig

Advanced Engineering Mathematics Part D)

2 Ibid Chapters 9 10

3 HM Schey Div Grad Curl and All that An Informal Text on Vector

Calculus WW Norton 2005 ISBN 9780393925166

euclidean vector space

bases

components

basis vectors

invariant

coordinate system

Cartesian coordinates

fluid mechanics wherein is studied the

motion and its forcing causes of fluids

(liquids gases plasmas)

heat transfer wherein is studied the

movement of thermal energy through and

among bodies

electromagnetism wherein is studied

the motion and its forcing causes of

electrically charged particles

This last example was in fact very influential

in the original development of both vector

calculus and complex analysis1 It is not

an exaggeration to say that the topics above

comprise the majority of physical topics of

interest in engineering

A good introduction to vector calculus is

given by Kreyszig2 Perhaps the most famous

and enjoyable treatment is given by Schey3

in the adorably titled Div Grad Curl and All

that

It is important to note that in much of what

follows we will describe (typically the three-

dimensional space of our lived experience) as

a euclidean vector space an nshydimensional

vector space isomorphic to Rn As we know

from linear algebra any vector v isin Rn can be

expressed in any number of bases That is the

vector v is a basis-free object with multiple

basis representations The components and

basis vectors of a vector change with basis

changes but the vector itself is invariant A

coordinate system is in fact just a basis We

are most familiar of course with Cartesian

vecs Vector calculus div p 4

manifolds

non-euclidean geometry

parallel postulate

differential geometry

4 John M Lee Introduction to Smooth Manifolds second Vol 218

Graduate Texts in Mathematics Springer 2012

5 Francesco Bullo and Andrew D Lewis Geometric control of

mechanical systems modeling analysis and design for simple

mechanical control systems Ed by JE Marsden L Sirovich and M

Golubitsky Springer 2005

coordinates which is the specific orthonormal

basis b for Rn

b1 =

1

0

0

b2 =

0

1

0

middot middot middot bn =

0

0

1

(vecs1)

Manifolds are spaces that appear locally

as Rn but can be globally rather different

and can describe non-euclidean geometry

wherein euclidean geometryrsquos parallel

postulate is invalid Calculus on manifolds

is the focus of differential geometry a

subset of which we can consider our current

study A motivation for further study of

differential geometry is that it is very

convenient when dealingwith advanced

applications of mechanics such as rigid-body

mechanics of robots and vehicles A very nice

mathematical introduction is given by Lee4 and

Bullo and Lewis5 give a compact presentation

in the context of robotics

Vector fields have several important

properties of interest wersquoll explore in this

chapter Our goal is to gain an intuition of

these properties and be able to perform basic

calculation

vecs Vector calculus div Divergence surface integrals and flux p 1

surface integral

flux

6 Schey Div Grad Curl and All that An Informal Text on Vector

Calculus pp 21shy30

7 Kreyszig Advanced Engineering Mathematics sect 106

source

sink

incompressible

vecsdiv Divergence surface

integrals and flux

Flux and surface integrals

Consider a surface S Let

r(u v) = [x(u v) y(u v) z(u v)] be a parametric

position vector on a Euclidean vector space R3Furthermore let F R3 rarr R3 be a vectorshyvalued

function of r and let n be a unit-normal vector

on a surface S The surface integral

uml

S

F middot ndS (div1)

which integrates the normal of F over the

surface We call this quantity the flux of F

out of the surface S This terminology comes

from fluid flow for which the flux is the

mass flow rate out of S In general the flux

is a measure of a quantity (or field) passing

through a surface For more on computing

surface integrals see Schey6 and Kreyszig7

Continuity

Consider the flux out of a surface S that

encloses a volume ∆V divided by that volume

1

∆V

uml

S

F middot ndS (div2)

This gives a measure of flux per unit volume

for a volume of space Consider its physical

meaningwhen we interpret this as fluid flow

all fluid that enters the volume is negative flux

and all that leaves is positive If physical

conditions are such that we expect no fluid to

enter or exit the volume via what is called a

source or a sink and if we assume the density

of the fluid is uniform (this is called an

incompressible fluid) then all the fluid that

vecs Vector calculus div Divergence surface integrals and flux p 1

continuity equation

divergence

enters the volume must exit and we get

1

∆V

uml

S

F middot ndS = 0 (div3)

This is called a continuity equation although

typically this name is given to equations of the

form in the next section This equation is one of

the governing equations in continuum mechanics

Divergence

Letrsquos take the fluxshypershyvolume as the volume

∆Vrarr 0 we obtain the following

Equation div4 divergence integral

form

lim∆Vrarr0

1

∆V

uml

S

F middot ndS

This is called the divergence of F and is

defined at each point in R3 by taking the

volume to zero about it It is given the

shorthand div F

What interpretation can we give this quantity

It is a measure of the vector fieldrsquos flux

outward through a surface containing an

infinitesimal volume When we consider a fluid

a positive divergence is a local decrease in

density and a negative divergence is a density

increase If the fluid is incompressible and has

no sources or sinks we can write the continuity

vecs Vector calculus div Divergence surface integrals and flux p 1

equation

div F = 0 (div5)

In the Cartesian basis it can be shown that the

divergence is easily computed from the field

F = Fxi + Fyj + Fzk (div6)

as follows

Equation div7 divergence

differential form

div F = partxFx + partyFy + partzFz

Exploring divergence

Divergence is perhaps best explored by

considering it for a vector field in R2 Sucha field F = Fxi + Fyj can be represented

as a ldquoquiverrdquo plot If we overlay the quiver

plot over a ldquocolor densityrdquo plot representing

div F we can increase our intuition about the

divergence

The followingwas generated from a Jupyter

notebook with the following filename and

kernel

notebook filename div_surface_integrals_fluxipynbnotebook kernel python3

vecs Vector calculus div Divergence surface integrals and flux p 2

First load some Python packages

from sympy import import numpy as npimport matplotlibpyplot as pltfrom matplotlibticker import LogLocatorfrom matplotlibcolors import from sympyutilitieslambdify import lambdifyfrom IPythondisplay import display Markdown Latex

Now we define some symbolic variables and

functions

var(xy)F_x = Function(F_x)(xy)F_y = Function(F_y)(xy)

Rather than repeat code letrsquos write a single

function quiver_plotter_2D to make several

of these plots

def quiver_plotter_2D(field=F_xxyF_yxygrid_width=3grid_decimate_x=8grid_decimate_y=8norm=Normalize()density_operation=divprint_density=True

) define symbolicsvar(xy)F_x = Function(F_x)(xy)F_y = Function(F_y)(xy)

vecs Vector calculus div Divergence surface integrals and flux p 3

field_sub = field

compute densityif density_operation is divden = F_xdiff(x) + F_ydiff(y)

elif density_operation is curl in the k directionden = F_ydiff(x) - F_xdiff(y)

elseerror(div and curl are the only density operators)

den_simp = densubs(field_sub

)doit()simplify()if den_simpis_constant()

print(Warning density operator is constant (no density plot)

)if print_density

print(fThe density_operation is)display(den_simp)

lambdify for numericsF_x_sub = F_xsubs(field_sub)F_y_sub = F_ysubs(field_sub)F_x_fun = lambdify((xy)F_xsubs(field_sub)numpy)F_y_fun = lambdify((xy)F_ysubs(field_sub)numpy)if F_x_subis_constantF_x_fun1 = F_x_fun dummyF_x_fun = lambda xy F_x_fun1(xy)npones(xshape)

if F_y_subis_constantF_y_fun1 = F_y_fun dummyF_y_fun = lambda xy F_y_fun1(xy)npones(xshape)

if not den_simpis_constant()den_fun = lambdify((xy) den_simpnumpy)

create gridw = grid_widthY X = npmgrid[-ww100j -ww100j]

evaluate numericallyF_x_num = F_x_fun(XY)F_y_num = F_y_fun(XY)if not den_simpis_constant()den_num = den_fun(XY)

plotp = pltfigure() colormeshif not den_simpis_constant()cmap = pltget_cmap(PiYG)im = pltpcolormesh(XYden_numcmap=cmapnorm=norm)pltcolorbar()

Abs quiverdx = grid_decimate_ydy = grid_decimate_x

vecs Vector calculus div Divergence surface integrals and flux p 4

pltquiver(X[dxdy]Y[dxdy]F_x_num[dxdy]F_y_num[dxdy]units=xy scale=10

)plttitle(fF(xy) = [F_xsubs(field_sub)F_ysubs(field_sub)])return p

Note that while wersquore at it we included a

hook for density plots of the curl of F and

wersquoll return to this in a later lecture

Letrsquos inspect several cases

p = quiver_plotter_2D(field=F_xx2F_yy2

)

The div is

2x + 2y

p = quiver_plotter_2D(field=F_xxyF_yxy

)

The div is

vecs Vector calculus div Divergence surface integrals and flux p 5

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [x2y2]

10

5

0

5

10

Figure 051 png

x + y

p = quiver_plotter_2D(field=F_xx2+y2F_yx2+y2

)

The div is

2x + 2y

p = quiver_plotter_2D(field=F_xx2sqrt(x2+y2)F_yy2sqrt(x2+y2)

vecs Vector calculus div Divergence surface integrals and flux p 6

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [xyxy]

10

5

0

5

10

Figure 052 png

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [x2 + y2x2 + y2]

10

5

0

5

10

Figure 053 png

vecs Vector calculus curl Divergence surface integrals and flux p 7

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [x2sqrt(x2 + y2)y2sqrt(x2 + y2)]

100

10 1010 1

100

Figure 054 png

norm=SymLogNorm(linthresh=3 linscale=3))

The div is

ndashx3 ndash y3 + 2(x + y

) (x2 + y2

)(x2 + y2

) 32

vecs Vector calculus curl Curl line integrals and circulation p 1

line integral

8 Schey Div Grad Curl and All that An Informal Text on Vector

Calculus pp 63shy74

9 Kreyszig Advanced Engineering Mathematics sect 101 102

force

work

circulation

the law of circulation

vecscurl Curl line integrals

and circulation

Line integrals

Consider a curve C in a Euclidean vector space

R3 Let r(t) = [x(t) y(t) z(t)] be a parametric

representation of C Furthermore let F R3 rarr R3

be a vectorshyvalued function of r and let rprime(t)

be the tangent vector The line integral is

ˆ

C

F(r(t)) middot rprime(t) dt (curl1)

which integrates F along the curve For more

on computing line integrals see Schey8 and

Kreyszig9

If F is a force being applied to an object

moving along the curve C the line integral is

the work done by the force More generally the

line integral integrates F along the tangent of

C

Circulation

Consider the line integral over a closed curve

C denoted by

˛

C

F(r(t)) middot rprime(t) dt (curl2)

We call this quantity the circulation of F

around C

For certain vectorshyvalued functions F the

circulation is zero for every curve In these

cases (static electric fields for instance) this

is sometimes called the the law of circulation

vecs Vector calculus curl Curl line integrals and circulation p 2

curl

curl

Curl

Consider the division of the circulation around

a curve in R3 by the surface area it encloses

∆S

1

∆S

˛

C

F(r(t)) middot rprime(t) dt (curl3)

In a manner analogous to the operation

that gaves us the divergence letrsquos consider

shrinking this curve to a point and the surface

area to zero

lim∆Srarr0

1

∆S

˛

C

F(r(t)) middot rprime(t) dt (curl4)

We call this quantity the ldquoscalarrdquo curl of F

at each point in R3 in the direction normal

to ∆S as it shrinks to zero Taking three (or

n for Rn) ldquoscalarrdquo curls in indepedent normal

directions (enough to span the vector space)

we obtain the curl proper which is a vector-

valued function curl R3 rarr R3

The curl is coordinate-independent In

cartesian coordinates it can be shown to be

equivalent to the following

Equation curl5 curl differential

form cartesian coordinates

curl F =[partyFz ndash partzFy partzFx ndash partxFz partxFy ndash partyFx

]gt

But what does the curl of F represent It

quantifies the local rotation of F about each

point If F represents a fluidrsquos velocity curl F

vecs Vector calculus curl Curl line integrals and circulation p 1

vorticity

10 A region is simply connected if every curve in it can shrink to a

point without leaving the region An example of a region that is not

simply connected is the surface of a toroid

path independence

is the local rotation of the fluid about each

point and it is called the vorticity

Zero curl circulation and path

independence

Circulation

It can be shown that if the circulation of F on

all curves is zero then in each direction n and

at every point curl F = 0 (ie n middot curl F = 0)

Conversely for curl F = 0 in a simply connected

region10 F has zero circulation

Succinctly informally and without the

requisite qualifiers above

zero circulationrArr zero curl

(curl6)

zero curl + simply connected regionrArr zero circulation

(curl7)

Path independence

It can be shown that if the path integral of F

on all curves between any two points is path-

independent then in each direction n and

at every point curl F = 0 (ie n middot curl F = 0)

Conversely for curl F = 0 in a simply connected

region all line integrals are independent of

path

Succinctly informally and without the

requisite qualifiers above

path independencerArr zero curl

(curl8)

zero curl + simply connected regionrArr path independence

(curl9)

vecs Vector calculus curl Curl line integrals and circulation p 1

zero curl

zero circulation path independence

connectedness and

Figure 055 an implication graph relating zerocurl zero circulation path independence andconnectedness Green edges represent implication(a implies b) and black edges represent logicalconjunctions

hellip and how they relate

It is also true that

path independencehArr zero circulation

(curl10)

So putting it all together we get Figure 055

Exploring curl

Curl is perhaps best explored by considering

it for a vector field in R2 Such a field in

cartesian coordinates F = Fxi + Fyj has curl

curl F =[party0 ndash partzFy partzFx ndash partx0 partxFy ndash partyFx

]gt=[0 ndash 0 0 ndash 0 partxFy ndash partyFx

]gt=[0 0 partxFy ndash partyFx

]gt (curl11)

That is curl F = (partxFy ndash partyFx )k and the only

nonzero component is normal to the xyshyplane

If we overlay a quiver plot of F over a ldquocolor

densityrdquo plot representing the k-component of

curl F we can increase our intuition about the

curl

The followingwas generated from a Jupyter

vecs Vector calculus curl Curl line integrals and circulation p 2

notebook with the following filename and

kernel

notebook filename curl-and-line-integralsipynbnotebook kernel python3

First load some Python packages

from sympy import import numpy as npimport matplotlibpyplot as pltfrom matplotlibticker import LogLocatorfrom matplotlibcolors import from sympyutilitieslambdify import lambdifyfrom IPythondisplay import display Markdown Latex

Now we define some symbolic variables and

functions

var(xy)F_x = Function(F_x)(xy)F_y = Function(F_y)(xy)

We use the same function defined in

Lecture vecsdiv quiver_plotter_2D to make

several of these plots

Letrsquos inspect several cases

vecs Vector calculus curl Curl line integrals and circulation p 3

100 075 050 025 000 025 050 075 100100

075

050

025

000

025

050

075

100F(xy) = [0cos(2pix)]

6

4

2

0

2

4

6

Figure 056 png

p = quiver_plotter_2D(field=F_x0F_ycos(2pix)density_operation=curlgrid_decimate_x=2grid_decimate_y=10grid_width=1

)

The curl is

ndash2π sin (2πx)

p = quiver_plotter_2D(field=F_x0F_yx2density_operation=curl

vecs Vector calculus curl Curl line integrals and circulation p 4

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [0x2]

6

4

2

0

2

4

6

Figure 057 png

grid_decimate_x=2grid_decimate_y=20

)

The curl is

2x

p = quiver_plotter_2D(field=F_xy2F_yx2density_operation=curlgrid_decimate_x=2grid_decimate_y=20

)

vecs Vector calculus curl Curl line integrals and circulation p 5

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [y2x2]

6

4

2

0

2

4

6

Figure 058 png

The curl is

2x ndash 2y

p = quiver_plotter_2D(field=F_x-yF_yxdensity_operation=curlgrid_decimate_x=6grid_decimate_y=6

)

Warning density operator is constant (no densityplot)rarr

The curl is

vecs Vector calculus grad Curl line integrals and circulation p 6

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [-yx]

Figure 059 png

2

vecs Vector calculus grad Gradient p 1

gradient

direction

magnitude

principle of least action

vecsgrad Gradient

Gradient

The gradient grad of a scalarshyvalued

function f R3 rarr R is a vector field

F R3 rarr R3 that is grad f is a vectorshyvalued

function on R3 The gradientrsquos local direction

and magnitude are those of the local maximum

rate of increase of f This makes it useful in

optimization (eg in the method of gradient

descent)

In classical mechanics quantum mechanics

relativity string theory thermodynamics

and continuum mechanics (and elsewhere)

the principle of least action is taken as

fundamental (Richard P Feynman Robert B

Leighton and Matthew Sands The Feynman

Lectures on Physics New Millennium Perseus

Basic Books 2010) This principle tells us that

naturersquos laws quite frequently seem to be

derivable by assuming a certain quantitymdash

called actionmdashis minimized Considering then

that the gradient supplies us with a tool for

optimizing functions it is unsurprising that the

gradient enters into the equations of motion of

many physical quantities

The gradient is coordinate-independent but

its coordinate-free definitions donrsquot add much

to our intuition In cartesian coordinates it

can be shown to be equivalent to the following

vecs Vector calculus grad Gradient p 1

gradience

11 This is nonstandard terminology but wersquore bold

potentials

Equation grad1 gradient cartesian

coordinates

grad f =[partxf partyf partzf

]gt

Vector fields from gradients are special

Although all gradients are vector fields not

all vector fields are gradients That is given

a vector field F it may or may not be equal to

the gradient of any scalarshyvalued function f

Letrsquos say of a vector field that is a gradient

that it has gradience11 Those vector fields

that are gradients have special properties

Surprisingly those properties are connected to

path independence and curl It can be shown

that iff a field is a gradient line integrals of

the field are path independent That is for a

vector field

gradiencehArr path independence (grad2)

Consideringwhat we know from

Lecture vecscurl about path independence we

can expand Figure 055 to obtain Figure 0510

One implication is that gradients have zero

curl Many important fields that describe

physical interactions (eg static electric fields

Newtonian gravitational fields) are gradients

of scalar fields called potentials

Exploring gradient

Gradient is perhaps best explored by

considering it for a scalar field on R2 Sucha field in cartesian coordinates f(x y) has

vecs Vector calculus grad Gradient p 2

gradience

zero curl

zero circulation path independence

connectedness and

Figure 0510 an implication graph relating gradiencezero curl zero circulation path independence andconnectedness Green edges represent implication(a implies b) and black edges represent logicalconjunctions

gradient

grad f =[partxf partyf

]gt(grad3)

That is grad f = F = partxf i + partyf j If we overlay

a quiver plot of F over a ldquocolor densityrdquo

plot representing the f we can increase our

intuition about the gradient

The followingwas generated from a Jupyter

notebook with the following filename and

kernel

notebook filename gradipynbnotebook kernel python3

First load some Python packages

vecs Vector calculus grad Gradient p 3

from sympy import import numpy as npimport matplotlibpyplot as pltfrom matplotlibticker import LogLocatorfrom matplotlibcolors import from sympyutilitieslambdify import lambdifyfrom IPythondisplay import display Markdown Latex

Now we define some symbolic variables and

functions

var(xy)

(x y)

Rather than repeat code letrsquos write a single

function grad_plotter_2D to make several of

these plots

Letrsquos inspect several cases While considering

the following plots remember that they all

have zero curl

p = grad_plotter_2D(field=x

)

vecs Vector calculus grad Gradient p 4

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = x

3

2

1

0

1

2

Figure 0511 png

The gradient is

[1 0

]

p = grad_plotter_2D(field=x+y

)

The gradient is

[1 1

]

vecs Vector calculus grad Gradient p 1

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = x + y

6

4

2

0

2

4

Figure 0512 png

p = grad_plotter_2D(field=1

)

Warning field is constant (no plot)The gradient is

[0 0

]Gravitational potential

Gravitational potentials have the form of

1distance Letrsquos check out the gradient

vecs Vector calculus grad Gradient p 1

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = 1sqrt(x2 + y2)

100

101

Figure 0513 png

conic section

p = grad_plotter_2D(field=1sqrt(x2+y2)norm=SymLogNorm(linthresh=3 linscale=3)mask=True

)

The gradient is

[ndash x(x2+y2

) 32

ndashy(

x2+y2) 32

]

Conic section fields

Gradients of conic section fields can be

explored

vecs Vector calculus grad Gradient p 2

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = x2

1

2

3

4

5

6

7

8

9

Figure 0514 png

parabolic fields

eliptic fields

The following is called a parabolic field

p = grad_plotter_2D(field=x2

)

The gradient is

[2x 0

]

The following are called eliptic fields

vecs Vector calculus grad Gradient p 3

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = x2 + y2

2

4

6

8

10

12

14

16

18

Figure 0515 png

p = grad_plotter_2D(field=x2+y2

)

The gradient is

[2x 2y

]

p = grad_plotter_2D(field=-x2-y2

)

vecs Vector calculus grad Gradient p 4

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = -x2 - y2

18

16

14

12

10

8

6

4

2

Figure 0516 png

hyperbolic fields

The gradient is

[ndash2x ndash2y

]

The following is called a hyperbolic field

p = grad_plotter_2D(field=x2-y2

)

The gradient is

vecs Vector calculus stoked Gradient p 5

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = x2 - y2

8

6

4

2

0

2

4

6

8

Figure 0517 png

[2x ndash2y

]

vecs Vector calculus stoked Stokes and divergence theorems p 1

divergence theorem

triple integral

orientable

12 A surface is orientable if a consistent normal direction can

be defined Most surfaces are orientable but some notably the

Moumlbius strip cannot be See Kreyszig (Kreyszig Advanced Engineering

Mathematics sect 106) for more

13 Ibid sect 107

14 Schey Div Grad Curl and All that An Informal Text on Vector

Calculus pp 45-52

continuity equation

15 Ibid pp 49-52

vecsstoked Stokes and

divergence

theorems

Two theorems allow us to exchange certain

integrals in R3 for others that are easier to

evaluate

The divergence theorem

The divergence theorem asserts the equality

of the surface integral of a vector field F and

the triple integral of div F over the volume V

enclosed by the surface S in R3 That is

umlSF middot ndS =

˚Vdiv FdV (stoked1)

Caveats are that V is a closed region bounded

by the orientable12 surface S and that F is

continuous and continuously differentiable

over a region containing V This theorem makes

some intuitive sense we can think of the

divergence inside the volume ldquoaccumulatingrdquo

via the triple integration and equaling the

corresponding surface integral For more on the

divergence theorem see Kreyszig13 and Schey14

A lovely application of the divergence theorem

is that for any quantity of conserved stuff

(mass charge spin etc) distributed in a spatial

R3 with timeshydependent density ρ R4 rarr Rand velocity field v R4 rarr R3 the divergence

theorem can be applied to find that

parttρ = ndash div(ρv) (stoked2)

which is a more general form of a continuity

equation one of the governing equations of

many physical phenomena For a derivation of

this equation see Schey15

vecs Vector calculus ex Stokes and divergence theorems p 1

KelvinshyStokesrsquo theorem

piecewise smooth

16 A surface is smooth if its normal is continuous everywhere It

is piecewise smooth if it is composed of a finite number of smooth

surfaces

17 Kreyszig Advanced Engineering Mathematics sect 109

18 Schey Div Grad Curl and All that An Informal Text on Vector

Calculus pp 93-102

Greenersquos theorem

19 Kreyszig Advanced Engineering Mathematics sect 109

generalized Stokesrsquo theorem

20 Lee Introduction to Smooth Manifolds Ch 16

The Kelvin-Stokesrsquo theorem

The KelvinshyStokesrsquo theorem asserts the

equality of the circulation of a vector field

F over a closed curve C and the surface

integral of curl F over a surface S that has

boundary C That is for r(t) a parameterization

of C and surface normal n

˛CF(r(t)) middot rprime(t) dt =

umlSn middot curl FdS (stoked3)

Caveats are that S is piecewise smooth16 its

boundary C is a piecewise smooth simple closed

curve and F is continuous and continuously

differentiable over a region containing S

This theorem is also somewhat intuitive we

can think of the divergence over the surface

ldquoaccumulatingrdquo via the surface integration and

equaling the corresponding circulation For more

on the Kelvin-Stokesrsquo theorem see Kreyszig17

and Schey18

Related theorems

Greenersquos theorem is a twoshydimensional special

case of the Kelvin-Stokesrsquo theorem It is

described by Kreyszig19

It turns out that all of the above theorems

(and the fundamental theorem of calculus

which relates the derivative and integral)

are special cases of the generalized Stokesrsquo

theorem defined by differential geometry

We would need a deeper understanding of

differential geometry to understand this

theorem For more see Lee20

vecs Vector calculus ex Exercises for Chapter 05 p 1

vecsex Exercises for Chapter

05

Exercise 051 (6)

The PDE of Example pdeseparation-1 can be used

to describe the conduction of heat along

a long thin rod insulated along its length

where u(t x) represents temperature The initial

and dirichlet boundary conditions in that

example would be interpreted as an initial

temperature distribution along the bar and

fixed temperatures of the ends Now consider

the same PDE

parttu(t x) = kpart2xxu(t x) (ex1)

with real constant k now with neumann

boundary conditions on inverval x isin [0 L]

partxu|x=0 = 0 and partxu|x=L = 0 (ex2a)

and with initial condition

u(0 x) = f(x) (ex3)

where f is some piecewise continuous function

on [0 L] This represents the complete insulation

of the ends of the rod such that no heat flows

from the ends (or from anywhere else)

a Assume a product solution separate

variables into X(x) and T(t) and set the

separation constant to ndashλ

b Solve the boundary value problem for its

eigenfunctions Xn and eigenvalues λn

c Solve for the general solution of the

time variable ODE

d Write the product solution and apply the

initial condition f(x) by constructing it

from a generalized fourier series of the

product solution

four Vector calculus Exercises for Chapter 05 p 2

e Let L = 1 and f(x) = 100 ndash 200L |x ndash L2| as

shown in Figure 071 Compute the solution

series components Plot the sum of the

first 50 terms over x and t

0 02 04 06 08 1050100

x

f(x)

Figure 0518 initial condition for Exercise 051

four

Fourier andorthogonality

four Fourier and orthogonality series Fourier series p 1

frequency domain

Fourier analysis

1 Itrsquos important to note that the symbol ωn in this context is not the

natural frequency but a frequency indexed by integer n

frequency spectrum

fourseries Fourier series

Fourier series are mathematical series that

can represent a periodic signal as a sum

of sinusoids at different amplitudes and

frequencies They are useful for solving

for the response of a system to periodic

inputs However they are probably most

important conceptually they are our gateway

to thinking of signals in the frequency

domainmdashthat is as functions of frequency

(not time) To represent a function as a

Fourier series is to analyze it as a sum of

sinusoids at different frequencies1 ωn and

amplitudes an Its frequency spectrum is the

functional representation of amplitudes an

versus frequency ωn

Letrsquos begin with the definition

Definition fourseries1 Fourier series

trigonometric form

The Fourier analysis of a periodic function

y(t) is for n isin N0 period T and angular

frequency ωn = 2πnT

an =2

T

ˆ T2

ndashT2y(t) cos(ωnt)dt (series1)

bn =2

T

ˆ T2

ndashT2y(t) sin(ωnt)dt (series2)

The Fourier synthesis of a periodic function

y(t) with analysis components an and bn

corresponding to ωn is

y(t) =a02+

infinsumn=1

an cos(ωnt) + bn sin(ωnt)

(series3)

four Fourier and orthogonality series Fourier series p 2

harmonic

harmonic frequency

fundamental frequency

Letrsquos consider the complex form of the

Fourier series which is analogous to

Definition fourseries1 It may be helpful to

review Eulerrsquos formula(s) ndash see Appendix D01

Definition fourseries2 Fourier series

complex form

The Fourier analysis of a periodic function

y(t) is for n isin N0 period T and angular

frequency ωn = 2πnT

cplusmnn =1

T

ˆ T2

ndashT2y(t)endashjωntdt (series4)

The Fourier synthesis of a periodic function

y(t) with analysis components cn corresponding

to ωn is

y(t) =infinsumn=ndashinfin cne

jωnt (series5)

We call the integer n a harmonic and the

frequency associated with it

ωn = 2πnT (series6)

the harmonic frequency There is a special

name for the first harmonic (n = 1) the

fundamental frequency It is called this

because all other frequency components are

integer multiples of it

It is also possible to convert between the two

representations above

four Fourier and orthogonality series Fourier series p 3

harmonic amplitude

magnitude line spectrum

harmonic phase

Definition fourseries3 Fourier series

converting between

forms

The complex Fourier analysis of a periodic

function y(t) is for n isin N0 and an and bn as

defined above

cplusmnn =1

2

(a|n| ∓ jb|n|

)(series7)

The sinusoidal Fourier analysis of a periodic

function y(t) is for n isin N0 and cn as defined

above

an = cn + cndashn and (series8)

bn = j (cn ndash cndashn) (series9)

The harmonic amplitude Cn is

Cn =

radica2n + b

2n (series10)

= 2radiccncndashn (series11)

A magnitude line spectrum is a graph of the

harmonic amplitudes as a function of the

harmonic frequencies The harmonic phase is

θn = ndash arctan2(bn an) (see Appendix C02)

= arctan2(Im(cn) Re(cn))

The following illustration demonstrates how

sinusoidal components sum to represent a

square wave A line spectrum is also shown

four Fourier and orthogonality series Fourier series p 4

time

frequency

spectral amplitudeamplitude

Let us compute the associated spectral

components in the following example

Example fourseriesshy1 re Fourier series analysis line spectrum

Problem Statement

Compute the first five harmonic amplitudes

that represent the line spectrum for a square

wave in the figure above

four Fourier and orthogonality trans Fourier series p 5

four Fourier and orthogonality trans Fourier transform p 1

fourtrans Fourier transform

The followingwas generated from a Jupyter

notebook with the following filename and

kernel

notebook filename fourier_series_to_transformipynbnotebook kernel python3

We begin with the usual loading of modules

import numpy as np for numericsimport sympy as sp for symbolicsimport matplotlibpyplot as plt for plotsfrom IPythondisplay import display Markdown Latex

Letrsquos consider a periodic function f with

period T (T) Each period the function has a

triangular pulse of width δ (pulse_width) andheight δ2

period = 15 periodpulse_width = 2 pulse width

First we plot the function f in the time

domain Letrsquos begin by defining f

four Fourier and orthogonality trans Fourier transform p 2

def pulse_train(tTpulse_width)f = lambda xpulse_width2-abs(x) pulsetm = npmod(tT)if tm lt= pulse_width2

return f(tm)elif tm gt= T-pulse_width2

return f(-(tm-T))else

return 0

Now we develop a numerical array in time to

plot f

N = 201 number of points to plottpp = nplinspace(-period25period2N) time valuesfpp = nparray(npzeros(tppshape))for it_now in enumerate(tpp)

fpp[i] = pulse_train(t_nowperiodpulse_width)

axes = pltfigure(1)pltplot(tppfppb-linewidth=2) plotpltxlabel(time (s))pltxlim([-period23period2])pltxticks(

[0pulse_width2period][0$delta2$$T=+str(period)+$ s]

)pltyticks([0pulse_width2][0$delta2$])pltshow() display here

For δ = 2 and T isin [5 15 25] the leftshyhand

column of Figure 061 shows two triangle pulses

for each period T

four Fourier and orthogonality trans Fourier transform p 3

Consider the following argument Just as

a Fourier series is a frequency domain

representation of a periodic signal a Fourier

transform is a frequency domain representation

of an aperiodic signal (we will rigorously

define it in a moment) The Fourier series

components will have an analog then in the

Fourier transform Recall that they can be

computed by integrating over a period of the

signal If we increase that period infinitely

the function is effectively aperiodic The

result (within a scaling factor) will be the

Fourier transform analog of the Fourier series

components

Let us approach this understanding by actually

computing the Fourier series components for

increasing period T using Wersquoll use sympyto compute the Fourier series cosine and sine

components an and bn for component n (n) andperiod T (T)

spvar(xa_0a_nb_nreal=True)spvar(deltaTpositive=True)spvar(nnonnegative=True) a0 = 2Tspintegrate( (delta2-spAbs(x)) (x-delta2delta2) otherwise zero )simplify()an = spintegrate(

2T(delta2-spAbs(x))spcos(2sppinTx)(x-delta2delta2) otherwise zero

)simplify()bn = 2Tspintegrate(

(delta2-spAbs(x))spsin(2sppinTx)(x-delta2delta2) otherwise zero

)simplify()display(spEq(a_nan)spEq(b_nbn))

four Fourier and orthogonality trans Fourier transform p 4

an =

T(1ndashcos

(πδnT

))π2n2

for n 6= 0δ2

2Totherwise

bn = 0

Furthermore let us compute the harmonic

amplitude

(f_harmonic_amplitude)

Cn =

radica2n + b

2n (trans1)

which we have also scaled by a factor Tδ in

order to plot it with a convenient scale

spvar(C_npositive=True)cn = spsqrt(an2+bn2)display(spEq(C_ncn))

Cn =

T∣∣∣cos(πδn

T

)ndash1∣∣∣

π2n2for n 6= 0

δ2

2Totherwise

Now we lambdify the symbolic expression for a

numpy function

cn_f = splambdify((nTdelta)cn)

Now we can plot

four Fourier and orthogonality trans Fourier transform p 5

amplitude CnTδ

δ2T = 5 s

δ2

5 10

δ2

δ2

T = 15 s

δ2

5 10

δ2

δ2

T = 25 s

δ2

time (s)

5 10

δ2

frequency ω (rads)

Figure 061 triangle pulse trains (left column) withlonger periods descending and their correspondingline spectra (right column) scaled for convenientcomparison

omega_max = 12 rads max frequency in line spectrumn_max = round(omega_maxperiod(2nppi)) max harmonicn_a = nplinspace(0n_maxn_max+1)omega = 2nppin_aperiodaxes = pltfigure(2)markerline stemlines baseline = pltstem(

omega periodpulse_widthcn_f(n_aperiodpulse_width)linefmt=b- markerfmt=bo basefmt=r-use_line_collection=True

)pltxlabel(frequency $omega$ (rads))pltxlim([0omega_max])pltylim([0pulse_width2])pltyticks([0pulse_width2][0$delta2$])pltshow() show here

The line spectra are shown in the rightshyhand

column of Figure 061 Note that with our chosen

scaling as T increases the line spectra reveal

a distinct waveform

four Fourier and orthogonality trans Fourier transform p 6

2 4 6 8 10 12

δ2

frequency ω (rads)

F(ω)

Figure 062 F(ω) our mysterious Fourier seriesamplitude analog

Let F be the continuous function of angular

frequency ω

F(ω) =δ

2middot sin

2(ωδ4)

(ωδ4)2 (trans2)

First we plot it

F = lambda w pulse_width2 npsin(wpulse_width(22))2 (wpulse_width(22))2

N = 201 number of points to plotwpp = nplinspace(00001omega_maxN)Fpp = []for i in range(0N)

Fppappend(F(wpp[i])) build array of function valuesaxes = pltfigure(3)pltplot(wppFppb-linewidth=2) plotpltxlim([0omega_max])pltyticks([0pulse_width2][0$delta2$])pltxlabel(frequency $omega$ (rads))pltylabel($F(omega)$)pltshow()

Letrsquos consider the plot in Figure 062 of F Itrsquos

obviously the function emerging in Figure 061

from increasing the period of our pulse train

four Fourier and orthogonality trans Fourier transform p 7

Now we are ready to define the Fourier

transform and its inverse

Definition fourtrans1 Fourier transforms

trigonometric form

Fourier transform (analysis)

A(ω) =

ˆ infinndashinfin y(t) cos(ωt)dt (trans3)

B(ω) =

ˆ infinndashinfin y(t) sin(ωt)dt (trans4)

Inverse Fourier transform (synthesis)

y(t) =1

ˆ infinndashinfin A(ω) cos(ωt)dω +

1

ˆ infinndashinfin B(ω) sin(ωt)dω

(trans5)

Definition fourtrans2 Fourier transforms

complex form

Fourier transform F (analysis)

F(y(t)) = Y(ω) =

ˆ infinndashinfin y(t)endashjωtdt (trans6)

Inverse Fourier transform Fndash1 (synthesis)

Fndash1(Y(ω)) = y(t) =1

ˆ infinndashinfin Y(ω)ejωtdω (trans7)

So now we have defined the Fourier transform

There are many applications including solving

differential equations and frequency domain

representationsmdashcalled spectramdashof time domain

functions

There is a striking similarity between the

four Fourier and orthogonality trans Fourier transform p 8

Fourier transform and the Laplace transform

with which you are already acquainted

In fact the Fourier transform is a special

case of a Laplace transform with Laplace

transform variable s = jω instead of having

some real component Both transforms convert

differential equations to algebraic equations

which can be solved and inversely transformed

to find timeshydomain solutions The Laplace

transform is especially important to use

when an input function to a differential

equation is not absolutely integrable and

the Fourier transform is undefined (for

example our definition will yield a transform

for neither the unit step nor the unit ramp

functions) However the Laplace transform is

also preferred for initial value problems due

to its convenient way of handling them The

two transforms are equally useful for solving

steady state problems Although the Laplace

transform has many advantages for spectral

considerations the Fourier transform is the

only game in town

A table of Fourier transforms and their

properties can be found in Appendix B02

Example fourtransshy1 re a Fourier transform

Problem Statement

Consider the aperiodic signal y(t) = us(t)endashat

with us the unit step function and a gt 0 The

signal is plotted below Derive the complex

frequency spectrum and plot its magnitude

and phase

ndash2 0 2 40

051

t

y(t)

four Fourier and orthogonality trans Fourier transform p 9

Problem SolutionThe signal is aperiodic so the Fourier

transform can be computed from

Equation trans6

Y(ω) =

ˆ infinndashinfin y(t)endashjωtdt

=

ˆ infinndashinfin us(t)e

ndashatendashjωtdt (def of y)

=

ˆ infin0

endashatendashjωtdt (us effect)

=

ˆ infin0

e(ndashandashjω)tdt (multiply)

=1

ndasha ndash jωe(ndashandashjω)t

∣∣∣∣infin0dt (antiderivative)

=1

ndasha ndash jω

(limtrarrinfin e(ndashandashjω)t ndash e0

)(evaluate)

=1

ndasha ndash jω

(limtrarrinfin endashatendashjωt ndash 1

)(arrange)

=1

ndasha ndash jω

((0)(complex with mag 6 1) ndash 1

)(limit)

=ndash1

ndasha ndash jω(consequence)

=1

a + jω

=a ndash jω

a ndash jωmiddot 1

a + jω(rationalize)

=a ndash jω

a2 +ω2

The magnitude and phase of this complex

function are straightforward to compute

|Y(ω)| =radicRe(Y(ω))2 + Im(Y(ω))2

=1

a2 +ω2

radica2 +ω2

=1radic

a2 +ω2

angY(ω) = ndash arctan(ωa)

four Fourier and orthogonality general Fourier transform p 10

Now we can plot these functions of ω Setting

a = 1 (arbitrarily) we obtain the following

plots

0

05

1

|Y(ω)|

ndash10 ndash5 0 5 10

ndash1

0

1

ω

angY(ω)

four Fourier and orthogonality general Generalized fourier series and orthogonality p 1

2 A function f is square-integrable ifacuteinfinndashinfin |f(x)|2 dx ltinfin

inner product

weight function

3 This definition of the inner product can be extended to functions

on R2 and R3 domains using double- and triple-integration See

( Schey Div Grad Curl and All that An Informal Text on Vector

Calculus p 261)

orthogonality

basis

function space

legendre polynomials

fourgeneral Generalized

fourier series and

orthogonality

Let f R rarr C g R rarr C and w R rarr Cbe complex functions For square-integrable2

f g and w the inner product of f and g with

weight function w over the interval [a b] sube Ris3

〈f g〉w =ˆ b

af(x)g(x)w(x) dx (general1)

where g denotes the complex conjugate of

g The inner product of functions can be

considered analogous to the inner (or dot)

product of vectors

The fourier series components can be found by a

special property of the sin and cos functions

called orthogonality In general functions

f and g from above are orthogonal over the

interval [a b] iff

〈f g〉w = 0 (general2)

for weight function w Similar to how a set of

orthogonal vectors can be a basis for a vector

space a set of orthogonal functions can be

a basis for a function space a vector space

of functions from one set to another (with

certain caveats)

In addition to some sets of sinusoids there

are several other important sets of functions

that are orthogonal For instance sets of

legendre polynomials (Erwin Kreyszig

Advanced Engineering Mathematics 10th John

Wiley amp Sons Limited 2011 ISBN 9781119571094

The authoritative resource for engineering

four Fourier and orthogonality ex Generalized fourier series and orthogonality p 2

bessel functions

generalized fourier series

Fourier components

synthesis

analysis

fourier-legendre series

fourier-bessel series

mathematics It includes detailed accounts

of probability statistics vector calculus

linear algebra fourier analysis ordinary

and partial differential equations and

complex analysis It also includes several

other topics with varying degrees of depth

Overall it is the best place to start when

seeking mathematical guidance sect 52) and bessel

functions (Kreyszig Advanced Engineering

Mathematics sect 54) are orthogonal

As with sinusoids the orthogonality of some

sets of functions allows us to compute their

series components Let functions f0 f1 middot middot middot beorthogonal with respect to weight function

w on interval [a b] and let α0 α1 middot middot middot be realconstants A generalized fourier series is

(ibid sect 116)

f(x) =infinsumm=0

αmfm(x) (general3)

and represents a function f as a convergent

series It can be shown that the Fourier

components αm can be computed from

αm =〈f fm〉w〈fm fm〉w

(general4)

In keepingwith our previous terminology

for fourier series Equation general3 and

Equation general4 are called general fourier

synthesis and analysis respectively

For the aforementioned legendre and bessel

functions the generalized fourier series are

called fourier-legendre and fourier-bessel

series (ibid sect 116) These and the standard

fourier series (Lecture fourseries) are of

particular interest for the solution of partial

differential equations (Chapter 07)

four Fourier and orthogonality ex Exercises for Chapter 06 p 1

3 Python code in this section was generated from a Jupyter notebook

named random_signal_fftipynb with a python3 kernel

fourex Exercises for Chapter

06

Exercise 061 (secrets)

This exercise encodes a ldquosecret wordrdquo into a

sampled waveform for decoding via a discrete

fourier transform (DFT) The nominal goal of

the exercise is to decode the secret word

Along the way plotting and interpreting the DFT

will be important

First load relevant packages

import numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

We define two functions letter_to_number to

convert a letter into an integer index of the

alphabet (a becomes 1 b becomes 2 etc) and

string_to_number_list to convert a string to

a list of ints as shown in the example at the

end

def letter_to_number(letter)return ord(letter) - 96

def string_to_number_list(string)out = [] listfor i in range(0len(string))

outappend(letter_to_number(string[i]))return out list

print(faces = string_to_number_list(aces) )

four Fourier and orthogonality ex Exercises for Chapter 06 p 2

aces = [1 3 5 19]

Now we encode a code string code into

a signal by beginningwith ldquowhite noiserdquo

which is broadband (appears throughout the

spectrum) and adding to it sin functions

with amplitudes corresponding to the letter

assignments of the code and harmonic

corresponding to the position of the letter in

the string For instance the string bad would

be represented by noise plus the signal

2 sin2πt + 1 sin 4πt + 4 sin 6πt (ex1)

Letrsquos set this up for secret word chupcabra

N = 2000Tm = 30T = float(Tm)float(N)fs = 1Tx = nplinspace(0 Tm N)noise = 4nprandomnormal(0 1 N)code = chupcabra the secret wordcode_number_array = nparray(string_to_number_list(code))y = nparray(noise)for i in range(0len(code))

y = y + code_number_array[i]npsin(2nppi(i+1)x)

For proper decoding later it is important

to know the fundamental frequency of the

generated data

print(ffundamental frequency = fs Hz)

four Fourier and orthogonality ex Exercises for Chapter 06 p 3

0 1 2 3 4 5 6 7

time (s)

minus60

minus40

minus20

0

20

40

60

y n

Figure 063 the chupacabra signal

fundamental frequency = 6666666666666667 Hz

Now we plot

fig ax = pltsubplots()pltplot(xy)pltxlim([0Tm4])pltxlabel(time (s))pltylabel($y_n$)pltshow()

Finally we can save our data to a numpy file

secretsnpy to distribute our message

npsave(secretsy)

four Fourier and orthogonality ex Exercises for Chapter 06 p 1

Now I have done this (for a different secret

word) and saved the data download it here

ricopiconemathematical_foundationssourcesecretsnpy

In order to load the npy file into Python we

can use the following command

secret_array = npload(secretsnpy)

Your job is to (a) perform a DFT (b) plot the

spectrum and (c) decode the message Here are

a few hints

1 Use from scipy import fft to do the

DFT

2 Use a hanning window to minimize the

end-effects See numpyhanning for

instance The fft call might then look

like

2fft(nphanning(N)secret_array)N

where N = len(secret_array)3 Use only the positive spectrum you can

lop off the negative side and double the

positive side

Exercise 062 (seesaw)

Consider the periodic function f R rarr R with

period T defined for one period as

f(t) = at for t isin (ndashT2 T2] (ex2)

where a T isin R Perform a fourier analysis on

f Letting a = 5 and T = 1 plot f along with the

partial sum of the fourier synthesis the first

50 nonzero components over t isin [ndashT T]

four Fourier and orthogonality ex Exercises for Chapter 06 p 1

Exercise 063 (9)

Consider the function f Rrarr R defined as

f(t) =

a ndash a|t|T for t isin [ndashT T]

0 otherwise(ex3)

where a T isin R Perform a fourier analysis on f

resulting in F(ω) Plot F for various a and T

Exercise 064 (10)

Consider the function f Rrarr R defined as

f(t) = aendashb|tndashT| (ex4)

where a b T isin R Perform a fourier analysis on

f resulting in F(ω) Plot F for various a b and

T

Exercise 065 (11)

Consider the function f Rrarr R defined as

f(t) = a cosω0t + b sinω0t (ex5)

where a bω0 isin R constants Perform a fourier

analysis on f resulting in F(ω)

Exercise 066 (12)

Derive a fourier transform property for

expressions including function f Rrarr R for

f(t) cos(ω0t +ψ)

where ω0 ψ isin R

Exercise 067 (13)

Consider the function f Rrarr R defined as

f(t) = aus(t)endashbt cos(ω0t +ψ) (ex6)

where a bω0 ψ isin R and us(t) is the unit

step function Perform a fourier analysis on

f resulting in F(ω) Plot F for various a b ω0 ψ

and T

four Fourier and orthogonality ex Exercises for Chapter 06 p 1

Exercise 068 (14)

Consider the function f Rrarr R defined as

f(t) = g(t) cos(ω0t) (ex7)

where ω0 isin R and g R rarr R will be defined in

each part below Perform a fourier analysis on

f for each g below for ω1 isin R a constant and

consider how things change if ω1 rarr ω0

a g(t) = cos(ω1t)

b g(t) = sin(ω1t)

Exercise 069 (savage)

An instrument called a ldquolock-in amplifierrdquo can

measure a sinusoidal signal A cos(ω0t + ψ) =

a cos(ω0t) + b sin(ω0t) at a known frequency ω0with exceptional accuracy even in the presence

of significant noise N(t) The workings of these

devices can be described in two operations

first the following operations on the signal

with its noise f1(t) = a cos(ω0t) + b sin(ω0t) + N(t)

f2(t) = f1(t) cos(ω1t) and f3(t) = f1(t) sin(ω1t)

(ex8)

where ω0 ω1 a b isin R Note the relation of

this operation to the Fourier analysis of

Exercise 068 The key is to know with some

accuracty ω0 such that the instrument can set

ω1 asymp ω0 The second operation on the signal is

an aggressive lowshypass filter The filtered f2and f3 are called the inshyphase and quadrature

components of the signal and are typically

given a complex representation

(inshyphase) + j (quadrature)

Explain with fourier analyses on f2 and f3

pde Fourier and orthogonality Exercises for Chapter 06 p 1

a what F2 = F(f2) looks like

b what F3 = F(f3) looks like

c why we want ω1 asymp ω0 d why a lowshypass filter is desirable and

e what the timeshydomain signal will look

like

Exercise 0610 (strawman)

Consider again the lock-in amplifier explored

in Exercise 069 Investigate the lock-in

amplifier numerically with the following steps

a Generate a noisy sinusoidal signal

at some frequency ω0 Include enough

broadband white noise that the signal

is invisible in a timeshydomain plot

b Generate f2 and f3 as described in

Exercise 069

c Apply a timeshydomain discrete lowshypass

filter to each f2 7rarr φ2 and f3 7rarr φ3 such

as scipyrsquos scipysignalsosfiltfilt tocomplete the lock-in amplifier operation

Plot the results in time and as a complex

(polar) plot

d Perform a discrete fourier transform

on each f2 7rarr F2 and f3 7rarr F3 Plot the

spectra

e Construct a frequency domain lowshypass

filter F and apply it (multiply) to each

F2 7rarr Fprime2 and F3 7rarr Fprime3 Plot the filtered

spectra

f Perform an inverse discrete fourier

transform to each Fprime2 7rarr fprime2 and Fprime3 7rarr fprime3

Plot the results in time and as a complex

(polar) plot

g Compare the two methods used ie time-

domain filtering versus frequencyshydomain

filtering

ordinary differential equations

lumped-parameter modeling

pde

Partialdifferential

equations

An ordinary differential equation is one

with (ordinary) derivatives of functions a

single variable eachmdashtime in many applications

These typically describe quantities in some sort

of lumped-parameter way mass as a ldquopoint

particlerdquo a springrsquos force as a function of

timeshyvarying displacement across it a resistorrsquos

current as a function of timeshyvarying voltage

across it Given the simplicity of such models

in comparison to the wildness of nature

it is quite surprising how well they work

for a great many phenomena For instance

electronics rigid body mechanics population

dynamics bulk fluid mechanics and bulk heat

transfer can be lumpedshyparameter modeled

However as we saw in there are many

phenomena of which we require more detailed

models These include

pde Partial differential equations p 3

timeshyvarying spatial distribution

partial differential equations

dependent variables

independent variables

analytic solution

numeric solution

1 There are some analytic techniques for gaining insight into PDEs

for which there are no known solutions such as considering the

phase space This is an active area of research for more see Bove

Colombini and Santo (Antonio Bove F (Ferruccio) Colombini and

Daniele Del Santo Phase space analysis of partial differential

equations eng Progress in nonlinear differential equations and

their applications v 69 Boston Berlin Birkhaumluser 2006 ISBN

9780817645212)

bull detailed fluid mechanics

bull detailed heat transfer

bull solid mechanics

bull electromagnetism and

bull quantum mechanics

In many cases what is required to account

for is the timeshyvarying spatial distribution

of a quantity In fluid mechanics we treat a

fluid as having quantities such as density and

velocity that vary continuous over space and

time Deriving the governing equations for such

phenomena typically involves vector calculus

we observed in that statements about

quantities like the divergence (eg continuity)

can be made about certain scalar and vector

fields Such statements are governing equations

(eg the continuity equation) and they are

partial differential equations (PDEs) because

the quantities of interest called dependent

variables (eg density and velocity) are both

temporally and spatially varying (temporal

and spatial variables are therefore called

independent variables)

In this chapter we explore the analytic

solution of PDEs This is related to but

distinct from the numeric solution (ie

simulation) of PDEs which is another important

topic Many PDEs have no known analytic

solution so numeric solution is the best

available option1 However it is important

to note that the insight one can gain from an

analytic solution is often much greater than

that from a numeric solution This is easily

understood when one considers that a numeric

solution is an approximation for a specific set

pde Partial differential equations class p 4

2 Kreyszig Advanced Engineering Mathematics Ch 12

3 WA Strauss Partial Differential Equations An Introduction Wiley

2007 ISBN 9780470054567 A thorough and yet relatively compact

introduction

4 R Haberman Applied Partial Differential Equations with Fourier

Series and Boundary Value Problems (Classic Version) Pearson Modern

Classics for Advanced Mathematics Pearson Education Canada 2018

ISBN 9780134995434

of initial and boundary conditions Typically

very little can be said of what would happen

in general although this is often what we seek

to know So despite the importance of numeric

solution one should always prefer an analytic

solution

Three good texts on PDEs for further study are

Kreyszig2 Strauss3 and Haberman4

pde Partial differential equations class Classifying PDEs p 1

initial conditions

boundary conditions

well-posed problem

linear

nonlinear

pdeclass Classifying PDEs

PDEs often have an infinite number of

solutions however when applying them to

physical systems we usually assume there is

a deterministic or at least a probabilistic

sequence of events will occur Therefore we

impose additonal constraints on a PDE usually

in the form of

1 initial conditions values of

independent variables over all space at

an initial time and

2 boundary conditions values of

independent variables (or their

derivatives) over all time

Ideally imposing such conditions leaves us

with a well-posed problem which has three

aspects (Antonio Bove F (Ferruccio) Colombini

and Daniele Del Santo Phase space analysis of

partial differential equations eng Progress

in nonlinear differential equations and their

applications v 69 Boston Berlin Birkhaumluser

2006 ISBN 9780817645212 sect 15)

existence There exists at least one solution

uniqueness There exists at most one solution

stability If the PDE boundary conditons or

initial conditions are changed slightly

the solution changes only slightly

As with ODEs PDEs can be linear or nonlinear

pde Partial differential equations class Classifying PDEs p 2

order

second-order PDEs

forcing function

homogeneous

that is the independent variables and

their derivatives can appear in only linear

combinations (linear PDE) or in one or more

nonlinear combination (nonlinear PDE) As with

ODEs there are more known analytic solutions

to linear PDEs than nonlinear PDEs

The order of a PDE is the order of its

highest partial derivative A great many

physical models can be described by second-

order PDEs or systems thereof Let u be an

independent scalar variable a function of

m temporal and spatial variables xi isin RnA second-order linear PDE has the form

for coefficients αβγ δ real functions

of xi (WA Strauss Partial Differential

Equations An Introduction Wiley 2007 ISBN

9780470054567 A thorough and yet relatively

compact introduction sect 16)

nsumi=1

nsumj=1

αijpart2xixju︸ ︷︷ ︸

second-order terms

+msumk=1

(γkpartxku + δku

)︸ ︷︷ ︸

first- and zeroth-order terms

= f(x1 middot middot middot xn)︸ ︷︷ ︸forcing

(class1)

where f is called a forcing function When f

is zero Equation class1 is called homogeneous

We can consider the coefficients αij to be

components of a matrix A with rows indexed

by i and columns indexed by j There are four

prominent classes defined by the eigenvalues of

A

elliptic the eigenvalues all have the same sign

parabolic the eigenvalues have the same sign

except one that is zero

hyperbolic exactly one eigenvalue has the

opposite sign of the others and

pde Partial differential equations sturm Classifying PDEs p 3

cauchyshykowalevski theorem

cauchy problems

ultrahyperbolic at least two eigenvalues of

each signs

The first three of these have received extensive

treatment They are named after conic

sections due to the similarity the equations

have with polynomials when derivatives

are considered analogous to powers of

polynomial variables For instance here is a

case of each of the first three classes

part2xxu + part2yyu = 0 (elliptic)

part2xxu ndash part2yyu = 0 (hyperbolic)

part2xxu ndash parttu = 0 (parabolic)

When A depends on xi it may have multiple

classes across its domain In general this

equation and its associated initial and

boundary conditions do not comprise a well-

posed problem however several special cases

have been shown to be wellshyposed Thus far

the most general statement of existence and

uniqueness is the cauchyshykowalevski theorem

for cauchy problems

pde Partial differential equations sturm Sturm-liouville problems p 1

sturm-liouville (SshyL) differential equation

regular SshyL problem

5 For the SshyL problem to be regular it has the additional constraints

that p qσ are continuous and pσ gt 0 on [a b] This is also sometimes

called the sturm-liouville eigenvalue problem See Haberman (Haberman

Applied Partial Differential Equations with Fourier Series and

Boundary Value Problems (Classic Version) sect 53) for the more general

(non-regular) SshyL problem and Haberman (ibid sect 74) for the multi-

dimensional analog

boundary value problems

eigenfunctions

eigenvalues

6 These eigenvalues are closely related to but distinct from the

ldquoeigenvaluesrdquo that arise in systems of linear ODEs

7 Ibid sect 53

pdesturm Sturm-liouville

problems

Before we introduce an important solution

method for PDEs in Lecture pdeseparation we

consider an ordinary differential equation

that will arise in that method when dealing

with a single spatial dimension x the sturm-

liouville (SshyL) differential equation Let

p qσ be functions of x on open interval

(a b) Let X be the dependent variable and λ

constant The regular SshyL problem is the SshyL

ODE5

d

dx

(pXprime

)+ qX + λσX = 0 (sturm1a)

with boundary conditions

β1X(a) + β2Xprime(a) = 0 (sturm2a)

β3X(b) + β4Xprime(b) = 0 (sturm2b)

with coefficients βi isin R This is a type of

boundary value problem

This problem has nontrivial solutions

called eigenfunctions Xn(x) with n isin Z+ corresponding to specific values of λ = λn

called eigenvalues6 There are several

important theorems proven about this (see

Haberman7) Of greatest interest to us are that

1 there exist an infinite number of

eigenfunctions Xn (unique within a

multiplicative constant)

2 there exists a unique corresponding real

eigenvalue λn for each eigenfunction Xn

3 the eigenvalues can be ordered as λ1 lt

λ2 lt middot middot middot

pde Partial differential equations sturm Sturm-liouville problems p 1

homogeneous boundary conditions

periodic boundary conditions

4 eigenfunction Xn has n ndash 1 zeros on open

interval (a b)

5 the eigenfunctions Xn form an

orthogonal basis with respect to

weighting function σ such that

any piecewise continuous function

f [a b] rarr R can be represented by a

generalized fourier series on [a b]

This last theorem will be of particular interest

in Lecture pdeseparation

Types of boundary conditions

Boundary conditions of the sturm-liouville

kind (sturm2) have four sub-types

dirichlet for just β2 β4 = 0

neumann for just β1 β3 = 0

robin for all βi 6= 0 andmixed if β1 = 0 β3 6= 0 if β2 = 0 β4 6= 0

There are many problems that are not regular

sturm-liouville problems For instance the

rightshyhand sides of Equation sturm2 are

zero making them homogeneous boundary

conditions however these can also be

nonzero Another case is periodic boundary

conditions

X(a) = X(b) (sturm3a)

Xprime(a) = Xprime(b) (sturm3b)

pde Partial differential equations sturm Sturm-liouville problems p 2

Example pdesturmshy1 re a sturm-liouville problem with dirichlet

boundary conditions

Problem Statement

Consider the differential equation

Xprimeprime + λX = 0 (sturm4)

with dirichlet boundary conditions on the

boundary of the interval [0 L]

X(0) = 0 and X(L) = 0 (sturm5)

Solve for the eigenvalues and eigenfunctions

Problem SolutionThis is a sturm-liouville problem so we know

the eigenvalues are real The well-known

general solutions to the ODE is

X(x) =

k1 + k2x λ = 0

k1ejradicλx + k2e

ndashjradicλx otherwise

(sturm6)

with real constants k1 k2 The solution must

also satisfy the boundary conditions Letrsquos

apply them to the case of λ = 0 first

X(0) = 0rArr k1 + k2(0) = 0rArr k1 = 0 (sturm7)

X(L) = 0rArr k1 + k2(L) = 0rArr k2 = ndashk1L (sturm8)

Together these imply k1 = k2 = 0 which gives

the trivial solution X(x) = 0 in which we

arenrsquot interested We say then for nontrivial

solutions λ 6= 0 Now letrsquos check λ lt 0 The

solution becomes

X(x) = k1endashradic|λ|x + k2e

radic|λ|x (sturm9)

= k3 cosh(radic|λ|x) + k4 sinh(

radic|λ|x) (sturm10)

where k3 and k4 are real constants Again

applying the boundary conditions

X(0) = 0rArr k3 cosh(0) + k4 sinh(0) = 0rArr k3 + 0 = 0rArr k3 = 0

X(L) = 0rArr 0cosh(radic|λ|L) + k4 sinh(

radic|λ|L) = 0rArr k4 sinh(

radic|λ|L) = 0

However sinh(radic|λ|L) 6= 0 for L gt 0 so

k4 = k3 = 0mdashagain the trivial solution Now

pde Partial differential equations sturm Sturm-liouville problems p 3

letrsquos try λ gt 0 The solution can be written

X(x) = k5 cos(radicλx) + k6 sin(

radicλx) (sturm11)

Applying the boundary conditions for this

case

X(0) = 0rArr k5 cos(0) + k6 sin(0) = 0rArr k5 + 0 = 0rArr k5 = 0

X(L) = 0rArr 0cos(radicλL) + k6 sin(

radicλL) = 0rArr k6 sin(

radicλL) = 0

Now sin(radicλL) = 0 for

radicλL = nπrArr

λ =(nπL

)2 (n isin Z+)

Therefore the only nontrivial solutions

that satisfy both the ODE and the boundary

conditions are the eigenfunctions

Xn(x) = sin(radic

λnx)

(sturm12a)

= sin(nπLx)

(sturm12b)

with corresponding eigenvalues

λn =(nπL

)2 (sturm13)

Note that because λ gt 0 λ1 is the lowest

eigenvalue

Plotting the eigenfunctions

The followingwas generated from a Jupyter

notebook with the following filename and

kernel

notebook filename eigenfunctions_example_plotipynbnotebook kernel python3

First load some Python packages

pde Partial differential equations sturm Sturm-liouville problems p 4

import numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

Set L = 1 and compute values for the first

four eigenvalues lambda_n and eigenfunctions

X_n

L = 1x = nplinspace(0L100)n = nplinspace(144dtype=int)lambda_n = (nnppiL)2X_n = npzeros([len(n)len(x)])for in_i in enumerate(n)X_n[i] = npsin(npsqrt(lambda_n[i])x)

Plot the eigenfunctions

for in_i in enumerate(n)pltplot(

xX_n[i]linewidth=2label=n = +str(n_i)

)pltlegend()pltshow() display the plot

00 02 04 06 08 10

ndash1

ndash05

0

05

1

n = 1

n = 2

n = 3

n = 4

pde Partial differential equations separation Sturm-liouville problems p 5

We see that the fourth of the SshyL theorems

appears true n ndash 1 zeros of Xn exist on the

open interval (0 1)

pde Partial differential equations separation PDE solution by separation of variables p 1

separation of variables

linear

product solution

separable PDEs

pdeseparation PDE solution by

separation of

variables

We are now ready to learn one of the

most important techniques for solving PDEs

separation of variables It applies only to

linear PDEs since it will require the principle

of superposition Not all linear PDEs yield to

this solution technique but several that are

important do

The technique includes the following steps

assume a product solution Assume the

solution can be written as a product

solution up the product of functions

of each independent variable

separate PDE Substitute up into the PDE and

rearrange such that at least one side

of the equation has functions of a single

independent variabe If this is possible

the PDE is called separable

set equal to a constant Each side of

the equation depends on different

independent variables therefore they

must each equal the same constant often

called ndashλ

repeat separation as needed If there are more

than two independent variables there

will be an ODE in the separated variable

and a PDE (with one fewer variables) in

the other independent variables Attempt

to separate the PDE until only ODEs

remain

solve each boundary value problem Solve

each boundary value problem ODE

pde Partial differential equations separation PDE solution by separation of variables p 1

eigenfunctions

superposition

ignoring the initial conditions for now

solve the time variable ODE Solve for the

general solution of the time variable ODE

sans initial conditions

construct the product solution Multiply the

solution in each variable to construct

the product solution up If the boundary

value problems were sturm-liouville

the product solution is a family of

eigenfunctions from which any function

can be constructed via a generalized

fourier series

apply the initial condition The product

solutions individually usually do not

meet the initial condition However a

generalized fourier series of them nearly

always does Superposition tells us a

linear combination of solutions to the

PDE and boundary conditions is also

a solution the unique series that also

satisfies the initial condition is the

unique solution to the entire problem

Example pdeseparationshy1 re 1D diffusion equation

Problem Statement

Consider the oneshydimensional diffusion

equation PDEa

parttu(t x) = kpart2xxu(t x) (separation1)

with real constant k with dirichlet boundary

conditions on inverval x isin [0 L]

u(t 0) = 0 (separation2a)

u(t L) = 0 (separation2b)

and with initial condition

u(0 x) = f(x) (separation3)

where f is some piecewise continuous function

on [0 L]

pde Partial differential equations separation PDE solution by separation of variables p 2

a For more on the diffusion or heat equation see Haberman

(Haberman Applied Partial Differential Equations with Fourier

Series and Boundary Value Problems (Classic Version) sect 23)

Kreyszig (Kreyszig Advanced Engineering Mathematics sect 125)

and Strauss (Strauss Partial Differential Equations An

Introduction sect 23)

Problem Solution

Assume a product solution

First we assume a product solution of the

form up(t x) = T(t)X(x) where T and X are

unknown functions on t gt 0 and x isin [0 L]

Separate PDE

Second we substitute the product solution

into Equation separation1 and separate

variables

TprimeX = kTXprimeprime rArr (separation4)

Tprime

kT=Xprimeprime

X (separation5)

So it is separable Note that we chose to

group k with T which was arbitrary but

conventional

Set equal to a constant

Since these two sides depend on different

independent variables (t and x) they must

equal the same constant we call ndashλ so we

have two ODEs

Tprime

kT= ndashλ rArr Tprime + λkT = 0 (separation6)

Xprimeprime

X= ndashλ rArr Xprimeprime + λX = 0 (separation7)

Solve the boundary value problem

The latter of these equations with the

boundary conditions (separation2) is precisely

the same sturm-liouville boundary value

problem from Example pdesturm-1 which had

pde Partial differential equations separation PDE solution by separation of variables p 3

eigenfunctions

Xn(x) = sin(radic

λnx)

(separation8a)

= sin(nπLx)

(separation8b)

with corresponding (positive) eigenvalues

λn =(nπL

)2 (separation9)

Solve the time variable ODE

The time variable ODE is homogeneous and has

the familiar general solution

T(t) = cendashkλt (separation10)

with real constant c However the boundary

value problem restricted values of λ to λn so

Tn(t) = cendashk(nπL)

2t (separation11)

Construct the product solution

The product solution is

up(t x) = Tn(t)Xn(x)

= cendashk(nπL)2t sin

(nπLx) (separation12)

This is a family of solutions that each satisfy

only exotically specific initial conditions

Apply the initial condition

The initial condition is u(0 x) = f(x) The

eigenfunctions of the boundary value problem

form a fourier series that satisfies the initial

condition on the interval [0 L] if we extend

f to be periodic and odd over x (Kreyszig

Advanced Engineering Mathematics p 550) we

call the extension flowast The odd series synthesis

can be written

flowast(x) =infinsumn=1

bn sin(nπLx)

(separation13)

pde Partial differential equations separation PDE solution by separation of variables p 4

where the fourier analysis gives

bn =2

L

ˆ L

0flowast(χ) sin

(nπLχ) (separation14)

So the complete solution is

u(t x) =infinsumn=1

bnendashk(nπL)

2t sin(nπLx)

(separation15)

Notice this satisfies the PDE the boundary

conditions and the initial condition

Plotting solutions

If we want to plot solutions we need to

specify an initial condition u(0 x) = flowast(x)

over [0 L] We can choose anything piecewise

continuous but for simplicity letrsquos let

f(x) = 1 (x isin [0 L])

The odd periodic extension is an odd square

wave The integral (separation14) gives

bn =4

nπ(1 ndash cos(nπ))

=

0 n even

4nπ n odd

(separation16)

Now we can write the solution as

u(t x) =infinsum

n=1 n odd

4

nπendashk(nπL)

2t sin(nπLx)

(separation17)

Plotting in Python

First load some Python packages

import numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

pde Partial differential equations separation PDE solution by separation of variables p 5

Set k = L = 1 and sum values for the first Nterms of the solution

L = 1k = 1N = 100x = nplinspace(0L300)t = nplinspace(02(Lnppi)2100)u_n = npzeros([len(t)len(x)])for n in range(N)n = n+1 because index starts at 0if n 2 == 0 even

pass already initialized to zeroselse odd

u_n += 4(nnppi)npouter(npexp(-k(nnppiL)2t)npsin(nnppiLx)

)

Letrsquos first plot the initial condition

p = pltfigure()pltplot(xu_n[0])pltxlabel(space $x$)pltylabel($u(0x)$)pltshow()

00 02 04 06 08 10space x

00

02

04

06

08

10

12

u(0x)

pde Partial differential equations wave PDE solution by separation of variables p 6

Now we plot the entire response

p = pltfigure()pltcontourf(txu_nT)c = pltcolorbar()cset_label($u(tx))pltxlabel(time $t$)pltylabel(space $x$)pltshow()

000 005 010 015 020

time t

00

02

04

06

08

10

spacex

000

015

030

045

060

075

090

105

120

u(tx)

We see the diffusive action proceeds as we

expected

Python code in this section was generated from a Jupyter

notebook named pde_separation_example_01ipynb with a

python3 kernel

pde Partial differential equations wave The 1D wave equation p 1

wave equation

8 Kreyszig Advanced Engineering Mathematics sect 129 1210

9 Haberman Applied Partial Differential Equations with Fourier Series

and Boundary Value Problems (Classic Version) sect 45 73

pdewave The 1D wave equation

The oneshydimensional wave equation is the

linear PDE

part2ttu(t x) = c2part2xxu(t x) (wave1)

with real constant c This equation models

such phenomena as strings fluids sound and

light It is subject to initial and boundary

conditions and can be extended to multiple

spatial dimensions For 2D and 3D examples

in rectangular and polar coordinates see

Kreyszig8 and Haberman9

Example pdewaveshy1 re vibrating string PDE solution by separation of

variables

Problem Statement

Consider the oneshydimensional wave equation

PDE

part2ttu(t x) = c2part2xxu(t x) (wave2)

with real constant c and with dirichlet

boundary conditions on inverval x isin [0 L]

u(t 0) = 0 and u(t L) = 0 (wave3a)

and with initial conditions (we need two

because of the second timeshyderivative)

u(0 x) = f(x) and parttu(0 x) = g(x) (wave4)

where f and g are some piecewise continuous

functions on [0 L]

Problem Solution

Assume a product solution

First we assume a product solution of the

form up(t x) = T(t)X(x) where T and X are

unknown functions on t gt 0 and x isin [0 L]

pde Partial differential equations wave The 1D wave equation p 2

Separate PDE

Second we substitute the product solution

into Equation wave2 and separate variables

TprimeprimeX = c2TXprimeprime rArr (wave5)

Tprimeprime

c2T=Xprimeprime

X (wave6)

So it is separable Note that we chose to

group c with T which was arbitrary but

conventional

Set equal to a constant

Since these two sides depend on different

independent variables (t and x) they must

equal the same constant we call ndashλ so we

have two ODEs

Tprimeprime

c2T= ndashλ rArr Tprimeprime + λc2T = 0 (wave7)

Xprimeprime

X= ndashλ rArr Xprimeprime + λX = 0 (wave8)

Solve the boundary value problem

The latter of these equations with the

boundary conditions (wave3) is precisely

the same sturm-liouville boundary value

problem from Example pdesturm-1 which had

eigenfunctions

Xn(x) = sin(radic

λnx)

(wave9a)

= sin(nπLx)

(wave9b)

with corresponding (positive) eigenvalues

λn =(nπL

)2 (wave10)

Solve the time variable ODE

The time variable ODE is homogeneous and

with λ restricted by the reals by the

boundary value problem has the familiar

general solution

T(t) = k1 cos(cradicλt) + k2 sin(c

radicλt) (wave11)

pde Partial differential equations wave The 1D wave equation p 3

with real constants k1 and k2 However the

boundary value problem restricted values of λ

to λn so

Tn(t) = k1 cos(cnπLt)+ k2 sin

(cnπLt)

(wave12)

Construct the product solution

The product solution is

up(t x) = Tn(t)Xn(x)

= k1 sin(nπLx)cos

(cnπLt)+ k2 sin

(nπLx)sin

(cnπLt)

This is a family of solutions that each satisfy

only exotically specific initial conditions

Apply the initial conditions

Recall that superposition tells us that any

linear combination of the product solution is

also a solution Therefore

u(t x) =infinsumn=1

an sin(nπLx)cos

(cnπLt)+ bn sin

(nπLx)sin

(cnπLt)

(wave13)

is a solution If an and bn are properly

selected to satisfy the initial conditions

Equation wave13 will be the solution to the

entire problem Substituting t = 0 into our

potential solution gives

u(0 x) =infinsumn=1

an sin(nπLx)

(wave14a)

parttu(t x)|t=0 =infinsumn=1

bncnπ

Lsin

(nπLx) (wave14b)

Let us extend f and g to be periodic and odd

over x we call the extensions flowast and glowast From

Equation wave14 the intial conditions are

satsified if

flowast(x) =infinsumn=1

an sin(nπLx)

(wave15a)

glowast(x) =infinsumn=1

bncnπ

Lsin

(nπLx) (wave15b)

pde Partial differential equations wave The 1D wave equation p 4

We identify these as two odd fourier

syntheses The corresponding fourier analyses

are

an =2

L

ˆ L

0flowast(χ) sin

(nπLχ)

(wave16a)

bncnπ

L=2

L

ˆ L

0glowast(χ) sin

(nπLχ)

(wave16b)

So the complete solution is Equation wave13

with components given by Equation wave16

Notice this satisfies the PDE the boundary

conditions and the initial condition

Discussion

It can be shown that this series solution is

equivalent to two travelingwaves that are

interfering (see Habermana and Kreyszigb)

This is convenient because computing the

series solution exactly requires an infinite

summation We show in the following section

that the approximation by partial summation is

still quite good

Choosing specific initial conditions

If we want to plot solutions we need to

specify initial conditions over [0 L] Letrsquos

model a string being suddenly struck from

rest as

f(x) = 0 (wave17)

g(x) = δ(x ndash ∆L) (wave18)

where δ is the dirac delta distribution and

∆ isin [0 L] is a fraction of L representing the

location of the string being struck The odd

periodic extension is an odd pulse train The

integrals of (wave16) give

an = 0 (wave19a)

bn =2

cnπ

ˆ L

0δ(χ ndash ∆L) sin

(nπLχ)dx

=2

cnπsin(nπ∆) (sifting property)

pde Partial differential equations wave The 1D wave equation p 5

Now we can write the solution as

u(t x) =infinsumn=1

2

cnπsin(nπ∆) sin

(nπLx)sin

(cnπLt)

(wave20)

Plotting in Python

First load some Python packages

import numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

Set c = L = 1 and sum values for the first

N terms of the solution for some striking

location ∆

Delta = 01 0 lt= Delta lt= LL = 1c = 1N = 150t = nplinspace(030(Lnppi)2100)x = nplinspace(0L150)t_b x_b = npmeshgrid(tx)u_n = npzeros([len(x)len(t)])for n in range(N)n = n+1 because index starts at 0u_n += 4(cnnppi)

npsin(nnppiDelta) npsin(cnnppiLt_b) npsin(nnppiLx_b)

Letrsquos first plot some early snapshots of the

response

pde Partial differential equations wave The 1D wave equation p 6

import seaborn as snsn_snaps = 7snsset_palette(snsdiverging_palette(

240 10 n=n_snaps center=dark)

)

fig ax = pltsubplots()it = nplinspace(277n_snapsdtype=int)for i in range(len(it))axplot(xu_n[it[i]]label=ft = t[i]3f)

lgd = axlegend(bbox_to_anchor=(105 1)loc=upper left

)pltxlabel(space $x$)pltylabel($u(tx)$)pltshow()

00 02 04 06 08 10

space x

minus10

minus05

00

05

10

u(tx

)

t = 0000

t = 0031

t = 0061

t = 0092

t = 0123

t = 0154

t = 0184

Now we plot the entire response

fig ax = pltsubplots()p = axcontourf(txu_n)c = figcolorbar(pax=ax)cset_label($u(tx)$)pltxlabel(time $t$)pltylabel(space $x$)pltshow()

pde Partial differential equations ex The 1D wave equation p 7

00 05 10 15 20 25 30

time t

00

02

04

06

08

10

spac

ex

minus12

minus09

minus06

minus03

00

03

06

09

12

u(tx

)

We see a wave develop and travel reflecting

and inverting off each boundary

a Haberman Applied Partial Differential Equations with Fourier

Series and Boundary Value Problems (Classic Version) sect 44

b Kreyszig Advanced Engineering Mathematics sect 122

b Python code in this section was generated from a Jupyter

notebook named pde_separation_example_02ipynb with a

python3 kernel

pde Partial differential equations ex Exercises for Chapter 07 p 1

pdeex Exercises for Chapter 07

Exercise 071 (poltergeist)

The PDE of Example pdeseparation-1 can be used

to describe the conduction of heat along

a long thin rod insulated along its length

where u(t x) represents temperature The initial

and dirichlet boundary conditions in that

example would be interpreted as an initial

temperature distribution along the bar and

fixed temperatures of the ends Now consider

the same PDE

parttu(t x) = kpart2xxu(t x) (ex1)

with real constant k now with neumann

boundary conditions on inverval x isin [0 L]

partxu|x=0 = 0 and partxu|x=L = 0 (ex2a)

and with initial condition

u(0 x) = f(x) (ex3)

where f is some piecewise continuous function

on [0 L] This represents the complete insulation

of the ends of the rod such that no heat flows

from the ends (or from anywhere else)

a Assume a product solution separate

variables into X(x) and T(t) and set the

separation constant to ndashλ

b Solve the boundary value problem for its

eigenfunctions Xn and eigenvalues λn

c Solve for the general solution of the

time variable ODE

d Write the product solution and apply the

initial condition f(x) by constructing it

from a generalized fourier series of the

product solution

pde Partial differential equations ex Exercises for Chapter 07 p 1

e Let L = k = 1 and f(x) = 100ndash200L |xndashL2| as

shown in Figure 071 Compute the solution

series components Plot the sum of the

first 50 terms over x and t

0 02 04 06 08 1050100

x

f(x)

Figure 071 initial condition for Exercise 051Exercise 072 (kathmandu)

Consider the free vibration of a uniform

and relatively thin beammdashwith modulus of

elasticity E second moment of cross-sectional

area I and massshyper-length micromdashpinned at each

end The PDE describing this is a version of

the euler-bernoulli beam equation for vertical

motion u

part2ttu(t x) = ndashα2part4xxxxu(t x) (ex4)

with real constant α defined as

α2 =EI

micro (ex5)

Pinned supports fix vertical motion such that

we have boundary conditions on interval

x isin [0 L]

u(t 0) = 0 and u(t L) = 0 (ex6a)

Additionally pinned supports cannot provide

a moment so

part2xxu|x=0 = 0 and part2xxu|x=L = 0 (ex6b)

Furthermore consider the initial conditions

u(0 x) = f(x) and parttu|t=0 = 0 (ex7a)

where f is some piecewise continuous function

on [0 L]

opt Partial differential equations Exercises for Chapter 07 p 2

a Assume a product solution separate

variables into X(x) and T(t) and set the

separation constant to ndashλ

b Solve the boundary value problem for

its eigenfunctions Xn and eigenvalues λn

Assume real λ gt 0 (itrsquos true but tedious to

show)

c Solve for the general solution of the

time variable ODE

d Write the product solution and apply

the initial conditions by constructing

it from a generalized fourier series of

the product solution

e Let L = α = 1 and f(x) = sin(10πxL)

as shown in Figure 072 Compute the

solution series components Plot the sum

of the first 50 terms over x and t

0 02 04 06 08 1ndash101

x

f(x)

Figure 072 initial condition for Exercise 072

opt

Optimization

opt Optimization grad Gradient descent p 1

objective function

extremum

global extremum

local extremum

gradient descent

stationary point

saddle point

hessian matrix

optgrad Gradient descent

Consider a multivariate function f Rn rarr Rthat represents some cost or value This is

called an objective function and we often

want to find an X isin Rn that yields frsquos

extremum minimum or maximum depending on

whichever is desirable

It is important to note however that some

functions have no finite extremum Other

functions have multiple Finding a global

extremum is generally difficult however

many good methods exist for finding a local

extremum an extremum for some region R sub Rn

The method explored here is called gradient

descent It will soon become apparent why it

has this name

Stationary points

Recall from basic calculus that a function f of

a single variable had potential local extrema

where df(x) dx = 0 The multivariate version of

this for multivariate function f is

grad f = 0 (grad1)

A value X for which Equation grad1 holds is

called a stationary point However as in the

univariate case a stationary point may not be

a local extremum in these cases it called a

saddle point

Consider the hessian matrix H with values for

independent variables xi

Hij = part2xixjf (grad2)

opt Optimization grad Gradient descent p 1

second partial derivative test

positive definite

negative definite

indefinite

convex

concave

the gradient points toward stationary points

For a stationary point X the second partial

derivative test tells us if it is a local

maximum local minimum or saddle point

minimum If H(X) is positive definite (all its

eigenvalues are positive)

X is a local minimum

maximum If H(X) is negative definite (all its

eigenvalues are negative)

X is a local maximum

saddle If H(X) is indefinite (it has both

positive and negative eigenvalues)

X is a saddle point

These are sometimes called tests for concavity

minima occur where f is convex and maxima

where f is concave (ie where ndashf is convex)

It turns out however that solving

Equation grad1 directly for stationary points is

generally hard Therefore we will typically use

an iterative technique for estimating them

The gradient points the way

Although Equation grad1 isnrsquot usually directly

useful for computing stationary points it

suggests iterative techniques that are Several

techniques rely on the insight that the

gradient points toward stationary points

Recall from Lecture vecsgrad that grad f is a

vector field that points in the direction of

greatest increase in f

opt Optimization grad Gradient descent p 1

line search

step size α

Consider starting at some point x0 and

wanting to move iteratively closer to a

stationary point So if one is seeking a

maximum of f then choose x1 to be in the

direction of grad f If one is seeking a

minimum of f then choose x1 to be opposite the

direction of grad f

The question becomes how far α should we go

in (or opposite) the direction of the gradient

Surely too-small α will require more iteration

and too-large α will lead to poor convergence

or missing minima altogether This framing of

the problem is called line search There are a

few common methods for choosing α called the

step size some more computationally efficient

than others

Two methods for choosing the step size

are described below Both are framed as

minimization methods but changing the sign of

the step turns them into maximization methods

The classical method

Let

gk = grad f(xk) (grad3)

the gradient at the algorithmrsquos current

estimate xk of the minimum The classical

method of choosing α is to attempt to solve

analytically for

αk = argminα

f(xk ndash αgk) (grad4)

This solution approximates the function f

as one varies α It is approximate because

as α varies so should x But even with α as

the only variable Equation grad4 may be

opt Optimization grad Gradient descent p 1

1 Kreyszig Advanced Engineering Mathematics sect 221

2 Jonathan Barzilai and Jonathan M Borwein Two-Point Step Size

Gradient Methods In IMA Journal of Numerical Analysis 81 (Jan 1988)

pp 141ndash148 issn 0272-4979 doi 101093imanum81141 This

includes an innovative line search method

difficult or impossible to solve However this

is sometimes called the ldquooptimalrdquo choice for

α Here ldquooptimalityrdquo refers not to practicality

but to ideality This method is rarely used to

solve practical problems

The algorithm of the classical gradient descent

method can be summarized in the pseudocode

of Algorithm 1 It is described further in

Kreyszig1

Algorithm 1 Classical gradient descent

1 procedure classical_minimizer(fx0 T)

2 while δx gt T do until threshold T is

met

3 gk larr grad f(xk)

4 αk larr argminα f(xk ndash αgk)

5 xk+1 larr xk ndash αkgk6 δxlarr xk+1 ndash xk7 klarr k + 1

8 end while

9 return xk the threshold was reached

10 end procedureThe Barzilai and Borwein method

In practice several non-classical methods are

used for choosing step size α Most of these

construct criteria for step sizes that are too

small and too large and prescribe choosing

some α that (at least in certain cases) must

be in the sweet-spot in between Barzilai and

Borwein2 developed such a prescription which

we now present

Let ∆xk = xk ndash xkndash1 and ∆gk = gk ndash gkndash1 This

method minimizes ∆x ndash α∆g2 by choosing

αk =∆xk middot ∆gk∆gk middot ∆gk

(grad5)

opt Optimization grad Gradient descent p 2

3 Barzilai and Borwein Two-Point Step Size Gradient Methods

The algorithm of this gradient descent

method can be summarized in the pseudocode

of Algorithm 2 It is described further in

Barzilai and Borwein3

Algorithm 2 Barzilai and Borwein gradient

descent

1 procedure barzilai_minimizer(fx0 T)

2 while δx gt T do until threshold T is

met

3 gk larr grad f(xk)

4 ∆gk larr gk ndash gkndash15 ∆xk larr xk ndash xkndash16 αk larr

∆xkmiddot∆gk∆gkmiddot∆gk

7 xk+1 larr xk ndash αkgk8 δxlarr xk+1 ndash xk9 klarr k + 1

10 end while

11 return xk the threshold was reached

12 end procedure

Example optgradshy1 re Barzilai and Borwein gradient descent

Problem Statement

Consider the functions (a) f1 R2 rarr R and (b)

f2 R2 rarr R defined as

f1(x) = (x1 ndash 25)2 + 13(x2 + 10)

2 (grad6)

f2(x) =1

2x middot Ax ndash b middot x (grad7)

where

A =

[20 0

0 10

]and (grad8a)

b =[1 1

]gt (grad8b)

Use the method of Barzilai and Borweina

starting at some x0 to find a minimum of each

function

a Ibid

Problem Solution

First load some Python packages

opt Optimization grad Gradient descent p 3

from sympy import import numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latexfrom tabulate import tabulate

We begin by writing a class gradient_descent_minto perform the gradient descent This is not

optimized for speed

class gradient_descent_min() A Barzilai and Borwein gradient descent class

Inputs f Python function of x variables x list of symbolic variables (eg [x1 x2]) x0 list of numeric initial guess of a min of f T step size threshold for stopping the descent

To execute the gradient descent call descend method

nb This is only for gradients in cartesiancoordinates Further work would be to implementthis in multiple or generalized coordinatesSee the grad method below for implementation

def __init__(selffxx0T)selff = fselfx = Array(x)selfx0 = nparray(x0)selfT = Tselfn = len(x0) size of xselfg = lambdify(xselfgrad(fx)numpy)selfxk = nparray(x0)selftable =

def descend(self) unpack variablesf = selffx = selfxx0 = selfx0T = selfTg = selfg initialize variablesN = 0

opt Optimization grad Gradient descent p 4

x_k = x0dx = 2T cant be zerox_km1 = 9x0-1 cant equal x0g_km1 = nparray(g(x_km1))N_max = 100 max iterationstable_data = [[Nx0nparray(g(x0))0]]while (dx gt T and N lt N_max) or N lt 1N += 1 increment indexg_k = nparray(g(x_k))dg_k = g_k - g_km1dx_k = x_k - x_km1alpha_k = dx_kdot(dg_k)dg_kdot(dg_k)x_km1 = x_k storex_k = x_k - alpha_kg_k savetable_dataappend([Nx_kg_kalpha_k])selfxk = npvstack((selfxkx_k)) store other variablesg_km1 = g_kdx = nplinalgnorm(x_k - x_km1) check

selftabulater(table_data)

def tabulater(selftable_data)npset_printoptions(precision=2)tabulateLATEX_ESCAPE_RULES=selftable[python] = tabulate(table_dataheaders=[Nx_kg_kalpha]

)selftable[latex] = tabulate(table_dataheaders=[

$N$$bmx_k$$bmg_k$$alpha$]tablefmt=latex_raw

)

def grad(selffx) cartesian coords gradientreturn derive_by_array(f(x)x)

First consider f1

var(x1 x2)x = Array([x1x2])f1 = lambda x (x[0]-25)2 + 13(x[1]+10)2gd = gradient_descent_min(f=f1x=xx0=[-5040]T=1e-8)

opt Optimization grad Gradient descent p 5

Perform the gradient descent

gddescend()

Print the interesting variables

print(gdtable[python])

N xk gk α

0 [-50 40] [-150 1300] 0

1 [-4365 -1503] [-150 1300] 00423296

2 [shy3836 -10 ] [-1373 -13074] 00384979

3 [shy3311 -10 ] [-127e+02 124e-01] 0041454

4 [ 2499 -10 ] [-116e+02 -962e-03] 0499926

5 [ 25 -1005] [-002 012] 0499999

6 [ 25 -10] [-184e-08 -138e+00] 00385225

7 [ 25 -10] [-170e-08 219e-03] 00384615

8 [ 25 -10] [-157e-08 000e+00] 00384615

Now letrsquos lambdify the function f1 so we can

plot

f1_lambda = lambdify((x1x2)f1(x)numpy)

Now letrsquos plot a contour plot with the

gradient descent overlaid

opt Optimization grad Gradient descent p 6

fig ax = pltsubplots() contour plotX1 = nplinspace(-100100100)X2 = nplinspace(-5050100)X1 X2 = npmeshgrid(X1X2)F1 = f1_lambda(X1X2)pltcontourf(X1X2F1) gradient descent plotfrom mpl_toolkitsmplot3d import Axes3Dfrom matplotlibcollections import LineCollectionxX1 = gdxk[0]xX2 = gdxk[1]points = nparray([xX1 xX2])Treshape(-1 1 2)segments = npconcatenate([points[-1] points[1]] axis=1

)lc = LineCollection(segmentscmap=pltget_cmap(Reds)

)lcset_array(nplinspace(01len(xX1))) color segslcset_linewidth(3)axautoscale(False) avoid the scatter changing limsaxadd_collection(lc)axscatter(xX1xX2zorder=1marker=ocolor=pltcmReds(nplinspace(01len(xX1)))edgecolor=none

)pltshow()

minus100 minus75 minus50 minus25 0 25 50 75 100

minus40

minus20

0

20

40

opt Optimization grad Gradient descent p 7

Now consider f2

A = Matrix([[100][020]])b = Matrix([[11]])def f2(x)X = Array([x])tomatrix()Treturn 12Xdot(AX) - bdot(X)

gd = gradient_descent_min(f=f2x=xx0=[50-40]T=1e-8)

Perform the gradient descent

gddescend()

Print the interesting variables

print(gdtable[python])

opt Optimization grad Gradient descent p 8

N xk gk α

0 [ 50 -40] [ 499 -801] 0

1 [1758 1204] [ 499 -801] 00649741

2 [ 807 -101] [17478 23988] 00544221

3 [362 017] [ 7966 shy2122] 00558582

4 [ 049 -005] [3516 249] 00889491

5 [01 014] [ 389 -194] 00990201

6 [01 0 ] [004 19 ] 00750849

7 [01 005] [ 001 -095] 0050005

8 [01 005] [474e-03 958e-05] 00500012

9 [01 005] [ 237e-03 shy238e-09] 00999186

10 [01 005] [193e-06 237e-09] 01

11 [01 005] [ 000e+00 shy237e-09] 00999997

Now letrsquos lambdify the function f2 so we can

plot

f2_lambda = lambdify((x1x2)f2(x)numpy)

Now letrsquos plot a contour plot with the

gradient descent overlaid

fig ax = pltsubplots() contour plotX1 = nplinspace(-100100100)X2 = nplinspace(-5050100)X1 X2 = npmeshgrid(X1X2)F2 = f2_lambda(X1X2)pltcontourf(X1X2F2) gradient descent plotfrom mpl_toolkitsmplot3d import Axes3Dfrom matplotlibcollections import LineCollectionxX1 = gdxk[0]xX2 = gdxk[1]points = nparray([xX1 xX2])Treshape(-1 1 2)

opt Optimization lin Gradient descent p 9

segments = npconcatenate([points[-1] points[1]] axis=1

)lc = LineCollection(segmentscmap=pltget_cmap(Reds)

)lcset_array(nplinspace(01len(xX1))) color segslcset_linewidth(3)axautoscale(False) avoid the scatter changing limsaxadd_collection(lc)axscatter(xX1xX2zorder=1marker=ocolor=pltcmReds(nplinspace(01len(xX1)))edgecolor=none

)pltshow()

minus100 minus75 minus50 minus25 0 25 50 75 100

minus40

minus20

0

20

40

Python code in this section was generated from a Jupyter

notebook named gradient_descentipynb with a python3kernel

opt Optimization lin Constrained linear optimization p 1

constraints

linear programming problem

maximize

constrained

linear

feasible solution

optimal

hyperplane

optlin Constrained linear

optimization

Consider a linear objective function f

Rn rarr R with variables xi in vector x and

coefficients ci in vector c

f(x) = c middot x (lin1)

subject to the linear constraintsmdashrestrictions

on ximdash

Ax 6 a (lin2a)

Bx = b and (lin2b)

l 6 x 6 u (lin2c)

where A and B are constant matrices and

ab lu are nshyvectors This is one formulation

of what is called a linear programming

problem Usually we want to maximize f over

the constraints Such problems frequently

arise throughout engineering for instance

in manufacturing transportation operations

etc They are called constrained because

there are constraints on x they are called

linear because the objective function and the

constraints are linear

We call a pair (x f(x)) for which x satisfies

Equation lin2 a feasible solution Of course

not every feasible solution is optimal a

feasible solution is optimal iff there exists no

other feasible solution for which f is greater

(assumingwersquore maximizing) We call the vector

subspace of feasible solutions S sub Rn

Feasible solutions form a polytope

Consider the effect of the constraints Each

of the equalities and inequalities defines a

linear hyperplane in Rn (ie a linear subspace

opt Optimization lin Constrained linear optimization p 1

flat

polytope

vertices

basic feasible solutions

level set

an optimal solution occurs at a vertex of S

of dimension n ndash 1) either as a boundary of

S (inequality) or as a restriction of S to the

hyperplane When joined these hyperplanes

are the boundary of S (equalities restrict S

to lower dimension) So we see that each of

the boundaries of S is flat which makes S a

polytope (in R2 a polygon) What makes this

especially interesting is that polytopes have

vertices where the hyperplanes intersect

Solutions at the vertices are called basic

feasible solutions

Only the vertices matter

Our objective function f is linear so for

some constant h f(x) = h defines a level set

that is itself a hyperplane H in Rn If thishyperplane intersects S at a point x (x f(x) = h)

is the corresponding solution There are three

possibilities when H intersects S

1 H cap S is a vertex of S

2 H cap S is a boundary hyperplane of S or

3 H cap S slices through the interior of S

However this third option implies that there

exists a level set G corresponding to f(x) =

g such that G intersects S and g gt h so

solutions on H cap S are not optimal (We have

not proven this but it may be clear from

our progression) We conclude that either the

first or second case must be true for optimal

solutions And notice that in both cases a

(potentially optimal) solution occurs at at

least one vertex The key insight then is that

an optimal solution occurs at a vertex of

S

opt Optimization simplex Constrained linear optimization p 2

This means we donrsquot need to search all of S

or even its boundary we need only search the

vertices Helpful as this is it restricts us down

to( n constraints

)potentially optimal solutionsmdash

usually still too many to search in a naiumlve

way In Lecture optsimplex this is mitigated

by introducing a powerful searching method

opt Optimization simplex The simplex algorithm p 1

simplex algorithm

4 Another Python package pulp (PuLP) is probably more popular for

linear programming however we choose scipyoptimize because it

has applications beyond linear programming

optsimplex The simplex

algorithm

The simplex algorithm (or ldquomethodrdquo) is an

iterative technique for finding an optimal

solution of the linear programming problem

of Equations lin1 and lin2 The details of

the algorithm are somewhat involved but

the basic idea is to start at a vertex of the

feasible solution space S and traverse an edge

of the polytope that leads to another vertex

with a greater value of f Then repeat this

process until there is no neighboring vertex

with a greater value of f at which point the

solution is guaranteed to be optimal

Rather than present the details of the

algorithm we choose to show an example using

Python There have been some improvements

on the original algorithm that have been

implemented into many standard software

packages including Pythonrsquos scipy package

(Pauli Virtanen et al SciPy 10ndashFundamental

Algorithms for Scientific Computing in Python

In arXiv eshyprints arXiv190710121 [July 2019]

arXiv190710121) module scipyoptimize4

Example optsimplexshy1 re simplex method using scipyoptimize

Problem Statement

Maximize the objective function

f(x) = c middot x (simplex1a)

for x isin R2 and

c =[5 2

]gt(simplex1b)

opt Optimization simplex The simplex algorithm p 2

subject to constraints

0 6 x1 6 10 (simplex2a)

ndash5 6 x2 6 15 (simplex2b)

4x1 + x2 6 40 (simplex2c)

x1 + 3x2 6 35 (simplex2d)

ndash8x1 ndash x2 gt ndash75 (simplex2e)

Problem Solution

First load some Python packages

from scipyoptimize import linprogimport numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

Encoding the problem

Before we can use linprog we must first

encode Equations simplex2 and ex4 into

a form linprog will recognize We begin

with f which we can write as c middot x with the

coefficients of c as follows

c = [-5 -2] negative to find max

Wersquove negated each constant because

linprog minimizes f and we want to

maximize f Now letrsquos encode the inequality

constraints We will write the leftshyhand

side coefficients in the matrix A and the

opt Optimization simplex The simplex algorithm p 3

rightshyhand-side values in vector a such that

Ax 6 a (simplex3)

Notice that one of our constraint

inequalities is gt instead of 6 We can

flip this by multiplying the inequality by ndash1

We use simple lists to encode A and a

A = [[4 1][1 3][8 1]

]a = [40 35 75]

Now we need to define the lower l and upper

u bounds of x The function linprog expects

these to be in a single list of lower- and

upper-bounds of each xi

lu = [(0 10)(-515)

]

We want to keep track of each step linprogtakes We can access these by defining a

function callback to be passed to linprog

x = [] for storing the stepsdef callback(res) called at each step

opt Optimization simplex The simplex algorithm p 4

global xprint(fnit = resnit x_k = resx)xappend(resxcopy()) store

Now we need to call linprog We donrsquot have

any equality constraints so we need only use

the keyword arguments A_ub=A and b_ub=aFor demonstration purposes we tell it to

use the simplex method which is not as

good as its other methods which use better

algorithms based on the simplex

res = linprog(cA_ub=Ab_ub=abounds=lumethod=simplexcallback=callback

)

x = nparray(x)

nit = 0 x_k = [ 0 -5]nit = 1 x_k = [10 -5]nit = 2 x_k = [875 5 ]nit = 3 x_k = [772727273 909090909]nit = 4 x_k = [772727273 909090909]nit = 5 x_k = [772727273 909090909]nit = 5 x_k = [772727273 909090909]

So the optimal solution (x f(x)) is as follows

opt Optimization simplex The simplex algorithm p 5

print(foptimum x resx)print(foptimum f(x) -resfun)

optimum x [772727273 909090909]optimum f(x) 5681818181818182

The last point was repeated

1 once because there was no adjacent

vertex with greater f(x) and

2 twice because the algorithm calls

lsquocallbacklsquo twice on the last step

Plotting

When the solution space is in R2 it is

helpful to graphically represent the solution

space constraints and the progression of

the algorithm We begin by defining the

inequality lines from A and a over the bounds

of x1

n = len(c) number of variables xm = npshape(A)[0] number of inequality constraintsx2 = npempty([m2])for i in range(0m)x2[i] = -A[i][0]A[i][1]nparray(lu[0]) + a[i]A[i][1]

Now we plot a contour plot of f over

the bounds of x1 and x2 and overlay the

opt Optimization simplex The simplex algorithm p 6

inequality constraints and the steps of the

algorithm stored in x

lu_array = nparray(lu)fig ax = pltsubplots()mplrcParams[lineslinewidth] = 3 contour plotX1 = nplinspace(lu_array[0]100)X2 = nplinspace(lu_array[1]100)X1 X2 = npmeshgrid(X1X2)F2 = -c[0]X1 + -c[1]X2 negative because max hackcon = axcontourf(X1X2F2)cbar = figcolorbar(conax=ax)cbaraxset_ylabel(objective function) bounds on xun = nparray([11])opts = cwlabelNonelinewidth6pltplot(lu_array[0]lu_array[10]unopts)pltplot(lu_array[0]lu_array[11]unopts)pltplot(lu_array[00]unlu_array[1]opts)pltplot(lu_array[01]unlu_array[1]opts) inequality constraintsfor i in range(0m)p = pltplot(lu[0]x2[i]c=w)

pset_label(constraint) stepspltplot(x[0]x[1]-oc=rclip_on=Falsezorder=20label=simplex

)pltylim(lu_array[1])pltxlabel($x_1$)pltylabel($x_2$)pltlegend()pltshow()

opt Optimization ex The simplex algorithm p 7

0 2 4 6 8 10x1

minus50

minus25

00

25

50

75

100

125

150

x2

constraint

simplex

minus15

0

15

30

45

60

75

90

obje

ctiv

efu

nct

ion

Python code in this section was generated from a Jupyter

notebook named simplex_linear_programmingipynb with a

python3 kernel

nlin Optimization Exercises for Chapter 08 p 1

5 Barzilai and Borwein Two-Point Step Size Gradient Methods

optex Exercises for Chapter 08

Exercise 081 (cummerbund)

Consider the functions (a) f1 R2 rarr R and (b)

f2 R2 rarr R defined as

f1(x) = 4(x1 ndash 16)2 + (x2 + 64)

2 + x1 sin2 x1 (ex1)

f2(x) =1

2x middot Ax ndash b middot x (ex2)

where

A =

[5 0

0 15

]and (ex3a)

b =[ndash2 1

]gt (ex3b)

Use the method of Barzilai and Borwein5

starting at some x0 to find a minimum of each

function

Exercise 082 (melty)

Maximize the objective function

f(x) = c middot x (ex4a)

for x isin R3 and

c =[3 ndash8 1

]gt(ex4b)

subject to constraints

0 6 x1 6 20 (ex5a)

ndash5 6 x2 6 0 (ex5b)

5 6 x3 6 17 (ex5c)

x1 + 4x2 6 50 (ex5d)

2x1 + x3 6 43 (ex5e)

ndash4x1 + x2 ndash 5x3 gt ndash99 (ex5f)

1 As is customary we frequently say ldquosystemrdquo when we mean

ldquomathematical system modelrdquo Recall that multiple models may be

used for any given physical system depending on what one wants to

know

nlin

Nonlinear analysis

1 The ubiquity of near-linear systems and

the tools we have for analyses thereof can

sometimes give the impression that nonlinear

systems are exotic or even downright

flamboyant However a great many systems1

important for a mechanical engineer are

frequently hopelessly nonlinear Here are a some

examples of such systems

bull A robot arm

bull Viscous fluid flow (usually modelled by

the navier-stokes equations)

bull

bull Anything that ldquofills uprdquo or ldquosaturatesrdquo

bull Nonlinear optics

bull Einsteinrsquos field equations (gravitation in

general relativity)

bull Heat radiation and nonlinear heat

conduction

bull Fracture mechanics

bull

nlin Nonlinear analysis ss p 3

2 SH Strogatz and M Dichter Nonlinear Dynamics and Chaos

Second Studies in Nonlinearity Avalon Publishing 2016 isbn

9780813350844

3 W Richard Kolk and Robert A Lerman Nonlinear System Dynamics

1st ed Springer US 1993 isbn 978-1-4684shy6496shy2

2 Lest we think this is merely an

inconvenience we should keep in mind that

it is actually the nonlinearity that makes

many phenomena useful For instance the

depends on the nonlinearity

of its optics Similarly transistors and the

digital circuits made thereby (including the

microprocessor) wouldnrsquot function if their

physics were linear

3 In this chapter we will see some ways

to formulate characterize and simulate

nonlinear systems Purely

are few for nonlinear systems

Most are beyond the scope of this text but we

describe a few mostly in Lecture nlinchar

Simulation via numerical integration of

nonlinear dynamical equations is the most

accessible technique so it is introduced

4 We skip a discussion of linearization

of course if this is an option it is

preferable Instead we focus on the

5 For a good introduction to nonlinear

dynamics see Strogatz and Dichter2 A more

engineer-oriented introduction is Kolk and

Lerman3

nlin Nonlinear analysis ss Nonlinear state-space models p 1

nonlinear stateshyspace models

autonomous system

nonautonomous system

4 Strogatz and Dichter Nonlinear Dynamics and Chaos

equilibrium state

stationary point

nlinss Nonlinear stateshyspace

models

1 A state-space model has the general form

dx

dt= f(xu t) (ss1a)

y = (ss1b)

where f and g are vectorshyvalued functions

that depend on the system Nonlinear state-

space models are those for which f is a

functional of either x or u

For instance a state variable x1 might appear

as x21or two state variables might combine as

x1x2 or an input u1 might enter the equations as

log u1

Autonomous and nonautonomous systems

2 An autonomous system is one for which

f(x) with neither time nor input appearing

explicitly A nonautonomous system is one

for which either t or u do appear explicitly

in f It turns out that we can always write

nonautonomous systems as autonomous by

substituting in u(t) and introducing an extra

for t4

3 Therefore without loss of generality we

will focus on ways of analyzing autonomous

systems

Equilibrium

4 An equilibrium state (also called a

) x is one for which

dx dt = 0 In most cases this occurs only

when the input u is a constant u and for

timeshyvarying systems at a given time t For

autonomous systems equilibrium occurs when

nlin Nonlinear analysis char Nonlinear state-space models p 2

the following holds

(ss2)

This is a system of nonlinear algebraic

equations which can be challenging to solve for

x However frequently several solutionsmdashthat

is equilibrium statesmdashdo exist

nlin Nonlinear analysis char Nonlinear system characteristics p 1

system order

nlinchar Nonlinear system

characteristics

1 Characterizing nonlinear systems can be

challengingwithout the tools developed

for system characterization

However there are ways of characterizing

nonlinear systems and wersquoll here explore a few

DUM See Equation ss1a

Those in-common with linear systems

2 As with linear systems the system order is

either the number of stateshyvariables required

to describe the system or equivalently the

highest-order in a single

scalar differential equation describing the

system

3 Similarly nonlinear systems can have state

variables that depend on alone

or those that also depend on

(or some other independent variable) The

former lead to ordinary differential equations

(ODEs) and the latter to partial differential

equations (PDEs)

4 Equilibrium was already considered in

Lecture nlinss

Stability

5 In terms of system performance perhaps

no other criterion is as important as

nlin Nonlinear analysis char Nonlinear system characteristics p 1

Lyapunov stability theory

5 William L Brogan Modern Control Theory Third Prentice Hall 1991

Ch 10

6 A Choukchou-Braham et al Analysis and Control of

Underactuated Mechanical Systems SpringerLink Buumlcher Springer

International Publishing 2013 isbn 9783319026367 App A

Definition nlinchar1 Stability

If x is perturbed from an equilibrium state x

the response x(t) can

1 asymptotically return to x

(asymptotically )

2 diverge from x ( ) or

3 remain perturned or oscillate

about x with a constant amplitude

( stable)

Notice that this definition is actually

local stability in the neighborhood of one

equilibrium may not be the same as in the

neighborhood of another

6 Other than nonlinear systemsrsquo lack of

linear systemsrsquo eigenvalues poles and roots

of the characteristic equation from which to

compute it the primary difference between

the stability of linear and nonlinear systems

is that nonlinear system stability is often

difficult to establish

Using a linear systemrsquos eigenvalues it is

straightforward to establish stable unstable

and marginally stable subspaces of state-space

(via transforming to an eigenvector basis)

For nonlinear systems no such method exists

However we are not without tools to explore

nonlinear system stability One mathematical

tool to consider is

which is beyond the scope of this

course but has good treatments in5 and6

Qualities of equilibria

7 Equilibria (ie stationary points) come in

a variety of qualities It is instructive to

nlin Nonlinear analysis char Nonlinear system characteristics p 2

x

xprime

(a) r lt 0

x

xprime

(b) r = 0

x

xprime

(c) r gt 0

Figure 091 plots of xprime versus x for Equation char1

attractor

sink

stable

consider the first-order differential equation

in state variable with real constant

xprime = rx ndash x3 (char1)

If we plot xprime versus x for different values of r

we obtain the plots of Figure 091

8 By definition equilibria occur when

xprime = 0 so the xshyaxis crossings of Figure 091

are equilibria The blue arrows on the xshyaxis

show the of state

change xprime quantified by the plots For both

(a) and (b) only one equilibrium exists x = 0

Note that the blue arrows in both plots point

toward the equilibrium In such casesmdashthat is

when a exists around

an equilibrium for which state changes point

toward the equilibriummdashthe equilibrium is

called an or

Note that attractors are

9 Now consider (c) of Figure 091 When r gt 0

three equilibria emerge This change of the

number of equilibria with the changing of a

parameter is called a

nlin Nonlinear analysis sim Nonlinear system characteristics p 3

bifurcation diagram

repeller

source

unstable

saddle

A plot of bifurcations versus the parameter

is called a bifurcation diagram The x =

0 equilibrium now has arrows that point

from it Such an equilibrium is

called a or

and is The other two

equilibria here are (stable) attractors

Consider a very small initial condition x(0) = ε

If ε gt 0 the repeller pushes away x and the

positive attractor pulls x to itself Conversely

if ε lt 0 the repeller again pushes away x and

the negative attractor pulls x to itself

10 Another type of equilibrium is called

the one which acts as an

attractor along some lines and as a repeller

along others We will see this type in the

following example

Example nlincharshy1 re Saddle bifurcation

Problem Statement

Consider the dynamical equation

xprime = x2 + r (char2)

with r a real constant Sketch xprime vs x for

negative zero and positive r Identify and

classify each of the equilibria

nlin Nonlinear analysis pysim Nonlinear system simulation p 1

nlinsim Nonlinear system

simulation

nlin Nonlinear analysis pysim Simulating nonlinear systems in Python p 1

nlinpysim Simulating nonlinear

systems in Python

Example nlinpysimshy1 re a nonlinear unicycle

Problem Statementasdf

Problem Solution

First load some Python packages

import numpy as npimport sympy as spfrom scipyintegrate import solve_ivpimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

The state equation can be encoded via the

following function f

def f(t x u c)dxdt = [

x[3]npcos(x[2])x[3]npsin(x[2])x[4]1c[0] u(t)[0]1c[1] u(t)[1]

]return dxdt

The input function u must also be defined

nlin Nonlinear analysis ex Simulating nonlinear systems in Python p 2

def u(t)return [

15(1+npcos(t))25npsin(3t)

]

Define time spans initial values and constantstspan = nplinspace(0 50 300)xinit = [00000]mass = 10inertia = 10c = [massinertia]

Solve differential equationsol = solve_ivp(

lambda t x f(t x u c)[tspan[0] tspan[-1]]xinitt_eval=tspan

)

Letrsquos first plot the trajectory and

instantaneous velocity

ndash75 ndash50 ndash25 0 25 50 75 100x

ndash75

ndash50

ndash25

0

25

50

75

100

y

Figure 092

xp = soly[3]npcos(soly[2])yp = soly[3]npsin(soly[2])p = pltfigure()pltplot(soly[0]soly[1])pltquiver(soly[0]soly[1]xpyp)pltxlabel($x$)pltylabel($y$)pltshow()

Python code in this section was generated from a Jupyter

notebook named horcruxipynb with a python3 kernel

Nonlinear analysis Exercises for Chapter 09 p 1

nlinex Exercises for Chapter 09

A

Distribution tables

A Distribution tables A01 Gaussian distribution table p 1

A01 Gaussian distribution table

Below are plots of the Gaussian probability

density function f and cumulative distribution

function Φ Below them is Table A1 of CDF

values

zb

ndash3 ndash2 ndash1 0 1 2 30

01

02

03

04

z

Gaussia

nPDFf(z)

Φ(zb)

f(z)

(a) the Gaussian PDF

ndash3 ndash2 ndash1 0 1 2 30

02

04

06

08

1

interval bound zb

Gaussia

nCDFΦ(zb)

(b) the Gaussian CDF

Figure A1 the Gaussian PDF and CDF for z-scoresTable A1 z-score table Φ(zb) = P(z isin (ndashinfin zb])

zb 0 1 2 3 4 5 6 7 8 9

shy34 00003 00003 00003 00003 00003 00003 00003 00003 00003 00002

shy33 00005 00005 00005 00004 00004 00004 00004 00004 00004 00003

shy32 00007 00007 00006 00006 00006 00006 00006 00005 00005 00005

shy31 00010 00009 00009 00009 00008 00008 00008 00008 00007 00007

shy30 00013 00013 00013 00012 00012 00011 00011 00011 00010 00010

shy29 00019 00018 00018 00017 00016 00016 00015 00015 00014 00014

shy28 00026 00025 00024 00023 00023 00022 00021 00021 00020 00019

shy27 00035 00034 00033 00032 00031 00030 00029 00028 00027 00026

shy26 00047 00045 00044 00043 00041 00040 00039 00038 00037 00036

shy25 00062 00060 00059 00057 00055 00054 00052 00051 00049 00048

shy24 00082 00080 00078 00075 00073 00071 00069 00068 00066 00064

shy23 00107 00104 00102 00099 00096 00094 00091 00089 00087 00084

shy22 00139 00136 00132 00129 00125 00122 00119 00116 00113 00110

shy21 00179 00174 00170 00166 00162 00158 00154 00150 00146 00143

shy20 00228 00222 00217 00212 00207 00202 00197 00192 00188 00183

shy19 00287 00281 00274 00268 00262 00256 00250 00244 00239 00233

shy18 00359 00351 00344 00336 00329 00322 00314 00307 00301 00294

shy17 00446 00436 00427 00418 00409 00401 00392 00384 00375 00367

shy16 00548 00537 00526 00516 00505 00495 00485 00475 00465 00455

A Distribution tables A01 Gaussian distribution table p 2

Table A1 z-score table Φ(zb) = P(z isin (ndashinfin zb])

zb 0 1 2 3 4 5 6 7 8 9

shy15 00668 00655 00643 00630 00618 00606 00594 00582 00571 00559

shy14 00808 00793 00778 00764 00749 00735 00721 00708 00694 00681

shy13 00968 00951 00934 00918 00901 00885 00869 00853 00838 00823

shy12 01151 01131 01112 01093 01075 01056 01038 01020 01003 00985

shy11 01357 01335 01314 01292 01271 01251 01230 01210 01190 01170

shy10 01587 01562 01539 01515 01492 01469 01446 01423 01401 01379

shy09 01841 01814 01788 01762 01736 01711 01685 01660 01635 01611

shy08 02119 02090 02061 02033 02005 01977 01949 01922 01894 01867

shy07 02420 02389 02358 02327 02296 02266 02236 02206 02177 02148

shy06 02743 02709 02676 02643 02611 02578 02546 02514 02483 02451

shy05 03085 03050 03015 02981 02946 02912 02877 02843 02810 02776

shy04 03446 03409 03372 03336 03300 03264 03228 03192 03156 03121

shy03 03821 03783 03745 03707 03669 03632 03594 03557 03520 03483

shy02 04207 04168 04129 04090 04052 04013 03974 03936 03897 03859

shy01 04602 04562 04522 04483 04443 04404 04364 04325 04286 04247

shy00 05000 04960 04920 04880 04840 04801 04761 04721 04681 04641

00 05000 05040 05080 05120 05160 05199 05239 05279 05319 05359

01 05398 05438 05478 05517 05557 05596 05636 05675 05714 05753

02 05793 05832 05871 05910 05948 05987 06026 06064 06103 06141

03 06179 06217 06255 06293 06331 06368 06406 06443 06480 06517

04 06554 06591 06628 06664 06700 06736 06772 06808 06844 06879

05 06915 06950 06985 07019 07054 07088 07123 07157 07190 07224

06 07257 07291 07324 07357 07389 07422 07454 07486 07517 07549

07 07580 07611 07642 07673 07704 07734 07764 07794 07823 07852

08 07881 07910 07939 07967 07995 08023 08051 08078 08106 08133

09 08159 08186 08212 08238 08264 08289 08315 08340 08365 08389

10 08413 08438 08461 08485 08508 08531 08554 08577 08599 08621

11 08643 08665 08686 08708 08729 08749 08770 08790 08810 08830

12 08849 08869 08888 08907 08925 08944 08962 08980 08997 09015

13 09032 09049 09066 09082 09099 09115 09131 09147 09162 09177

14 09192 09207 09222 09236 09251 09265 09279 09292 09306 09319

15 09332 09345 09357 09370 09382 09394 09406 09418 09429 09441

16 09452 09463 09474 09484 09495 09505 09515 09525 09535 09545

17 09554 09564 09573 09582 09591 09599 09608 09616 09625 09633

18 09641 09649 09656 09664 09671 09678 09686 09693 09699 09706

19 09713 09719 09726 09732 09738 09744 09750 09756 09761 09767

20 09772 09778 09783 09788 09793 09798 09803 09808 09812 09817

21 09821 09826 09830 09834 09838 09842 09846 09850 09854 09857

22 09861 09864 09868 09871 09875 09878 09881 09884 09887 09890

23 09893 09896 09898 09901 09904 09906 09909 09911 09913 09916

24 09918 09920 09922 09925 09927 09929 09931 09932 09934 09936

25 09938 09940 09941 09943 09945 09946 09948 09949 09951 09952

26 09953 09955 09956 09957 09959 09960 09961 09962 09963 09964

27 09965 09966 09967 09968 09969 09970 09971 09972 09973 09974

28 09974 09975 09976 09977 09977 09978 09979 09979 09980 09981

29 09981 09982 09982 09983 09984 09984 09985 09985 09986 09986

30 09987 09987 09987 09988 09988 09989 09989 09989 09990 09990

31 09990 09991 09991 09991 09992 09992 09992 09992 09993 09993

32 09993 09993 09994 09994 09994 09994 09994 09995 09995 09995

33 09995 09995 09995 09996 09996 09996 09996 09996 09996 09997

34 09997 09997 09997 09997 09997 09997 09997 09997 09997 09998

A Distribution tables Studentrsquos tshydistribution table p 1

A02 Studentrsquos t-distribution

tableTable A2 two-tail inverse studentrsquos t-distribution table

percent probability

ν 600 667 750 800 875 900 950 975 990 995 999

1 0325 0577 1000 1376 2414 3078 6314 12706 31821 63657 31831

2 0289 0500 0816 1061 1604 1886 2920 4303 6965 9925 22327

3 0277 0476 0765 0978 1423 1638 2353 3182 4541 5841 10215

4 0271 0464 0741 0941 1344 1533 2132 2776 3747 4604 7173

5 0267 0457 0727 0920 1301 1476 2015 2571 3365 4032 5893

6 0265 0453 0718 0906 1273 1440 1943 2447 3143 3707 5208

7 0263 0449 0711 0896 1254 1415 1895 2365 2998 3499 4785

8 0262 0447 0706 0889 1240 1397 1860 2306 2896 3355 4501

9 0261 0445 0703 0883 1230 1383 1833 2262 2821 3250 4297

10 0260 0444 0700 0879 1221 1372 1812 2228 2764 3169 4144

11 0260 0443 0697 0876 1214 1363 1796 2201 2718 3106 4025

12 0259 0442 0695 0873 1209 1356 1782 2179 2681 3055 3930

13 0259 0441 0694 0870 1204 1350 1771 2160 2650 3012 3852

14 0258 0440 0692 0868 1200 1345 1761 2145 2624 2977 3787

15 0258 0439 0691 0866 1197 1341 1753 2131 2602 2947 3733

16 0258 0439 0690 0865 1194 1337 1746 2120 2583 2921 3686

17 0257 0438 0689 0863 1191 1333 1740 2110 2567 2898 3646

18 0257 0438 0688 0862 1189 1330 1734 2101 2552 2878 3610

19 0257 0438 0688 0861 1187 1328 1729 2093 2539 2861 3579

20 0257 0437 0687 0860 1185 1325 1725 2086 2528 2845 3552

21 0257 0437 0686 0859 1183 1323 1721 2080 2518 2831 3527

22 0256 0437 0686 0858 1182 1321 1717 2074 2508 2819 3505

23 0256 0436 0685 0858 1180 1319 1714 2069 2500 2807 3485

24 0256 0436 0685 0857 1179 1318 1711 2064 2492 2797 3467

25 0256 0436 0684 0856 1178 1316 1708 2060 2485 2787 3450

26 0256 0436 0684 0856 1177 1315 1706 2056 2479 2779 3435

27 0256 0435 0684 0855 1176 1314 1703 2052 2473 2771 3421

28 0256 0435 0683 0855 1175 1313 1701 2048 2467 2763 3408

29 0256 0435 0683 0854 1174 1311 1699 2045 2462 2756 3396

30 0256 0435 0683 0854 1173 1310 1697 2042 2457 2750 3385

35 0255 0434 0682 0852 1170 1306 1690 2030 2438 2724 3340

40 0255 0434 0681 0851 1167 1303 1684 2021 2423 2704 3307

45 0255 0434 0680 0850 1165 1301 1679 2014 2412 2690 3281

50 0255 0433 0679 0849 1164 1299 1676 2009 2403 2678 3261

55 0255 0433 0679 0848 1163 1297 1673 2004 2396 2668 3245

60 0254 0433 0679 0848 1162 1296 1671 2000 2390 2660 3232infin 0253 0431 0674 0842 1150 1282 1645 1960 2326 2576 3090

B

Fourier andLaplace tables

B Fourier and Laplace tables B01 Laplace transforms p 1

B01 Laplace transforms

Table B1 is a table with functions of time

f(t) on the left and corresponding Laplace

transforms L(s) on the right Where applicable

s = σ + jω is the Laplace transform variable

T is the timeshydomain period ω02πT is the

corresponding angular frequency j =radicndash1 a isin R+

and b t0 isin R are constants

Table B1 Laplace transform identities

function of time t function of Laplace s

a1f1(t) + a2f2(t) a1F1(s) + a2F2(s)

f(t ndash t0) F(s)endasht0s

fprime(t) sF(s) ndash f(0)

dnf(t)

dtnsnF(s) + s(nndash1)f(0) + s(nndash2)fprime(0) + middot middot middot + f(nndash1)(0)

ˆ t

0f(τ)dτ

1

sF(s)

tf(t) ndashFprime(s)

f1(t) lowast f2(t) =ˆ infinndashinfin f1(τ)f2(t ndash τ)dτ F1(s)F2(s)

δ(t) 1

us(t) 1s

ur(t) 1s2

tnndash1(n ndash 1) 1sn

endashat1

s + a

tendashat1

(s + a)2

B Fourier and Laplace tables B01 Laplace transforms p 2

1

(n ndash 1)tnndash1endashat

1

(s + a)n

1

a ndash b(eat ndash ebt)

1

(s ndash a)(s ndash b)(a 6= b)

1

a ndash b(aeat ndash bebt)

s

(s ndash a)(s ndash b)(a 6= b)

sinωtω

s2 +ω2

cosωts

s2 +ω2

eat sinωtω

(s ndash a)2 +ω2

eat cosωts ndash a

(s ndash a)2 +ω2

B Fourier and Laplace tables B02 Fourier transforms p 1

B02 Fourier transforms

Table B2 is a table with functions of time

f(t) on the left and corresponding Fourier

transforms F(ω) on the right Where applicable

T is the timeshydomain period ω02πT is the

corresponding angular frequency j =radicndash1 a isin R+

and b t0 isin R are constants Furthermore feand f0 are even and odd functions of time

respectively and it can be shown that any

function f can be written as the sum f(t) =

fe(t) + f0(t) (Hsu1967)

Table B2 Fourier transform identities

function of time t function of frequency ω

a1f1(t) + a2f2(t) a1F1(ω) + a2F2(ω)

f(at)1

|a|F(ωa)

f(ndasht) F(ndashω)

f(t ndash t0) F(ω)endashjωt0

f(t) cosω0t1

2F(ω ndashω0) +

1

2F(ω +ω0)

f(t) sinω0t1

j2F(ω ndashω0) ndash

1

j2F(ω +ω0)

fe(t) Re F(ω)

f0(t) j Im F(ω)

F(t) 2πf(ndashω)

fprime(t) jωF(ω)

dnf(t)

dtn(jω)nF(ω)

ˆ t

ndashinfin f(τ)dτ1

jωF(ω) + πF(0)δ(ω)

B Fourier and Laplace tables B02 Fourier transforms p 2

ndashjtf(t) Fprime(ω)

(ndashjt)nf(t)dnF(ω)

dωn

f1(t) lowast f2(t) =ˆ infinndashinfin f1(τ)f2(t ndash τ)dτ F1(ω)F2(ω)

f1(t)f2(t)1

2πF1(ω) lowast F2(ω) =

1

ˆ infinndashinfin F1(α)F2(ω ndash α)dα

endashatus(t)1

jω + a

endasha|t|2a

a2 +ω2

endashat2 radic

πa endashω2(4a)

1 for |t| lt a2 else 0a sin(aω2)

aω2

tendashatus(t)1

(a + jω)2

tnndash1

(n ndash 1)endashat)nus(t)

1

(a + jω)n

1

a2 + t2π

aendasha|ω|

δ(t) 1

δ(t ndash t0) endashjωt0

us(t) πδ(ω) +1

us(t ndash t0) πδ(ω) +1

jωendashjωt0

1 2πδ(ω)

t 2πjδprime(ω)

tn 2πjndnδ(ω)

dωn

ejω0t 2πδ(ω ndashω0)

cosω0t πδ(ω ndashω0) + πδ(ω +ω0)

B Fourier and Laplace tables Fourier transforms p 3

sinω0t ndashjπδ(ω ndashω0) + jπδ(ω +ω0)

us(t) cosω0tjω

ω20 ndashω2+π

2δ(ω ndashω0) +

π

2δ(ω +ω0)

us(t) sinω0tω0

ω20 ndashω2+π

2jδ(ω ndashω0) ndash

π

2jδ(ω +ω0)

tus(t) jπδprime(ω) ndash 1ω2

1t πj ndash 2πjus(ω)

1tn(ndashjω)nndash1

(n ndash 1)

(πj ndash 2πjus(ω)

)sgn t

2

jωinfinsumn=ndashinfin δ(t ndash nT) ω0

infinsumn=ndashinfin δ(ω ndash nω0)

C

Mathematicsreference

C Mathematics reference C01 Quadratic forms p 1

C01 Quadratic forms

The solution to equations of the form ax2 + bx +

c = 0 is

x =ndashbplusmn

radicb2 ndash 4ac

2a (C011)

Completing the square

This is accomplished by reshywriting the quadratic

formula in the form of the leftshyhand-side (LHS)

of this equality which describes factorization

x2 + 2xh + h2 = (x + h)2 (C012)

C Mathematics reference C02 Trigonometry p 1

C02 Trigonometry

Triangle identities

With reference to the below figure the law of

sines is

sin α

a=sin β

b=sin γ

c(C021)

and the law of cosines is

c2 = a2 + b2 ndash 2ab cos γ (C022a)

b2 = a2 + c2 ndash 2ac cos β (C022b)

a2 = c2 + b2 ndash 2cb cos α (C022c)

b

ca

α γ

β

Reciprocal identities

csc u =1

sin u(C023a)

sec u =1

cos u(C023b)

cot u =1

tan u(C023c)

Pythagorean identities

1 = sin2 u + cos2 u (C024a)

sec2 u = 1 + tan2 u (C024b)

csc2 u = 1 + cot2 u (C024c)

C Mathematics reference C02 Trigonometry p 1

Co-function identities

sin(π2ndash u

)= cos u (C025a)

cos(π2ndash u

)= sin u (C025b)

tan(π2ndash u

)= cot u (C025c)

csc(π2ndash u

)= sec u (C025d)

sec(π2ndash u

)= csc u (C025e)

cot(π2ndash u

)= tan u (C025f)

Even-odd identities

sin(ndashu) = ndash sin u (C026a)

cos(ndashu) = cos u (C026b)

tan(ndashu) = ndash tan u (C026c)

Sumshydifference formulas (AM or lock-in)

sin(uplusmn v) = sin u cos vplusmn cos u sin v (C027a)

cos(uplusmn v) = cos u cos v∓ sin u sin v (C027b)

tan(uplusmn v) = tan uplusmn tan v1∓ tanu tan v

(C027c)

Double angle formulas

sin(2u) = 2 sin u cos u (C028a)

cos(2u) = cos2 u ndash sin2 u (C028b)

= 2 cos2 u ndash 1 (C028c)

= 1 ndash 2 sin2 u (C028d)

tan(2u) =2 tanu

1 ndash tan2 u(C028e)

Power-reducing or halfshyangle formulas

sin2 u =1 ndash cos(2u)

2(C029a)

cos2 u =1 + cos(2u)

2(C029b)

tan2 u =1 ndash cos(2u)

1 + cos(2u)(C029c)

C Mathematics reference C02 Trigonometry p 1

Sum-toshyproduct formulas

sin u + sin v = 2 sinu + v

2cos

u ndash v

2(C0210a)

sin u ndash sin v = 2 cosu + v

2sin

u ndash v

2(C0210b)

cos u + cos v = 2 cosu + v

2cos

u ndash v

2(C0210c)

cos u ndash cos v = ndash2 sinu + v

2sin

u ndash v

2(C0210d)

Product-to-sum formulas

sin u sin v =1

2

[cos(u ndash v) ndash cos(u + v)

](C0211a)

cos u cos v =1

2

[cos(u ndash v) + cos(u + v)

](C0211b)

sin u cos v =1

2

[sin(u + v) + sin(u ndash v)

](C0211c)

cos u sin v =1

2

[sin(u + v) ndash sin(u ndash v)

](C0211d)

Two-to-one formulas

A sin u + B cos u = C sin(u + φ) (C0212a)

= C cos(u +ψ) where (C0212b)

C =

radicA2 + B2 (C0212c)

φ = arctanB

A(C0212d)

ψ = ndash arctanA

B(C0212e)

C Mathematics reference C03 Matrix inverses p 1

adjoint

C03 Matrix inverses

This is a guide to inverting 1times1 2times2 and ntimesnmatrices

Let A be the 1times 1 matrix

A =[a]

The inverse is simply the reciprocal

Andash1 =[1a

]

Let B be the 2times 2 matrix

B =

[b11 b12b21 b22

]

It can be shown that the inverse follows a

simple pattern

Bndash1 =1

det B

[b22 ndashb12ndashb21 b11

]

=1

b11b22 ndash b12b21

[b22 ndashb12ndashb21 b11

]

Let C be an n times n matrix It can be shown that

its inverse is

Cndash1 =1

det CadjC

where adj is the adjoint of C

C Mathematics reference Laplace transforms p 1

C04 Laplace transforms

The definition of the one-side Laplace and

inverse Laplace transforms follow

Definition C041 Laplace transforms (one-

sided)

Laplace transform L

L(y(t)) = Y(s) =

ˆ infin0

y(t)endashstdt (C041)

Inverse Laplace transform Lndash1

Lndash1(Y(s)) = y(t) =1

2πj

ˆ σ+jinfinσndashjinfin Y(s)estds (C042)

See Table B1 for a list of properties and

common transforms

D

Complex analysis

D Complex analysis Eulerrsquos formulas p 1

Eulerrsquos formulaD01 Eulerrsquos formulas

Eulerrsquos formula is our bridge backshyand forth

between trigonomentric forms (cos θ and sin θ)

and complex exponential form (ejθ)

ejθ = cos θ + j sin θ (The Formulas of Euler1)

Here are a few useful identities implied by

Eulerrsquos formula

endashjθ = cos θ ndash j sin θ (The Formulas of Euler2a)

cos θ = Re (ejθ) (The Formulas of Euler2b)

=1

2

(ejθ + endashjθ

)(The Formulas of Euler2c)

sin θ = Im (ejθ) (The Formulas of Euler2d)

=1

j2

(ejθ ndash endashjθ

)

(The Formulas of Euler2e)

Bibliography

[1] Robert B Ash Basic Probability Theory

Dover Publications Inc 2008

[2] Joan Bagaria Set Theory In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Fall 2019

Metaphysics Research Lab Stanford

University 2019

[3] Maria Baghramian and J Adam

Carter Relativism In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Winter 2019

Metaphysics Research Lab Stanford

University 2019

[4] Stephen Barker and Mark Jago Being

Positive About Negative Facts In

Philosophy and Phenomenological

Research 851 (2012) pp 117ndash138 doi 10

1111j1933-1592201000479x

[5] Jonathan Barzilai and Jonathan M

Borwein Two-Point Step Size Gradient

Methods In IMA Journal of Numerical

Analysis 81 (Jan 1988) pp 141ndash148 issn

0272-4979 doi 101093imanum81141

This includes an innovative line search

method

D Complex analysis Bibliography p 3

1 FBADL04

[6] Anat Biletzki and Anat Matar Ludwig

Wittgenstein In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Summer 2018

Metaphysics Research Lab Stanford

University 2018 An introduction to

Wittgenstein and his thought

[7] Antonio Bove F (Ferruccio) Colombini

and Daniele Del Santo Phase space

analysis of partial differential

equations eng Progress in nonlinear

differential equations and their

applications v 69 Boston Berlin

Birkhaumluser 2006 isbn 9780817645212

[8] William L Brogan Modern Control Theory

Third Prentice Hall 1991

[9] Francesco Bullo and Andrew D Lewis

Geometric control of mechanical systems

modeling analysis and design for simple

mechanical control systems Ed by JE

Marsden L Sirovich and M Golubitsky

Springer 2005

[10] Francesco Bullo and Andrew D Lewis

Supplementary Chapters for Geometric

Control of Mechanical Systems1 Jan

2005

[11] A Choukchou-Braham et al Analysis and

Control of Underactuated Mechanical

Systems SpringerLink Buumlcher Springer

International Publishing 2013 isbn

9783319026367

[12] K Ciesielski Set Theory for the

Working Mathematician London

Mathematical Society Student Texts

Cambridge University Press 1997 isbn

9780521594653 A readable introduction

to set theory

D Complex analysis Bibliography p 4

[13] Marian David The Correspondence

Theory of Truth In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Fall 2016 Metaphysics

Research Lab Stanford University

2016 A detailed overview of the

correspondence theory of truth

[14] David Dolby Wittgenstein on Truth

In A Companion to Wittgenstein John

Wiley amp Sons Ltd 2016 Chap 27 pp 433ndash

442 isbn 9781118884607 doi 101002

9781118884607ch27

[15] Peter J Eccles An Introduction to

Mathematical Reasoning Numbers

Sets and Functions Cambridge

University Press 1997 doi

10 1017 CBO9780511801136 A gentle

introduction to mathematical reasoning

It includes introductory treatments of

set theory and number systems

[16] HB Enderton Elements of Set

Theory Elsevier Science 1977 isbn

9780080570426 A gentle introduction to

set theory and mathematical reasoningmdasha

great place to start

[17] Richard P Feynman Robert B Leighton

and Matthew Sands The Feynman

Lectures on Physics New Millennium

Perseus Basic Books 2010

[18] Michael Glanzberg Truth In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Fall 2018

Metaphysics Research Lab Stanford

University 2018

[19] Hans Johann Glock Truth in the

Tractatus In Synthese 1482 (Jan

2006) pp 345ndash368 issn 1573-0964 doi

101007s11229-004-6226-2

D Complex analysis Bibliography p 5

[20] Mario GoacutemezshyTorrente Alfred Tarski

In The Stanford Encyclopedia of

Philosophy Ed by Edward N Zalta

Spring 2019 Metaphysics Research Lab

Stanford University 2019

[21] Paul Guyer and Rolf-Peter Horstmann

Idealism In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Winter 2018

Metaphysics Research Lab Stanford

University 2018

[22] R Haberman Applied Partial Differential

Equations with Fourier Series and

Boundary Value Problems (Classic

Version) Pearson Modern Classics for

Advanced Mathematics Pearson Education

Canada 2018 isbn 9780134995434

[23] GWF Hegel and AV Miller

Phenomenology of Spirit Motilal

Banarsidass 1998 isbn 9788120814738

[24] Wilfrid Hodges Model Theory In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Fall 2018

Metaphysics Research Lab Stanford

University 2018

[25] Wilfrid Hodges Tarskirsquos Truth

Definitions In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Fall 2018 Metaphysics

Research Lab Stanford University 2018

[26] Peter Hylton and Gary Kemp Willard

Van Orman Quine In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Spring 2019 Metaphysics

Research Lab Stanford University 2019

[27] ET Jaynes et al Probability Theory The

Logic of Science Cambridge University

Press 2003 isbn 9780521592710

D Complex analysis Bibliography p 6

An excellent and comprehensive

introduction to probability theory

[28] I Kant P Guyer and AW Wood Critique

of Pure Reason The Cambridge Edition

of the Works of Immanuel Kant

Cambridge University Press 1999 isbn

9781107268333

[29] Juliette Kennedy Kurt Goumldel In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Winter 2018

Metaphysics Research Lab Stanford

University 2018

[30] Drew Khlentzos Challenges to

Metaphysical Realism In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Winter 2016

Metaphysics Research Lab Stanford

University 2016

[31] Peter Klein Skepticism In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Summer 2015

Metaphysics Research Lab Stanford

University 2015

[32] M Kline Mathematics The Loss

of Certainty A Galaxy book

Oxford University Press 1982 isbn

9780195030853 A detailed account

of the ldquoillogicalrdquo development of

mathematics and an exposition of

its therefore remarkable utility in

describing the world

[33] W Richard Kolk and Robert A Lerman

Nonlinear System Dynamics 1st ed

Springer US 1993 isbn 978-1-4684shy6496shy2

[34] Erwin Kreyszig Advanced Engineering

Mathematics 10th John Wiley amp Sons

Limited 2011 isbn 9781119571094 The

authoritative resource for engineering

mathematics It includes detailed

D Complex analysis Bibliography p 7

accounts of probability statistics

vector calculus linear algebra

fourier analysis ordinary and partial

differential equations and complex

analysis It also includes several other

topics with varying degrees of depth

Overall it is the best place to start

when seeking mathematical guidance

[35] John M Lee Introduction to Smooth

Manifolds second Vol 218 Graduate

Texts in Mathematics Springer 2012

[36] Catherine Legg and Christopher

Hookway Pragmatism In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Spring 2019 Metaphysics

Research Lab Stanford University

2019 An introductory article on the

philsophical movement ldquopragmatismrdquo It

includes an important clarification of

the pragmatic slogan ldquotruth is the end of

inquiryrdquo

[37] Daniel Liberzon Calculus of Variations

and Optimal Control Theory A Concise

Introduction Princeton University Press

2012 isbn 9780691151878

[38] Panu Raatikainen Goumldelrsquos Incompleteness

Theorems In The Stanford Encyclopedia

of Philosophy Ed by Edward N Zalta

Fall 2018 Metaphysics Research Lab

Stanford University 2018 A through

and contemporary description of

Goumldelrsquos incompleteness theorems which

have significant implications for the

foundations and function of mathematics

and mathematical truth

[39] Paul Redding Georg Wilhelm Friedrich

Hegel In The Stanford Encyclopedia

of Philosophy Ed by Edward N Zalta

Summer 2018 Metaphysics Research Lab

Stanford University 2018

D Complex analysis Bibliography p 8

[40] HM Schey Div Grad Curl and All that An

Informal Text on Vector Calculus WW

Norton 2005 isbn 9780393925166

[41] Christopher Shields Aristotle In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Winter 2016

Metaphysics Research Lab Stanford

University 2016

[42] Steven S Skiena Calculated Bets

Computers Gambling and Mathematical

Modeling to Win Outlooks Cambridge

University Press 2001 doi 10 1017

CBO9780511547089 This includes a lucid

section on probability versus statistics

also available here https www3

csstonybrookedu~skienajaialai

excerptsnode12html

[43] George Smith Isaac Newton In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Fall 2008

Metaphysics Research Lab Stanford

University 2008

[44] Daniel Stoljar and Nic Damnjanovic

The Deflationary Theory of Truth

In The Stanford Encyclopedia of

Philosophy Ed by Edward N Zalta Fall

2014 Metaphysics Research Lab Stanford

University 2014

[45] WA Strauss Partial Differential

Equations An Introduction Wiley 2007

isbn 9780470054567 A thorough and yet

relatively compact introduction

[46] SH Strogatz and M Dichter Nonlinear

Dynamics and Chaos Second Studies

in Nonlinearity Avalon Publishing 2016

isbn 9780813350844

[47] Mark Textor States of Affairs In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Winter 2016

D Complex analysis Bibliography p 9

Metaphysics Research Lab Stanford

University 2016

[48] Pauli Virtanen et al SciPy

10ndashFundamental Algorithms for

Scientific Computing in Python In arXiv

eshyprints arXiv190710121 (July 2019)

arXiv190710121

[49] Wikipedia Algebra mdash Wikipedia The Free

Encyclopedia httpenwikipedia

orgwindexphptitle=Algebraampoldid=

920573802 [Online accessed 26shyOctober-

2019] 2019

[50] Wikipedia Carl Friedrich Gauss mdash

Wikipedia The Free Encyclopedia http

en wikipedia org w index php

title=Carl20Friedrich20Gaussampoldid=

922692291 [Online accessed 26shyOctober-

2019] 2019

[51] Wikipedia Euclid mdash Wikipedia The Free

Encyclopedia httpenwikipedia

orgwindexphptitle=Euclidampoldid=

923031048 [Online accessed 26shyOctober-

2019] 2019

[52] Wikipedia First-order logic mdash

Wikipedia The Free Encyclopedia http

enwikipediaorgwindexphptitle=

First-order20logicampoldid=921437906

[Online accessed 29shyOctobershy2019] 2019

[53] Wikipedia Fundamental interaction mdash

Wikipedia The Free Encyclopedia http

en wikipedia org w index php

title=Fundamental20interactionampoldid=

925884124 [Online accessed 16shyNovember-

2019] 2019

[54] Wikipedia Leonhard Euler mdash Wikipedia

The Free Encyclopedia http en

wikipediaorgwindexphptitle=

Leonhard 20Euler amp oldid = 921824700

[Online accessed 26shyOctobershy2019] 2019

D Complex analysis Bibliography p 10

[55] Wikipedia Linguistic turnmdashWikipedia

The Free Encyclopedia http en

wikipediaorgwindexphptitle=

Linguistic20turnampoldid=922305269

[Online accessed 23shyOctobershy2019] 2019

Hey we all do it

[56] Wikipedia Probability space mdash

Wikipedia The Free Encyclopedia http

enwikipediaorgwindexphptitle=

Probability20spaceampoldid=914939789

[Online accessed 31shyOctobershy2019] 2019

[57] Wikipedia Propositional calculus mdash

Wikipedia The Free Encyclopedia http

en wikipedia org w index php

title=Propositional20calculusampoldid=

914757384 [Online accessed 29shyOctober-

2019] 2019

[58] Wikipedia Quaternion mdash Wikipedia The

Free Encyclopedia httpenwikipedia

orgwindexphptitle=Quaternionamp

oldid=920710557 [Online accessed 26-

Octobershy2019] 2019

[59] Wikipedia Set-builder notation mdash

Wikipedia The Free Encyclopedia http

enwikipediaorgwindexphptitle=

Set-builder20notationampoldid=917328223

[Online accessed 29shyOctobershy2019] 2019

[60] Wikipedia William Rowan Hamilton mdash

Wikipedia The Free Encyclopedia http

enwikipediaorgwindexphptitle=

William 20Rowan 20Hamilton amp oldid =

923163451 [Online accessed 26shyOctober-

2019] 2019

[61] L Wittgenstein PMS Hacker and J

Schulte Philosophical Investigations

Wiley 2010 isbn 9781444307979

[62] Ludwig Wittgenstein Tractatus Logico-

Philosophicus Ed by C familyi=C

given=K Ogden giveni= O Project

Complex analysis Bibliography p 11

Gutenberg International Library of

Psychology Philosophy and Scientific

Method Kegan Paul Trench Trubner amp Co

Ltd 1922 A brilliant work on what is

possible to express in languagemdashand what

is not As Wittgenstein puts it ldquoWhat can

be said at all can be said clearly and

whereof one cannot speak thereof one

must be silentrdquo

[63] Slavoj Žižek Less Than Nothing

Hegel and the Shadow of Dialectical

Materialism Verso 2012 isbn

9781844678976 This is one of the most

interesting presentations of Hegel and

Lacan by one of the most exciting

contemporary philosophers

  • Mathematics itself
    • Truth
      • Neo-classical theories of truth
      • The picture theory
      • The relativity of truth
      • Other ideas about truth
      • Where this leaves us
        • The foundations of mathematics
          • Algebra ex nihilo
          • The application of mathematics to science
          • The rigorization of mathematics
          • The foundations of mathematics are built
          • The foundations have cracks
          • Mathematics is considered empirical
            • Mathematical reasoning
            • Mathematical topics in overview
            • What is mathematics for engineering
            • Exercises for Chapter 01
              • Mathematical reasoning logic and set theory
                • Introduction to set theory
                • Logical connectives and quantifiers
                  • Logical connectives
                  • Quantifiers
                    • Exercises for Chapter 02
                      • Exe 021 hardhat
                      • Exe 022 2
                      • Exe 023 3
                      • Exe 024 4
                          • Probability
                            • Probability and measurement
                            • Basic probability theory
                              • Algebra of events
                                • Independence and conditional probability
                                  • Conditional probability
                                    • Bayes theorem
                                      • Testing outcomes
                                      • Posterior probabilities
                                        • Random variables
                                        • Probability density and mass functions
                                          • Binomial PMF
                                          • Gaussian PDF
                                            • Expectation
                                            • Central moments
                                            • Exercises for Chapter 03
                                              • Exe 031 5
                                                  • Statistics
                                                    • Populations samples and machine learning
                                                    • Estimation of sample mean and variance
                                                      • Estimation and sample statistics
                                                      • Sample mean variance and standard deviation
                                                      • Sample statistics as random variables
                                                      • Nonstationary signal statistics
                                                        • Confidence
                                                          • Generate some data to test the central limit theorem
                                                          • Sample statistics
                                                          • The truth about sample means
                                                          • Gaussian and probability
                                                            • Student confidence
                                                            • Multivariate probability and correlation
                                                              • Marginal probability
                                                              • Covariance
                                                              • Conditional probability and dependence
                                                                • Regression
                                                                • Exercises for Chapter 04
                                                                  • Vector calculus
                                                                    • Divergence surface integrals and flux
                                                                      • Flux and surface integrals
                                                                      • Continuity
                                                                      • Divergence
                                                                      • Exploring divergence
                                                                        • Curl line integrals and circulation
                                                                          • Line integrals
                                                                          • Circulation
                                                                          • Curl
                                                                          • Zero curl circulation and path independence
                                                                          • Exploring curl
                                                                            • Gradient
                                                                              • Gradient
                                                                              • Vector fields from gradients are special
                                                                              • Exploring gradient
                                                                                • Stokes and divergence theorems
                                                                                  • The divergence theorem
                                                                                  • The Kelvin-Stokes theorem
                                                                                  • Related theorems
                                                                                    • Exercises for Chapter 05
                                                                                      • Exe 051 6
                                                                                          • Fourier and orthogonality
                                                                                            • Fourier series
                                                                                            • Fourier transform
                                                                                            • Generalized fourier series and orthogonality
                                                                                            • Exercises for Chapter 06
                                                                                              • Exe 061 secrets
                                                                                              • Exe 062 seesaw
                                                                                              • Exe 063 9
                                                                                              • Exe 064 10
                                                                                              • Exe 065 11
                                                                                              • Exe 066 12
                                                                                              • Exe 067 13
                                                                                              • Exe 068 14
                                                                                              • Exe 069 savage
                                                                                              • Exe 0610 strawman
                                                                                                  • Partial differential equations
                                                                                                    • Classifying PDEs
                                                                                                    • Sturm-liouville problems
                                                                                                      • Types of boundary conditions
                                                                                                        • PDE solution by separation of variables
                                                                                                        • The 1D wave equation
                                                                                                        • Exercises for Chapter 07
                                                                                                          • Exe 071 poltergeist
                                                                                                          • Exe 072 kathmandu
                                                                                                              • Optimization
                                                                                                                • Gradient descent
                                                                                                                  • Stationary points
                                                                                                                  • The gradient points the way
                                                                                                                  • The classical method
                                                                                                                  • The Barzilai and Borwein method
                                                                                                                    • Constrained linear optimization
                                                                                                                      • Feasible solutions form a polytope
                                                                                                                      • Only the vertices matter
                                                                                                                        • The simplex algorithm
                                                                                                                        • Exercises for Chapter 08
                                                                                                                          • Exe 081 cummerbund
                                                                                                                          • Exe 082 melty
                                                                                                                              • Nonlinear analysis
                                                                                                                                • Nonlinear state-space models
                                                                                                                                  • Autonomous and nonautonomous systems
                                                                                                                                  • Equilibrium
                                                                                                                                    • Nonlinear system characteristics
                                                                                                                                      • Those in-common with linear systems
                                                                                                                                      • Stability
                                                                                                                                      • Qualities of equilibria
                                                                                                                                        • Nonlinear system simulation
                                                                                                                                        • Simulating nonlinear systems in Python
                                                                                                                                        • Exercises for Chapter 09
                                                                                                                                          • Distribution tables
                                                                                                                                            • Gaussian distribution table
                                                                                                                                            • Students t-distribution table
                                                                                                                                              • Fourier and Laplace tables
                                                                                                                                                • Laplace transforms
                                                                                                                                                • Fourier transforms
                                                                                                                                                  • Mathematics reference
                                                                                                                                                    • Quadratic forms
                                                                                                                                                      • Completing the square
                                                                                                                                                        • Trigonometry
                                                                                                                                                          • Triangle identities
                                                                                                                                                          • Reciprocal identities
                                                                                                                                                          • Pythagorean identities
                                                                                                                                                          • Co-function identities
                                                                                                                                                          • Even-odd identities
                                                                                                                                                          • Sum-difference formulas (AM or lock-in)
                                                                                                                                                          • Double angle formulas
                                                                                                                                                          • Power-reducing or half-angle formulas
                                                                                                                                                          • Sum-to-product formulas
                                                                                                                                                          • Product-to-sum formulas
                                                                                                                                                          • Two-to-one formulas
                                                                                                                                                            • Matrix inverses
                                                                                                                                                            • Laplace transforms
                                                                                                                                                              • Complex analysis
                                                                                                                                                                • Eulers formulas
                                                                                                                                                                  • Bibliography
Page 3: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview

Contents Contents p 3

Logical connectives 43

Quantifiers 43

setsex Exercises for Chapter 02 45

Exe 021 hardhat 45

Exe 022 2 45

Exe 023 3 45

Exe 024 4 45

03 Probability 46

probmeasProbability and measurement 47

probprobBasic probability theory 48

Algebra of events 49

probconditionIndependence and

conditional probability 51

Conditional probability 52

probbay Bayesrsquo theorem 54

Testing outcomes 54

Posterior probabilities 55

probrandoRandom variables 60

probpxf Probability density and

mass functions 63

Binomial PMF 64

Gaussian PDF 70

probE Expectation 72

probmomentsCentral moments 75

probex Exercises for Chapter 03 78

Exe 031 5 78

04 Statistics 79

statstermsPopulations samples and

machine learning 81

statssampleEstimation of sample mean

and variance 83

Estimation and sample statistics 83

Sample mean variance and

standard deviation 83

Sample statistics as random

variables 85

Nonstationary signal statistics 86

statsconfidenceConfidence 98

Contents Contents p 4

Generate some data to test the

central limit theorem 99

Sample statistics 101

The truth about sample means 101

Gaussian and probability 102

statsstudentStudent confidence 110

statsmultivarMultivariate probability

and correlation 115

Marginal probability 117

Covariance 118

Conditional probability and

dependence 122

statsregressionRegression 125

statsex Exercises for Chapter 04 131

05 Vector calculus 132

vecsdiv Divergence surface

integrals and flux 135

Flux and surface integrals 135

Continuity 135

Divergence 136

Exploring divergence 137

vecscurlCurl line integrals and

circulation 144

Line integrals 144

Circulation 144

Curl 145

Zero curl circulation and path

independence 146

Exploring curl 147

vecsgradGradient 153

Gradient 153

Vector fields from gradients are

special 154

Exploring gradient 154

vecsstokedStokes and divergence

theorems 164

The divergence theorem 164

The Kelvin-Stokesrsquo theorem 165

Related theorems 165

vecsex Exercises for Chapter 05 166

Contents Contents p 5

Exe 051 6 166

06 Fourier and orthogonality 168

fourseriesFourier series 169

fourtransFourier transform 174

fourgeneralGeneralized fourier series

and orthogonality 184

fourex Exercises for Chapter 06 186

Exe 061 secrets 186

Exe 062 seesaw 189

Exe 063 9 189

Exe 064 10 190

Exe 065 11 190

Exe 066 12 190

Exe 067 13 190

Exe 068 14 190

Exe 069 savage 191

Exe 0610 strawman 192

07 Partial differential equations 193

pdeclassClassifying PDEs 196

pdesturmSturm-liouville problems 199

Types of boundary conditions 200

pdeseparationPDE solution by separation of

variables 205

pdewaveThe 1D wave equation 212

pdeex Exercises for Chapter 07 219

Exe 071 poltergeist 219

Exe 072 kathmandu 220

08 Optimization 222

optgrad Gradient descent 223

Stationary points 223

The gradient points the way 224

The classical method 225

The Barzilai and Borwein method 226

optlin Constrained linear

optimization 235

Feasible solutions form a polytope 235

Only the vertices matter 236

optsimplexThe simplex algorithm 238

optex Exercises for Chapter 08 245

Contents Contents p 6

Exe 081 cummerbund 245

Exe 082 melty 245

09 Nonlinear analysis 246

nlinss Nonlinear state-space models 248

Autonomous and nonautonomous

systems 248

Equilibrium 248

nlinchar Nonlinear system

characteristics 250

Those in-common with linear

systems 250

Stability 250

Qualities of equilibria 251

nlinsim Nonlinear system simulation 254

nlinpysimSimulating nonlinear systems

in Python 255

nlinex Exercises for Chapter 09 257

A Distribution tables 258

A01 Gaussian distribution table 259

A02 Studentrsquos tshydistribution table 261

B Fourier and Laplace tables 262

B01 Laplace transforms 263

B02 Fourier transforms 265

C Mathematics reference 268

C01 Quadratic forms 269

Completing the square 269

C02 Trigonometry 270

Triangle identities 270

Reciprocal identities 270

Pythagorean identities 270

Co-function identities 271

Even-odd identities 271

Sumshydifference formulas (AM or

lock-in) 271

Double angle formulas 271

Power-reducing or halfshyangle

formulas 271

Sum-toshyproduct formulas 272

Contents Contents p 7

Product-to-sum formulas 272

Two-to-one formulas 272

C03 Matrix inverses 273

C04 Laplace transforms 274

D Complex analysis 275

D01 Eulerrsquos formulas 276

Bibliography 277

itself

Mathematics itself

itself Mathematics itself tru Truth p 1

1 For much of this lecture I rely on the thorough overview

of Glanzberg (Michael Glanzberg Truth In The Stanford

Encyclopedia of Philosophy Ed by Edward N Zalta Fall 2018

Metaphysics Research Lab Stanford University 2018)

correspondence theory

itselftru Truth

1 Before we can discuss mathematical truth

we should begin with a discussion of truth

itself1 It is important to note that this is

obviously extremely incomplete My aim is

to give a sense of the subject via brutal

(mis)abbreviation

2 Of course the study of truth cannot but

be entangled with the study of the world

as such (metaphysics) and of knowledge

(epistemology) Some of the following theories

presuppose or imply a certain metaphysical

andor epistemological theory but which these

are is controversial

Neo-classical theories of truth

3 The neo-classical theories of truth

take for granted that there is truth and

attempt to explain what its precise nature is

(Michael Glanzberg Truth In The Stanford

Encyclopedia of Philosophy Ed by Edward N

Zalta Fall 2018 Metaphysics Research Lab

Stanford University 2018) What are provided

here are modern understandings of theories

developed primarily in the early 20th century

The correspondence theory

4 A version of what is called the

correspondence theory of truth is the

following

A proposition is true iff there is an

existing entity in the world that

corresponds with it

itself Mathematics itself tru Truth p 1

facts

false

coherence theory

2 This is typically put in terms of ldquobeliefsrdquo or ldquojudgmentsrdquo but for

brevity and parallelism I have cast it in terms of propositions It is

to this theory I have probably committed the most violence

coherence

Such existing entities are called facts Facts

are relational in that their parts (eg subject

predicate etc) are related in a certain way

5 Under this theory then if a proposition

does not correspond to a fact it is false

6 This theory of truth is rather intuitive

and consistently popular (Marian David The

Correspondence Theory of Truth In The

Stanford Encyclopedia of Philosophy Ed

by Edward N Zalta Fall 2016 Metaphysics

Research Lab Stanford University 2016 A

detailed overview of the correspondence

theory of truth)

The coherence theory

7 The coherence theory of truth is adamant

that the truth of any given proposition

is only as good as its holistic system of

propositions2 This includes (but typically

goes beyond) a requirement for consistency of

a given proposition with the whole and the

self-consistency of the whole itselfmdashsometimes

called coherence

8 For parallelism letrsquos attempt a succinct

formulation of this theory cast in terms of

propositions

A proposition is true iff it is

has coherence with a system of

propositions

itself Mathematics itself tru Truth p 1

pragmatism

3 Pragmatism was an American philosophical movement of the early

20th century that valued the success of ldquopracticalrdquo application of

theories For an introduction see Legg and Hookway (Catherine

Legg and Christopher Hookway Pragmatism In The Stanford

Encyclopedia of Philosophy Ed by Edward N Zalta Spring 2019

Metaphysics Research Lab Stanford University 2019 An introductory

article on the philsophical movement ldquopragmatismrdquo It includes an

important clarification of the pragmatic slogan ldquotruth is the end of

inquiryrdquo)

4 This is especially congruent with the work of William James (ibid)

9 Note that this has no reference to facts

whatsoever However it need not necessarily

preclude them

The pragmatic theory

10 Of the neo-classical theories of truth this

is probably the least agreed upon as having a

single clear statement (Glanzberg Truth)

However as with pragmatism in general3 the

pragmatic truth is oriented practically

11 Perhaps the most important aspect

of this theory is that it is thoroughly a

correspondence theory agreeing that true

propositions are those that correspond to

the world However there is a different

focus here that differentiates it from

correspondence theory proper it values as

more true that which has some sort of practical

use in human life

12 Wersquoll try to summarize pragmatism in two

slogans with slightly different emphases herersquos

the first again cast in propositional parallel

A proposition is true iff it works4

Now there are two ways this can be

understood (a) the proposition ldquoworksrdquo in

that it empirically corresponds to the world

or (b) the proposition ldquoworksrdquo in that it has

an effect that some agent intends The former

is pretty standard correspondence theory The

itself Mathematics itself tru Truth p 1

5 This is especially congruent with the work of Charles Sanders

Peirce (Legg and Hookway Pragmatism)

inquiry

facts

latter is new and fairly obviously has ethical

implications especially today

13 Let us turn to a second formulation

A proposition is true if it

corresponds with a process of

inquiry5

This has two interesting facets (a) an agentrsquos

active inquiry creates truth and (b) it is a

sort of correspondence theory that requires a

correspondence of a proposition with a process

of inquiry not as in the correspondence theory

with a fact about the world The latter has

shades of both correspondence theory and

coherence theory

The picture theory

Before we delve into this theory we must take

a moment to clarify some terminology

States of affairs and facts

14 When discussing the correspondence

theory we have used the term fact to mean

an actual state of things in the world A

problem arises in the correspondence theory

here It says that a proposition is true iff

there is a fact that corresponds with it What

of a negative proposition like ldquothere are no

cows in Antarcticardquo We would seem to need

a corresponding ldquonegative factrdquo in the world

to make this true If a fact is taken to be

itself Mathematics itself tru Truth p 1

6 But Barker and Jago (Stephen Barker and Mark Jago Being

Positive About Negative Facts In Philosophy and Phenomenological

Research 851 [2012] pp 117ndash138 DOI 10 1111 j 1933 -1592201000479x) have attempted just that

state of affairs

obtain

model

picture

7 See Wittgenstein (Ludwig Wittgenstein Tractatus Logico-

Philosophicus Ed by C familyi=C given=K Ogden giveni= O

Project Gutenberg International Library of Psychology Philosophy

and Scientific Method Kegan Paul Trench Trubner amp Co Ltd 1922

A brilliant work on what is possible to express in languagemdashand

what is not As Wittgenstein puts it ldquoWhat can be said at all can

be said clearly and whereof one cannot speak thereof one must be

silentrdquo) Biletzki and Matar (Anat Biletzki and Anat Matar Ludwig

Wittgenstein In The Stanford Encyclopedia of Philosophy Ed by

Edward N Zalta Summer 2018 Metaphysics Research Lab Stanford

University 2018 An introduction to Wittgenstein and his thought)

glock2016 () and Dolby (David Dolby Wittgenstein on Truth In

A Companion to Wittgenstein John Wiley amp Sons Ltd 2016 Chap 27

pp 433ndash442 ISBN 9781118884607 DOI 1010029781118884607ch27)

composed of a complex of actual objects and

relations it is hard to imagine such facts6

15 Furthermore if a proposition is true it

seems that it is the corresponding fact that

makes it so what then makes a proposition

false since there is no fact to support the

falsity (Mark Textor States of Affairs In

The Stanford Encyclopedia of Philosophy Ed

by Edward N Zalta Winter 2016 Metaphysics

Research Lab Stanford University 2016)

16 And what of nonsense There are some

propositions like ldquothere is a round cuberdquo

that are neither true nor false However

the preceding correspondence theory cannot

differentiate between false and nonsensical

propositions

17 A state of affairs is something possible

that may or may not be actual (ibid) If

a state of affairs is actual it is said to

obtain The picture theory will make central

this concept instead of that of the fact

The picture theory of meaning (and truth)

18 The picture theory of meaning uses the

analogy of the model or picture to explain

the meaningfulness of propositions7

A proposition names possible objects

and arranges these names to

correspond to a state of affairs

itself Mathematics itself tru Truth p 1

Figure 011 a representation of the picture theory

nonsense

See Figure 011 This also allows for an easy

account of truth falsity and nonsense

nonsense A sentence that appears to be

a proposition is actually not if the

arrangement of named objects is

impossible Such a sentence is simply

nonsense

truth A proposition is true if the state of

affairs it depicts obtains

falsity A proposition is false if the state of

affairs it depicts does not obtain

19 Now some (Hans Johann Glock Truth

in the Tractatus In Synthese 1482 [Jan

2006] pp 345ndash368 ISSN 1573-0964 DOI 10

1007 s11229 - 004 - 6226 - 2) argue this is

a correspondence theory and others (David

Dolby Wittgenstein on Truth In A Companion

to Wittgenstein John Wiley amp Sons Ltd 2016

Chap 27 pp 433ndash442 ISBN 9781118884607 DOI

1010029781118884607ch27) that it is not In

any case it certainly solves some issues that

have plagued the correspondence theory

ldquoWhat cannot be said must be shownrdquo

20 Something the picture theory does is

declare a limit on what can meaningfully be

said A proposition (as defined above) must be

potentially true or false Therefore something

that cannot be false (something necessarily

true) cannot be a proposition (ibid) And

there are certain things that are necessarily

true for language itself to be meaningfulmdash

paradigmatically the logical structure of the

world What a proposition does then is show

itself Mathematics itself tru Truth p 1

8 See also ( Slavoj Žižek Less Than Nothing Hegel and the Shadow

of Dialectical Materialism Verso 2012 ISBN 9781844678976 This is

one of the most interesting presentations of Hegel and Lacan by one

of the most exciting contemporary philosophers pp 25shy6) from whom I

stole the section title

language itself

9 This was one of the contributions to the ldquolinguistic turnrdquo

(Wikipedia Linguistic turnmdashWikipedia The Free Encyclopedia

http en wikipedia org w index php title =Linguistic 20turn amp oldid = 922305269 [Online accessed23shyOctobershy2019] 2019 Hey we all do it) of philosophy in the early

20th century

perspective

skepticism

via its own logical structure the necessary

(for there to be meaningful propositions at all)

logical structure of the world8

21 An interesting feature of this perspective

is that it opens up language itself to analysis

and limitation9 And furthermore it suggests

that the set of what is is smaller than the set

of what can be meaningfully spoken about

The relativity of truth

22 Each subject (ie agent) in the world with

their propositions has a perspective a given

moment a given place an historical-cultural-

linguistic situation At the very least the truth

of propositions must account for this Just

how a theory of truth should do so is a matter

of significant debate (Maria Baghramian and

J Adam Carter Relativism In The Stanford

Encyclopedia of Philosophy Ed by Edward N

Zalta Winter 2019 Metaphysics Research Lab

Stanford University 2019)

23 Some go so far as to be skeptical

about truth (Peter Klein Skepticism In The

Stanford Encyclopedia of Philosophy Ed by

Edward N Zalta Summer 2015 Metaphysics

Research Lab Stanford University 2015)

regarding it to be entirely impossible Others

say that while a proposition may or may not be

true we could never come to know this

24 Often underlying this conversation is the

question of there being a common world in

which we all participate and if so whether

or not we can properly represent this world

itself Mathematics itself tru Truth p 1

10 Especially notable here is the work of Alfred Tarski in the

midshy20th century

beliefs

sentences

interpreted sentences

quasi-quotation

in language such that multiple subjects could

come to justifiably agree or disagree on the

truth of a proposition If every proposition

is so relative that it is relevant to only the

proposer truth would seem of little value On

the other hand if truth is understood to

be ldquoobjectiverdquomdashindependent of subjective

perspectivemdasha number of objections can be

made (Baghramian and Carter Relativism) such

as that there is no non-subjective perspective

from which to judge truth

Other ideas about truth

25 There are too many credible ideas

about truth to attempt a reasonable summary

however I will attempt to highlight a few

important ones

Formal methods

26 A set of tools was developed for

exploring theories of truth especially

correspondence theories10 Focus turned

from beliefs to sentences which are akin to

propositions (Recall that the above theories

have already been recast in the more modern

language of propositions) Another aspect

of these sentences under consideration is

that they begin to be taken as interpreted

sentences they are already have meaning

27 Beyond this several technical apparatus

are introduced that formalize criteria for

truth For instance a sentence is given a

sign φ A need arises to distinguish between

the quotation of sentence φ and the unqoted

sentence φ which is then given the quasi-

quotation notation pφq For instance let φ

stand for snow is white then φ rarr snow is

itself Mathematics itself tru Truth p 1

Convention T

11 Wilfrid Hodges Tarskirsquos Truth Definitions In The Stanford

Encyclopedia of Philosophy Ed by Edward N Zalta Fall 2018

Metaphysics Research Lab Stanford University 2018 Mario Goacutemez-

Torrente Alfred Tarski In The Stanford Encyclopedia of

Philosophy Ed by Edward N Zalta Spring 2019 Metaphysics Research

Lab Stanford University 2019 Peter Hylton and Gary Kemp Willard

Van Orman Quine In The Stanford Encyclopedia of Philosophy Ed

by Edward N Zalta Spring 2019 Metaphysics Research Lab Stanford

University 2019

redundancy theory

white and pφq rarr lsquosnow is whitersquo Tarski

introduces Convention T which states that

for a fixed language L with fully interpreted

sentences (Glanzberg Truth)

An adequate theory of truth for L

must imply for each sentence φ of L

pφq is true if and only if φ

Using the same example then

lsquosnow is whitersquo if and only if snow

is white

Convention T states a general rule for the

adequacy of a theory of truth and is used in

several contemporary theories

28 We can see that these formal methods get

quite technical and fun For more see Hodges

GoacutemezshyTorrente and Hylton and Kemp11

Deflationary theories

29 Deflationary theories of truth try to

minimize or eliminate altogether the concept

of or use of the term lsquotruthrsquo For instance the

redundancy theory claim that (Glanzberg

Truth)

itself Mathematics itself tru Truth p 1

12 Daniel Stoljar and Nic Damnjanovic The Deflationary Theory of

Truth In The Stanford Encyclopedia of Philosophy Ed by Edward N

Zalta Fall 2014 Metaphysics Research Lab Stanford University 2014

truth-bearer

meaning

theory of use

To assert that pφq is true is just to

assert that φ

Therefore we can eliminate the use of lsquois truersquo

30 For more of less see Stoljar and

Damnjanovic12

Language

31 It is important to recognize that language

mediates truth that is truth is embedded in

language The way language in general affects

theories of truth has been studied extensively

For instance whether the truth-bearer is

a belief or a proposition or a sentencemdashor

something elsemdashhas been much discussed The

importance of the meaning of truth-bearers

like sentences has played another large role

Theories of meaning like the picture theory

presented above are often closely intertwined

with theories of truth

32 One of the most popular theories of

meaning is called the theory of use

For a large class of cases of the

employment of the word ldquomeaningrdquo

ndash though not for all ndash this word

can be explained in this way the

meaning of a word is its use in

the language (L Wittgenstein PMS

Hacker and J Schulte Philosophical

itself Mathematics itself tru Truth p 1

languageshygames

metaphysical realism

scientific realism

13 Drew Khlentzos Challenges to Metaphysical Realism In The

Stanford Encyclopedia of Philosophy Ed by Edward N Zalta Winter

2016 Metaphysics Research Lab Stanford University 2016

antishyrealism

Investigations Wiley 2010 ISBN

9781444307979)

This theory is accompanied by the concept of

languageshygames which are loosely defined as

rule-based contexts within which sentences

have uses The idea is that the meaning of

a given sentence is its use in a network of

meaning that is constantly evolving This view

tends to be understood as deflationary or

relativistic about truth

Metaphysical and epistemological

considerations

33 We began with the recognition that

truth is intertwined with metaphysics and

epistemology Letrsquos consider a few such topics

34 The first is metaphysical realism

which claims that there is a world existing

objectively independently of how we think

about or describe it This ldquorealismrdquo tends to be

closely tied to yet distinct from scientific

realism which goes further claiming the

world is ldquoactuallyrdquo as science describes

independently of the scientific descriptions

(eg there are actual objects corresponding to

the phenomena we call atoms molecules light

particles etc)

35 There have been many challenges to the

realist claim (for some recent versions see

Khlentzos13) put forth by what is broadly

called antishyrealism These vary but often

challenge the ability of realists to properly

itself Mathematics itself tru Truth p 2

Metaphysical idealism

Epistemological idealism

14 These definitions are explicated by Guyer and Horstmann (Paul

Guyer and Rolf-Peter Horstmann Idealism In The Stanford

Encyclopedia of Philosophy Ed by Edward N Zalta Winter 2018

Metaphysics Research Lab Stanford University 2018)

noumenal world

phenomenal world

link language to supposed objects in the

world

36 Metaphysical idealism has been

characterized as claiming that ldquomindrdquo or

ldquosubjectivityrdquo generate or completely compose

the world which has no being outside mind

Epistemological idealism on the other hand

while perhaps conceding that there is a world

independent of mind claims all knowledge

of the world is created through mind and

for mind and therefore can never escape a

sort of mindshyworld gap14 This epistemological

idealism has been highly influential since

the work of Immanuel Kant (I Kant P Guyer

and AW Wood Critique of Pure Reason The

Cambridge Edition of the Works of Immanuel

Kant Cambridge University Press 1999 ISBN

9781107268333) in the late 18th century

which ushered in the idea of the noumenal

world in-itself and the phenomenal world

which is how the noumenal world presents to

us Many have held that phenomena can be

known through inquiry whereas noumena are

inaccessible Furthermore what can be known is

restricted by the categories pre-existent in

the knower

37 Another approach taken by Georg Wilhelm

Friedrich Hegel (Paul Redding Georg

Wilhelm Friedrich Hegel In The Stanford

Encyclopedia of Philosophy Ed by Edward N

Zalta Summer 2018 Metaphysics Research

Lab Stanford University 2018) and other

German idealists following Kant is to reframe

reality as thoroughly integrating subjectivity

(GWF Hegel and AV Miller Phenomenology

of Spirit Motilal Banarsidass 1998 ISBN

itself Mathematics itself tru Truth p 1

Notion

9788120814738 Slavoj Žižek Less Than

Nothing Hegel and the Shadow of Dialectical

Materialism Verso 2012 ISBN 9781844678976

This is one of the most interesting presentations

of Hegel and Lacan by one of the most exciting

contemporary philosophers) that is ldquoeverything

turns on grasping and expressing the True not

only as Substance but equally as Subjectrdquo

A subjectrsquos proposition is true inasmuch as it

corresponds with its Notion (approximately

the idea or meaning for the subject) Some

hold that this idealism is compatible with a

sort of metaphysical realism at least as far

as understanding is not independent of but

rather beholden to reality (Žižek Less Than

Nothing Hegel and the Shadow of Dialectical

Materialism p 906 ff)

38 Clearly all these ideas have many

implications for theories of truth and vice

versa

Where this leaves us

39 The truth is hard What may at first

appear to be a simple concept becomes complex

upon analysis It is important to recognize

that we have only sampled some highlights of

the theories of truth I recommend further

study of this fascinating topic

40 Despite the difficulties of finding

definitive grounds for understanding truth

we are faced with the task of provisionally

forging ahead Much of what follows in

the study of mathematics makes certain

implicit and explicit assumptions about truth

However we have found that the foundations

itself Mathematics itself found Truth p 2

of these assumptions may themselves be

problematic It is my contention that despite

the lack of clear foundations it is still worth

studying engineering analysis its mathematical

foundations and the foundations of truth

itself My justification for this claim is that

I find the utility and the beauty of this study

highly rewarding

itself Mathematics itself found The foundations of mathematics p 1

axioms

deduction

proof

theorems

15 Throughout this section for the history of mathematics I rely

heavily on Kline (M Kline Mathematics The Loss of Certainty A

Galaxy book Oxford University Press 1982 ISBN 9780195030853 A

detailed account of the ldquoillogicalrdquo development of mathematics and

an exposition of its therefore remarkable utility in describing the

world)

Euclid

geometry

Gauss

itselffound The foundations of

mathematics

1 Mathematics has long been considered

exemplary for establishing truth Primarily

it uses a method that begins with

axiomsmdashunproven propositions that include

undefined termsmdashand uses logical deduction

to prove other propositions (theorems) to

show that they are necessarily true if the

axioms are

2 It may seem obvious that truth established

in this way would always be relative to the

truth of the axioms but throughout history

this footnote was often obscured by the

ldquoobviousrdquo or ldquointuitiverdquo universal truth of the

axioms15 For instance Euclid (Wikipedia

Euclid mdash Wikipedia The Free Encyclopedia

http en wikipedia org w index php

title = Euclid amp oldid = 923031048 [Online

accessed 26shyOctobershy2019] 2019) founded

geometrymdashthe study of mathematical objects

traditionally considered to represent physical

space like points lines etcmdashon axioms thought

so solid that it was not until the early 19th

century that Carl Friedrich Gauss (Wikipedia

Carl Friedrich Gauss mdash Wikipedia The Free

Encyclopedia httpenwikipediaorgw

indexphptitle=Carl20Friedrich20Gaussamp

oldid=922692291 [Online accessed 26shyOctober-

2019] 2019) and others recognized this was

only one among many possible geometries (M

Kline Mathematics The Loss of Certainty A

Galaxy book Oxford University Press 1982

ISBN 9780195030853 A detailed account of

the ldquoillogicalrdquo development of mathematics

and an exposition of its therefore remarkable

utility in describing the world) resting on

itself Mathematics itself found The foundations of mathematics p 1

Aristotle

algebra

rational numbers

natural numbers

integers

irrational numbers

different axioms Furthermore Aristotle

(Christopher Shields Aristotle In The

Stanford Encyclopedia of Philosophy Ed by

Edward N Zalta Winter 2016 Metaphysics

Research Lab Stanford University 2016) had

acknowledged that reasoning must begin

with undefined terms however even Euclid

(presumably aware of Aristotlersquos work) seemed

to forget this and provided definitions

obscuring the foundations of his work and

starting mathematics on a path that for over

2000 years would forget its own relativity

(Kline Mathematics The Loss of Certainty p 101-

2)

3 The foundations of Euclid were even

shakier than its murky starting point several

unstated axioms were used in proofs and some

proofs were otherwise erroneous However for

two millennia mathematics was seen as the field

wherein truth could be established beyond

doubt

Algebra ex nihilo

4 Although not much work new geometry

appeared during this period the field of

algebra (Wikipedia Algebra mdash Wikipedia The

Free Encyclopedia httpenwikipediaorg

windexphptitle=Algebraampoldid=920573802

[Online accessed 26shyOctobershy2019] 2019)mdashthe

study of manipulations of symbols standing for

numbers in generalmdashbegan with no axiomatic

foundation whatsoever The Greeks had a notion

of rational numbers ratios of natural numbers

(positive integers) and it was known that

many solutions to algebraic equations were

irrational (could not be expressed as a ratio

of integers) But these irrational numbers like

virtually everything else in algebra were

itself Mathematics itself found The foundations of mathematics p 1

negative numbers

imaginary numbers

complex numbers

optics

astronomy

calculus

Newtonian mechanics

non-Euclidean geometry

gradually accepted because they were so

useful in solving practical problems (they

could be approximated by rational numbers and

this seemed good enough) The rules of basic

arithmetic were accepted as applying to these

and other forms of new numbers that arose in

algebraic solutions negative imaginary and

complex numbers

The application of mathematics to science

5 During this time mathematics was being

applied to optics and astronomy Sir

Isaac Newton then built calculus upon

algebra applying it to what is now known as

Newtonian mechanics which was really more

the product of Leonhard Euler (George Smith

Isaac Newton In The Stanford Encyclopedia

of Philosophy Ed by Edward N Zalta Fall

2008 Metaphysics Research Lab Stanford

University 2008 Wikipedia Leonhard

Euler mdash Wikipedia The Free Encyclopedia

httpenwikipediaorgwindexphptitle=

Leonhard20Eulerampoldid=921824700 [Online

accessed 26shyOctobershy2019] 2019) Calculus

introduced its own dubious operations but

the success of mechanics in describing and

predicting physical phenomena was astounding

Mathematics was hailed as the language of

God (later Nature)

The rigorization of mathematics

6 It was not until Gauss created non-

Euclidean geometry in which Euclidrsquos were

shown to be one of many possible axioms

compatible with the world and William Rowan

Hamilton (Wikipedia William Rowan Hamilton

mdash Wikipedia The Free Encyclopedia http

en wikipedia org w index php

title=William20Rowan20Hamiltonampoldid=

923163451 [Online accessed 26shyOctober-

itself Mathematics itself found The foundations of mathematics p 1

quaternions

rigorization

symbolic logic

mathematical model

16 The branch of mathematics called model theory concerns itself

with general types of models that can be made from a given formal

system like an axiomatic mathematical system For more on model

theory see Hodges (Wilfrid Hodges Model Theory In The Stanford

Encyclopedia of Philosophy Ed by Edward N Zalta Fall 2018

Metaphysics Research Lab Stanford University 2018) It is noteworthy

that the engineeringscience use of the term ldquomathematical modelrdquo is

only loosely a ldquomodelrdquo in the sense of model theory

implicit definition

2019] 2019) created quaternions (Wikipedia

Quaternion mdash Wikipedia The Free Encyclopedia

http en wikipedia org w index

php title = Quaternion amp oldid = 920710557

[Online accessed 26shyOctobershy2019] 2019)

a number system in which multiplication is

noncommunicative that it became apparent

somethingwas fundamentally wrong with the

way truth in mathematics had been understood

This started a period of rigorization in

mathematics that set about axiomatizing and

proving 19th century mathematics This included

the development of symbolic logic which

aided in the process of deductive reasoning

7 An aspect of this rigorization is that

mathematicians came to terms with the axioms

that include undefined terms For instance

a ldquopointrdquo might be such an undefined term in

an axiom A mathematical model is what we

create when we attach these undefined terms

to objects which can be anything consistent

with the axioms16 The system that results

from proving theorems would then apply to

anything ldquoproperlyrdquo described by the axioms

So two masses might be assigned ldquopointsrdquo in

a Euclidean geometric space from which

we could be confident that for instance

the ldquodistancerdquo between these masses is the

Euclidean norm of the line drawn between the

points It could be said then that a ldquopointrdquo in

Euclidean geometry is implicitly defined by

its axioms and theorems and nothing else That

is mathematical objects are not inherently

tied to the physical objects to which we tend

to apply them Euclidean geometry is not

the study of physical space as it was long

consideredmdashit is the study of the objects

implicitly defined by its axioms and theorems

The foundations of mathematics are built

itself Mathematics itself found The foundations of mathematics p 2

consistent

theorem

Set theory

foundation

sets

ZFC set theory

8 The building of the modern foundations

mathematics began with clear axioms

solid reasoning (with symbolic logic) and

lofty yet seemingly attainable goals prove

theorems to support the already ubiquitous

mathematical techniques in geometry algebra

and calculus from axioms furthermore prove

that these axioms (and things they imply) do

not contradict each other ie are consistent

and that the axioms are not results of each

other (one that can be derived from others is

a theorem not an axiom)

9 Set theory is a type of formal axiomatic

system that all modern mathematics is

expressed with so set theory is often called

the foundation of mathematics (Joan Bagaria

Set Theory In The Stanford Encyclopedia

of Philosophy Ed by Edward N Zalta Fall

2019 Metaphysics Research Lab Stanford

University 2019) We will study the basics

in The primary objects in set theory are

sets informally collections of mathematical

objects There is not just one a single set

of axioms that is used as the foundation of

all mathematics for reasons will review in a

moment However the most popular set theory

is ZermeloshyFraenkel set theory with the

axiom of choice (ZFC) The axioms of ZF sans C

are as follows (ibid)

extensionality If two sets A and B have the

same elements then they are equal

empty set There exists a set denoted by emptyand called the empty set which has no

elements

pair Given any sets A and B there exists a set

itself Mathematics itself found The foundations of mathematics p 1

denoted by A B which contains A and B

as its only elements In particular there

exists the set A which has A as its only

element

power set For every set A there exists a set

denoted by P(A) and called the power set

of A whose elements are all the subsets of

A

union For every set A there exists a set

denoted by⋃A and called the union of

A whose elements are all the elements of

the elements of A

infinity There exists an infinite set In

particular there exists a set Z that

contains empty and such that if A isin Z then⋃A A isin Z

separation For every set A and every given

property there is a set containing

exactly the elements of A that have

that property A property is given by a

formula ϕ of the first-order language

of set theory Thus separation is not a

single axiom but an axiom schema that is

an infinite list of axioms one for each

formula ϕ

replacement For every given definable

function with domain a set A there is a

set whose elements are all the values of

the function

10 ZFC also has the axiom of choice (Bagaria

Set Theory)

choice For every set A of pairwiseshydisjoint

non-empty sets there exists a set that

contains exactly one element from each

set in A

itself Mathematics itself found The foundations of mathematics p 2

first incompleteness theorem

incomplete

17 Panu Raatikainen Goumldelrsquos Incompleteness Theorems In The

Stanford Encyclopedia of Philosophy Ed by Edward N Zalta Fall

2018 Metaphysics Research Lab Stanford University 2018 A through

and contemporary description of Goumldelrsquos incompleteness theorems

which have significant implications for the foundations and function

of mathematics and mathematical truth

undecidable

second incompleteness theorem

The foundations have cracks

11 The foundationalistsrsquo goal was to prove

that some set of axioms from which all of

mathematics can be derived is both consistent

(contains no contradictions) and complete

(every true statement is provable) The work of

Kurt Goumldel (Juliette Kennedy Kurt Goumldel In

The Stanford Encyclopedia of Philosophy Ed

by Edward N Zalta Winter 2018 Metaphysics

Research Lab Stanford University 2018) in

the mid 20th century shattered this dream by

proving in his first incompleteness theorem

that any consistent formal system within which

one can do some amount of basic arithmetic is

incomplete His argument is worth reviewing

(see Raatikainen17) but at its heart is an

undecidable statement like ldquoThis sentence is

unprovablerdquo Let U stand for this statement If

it is true it is unprovable If it is provable it is

false Therefore it is true iff it is provable

Then he shows that if a statement A that

essentially says ldquoarithmetic is consistentrdquo is

provable then so is the undecidable statement

U But if U is to be consistent it cannot be

provable and therefore neither can A be

provable

12 This is problematic It tells us virtually

any conceivable axiomatic foundation of

mathematics is incomplete If one is complete

it is inconsistent (and therefore worthless) One

problem this introduces is that a true theorem

may be impossible to prove but it turns out we

can never know that in advance of its proof if

it is provable

13 But it gets worse Goumldelrsquos second

itself Mathematics itself found The foundations of mathematics p 1

material implication

Banach-Tarski paradox

empirical

18 Kline Mathematics The Loss of Certainty

incompleteness theorem shows that such

systems cannot even be shown to be consistent

This means at any moment someone could find

an inconsistency in mathematics and not only

would we lose some of the theorems we would

lose them all This is because by what is called

the material implication (Kline Mathematics

The Loss of Certainty pp 187-8 264) if one

contradiction can be found every proposition

can be proven from it And if this is the case

all (even proven) theorems in the system would

be suspect

14 Even though no contradiction has yet

appeared in ZFC its axiom of choice which

is required for the proof of most of what

has thus far been proven generates the

Banach-Tarski paradox that says a sphere

of diameter x can be partitioned into a finite

number of pieces and recombined to form

two spheres of diameter x Troubling to say

the least Attempts were made for a while to

eliminate the use of the axiom of choice but

our buddy Goumldel later proved that if ZF is

consistent so is ZFC (ibid p 267)

Mathematics is considered empirical

15 Since its inception mathematics has been

applied extensively to the modeling of the

world Despite its cracked foundations it has

striking utility Many recent leading minds of

mathematics philosophy and science suggest

we treat mathematics as empirical like any

science subject to its success in describing

and predicting events in the world As Kline18

summarizes

itself Mathematics itself reason The foundations of mathematics p 2

The upshot [hellip] is that sound

mathematics must be determined

not by any one foundation which

may some day prove to be right The

ldquocorrectnessrdquo of mathematics must

be judged by its application to the

physical world Mathematics is an

empirical science much as Newtonian

mechanics It is correct only to the

extent that it works and when it

does not it must be modified It

is not a priori knowledge even

though it was so regarded for two

thousand years It is not absolute

or unchangeable

itself Mathematics itself overview Mathematical reasoning p 1

itselfreason Mathematical

reasoning

itself Mathematics itself engmath Mathematical topics in overview p 1

itselfoverview Mathematical

topics in

overview

itself Mathematics itself ex What is mathematics for engineering p 1

itselfengmath What is

mathematics for

engineering

language

sets Mathematics itself Exercises for Chapter 01 p 1

itselfex Exercises for Chapter

01

formal languages

logic

reasoning

propositions

theorems

proof

propositional calculus

logical connectives

first-order logic

sets

Mathematicalreasoning logicand set theory

In order to communicate mathematical

ideas effectively formal languages have

been developed within which logic ie

deductive (mathematical) reasoning can

proceed Propositions are statements that

can be either true gt or false perp Axiomatic

systems begin with statements (axioms) assumed

true Theorems are proven by deduction

In many forms of logic like propositional

calculus (Wikipedia Propositional

calculus mdash Wikipedia The Free Encyclopedia

httpenwikipediaorgwindexphptitle=

Propositional20calculusampoldid=914757384

[Online accessed 29shyOctobershy2019] 2019)

compound propositions are constructed

via logical connectives like ldquoandrdquo

and ldquoorrdquo of atomic propositions (see

Lecture setslogic) In others like first-order

logic (Wikipedia First-order logic mdash

Wikipedia The Free Encyclopedia http

enwikipediaorgwindexphptitle=First-

order 20logic amp oldid = 921437906 [Online

accessed 29shyOctobershy2019] 2019) there are also

sets Mathematical reasoning logic and set theory setintro p 3

quantifiers

set theory

logical quantifiers like ldquofor everyrdquo and ldquothere

existsrdquo

The mathematical objects and operations about

which most propositions are made are expressed

in terms of set theory which was introduced

in Lecture itselffound and will be expanded

upon in Lecture setssetintro We can say

that mathematical reasoning is comprised of

mathematical objects and operations expressed

in set theory and logic allows us to reason

therewith

sets Mathematical reasoning logic and set theory setintro Introduction to set theory p 1

set theory

1 K Ciesielski Set Theory for the Working Mathematician London

Mathematical Society Student Texts Cambridge University Press 1997

ISBN 9780521594653 A readable introduction to set theory

2 HB Enderton Elements of Set Theory Elsevier Science 1977 ISBN

9780080570426 A gentle introduction to set theory and mathematical

reasoningmdasha great place to start

set

field

addition

multiplication

subtraction

division

setssetintro Introduction to

set theory

Set theory is the language of the modern

foundation of mathematics as discussed

in Lecture itselffound It is unsurprising

then that it arises throughout the study of

mathematics We will use set theory extensively

in on probability theory

The axioms of ZFC set theory were introduced

in Lecture itselffound Instead of proceeding

in the pure mathematics way of introducing

and proving theorems we will opt for a more

applied approach in which we begin with some

simple definitions and include basic operations

A more thorough and still readable treatment

is given by Ciesielski1 and a very gentle

version by Enderton2

A set is a collection of objects Set theory

gives us a way to describe these collections

Often the objects in a set are numbers or sets

of numbers However a set could represent

collections of zebras and trees and hairballs

For instance here are some sets

A field is a set with special structure This

structure is provided by the addition (+)

and multiplication (times) operators and their

inverses subtraction (ndash) and division (divide)

sets Mathematical reasoning logic and set theory setintro Introduction to set theory p 2

real numbers

3 When the natural numbers include zero we write N0

set membership

set operations

union

The quintessential example of a field is the

set of real numbers R which admits these

operators making it a field The reals R thecomplex numbers C the integers Z and the

natural numbers3 N are the fields we typically

consider

Set membership is the belonging of an object

to a set It is denoted with the symbol isinwhich can be read ldquois an element ofrdquo for

element x and set X

For instance we might say 7 isin 1 7 2 or

4 isin 1 7 2 Or we might declare that a is a

real number by stating x isin R

Set operations can be used to construct new

sets from established sets We consider a few

common set operations now

The union cup of sets is the set containing all

the elements of the original sets (no repetition

allowed) The union of sets A and B is denoted

A cup B For instance let A = 1 2 3 and B = ndash1 3

then

sets Mathematical reasoning logic and set theory setintro Introduction to set theory p 3

intersection

empty set

set difference

subset

proper subset

The intersection cap of sets is a set containing

the elements common to all the original sets

The intersection of sets A and B is denoted

A cap B For instance let A = 1 2 3 and B = 2 3 4

then

If two sets have no elements in common the

intersection is the empty set empty = the unique

set with no elements

The set difference of two sets A and B is the

set of elements in A that arenrsquot also in B It is

denoted A B For instance let A = 1 2 3 and

B = 2 3 4 Then

A subset sube of a set is a set the elements of

which are contained in the original set If the

two sets are equal one is still considered a

subset of the other We call a subset that is

not equal to the other set a proper subset sub

sets Mathematical reasoning logic and set theory setintro Introduction to set theory p 4

complement

cartesian product

set-builder notation

For instance let A = 1 2 3 and B = 1 2 Then

The complement of a subset is a set of elements

of the original set that arenrsquot in the subset

For instance if B sube A then the complement of B

denoted B is

The cartesian product of two sets A and B is

denoted A times B and is the set of all ordered

pairs (a b) where a isin A and b isin B Itrsquos

worthwhile considering the following notation

for this definition

which means ldquothe cartesian product of A and B

is the ordered pair (a b) such that a isin A and

b isin Brdquo in set-builder notation (Wikipedia

Set-builder notation mdash Wikipedia The Free

Encyclopedia httpenwikipediaorgw

indexphptitle=Set-builder20notationamp

oldid=917328223 [Online accessed 29shyOctober-

sets Mathematical reasoning logic and set theory logic Introduction to set theory p 5

map

function

value

domain

codomain

image

2019] 2019)

Let A and B be sets A map or function f from

A to B is an assignment of some element a isin A

to each element b isin B The function is denoted

f A rarr B and we say that f maps each element

a isin A to an element f(a) isin B called the value

of a under f or a 7rarr f(a) We say that f has

domain A and codomain B The image of f is

the subset of its codomain B that contains

the values of all elements mapped by f from its

domain A

sets Mathematical reasoning logic and set theory ex Logical connectives and quantifiers p 1

first-order logic

atomistic proposition

compound proposition

logical connectives

not

andor

truth table

universal quantifier symbol

existential quantifier

setslogic Logical connectives

and quantifiers

In order to make compound propositions

we need to define logical connectives In

order to specify quantities of variables

we need to define logical quantifiers The

following is a form of first-order logic

(Wikipedia First-order logic mdash Wikipedia The

Free Encyclopedia)

Logical connectives

A proposition can be either true gt and false perpWhen it does not contain a logical connective

it is called an atomistic proposition

To combine propositions into a compound

proposition we require logical connectives

They are not (not) and (and) and or (or) Table 021

is a truth table for a number of connectives

Quantifiers

Logical quantifiers allow us to indicate

the quantity of a variable The universal

quantifier symbol forall means ldquofor allrdquo For

instance let A be a set then foralla isin A means

ldquofor all elements in Ardquo and gives this quantity

variable a The existential quantifier existmeans ldquothere exists at least onerdquo or ldquofor somerdquo

For instance let A be a set then exista isin A

Table 021 a truth table for logical connectivesThe first two columns are the truth values ofpropositions p and q the rest are outputs

not and or nand nor xor xnor

p q notp pand q por q p uarr q p darr q p Y q phArr q

perp perp gt perp perp gt gt perp gtperp gt gt perp gt gt perp gt perpgt perp perp perp gt gt perp gt perpgt gt perp gt gt perp perp perp gt

sets Mathematical reasoning logic and set theory ex Logical connectives and quantifiers p 2

means ldquothere exists at least one element a in

A helliprdquo

prob Mathematical reasoning logic and set theory Exercises for Chapter 02 p 1

setsex Exercises for Chapter

02

Exercise 021 (hardhat)

For the following write the set described in

set-builder notation

a A = 2 3 5 9 17 33 middot middot middot b B is the set of integers divisible by 11

c C = 13 14 15 middot middot middot d D is the set of reals between ndash3 and 42

Exercise 022 (2)

Let xy isin Rn Prove the Cauchy-Schwarz

Inequality

|x middot y| 6 xy (ex1)

Hint you may find the geometric definition of

the dot product helpful

Exercise 023 (3)

Let x isin Rn Prove that

x middot x = x2 (ex2)

Hint you may find the geometric definition of

the dot product helpful

Exercise 024 (4)

Let xy isin Rn Prove the Triangle Inequality

x + y 6 x + y (ex3)

Hint you may find the Cauchy-Schwarz

Inequality helpful

prob

Probability

prob Probability prob Probability and measurement p 1

Probability theory

1 For a good introduction to probability theory see Ash (Robert B

Ash Basic Probability Theory Dover Publications Inc 2008) or Jaynes

et al (ET Jaynes et al Probability Theory The Logic of Science

Cambridge University Press 2003 ISBN 9780521592710 An excellent

and comprehensive introduction to probability theory)

interpretation of probability

event

probmeas Probability and

measurement

Probability theory is a wellshydefined branch

of mathematics Andrey Kolmogorov described

a set of axioms in 1933 that are still in use

today as the foundation of probability

theory1

We will implicitly use these axioms in our

analysis The interpretation of probability is

a contentious matter Some believe probability

quantifies the frequency of the occurrence of

some event that is repeated in a large number

of trials Others believe it quantifies the

state of our knowledge or belief that some

event will occur

In experiments our measurements are tightly

coupled to probability This is apparent in the

questions we ask Here are some examples

1 How common is a given event

2 What is the probability we will reject

a good theory based on experimental

results

3 How repeatable are the results

4 How confident are we in the results

5 What is the character of the fluctuations

and drift in the data

6 How much data do we need

prob Probability prob Basic probability theory p 1

probability space

sample space

outcomes

probprob Basic probability

theory

The mathematical model for a class of

measurements is called the probability

space and is composed of a mathematical

triple of a sample space Ω σshyalgebra

F and probability measure P typically

denoted (ΩF P) each of which we will

consider in turn (Wikipedia Probability

space mdash Wikipedia The Free Encyclopedia

httpenwikipediaorgwindexphptitle=

Probability20spaceampoldid=914939789 [Online

accessed 31shyOctobershy2019] 2019)

The sample space Ω of an experiment is the

set representing all possible outcomes of the

experiment If a coin is flipped the sample

space is Ω = H T where H is heads and T

is tails If a coin is flipped twice the sample

space could be

However the same experiment can have

different sample spaces For instance for two

coin flips we could also choose

prob Probability prob Basic probability theory p 1

event

We base our choice of Ω on the problem at

hand

An event is a subset of the sample space

That is an event corresponds to a yes-or-no

question about the experiment For instance

event A (remember A sube Ω) in the coin flipping

experiment (two flips) might be A = HT TH A is

an event that corresponds to the question ldquoIs

the second flip different than the firstrdquo A is

the event for which the answer is ldquoyesrdquo

Algebra of events

Because events are sets we can perform the

usual set operations with them

Example probprobshy1 re set operations with events

Problem Statement

Consider a toss of a single die We choose

the sample space to be Ω = 1 2 3 4 5 6 Let the

following define events

A equiv the result is even = 2 4 6

B equiv the result is greater than 2 = 3 4 5 6

Find the following event combinations

A cup B A cap B A B B A A B

prob Probability condition Basic probability theory p 2

σshyalgebra

probability measure

The σshyalgebra F is the collection of events

of interest Often F is the set of all possible

events given a sample space Ω which is just

the power set of Ω (Wikipedia Probability

space mdash Wikipedia The Free Encyclopedia)

When referring to an event we often state

that it is an element of F For instance we

might say an event A isin F

Wersquore finally ready to assign probabilities to

events We define the probability measure

P F rarr [0 1] to be a function satisfying the

following conditions

1 For every event A isin F the probability

measure of A is greater than or equal to

zeromdashie P(A) gt 0

2 If an event is the entire sample space its

probability measure is unitymdashie if A = Ω

P(A) = 1

3 If events A1 A2 middot middot middot are disjoint sets (no

elements in common) then P(A1 cup A2 cup middot middot middot ) =P(A1) + P(A2) + middot middot middot

We conclude the basics by observing four facts

that can be proven from the definitions above

1

2

3

4

prob Probability condition Independence and conditional probability p 1

independent

probcondition Independence

and conditional

probability

Two events A and B are independent if and

only if

P(A cap B) = P(A)P(B)

If an experimenter must make a judgment

without data about the independence of

events they base it on their knowledge of the

events as discussed in the following example

Example probconditionshy1 re independence

Problem Statement

Answer the following questions and

imperatives

1 Consider a single fair die rolled twice

What is the probability that both rolls

are 6

2 What changes if the die is biased by a

weight such that P(6) = 17

3 What changes if the die is biased by a

magnet rolled on a magnetic dice-rolling

tray such that P(6) = 17

4 What changes if there are two dice

biased by weights such that for each

P(6) = 17 rolled once both resulting in

6

5 What changes if there are two dice

biased by magnets rolled together

prob Probability condition Independence and conditional probability p 1

dependent

conditional probability

Conditional probability

If events A and B are somehow dependent we

need a way to compute the probability of B

occurring given that A occurs This is called

the conditional probability of B given A and

is denoted P(B | A) For P(A) gt 0 it is defined as

P(B | A) =P(A cap B)P(A)

(condition1)

We can interpret this as a restriction of

the sample space Ω to A ie the new sample

space Ωprime = A sube Ω Note that if A and B are

independent we obtain the obvious result

prob Probability bay Independence and conditional probability p 2

Example probconditionshy2 re dependence

Problem Statement

Consider two unbiased dice rolled once

Let events A = sum of faces = 8 and

B = faces are equal What is the probability

the faces are equal given that their sum is 8

prob Probability bay Bayesrsquo theorem p 1

Bayesrsquo theorem

probbay Bayesrsquo theorem

Given two events A and B Bayesrsquo theorem (aka

Bayesrsquo rule) states that

P(A | B) = P(B | A)P(A)

P(B) (bay1)

Sometimes this is written

P(A | B) =P(B | A)P(A)

P(B | A)P(A) + P(B | notA)P(notA)(bay2)

=1

1 +P(B | notA)

P(B | A)middot P(notA)P(A)

(bay3)

This is a useful theorem for determining a

testrsquos effectiveness If a test is performed to

determine whether an event has occurred we

might as questions like ldquoif the test indicates

that the event has occurred what is the

probability it has actually occurredrdquo Bayesrsquo

theorem can help compute an answer

Testing outcomes

The test can be either positive or negative

meaning it can either indicate or not indicate

that A has occurred Furthermore this result

can be either true or false

A notA

positive (B) true false

negative (notB) false true

Table 031 test outcomeB for event A

There are four

options then

Consider an event

A and an event that

is a test result B

indicating that event

A has occurred

Table 031 shows

these four possible test outcomes The event

A occurring can lead to a true positive or a

prob Probability bay Bayesrsquo theorem p 1

sensitivity

detection rate

specificity

posterior probability

prior probability

false negative whereas notA can lead to a true

negative or a false positive

Terminology is important here

bull P(true positive) = P(B | A) aka sensitivity

or detection rate

bull P(true negative) = P(notB | notA) aka

specificity

bull P(false positive) = P(B | notA)

bull P(false negative) = P(notB | A)

Clearly the desirable result for any test

is that it is true However no test is true

100 percent of the time So sometimes it

is desirable to err on the side of the

false positive as in the case of a medical

diagnostic Other times it is more desirable to

err on the side of a false negative as in the

case of testing for defects in manufactured

balloons (when a false negative isnrsquot a big

deal)

Posterior probabilities

Returning to Bayesrsquo theorem we can evaluate

the posterior probability P(A | B) of the

event A having occurred given that the

test B is positive given information that

includes the prior probability P(A) of A The

form in Equation bay2 or (bay3) is typically

useful because it uses commonly known test

probabilities of the true positive P(B | A) and

of the false positive P(B | notA) We calculate

P(A | B) when we want to interpret test results

prob Probability bay Bayesrsquo theorem p 2

0 02 04 06 08 1

middot10ndash2

0

02

04

06

08

1

P(A) rarr

P(A|B)rarr

P(B|A) = 09000

P(B|A) = 09900

P(B|A) = 09990

P(B|A) = 09999

Figure 031 for different high-sensitivities theprobability that an event A occurred given thata test for it B is positive versus the probabilitythat the event A occurs under the assumption thespecificity equals the sensitivity

Some interesting results can be found from this

For instance if we let P(B | A) = P(notB | notA)

(sensitivity equal specificity) and realize that

P(B | notA) + P(notB | notA) = 1 (when notA either B or notB)

we can derive the expression

P(B | notA) = 1 ndash P(B | A) (bay4)

Using this and P(notA) = 1 ndash P(A) in Equation bay3

gives (recall wersquove assumed sensitivity equals

specificity)

P(A | B) =1

1 +1 ndash P(B | A)

P(B | A)middot 1 ndash P(A)P(A)

(bay5)

=1

1 +

(1

P(B | A)ndash 1

)(1

P(A)ndash 1

) (bay6)

This expression is plotted in Figure 031 See that

a positive result for a rare event (small P(A)) is

hard to trust unless the sensitivity P(B | A) and

specificity P(notB | notA) are very high indeed

prob Probability bay Bayesrsquo theorem p 3

Example probbayshy1 re Bayesrsquo theorem

Problem StatementSuppose 01 of springs manufactured at a

given plant are defective Suppose you need

to design a test that when it indicates

a deffective part the part is actually

defective 99 of the time What sensitivity

should your test have assuming it can be made

equal to its specificity

Problem Solution

The followingwas generated from a Jupyter

notebook with the following filename and

kernel

notebook filename bayes_theorem_example_01ipynbnotebook kernel python3

from sympy import for symbolicsimport numpy as np for numericsimport matplotlibpyplot as plt for plotsfrom IPythondisplay import display Markdown Latex

Define symbolic variables

var(p_Ap_nAp_Bp_nBp_B_Ap_B_nAp_A_Breal=True)

(p_A p_nA p_B p_nB p_B_A p_B_nA p_A_B)

prob Probability bay Bayesrsquo theorem p 4

Beginningwith Bayesrsquo theorem and assuming

the sensitivity and specificity are equal by

Equation bay4 we can derive the following

expression for the posterior probability P(A | B)

p_A_B_e1 = Eq(p_A_Bp_B_Ap_Ap_B)subs(

p_B p_B_Ap_A+p_B_nAp_nA conditional probp_B_nA 1-p_B_A Eq (35)p_nA 1-p_A

)display(p_A_B_e1)

pAB =pApBA

pApBA +(1 ndash pA

) (1 ndash pBA

)

Solve this for P(B | A) the quantity we seek

p_B_A_sol = solve(p_A_B_e1p_B_Adict=True)p_B_A_eq1 = Eq(p_B_Ap_B_A_sol[0][p_B_A])display(p_B_A_eq1)

pBA =pAB

(1 ndash pA

)ndash2pApAB + pA + pAB

Now letrsquos substitute the given probabilities

p_B_A_spec = p_B_A_eq1subs(

prob Probability rando Bayesrsquo theorem p 5

p_A 0001p_A_B 099

)display(p_B_A_spec)

pBA = 0999989888981011

Thatrsquos a tall order

prob Probability rando Random variables p 1

random variable

discrete random variable

continuous random variable

probrando Random variables

Probabilities are useful even when they do

not deal strictly with events It often occurs

that we measure something that has randomness

associated with it We use random variables to

represent these measurements

A random variable X Ω rarr R is a function

that maps an outcome ω from the sample

space Ω to a real number x isin R as shown in

Figure 032 A random variable will be denoted

with a capital letter (eg X and K) and a

specific value that it maps to (the value) will

be denoted with a lowercase letter (eg x and

k)

A discrete random variable K is one that

takes on discrete values A continuous random

variable X is one that takes on continuous

values

Example probrandoshy1 re dice again

Problem Statement

Roll two unbiased dice Let K be a random

variable representing the sum of the two Let

rvR

Ω

outcome ω

x

X

Figure 032 a random variable X maps an outcomeω isin Ω to an x isin R

prob Probability rando Random variables p 2

P(k) be the probability of the result k isin K Plotand interpret P(k)

Example probrandoshy2 re Johnson-Nyquist noise

Problem Statement

A resistor at nonzero temperature without

any applied voltage exhibits an interesting

phenomenon its voltage randomly fluctuates

This is called JohnsonshyNyquist noise and

is a result of thermal excitation of charge

carriers (electrons typically) For a given

resistor and measurement system let the

probability density function fV of the voltage

V across an unrealistically hot resistor be

fV(V) =1radicπendashV

2

Plot and interpret the meaning of this

function

prob Probability pxf Random variables p 3

prob Probability pxf Probability density and mass functions p 1

frequency distribution

probability mass function

Figure 033 plot of a probability mass function

frequency density distribution

probability density function

probpxf Probability density

and mass functions

Consider an experiment that measures a

random variable We can plot the relative

frequency of the measurand landing in

different ldquobinsrdquo (ranges of values) This is

called a frequency distribution or a

probability mass function (PMF)

Consider for instance a probability mass

function as plotted in Figure 033 where a

frequency ai can be interpreted as an estimate

of the probability of the measurand being in

the ith interval The sum of the frequencies

must be unity

with k being the number of bins

The frequency density distribution is similar

to the frequency distribution but with ai 7rarrai∆x where ∆x is the bin width

If we let the bin width approach zero we

derive the probability density function (PDF)

f(x) = limkrarrinfin∆xrarr0

ksumj=1

aj∆x (pxf1)

prob Probability pxf Probability density and mass functions p 1

Figure 034 plot of a probability density function

We typically think of a probability density

function f like the one in Figure 034 as a

function that can be integrated over to

find the probability of the random variable

(measurand) being in an interval [a b]

P(x isin [a b]) =ˆ b

af(χ)dχ (pxf2)

Of course

We now consider a common PMF and a common

PDF

Binomial PMF

Consider a random binary sequence of length

n such that each element is a random 0 or 1

generated independently like

(1 0 1 1 0 middot middot middot 1 1) (pxf3)

Let events 1 and 0 be mutually exclusive

and exhaustive and P(1) = p The probability of

the sequence above occurring is

prob Probability pxf Probability density and mass functions p 2

binomial distribution PDF

There are n choose k(n

k

)=

n

k(n ndash k) (pxf4)

possible combinations of k ones for n bits

Therefore the probability of any combination

of k ones in a series is

f(k) =

(n

k

)pk(1 ndash p)nndashk (pxf5)

We call Equation pxf5 the binomial

distribution PDF

Example probpxfshy1 re Binomial PMF

Problem Statement

Consider a field sensor that fails for a

given measurement with probability p Given

n measurements plot the binomial PMF as

a function of k failed measurements for

a few different probabilities of failure

p isin [004 025 05 075 096]

Problem SolutionFigure 036 shows Matlab code for

constructing the PDFs plotted in Figure 035

Note that the symmetry is due to the fact

that events 1 and 0 are mutually exclusive

and exhaustive

Example probpxfshy2 re hi

Problem Statement

Sed mattis erat sit amet gravida malesuada

elit augue egestas diam tempus scelerisque

nunc nisl vitae libero Sed consequat feugiat

massa Nunc porta eros in eleifend varius

erat leo rutrum dui non convallis lectus

prob Probability pxf Probability density and mass functions p 3

0 20 40 60 80 1000

005

01

015

02

number of failures in sequence

proba

bility

p = 004

p = 025

p = 05

p = 075

p = 096

Figure 035 binomial PDF for n = 100 measurementsand different values of P(1) = p the probability ofa measurement error The plot is generated by theMatlab code of Figure 036

parametersn = 100k_a = linspace(1nn)p_a = [042557596]

binomial functionf = (nkp) nchoosek(nk)p^k(1-p)^(n-k)

loop through to construct an arrayf_a = NaNones(length(k_a)length(p_a))for i = 1length(k_a)

for j = 1length(p_a)f_a(ij) = f(nk_a(i)p_a(j))

endend

plotfigurecolors = jet(length(p_a))for j = 1length(p_a)

bar(k_af_a(j)facecolorcolors(j)facealpha05displayname [$p = num2str(p_a(j))$]

) hold onendleg = legend(showlocationnorth)set(leginterpreterlatex)hold offxlabel(number of ones in sequence k)ylabel(probability)xlim([0100])

Figure 036 a Matlab script for generating binomialPMFs

prob Probability pxf Probability density and mass functions p 4

orci ut nibh Sed lorem massa nonummy quis

egestas id condimentum at nisl Maecenas at

nibh Aliquam et augue at nunc pellentesque

ullamcorper Duis nisl nibh laoreet suscipit

convallis ut rutrum id enim Phasellus odio

Nulla nulla elit molestie non scelerisque at

vestibulum eu nulla Ut odio nisl facilisis

id mollis et scelerisque nec enim Aenean sem

leo pellentesque sit amet scelerisque sit amet

vehicula pellentesque sapien

prob Probability pxf Probability density and mass functions p 5

prob Probability pxf Probability density and mass functions p 6

prob Probability E Probability density and mass functions p 1

Gaussian or normal random variable

mean

standard deviation

variance

Gaussian PDF

The Gaussian or normal random variable x has

PDF

f(x) =1

σradic2π

expndash(x ndash micro)2

2σ2 (pxf6)

Although wersquore not quite ready to understand

these quantities in detail it can be shown that

the parameters micro and σ have the following

meanings

bull micro is the mean of x

bull σ is the standard deviation of x and

bull σ2 is the variance of x

Consider the ldquobell-shapedrdquo Gaussian PDF in

Figure 037 It is always symmetric The mean micro

is its central value and the standard deviation

σ is directly related to its width We will

continue to explore the Gaussian distribution

in the following lectures especially in

Lecture statsconfidence

prob Probability E Probability density and mass functions p 2

ndash3 ndash2 ndash1 0 1 2 3

random variable x

Figure 037 PDF for Gaussian random variable x

mean micro = 0 and standard deviation σ = 1radic2

prob Probability E Expectation p 1

expected value

expectation

mean

probE Expectation

Recall that a random variable is a function

X Ω rarr R that maps from the sample

space to the reals Random variables are the

arguments of probability mass functions (PMFs)

and probability density functions (PDFs)

The expected value (or expectation) of a

random variable is akin to its ldquoaverage valuerdquo

and depends on its PMF or PDF The expected

value of a random variable X is denoted

〈X〉 or E[X] There are two definitions of

the expectation one for a discrete random

variable the other for a continuous random

variable Before we define them however it

is useful to predefine the most fundamental

property of a random variable its mean

Definition probE1 mean

The mean of a random variable X is defined

as

mX = E[X]

Letrsquos begin with a discrete random variable

Definition probE2 expectation of a

discrete random

variable

Let K be a discrete random variable and f its

PMF The expected value of K is defined as

E[K]=sumforallk

kf(k)

prob Probability E Expectation p 2

Example probEshy1 re expectation of a discrete random variable

Problem Statement

Given a discrete random variable K with PMF

shown below what is its mean mK

Let us now turn to the expectation of a

continuous random variable

Definition probE3 expectation of a

continuous random

variable

Let X be a continuous random variable and f

its PDF The expected value of X is defined as

E[X]=

ˆ infinndashinfin xf(x)dx

Example probEshy2 re expectation of a continuous random variable

Problem Statement

Given a continuous random variable X with

Gaussian PDF f what is the expected value of

X

random variable x

prob Probability moments Expectation p 3

Due to its sum or integral form the expected

value E [middot] has some familiar properties for

random variables X and Y and reals a and b

E [a] = a (E1a)

E[X + a

]= E

[X]+ a (E1b)

E[aX

]= aE

[X]

(E1c)

E[E[X]]= E

[X]

(E1d)

E[aX + bY

]= aE

[X]+ b E

[Y] (E1e)

prob Probability moments Central moments p 1

central moments

variance

probmoments Central moments

Given a probability mass function (PMF) or

probability density function (PDF) of a random

variable several useful parameters of the

random variable can be computed These are

called central moments which quantify

parameters relative to its mean

Definition probmoments1 central moments

The nth central moment of random variable X

with PDF f is defined as

E[(X ndash microX)

n] = ˆ infinndashinfin (x ndash microX)

nf(x)dx

For discrete random variable K with PMF f

E[(K ndash microK)

n] = infinsumforallk(k ndash microK)

nf(k)

Example probmomentsshy1 re first moment

Problem Statement

Prove that the first moment of continuous

random variable X is zero

The second central moment of random variable

X is called the variance and is denoted

prob Probability moments Central moments p 2

standard deviation

skewness

σ2X or Var[X]

or E[(X ndash microX)

2]

(moments1)

The variance is a measure of the width or

spread of the PMF or PDF We usually compute

the variance with the formula

Other properties of variance include for real

constant c

Var [c] = 0 (moments2)

Var[X + c

]= Var

[X]

(moments3)

Var[cX

]= c2 Var

[X] (moments4)

The standard deviation is defined as

Although the variance is mathematically more

convenient the standard deviation has the

same physical units as X so it is often the

more physically meaningful quantity Due to

its meaning as the width or spread of the

probability distribution and its sharing of

physical units it is a convenient choice for

error bars on plots of a random variable

The skewness Skew[X]is a normalized third

prob Probability ex Central moments p 3

asymmetry

kurtosis

tailedness

central moment

Skew[X]=E[(X ndash microX)

3]

σ3X

(moments5)

Skewness is a measure of asymmetry of a

random variablersquos PDF or PMF For a symmetric

PMF or PDF such as the Gaussian PDF Skew[X]=

0

The kurtosis Kurt[X]is a normalized fourth

central moment

Kurt[X]=E[(X ndash microX)

4]

σ4X

(moments6)

Kurtosis is a measure of the tailedness of a

random variablersquos PDF or PMF ldquoHeavierrdquo tails

yield higher kurtosis

A Gaussian random variable has PDF with

kurtosis 3 Given that for Gaussians both

skewness and kurtosis have nice values (0 and

3) we can think of skewness and and kurtosis

as measures of similarity to the Gaussian PDF

stats Probability Exercises for Chapter 03 p 1

probex Exercises for Chapter

03

Exercise 031 (5)

Several physical processes can be modeled

with a random walk a process of interatively

changing a quantity by some random

amount Infinitely many variations are

possible but common factors of variation

include probability distribution step

size dimensionality (eg oneshydimensional

twoshydimensional etc) and coordinate system

Graphical representations of these walks can be

beautiful Develop a computer program that

generates random walks and corresponding

graphics Do it well and call it art because it

is

estimation

stats

Statistics

Whereas probability theory is primarily

focused on the relations among mathematical

objects statistics is concerned with making

sense of the outcomes of observation (Steven

S Skiena Calculated Bets Computers Gambling

and Mathematical Modeling to Win Outlooks

Cambridge University Press 2001 DOI 10

1017CBO9780511547089 This includes a lucid

section on probability versus statistics also

available here httpswww3csstonybrook

edu~skienajaialaiexcerptsnode12html)

However we frequently use statistical methods

to estimate probabilistic models For instance

we will learn how to estimate the standard

deviation of a random process we have some

reason to expect has a Gaussian probability

distribution

Statistics has applications in nearly every

applied science and engineering discipline

Any time measurements are made statistical

analysis is how one makes sense of the results

For instance determining a reasonable level

of confidence in a measured parameter

stats Statistics terms p 3

machine learning

1 Ash Basic Probability Theory

2 Jaynes et al Probability Theory The Logic of Science

3 Erwin Kreyszig Advanced Engineering Mathematics 10th John Wiley amp

Sons Limited 2011 ISBN 9781119571094 The authoritative resource for

engineering mathematics It includes detailed accounts of probability

statistics vector calculus linear algebra fourier analysis ordinary

and partial differential equations and complex analysis It also

includes several other topics with varying degrees of depth Overall

it is the best place to start when seeking mathematical guidance

requires statistics

A particularly hot topic nowadays is machine

learning which seems to be a field with

applications that continue to expand This field

is fundamentally built on statistics

A good introduction to statistics appears at

the end of Ash1 A more involved introduction

is given by Jaynes et al2 The treatment by

Kreyszig3 is rather incomplete as will be our

own

stats Statistics terms Populations samples and machine learning p 1

population

sample

random

machine learning

training

training set

features

predictive model

statsterms Populations samples

and machine

learning

An experimentrsquos population is a complete

collection of objects that we would like to

study These objects can be people machines

processes or anything else we would like to

understand experimentally

Of course we typically canrsquot measure all of

the population Instead we take a subset of

the populationmdashcalled a samplemdashand infer the

characteristics of the entire population from

this sample

However this inference that the sample is

somehow representative of the population

assumes the sample size is sufficiently large

and that the sampling is random This means

selection of the sample should be such that no

one group within a population are systematically

over- or under-represented in the sample

Machine learning is a field that makes

extensive use of measurements and statistical

inference In it an algorithm is trained by

exposure to sample data which is called a

training set The variables measured are

called features Typically a predictive

model is developed that can be used to

extrapolate from the data to a new situation

The methods of statistical analysis we

introduce in this chapter are the foundation

of most machine learning methods

stats Statistics sample Populations samples and machine learning p 2

Example statstermsshy1 re combat boots

Problem Statement

Consider a robot Pierre with a particular

gravitas and sense of style He seeks just

the right-looking pair of combat boots

for wearing in the autumn rains Pierre

is to purchase the boots online via image

recognition and decides to gather data

by visiting a hipster hangout one evening to

train his style For contrast he also watches

footage of a White Nationalist rally focusing

special attention on the boots of wearers of

khakis and polos Comment on Pierrersquos methods

stats Statistics sample Estimation of sample mean and variance p 1

sample mean

statssample Estimation of sample

mean and variance

Estimation and sample statistics

The mean and variance definitions of

Lecture probE and Lecture probmoments apply

only to a random variable for which we have a

theoretical probability distribution Typically

it is not until after having performed many

measurements of a random variable that

we can assign a good distribution model

Until then measurements can help us estimate

aspects of the data We usually start by

estimating basic parameters such as mean

and variance before estimating a probability

distribution

There are two key aspects to randomness in the

measurement of a random variable First of

course there is the underlying randomness with

its probability distribution mean standard

deviation etc which we call the population

statistics Second there is the statistical

variability that is due to the fact that we

are estimating the random variablersquos statisticsmdash

called its sample statisticsmdashfrom some sample

Statistical variability is decreased with

greater sample size and number of samples

whereas the underlying randomness of the

random variable does not decrease Instead

our estimates of its probability distribution

and statistics improve

Sample mean variance and standard

deviation

The arithmetic mean or sample mean of a

measurand with sample size N represented by

stats Statistics sample Estimation of sample mean and variance p 2

population mean

sample variance

population variance

random variable X is defined as

x =1

N

Nsumi=1

xi (sample1)

If the sample size is large x rarr mX (the sample

mean approaches the mean) The population

mean is another name for the mean mX which

is equal to

mX = limNrarrinfin 1

N

Nsumi=1

xi (sample2)

Recall that the definition of the mean is

mX = E [x]

The sample variance of a measurand

represented by random variable X is defined

as

S2X =1

N ndash 1

Nsumi=1

(xi ndash x)2 (sample3)

If the sample size is large S2X rarr σ2X(the

sample variance approaches the variance) The

population variance is another term for the

variance σ2X and can be expressed as

σ2X = limNrarrinfin 1

N ndash 1

Nsumi=1

(xi ndash x)2 (sample4)

Recall that the definition of the variance is

σ2X= E

[(X ndash mX)

2]

The sample standard deviation of a measurand

represented by random variable X is defined

as

SX =

radicS2X (sample5)

If the sample size is large SX rarr σX (the sample

standard deviation approaches the standard

stats Statistics sample Estimation of sample mean and variance p 1

mean of means Xi

mean of standard deviations SXi

standard deviation of means SXi

standard deviation of standard deviations SSXi

deviation) The population standard deviation is

another term for the standard deviation σX

and can be expressed as

σX = limNrarrinfin

radicS2X (sample6)

Recall that the definition of the standard

deviation is σX =radicσ2X

Sample statistics as random variables

There is an ambiguity in our usage of the term

ldquosamplerdquo It can mean just one measurement

or it can mean a collection of measurements

gathered together Hopefully it is clear from

context

In the latter sense often we collect multiple

samples each of which has its own sample

mean Xi and standard deviation SXi In this

situation Xi and SXi are themselves random

variables (meta af I know) This means they

have their own sample means Xi and SXi and

standard deviations SXi

and SSXi

The mean of means Xi is equivalent to

a mean with a larger sample size and is

therefore our best estimate of the mean of

the underlying random process The mean of

standard deviations SXi is our best estimate

of the standard deviation of the underlying

random process The standard deviation

of means SXi

is a measure of the spread

in our estimates of the mean It is our best

estimate of the standard deviation of the

statistical variation and should therefore

tend to zero as sample size and number of

samples increases The standard deviation

of standard deviations SSXiis a measure of

stats Statistics sample Estimation of sample mean and variance p 1

the spread in our estimates of the standard

deviation of the underlying process It should

also tend to zero as sample size and number of

samples increases

Let N be the size of each sample It can be

shown that the standard deviation of the

means SXi

can be estimated from a single

sample standard deviation

SXiasympSXiradicN (sample7)

This shows that as the sample size N increases

the statistical variability of the mean

decreases (and in the limit approaches zero)

Nonstationary signal statistics

The sample mean variance and standard

deviation definitions above assume the random

process is stationarymdashthat is its population

mean does not vary with time However

a great many measurement signals have

populations that do vary with time ie they are

nonstationary Sometimes the nonstationarity

arises from a ldquodriftrdquo in the dc value of a

signal or some other slowly changing variable

But dynamic signals can also change in a

recognizable and predictable manner as when

say the temperature of a room changes when

a window is opened or when a water level

changes with the tide

Typically we would like to minimize the effect

of nonstationarity on the signal statistics In

certain cases such as drift the variation is a

nuissance only but other times it is the point

of the measurement

stats Statistics sample Estimation of sample mean and variance p 1

Two common techniques are used depending

on the overall type of nonstationarity If it

is periodic with some known or estimated

period the measurement data series can

be ldquofoldedrdquo or ldquoreshapedrdquo such that the ith

measurement of each period corresponds to

the ith measurement of all other periods In

this case somewhat counterintuitively we can

consider the ith measurements to correspond

to a sample of size N where N is the number of

periods over which measurements are made

When the signal is aperiodic we often simply

divide it into ldquosmallrdquo (relative to its overall

trend) intervals over which statistics are

computed separately

Note that in this discussion we have assumed

that the nonstationarity of the signal is due to

a variable that is deterministic (not random)

Example statssampleshy1 re measurement of gaussian noise on

nonstationary signal

Problem StatementConsider the measurement of the temperature

inside a desktop computer chassis via an

inexpensive thermistor a resistor that

changes resistance with temperature The

processor and power supply heat the chassis

in a manner that depends on processing

demand For the test protocol the processors

are cycled sinusoidally through processing

power levels at a frequency of 50 mHz for

nT = 12 periods and sampled at 1 Hz Assume

a temperature fluctuation between about 20

and 50 C and gaussian noise with standard

stats Statistics sample Estimation of sample mean and variance p 2

deviation 4 C Consider a sample to be the

multiple measurements of a certain instant in

the period

1 Generate and plot simulated

temperature data as a time series

and as a histogram or frequency

distribution Comment on why the

frequency distribution sucks

2 Compute the sample mean and standard

deviation for each sample in the cycle

3 Subtract the mean from each sample

in the period such that each sample

distribution is centered at zero Plot

the composite frequency distribution

of all samples together This represents

our best estimate of the frequency

distribution of the underlying process

4 Plot a comparison of the theoretical

mean which is 35 and the sample mean

of means with an error bar Vary the

number of samples nT and comment on its

effect on the estimate

5 Plot a comparison of the theoretical

standard deviation and the sample mean

of sample standard deviations with an

error bar Vary the number of samples

nT and comment on its effect on the

estimate

6 Plot the sample means over a single

period with error bars of plusmn one sample

standard deviation of the means This

represents our best estimate of the

sinusoidal heating temperature Vary the

number of samples nT and comment on

the estimate

Problem Solution

stats Statistics sample Estimation of sample mean and variance p 3

clear close all clear kernel

Generate the temperature data

The temperature data can be generated by

constructing an array that is passed to a

sinusoid then ldquorandomizedrdquo by gaussian

random numbers Note that we add 1 to npand n to avoid the sneaky fencepost error

f = 50e-3 Hz sinusoid frequencya = 15 C amplitude of oscillationdc = 35 C dc offset of oscillationfs = 1 Hz sampling frequencynT = 12 number of sinusoid periodss = 4 C standard deviationnp = fsf+1 number of samples per periodn = nTnp+1 total number of samples

t_a = linspace(0nTfn) time arraysin_a = dc + asin(2pift_a) sinusoidal arrayrng(43) seed the random number generatornoise_a = srandn(size(t_a)) gaussian noisesignal_a = sin_a + noise_a sinusoid + noise

Now that we have an array of ldquodatardquo wersquore

ready to plot

h = figurep = plot(t_asignal_ao-

Color[888]MarkerFaceColorbMarkerEdgeColornoneMarkerSize3)

xlabel(time (s))ylabel(temperature (C))hgsave(hfigurestemp)

stats Statistics sample Estimation of sample mean and variance p 4

0 50 100 150 200 250

20

40

60

time (s)

tempera

ture

(C)

Figure 041 temperature over time

This is something like what we might see for

continuous measurement data Now the

histogram

h = figurehistogram(signal_a

30 number of binsnormalizationprobability for PMF

)xlabel(temperature (C))ylabel(probability)hgsave(hfigurestemp)

10 20 30 40 50 600

002

004

006

008

temperature (C)

proba

bility

Figure 042 a poor histogram due tounstationarity of the signal

stats Statistics sample Estimation of sample mean and variance p 5

This sucks because we plot a frequency

distribution to tell us about the random

variation but this data includes the sinusoid

Sample mean variance and standard

deviation

To compute the sample mean micro and standard

deviation s for each sample in the period

we must ldquopick outrdquo the nT data points that

correspond to each other Currently theyrsquore in

one long 1 times n array signal_a It is helpful

to reshape the data so it is in an nT times nparray which each row corresponding to a

new period This leaves the correct points

aligned in columns It is important to note

that we can do this ldquofoldingrdquo operation only

when we know rather precisely the period

of the underlying sinusoid It is given in the

problem that it is a controlled experiment

variable If we did not know it we would

have to estimate it too from the data

signal_ar = reshape(signal_a(1end-1)[npnT])size(signal_ar) check sizesignal_ar(1314) print three rows four columns

ans =

12 21

ans =

302718 400946 408341 447662401836 372245 494076 461137400571 409718 461627 419145

stats Statistics sample Estimation of sample mean and variance p 6

Define the mean variance and standard

deviation functions as ldquoanonmymous

functionsrdquo We ldquoroll our ownrdquo These are not

as efficient or flexible as the built-in Matlab

functions mean var and std which should

typically be used

my_mean = (vec) sum(vec)length(vec)my_var = (vec) sum((vec-my_mean(vec))^2)(length(vec)-1)

my_std = (vec) sqrt(my_var(vec))

Now the sample mean variance and standard

deviations can be computed We proceed by

looping through each column of the reshaped

signal array

mu_a = NaNones(1np) initialize mean arrayvar_a = NaNones(1np) initialize var arrays_a = NaNones(1np) initialize std array

for i = 1np for each columnmu_a(i) = my_mean(signal_ar(i))var_a(i) = my_var(signal_ar(i))s_a(i) = sqrt(var_a(i)) touch of speed

end

Composite frequency distribution

The columns represent samples We want to

subtract the mean from each column We use

repmat to reproduce mu_a in nT rows so it can

be easily subtracted

stats Statistics sample Estimation of sample mean and variance p 7

signal_arz = signal_ar - repmat(mu_a[nT1])size(signal_arz) check sizesignal_arz(1314) print three rows four columns

ans =

12 21

ans =

-50881 09525 -02909 -1570048237 -19176 82826 -0222546972 18297 50377 -44216

Now that all samples have the same mean

we can lump them into one big bin for the

frequency distribution There are some nice

built-in functions to do a quick reshape and

fit

resizesignal_arzr = reshape(signal_arz[1nTnp])size(signal_arzr) check size fitpdfit_model = fitdist(signal_arzrnormal) fitx_a = linspace(-1515100)pdfit_a = pdf(pdfit_modelx_a)pdf_a = normpdf(x_a0s) theoretical pdf

ans =

1 252

Plot

stats Statistics sample Estimation of sample mean and variance p 8

h = figurehistogram(signal_arzr

round(ssqrt(nT)) number of binsnormalizationprobability for PMF

)hold onplot(x_apdfit_ab-linewidth2) hold onplot(x_apdf_ag--linewidth2)legend(pmfpdf estpdf)xlabel(zero-mean temperature (C))ylabel(probability massdensity)hgsave(hfigurestemp)

ndash15 ndash10 ndash5 0 5 10 150

005

01

015

zero-mean temperature (C)proba

bility

massdensity

pmf

pdf est

pdf

Figure 043 PMF and estimated and theoreticalPDFs

Means comparison

The sample mean of means is simply the

following

mu_mu = my_mean(mu_a)

mu_mu =

351175

stats Statistics sample Estimation of sample mean and variance p 9

The standard deviation that works as an

error bar which should reflect how well

we can estimate the point plotted is the

standard deviation of the means It is

difficult to compute this directly for a

nonstationary process We use the estimate

given above and improve upon it by using the

mean of standard deviations instead of a

single samplersquos standard deviation

s_mu = mean(s_a)sqrt(nT)

s_mu =

11580

Now for the simple plot

h = figurebar(mu_mu) hold on gives barerrorbar(mu_mus_murlinewidth2) error barax = gca current axisaxXTickLabels = $overlineoverlineX$axTickLabelInterpreter = latexhgsave(hfigurestemp)

Standard deviations comparison

The sample mean of standard deviations is

simply the following

mu_s = my_mean(s_a)

stats Statistics sample Estimation of sample mean and variance p 10

mu_s =

40114

The standard deviation that works as

an error bar which should reflect how

well we can estimate the point plotted is

the standard deviation of the standard

deviations We can compute this directly

s_s = my_std(s_a)

s_s =

08495

Now for the simple plot

h = figurebar(mu_s) hold on gives barerrorbar(mu_ss_srlinewidth2) error barsax = gca current axisaxXTickLabels = $overlineS_X$axTickLabelInterpreter = latexhgsave(hfigurestemp)

X0

10

20

30

40

Figure 044 sample mean of sample means

SX

0

2

4

Figure 045 sample standard deviation of sample means

Plot a period with error bars

Plotting the data with error bars is fairly

straightforward with the built-in errorbar

stats Statistics confidence Estimation of sample mean and variance p 11

function The main question is ldquowhich

standard deviationrdquo Since wersquore plotting the

means it makes sense to plot the error bars

as a single sample standard deviation of the

means

h = figuree1 = errorbar(t_a(1np)mu_as_muones(1np)b)hold ont_a2 = linspace(01f101)e2 = plot(t_a2dc + asin(2pift_a2)r-)xlim([t_a(1)t_a(np)])grid onxlabel(folded time (s))ylabel(temperature (C))legend([e1 e2]sample meanpopulation meanLocationNorthEast)hgsave(hfigurestemp)

0 5 10 15

20

30

40

50

folded time (s)

tempera

ture

(C)

sample mean

population mean

Figure 046 sample means over a period

stats Statistics confidence Confidence p 1

confidence

central limit theorem

statsconfidence Confidence

One really ought to have it to give a lecture

named it but wersquoll give it a try anyway

Confidence is used in the common sense

although we do endow it with a mathematical

definition to scare business majors who

arenrsquot actually impressed but indifferent

Approximately if under some reasonable

assumptions (probabilistic model) we estimate

the probability of some event to be P we say

we have P confidence in it I mean business

majors are all ldquoSupply and demand Letrsquos call

that a lsquolawrsquo rdquo so I think wersquore even

So wersquore back to computing probability from

distributionsmdashprobability density functions

(PDFs) and probability mass functions (PMFs)

Usually we care most about estimating the

mean of our distribution Recall from the

previous lecture that when several samples

are taken each with its own mean the mean

is itself a random variablemdashwith a mean of

course Meanception

But the mean has a probability distribution

of its own The central limit theorem has

as one of its implications that as the sample

size N gets large regardless of the sample

distributions this distribution of means

approaches the Gaussian distribution

But sometimes I always worry Irsquom being lied to

so letrsquos check

stats Statistics confidence Confidence p 1

clear close all clear kernel

Generate some data to test the central

limit theorem

Data can be generated by constructing an

array using a (seeded for consistency) random

number generator Letrsquos use a uniformly

distributed PDF between 0 and 1

N = 150 sample size (number of measurements per sample)M = 120 number of samplesn = NM total number of measurementsmu_pop = 05 because its a uniform PDF between 0 and 1

rng(11) seed the random number generatorsignal_a = rand(NM) uniform PDFsize(signal_a) check the size

ans =

150 120

Letrsquos take a look at the data by plotting

the first ten samples (columns) as shown in

Figure 047

This is something like what we might see for

continuous measurement data Now the

histogram shown in

samples_to_plot = 10h = figure

stats Statistics confidence Confidence p 1

0 20 40 60 80 100 120 1400

02

04

06

08

1

index

measurement

Figure 047 raw data with colors correspondingto samples

0 02 04 06 08 10

002

004

006

008

measurement

proba

bility

Figure 048 a histogram showing the approximatelyuniform distribution of each sample (color)

c = jet(samples_to_plot) color arrayfor j=1samples_to_plot

histogram(signal_a(j)30 number of binsfacecolorc(j)facealpha3normalizationprobability for PMF

)hold on

endhold offxlim([-05105])xlabel(measurement)ylabel(probability)hgsave(hfigurestemp)

This isnrsquot a great plot but it shows roughly

that each sample is fairly uniformly

distributed

stats Statistics confidence Confidence p 1

Sample statistics

Now letrsquos check out the sample statistics We

want the sample mean and standard deviation

of each column Letrsquos use the built-in functions

mean and std

mu_a = mean(signal_a1) mean of each columns_a = std(signal_a1) std of each column

Now we can compute the mean statistics both

the mean of the mean X and the standard

deviation of the mean sX which we donrsquot

strictly need for this part but wersquore curious

We choose to use the direct estimate instead

of the sXradicN formula but they should be close

mu_mu = mean(mu_a)s_mu = std(mu_a)

mu_mu =

04987

s_mu =

00236

The truth about sample means

Itrsquos the moment of truth Letrsquos look at the

distribution shown in Figure 049

stats Statistics confidence Confidence p 1

042 044 046 048 05 052 054 056 0580

01

02

03

measurement

proba

bility

Figure 049 a histogram showing the approximatelynormal distribution of the means

h = figurehistogram(mu_a

normalizationprobability for PMF)hold offxlabel(measurement)ylabel(probability)hgsave(hfigurestemp)

This looks like a Gaussian distribution about

the mean of means so I guess the central limit

theorem is legit

Gaussian and probability

We already know how to compute the

probability P a value of a random variable X

lies in a certain interval from a PMF or PDF

(the sum or the integral respectively) This

means that for sufficiently large sample size N

such that we can assume from the central limit

theorem that the sample means xi are normally

distributed the probability a sample mean

value xi is in a certain interval is given by

integrating the Gaussian PDF The Gaussian PDF

for random variable Y representing the sample

means is

f(y) =1

σradic2π

expndash(y ndash micro)2

2σ2 (confidence1)

stats Statistics confidence Confidence p 2

Gaussian cumulative distribution function

where micro is the population mean and σ is the

population standard deviation

The integral of f over some interval is the

probability a value will be in that interval

Unfortunately that integral is uncool It gives

rise to the definition of the error function

which for the Gaussian random variable Y is

erf(yb) =1radicπ

ˆ yb

ndashyb

endasht2dt (confidence2)

This expresses the probability a sample mean

being in the interval [ndashyb yb] if Y has mean 0

and variance 12

Matlab has this built-in as erf shown in

Figure 0410

y_a = linspace(03100)h = figurep1 = plot(y_aerf(y_a))p1LineWidth = 2grid onxlabel(interval bound $y_b$interpreterlatex)ylabel(error function $textrmerf(y_b)$

interpreterlatex)hgsave(hfigurestemp)

We could deal directly with the error

function but most people donrsquot and wersquore

weird enough as it is Instead people use the

Gaussian cumulative distribution function

stats Statistics confidence Confidence p 3

0 05 1 15 2 25 30

02

04

06

08

1

interval bound yb

errorfunctionerf(yb)

Figure 0410 the error function

(CDF) Φ Rrarr R which is defined as

Φ(z) =1

2

(1 + erf

(zradic2

))(confidence3)

and which expresses the probability of a

Gaussian random variable Z with mean 0 and

standard deviation 1 taking on a value in

the interval (ndashinfin z] The Gaussian CDF and PDF

are shown in Figure 0411 Values can be taken

directly from the graph but itrsquos more accurate

to use the table of values in Appendix A01

Thatrsquos great and all but occasionally (always)

we have Gaussian random variables with

nonzero means and nonunity standard

deviations It turns out we can shift any

Gaussian random variable by its mean and

scale it by its standard deviation to make

it have zero mean and standard deviation

We can then use Φ and interpret the results

as being relative to the mean and standard

deviation using phrases like ldquothe probability

it is within two standard deviations of its

meanrdquo The transformed random variable Z and

its values z are sometimes called the z-score

For a particular value x of a random variable

X we can compute its z-score (or value z of

stats Statistics confidence Confidence p 4

zb

ndash3 ndash2 ndash1 0 1 2 30

01

02

03

04

z

Gaussia

nPDFf(z)

Φ(zb)

f(z)

(a) the Gaussian PDF

ndash3 ndash2 ndash1 0 1 2 30

02

04

06

08

1

interval bound zb

Gaussia

nCDFΦ(zb)

(b) the Gaussian CDF

Figure 0411 the Gaussian PDF and CDF for z-scores

random variable Z) with the formula

z =x ndash microXσX

(confidence4)

and compute the probability of X taking on a

value within the interval say x isin [xbndash xb+]

from the table (Sample statistics X and SXare appropriate when population statistics are

unknown)

For instance compute the probability a

Gaussian random variable X with microX = 5 and

σX = 234 takes on a value within the interval

x isin [3 6]

1 Compute the z-score of each endpoint of

the interval

stats Statistics confidence Confidence p 5

z3 =3 ndash microXσX

asymp ndash085 (confidence5)

z6 =6 ndash microXσX

asymp 043 (confidence6)

2 Look up the CDF values for z3 and z6 which

are Φ(z3) = 01977 and Φ(z6) = 06664 3 The

CDF values correspond to the probabilities x lt 3

and x lt 6 Therefore to find the probability

x lies in that interval we subtract the lower

bound probability

P(x isin [3 6]) = P(x lt 6) ndash P(x lt 3) (confidence7)

= Φ(6) ndashΦ(3) (confidence8)

asymp 06664 ndash 01977 (confidence9)

asymp 04689 (confidence10)

So there is a 4689 probability and therefore

we have 4689 confidence that x isin [3 6]

Often we want to go the other way estimating

the symmetric interval [xbndash xb+] for which

there is a given probability In this case we

first look up the z-score corresponding to a

certain probability For concreteness given

the same population statistics above letrsquos find

the symmetric interval [xbndash xb+] over which we

have 90 confidence From the table we want

two symmetric z-scores that have CDFshyvalue

difference 09 Or in maths

Φ(zb+) ndashΦ(zbndash) = 09 and zb+ = ndashzbndash

(confidence11)

stats Statistics confidence Confidence p 6

Due to the latter relation and the additional

fact that the Gaussian CDF has antisymmetry

Φ(zb+) +Φ(zbndash) = 1 (confidence12)

Adding the two Φ equations

Φ(zb+) = 192 (confidence13)

= 095 (confidence14)

and Φ(zbndash) = 005 From the table these

correspond (with a linear interpolation) to

zb = zb+ = ndashzbndash asymp 1645 All that remains is

to solve the z-score formula for x

x = microX + zσX (confidence15)

From this

xb+ = microX + zb+σX asymp 8849 (confidence16)

xbndash = microX + zbndashσX asymp 1151 (confidence17)

and X has a 90 confidence interval

[1151 8849]

stats Statistics confidence Confidence p 7

Example statsconfidenceshy1 re gaussian confidence for a mean

Problem StatementConsider the data set generated above

What is our 95 confidence interval in our

estimate of the mean

Problem Solution

Assumingwe have a sufficiently large

data set the distribution of means is

approximately Gaussian Following the same

logic as above we need z-score that gives an

upper CDF value of From the

table we obtain the zb = zb+ = ndashzbndash below

z_b = 196

Now we can estimate the mean using our

sample and mean statistics

X = Xplusmn zbSX (confidence18)

mu_x_95 = mu_mu + [-z_bz_b]s_mu

mu_x_95 =

04526 05449

stats Statistics student Confidence p 8

This is our 95 confidence interval in our

estimate of the mean

stats Statistics student Student confidence p 1

Studentrsquos t-distribution

degrees of freedom

statsstudent Student

confidence

The central limit theorem tells us that for

large sample size N the distribution of the

means is Gaussian However for small sample

size the Gaussian isnrsquot as good of an estimate

Studentrsquos t-distribution is superior for

lower sample size and equivalent at higher

sample size Technically if the population

standard deviation σX is known even for

low sample size we should use the Gaussian

distribution However this rarely arises in

practice so we can usually get away with an

ldquoalways trdquo approach

A way that the tshydistribution accounts for

low-N is by having an entirely different

distribution for each N (seems a bit of a

cheat to me) Actually instead of N it uses

the degrees of freedom ν which is N minus the

number of parameters required to compute the

statistic Since the standard deviation requires

only the mean for most of our cases ν = N ndash 1

As with the Gaussian distribution the

tshydistributionrsquos integral is difficult to

calculate Typically we will use a t-table such

as the one given in Appendix A02 There are

three points of note

1 Since we are primarily concerned with

going from probabilityconfidence

values (eg P probabilityconfidence) to

intervals typically there is a column for

each probability

stats Statistics student Student confidence p 2

2 The extra parameter ν takes over one of

the dimensions of the table because three-

dimensional tables are illegal

3 Many of these tables are ldquotwo-sidedrdquo

meaning their t-scores and probabilities

assume you want the symmetric

probability about the mean over the

interval [ndashtb tb] where tb is your t-score

bound

Consider the following example

Example statsstudentshy1 re student confidence interval

Problem StatementWrite a Matlab script to generate a data

set with 200 samples and sample sizes

N isin 10 20 100 using any old distribution

Compare the distribution of the means for

the different N Use the sample distributions

and a t-table to compute 99 confidence

intervals

Problem Solution

Generate the data set

confidence = 099 requirement

M = 200 of samplesN_a = [1020100] sample sizes

mu = 27 population meansigma = 9 population std

rng(1) seed random number generatordata_a = mu + sigmarandn(N_a(end)M) normal

stats Statistics student Student confidence p 3

size(data_a) check sizedata_a(11015) check 10 rows and five columns

ans =

100 200

ans =

211589 302894 278705 307835 283662376305 171264 282973 240811 343486201739 443719 437059 390699 322002170135 326064 369030 379230 365747193900 329156 237230 224749 197709218460 138295 312479 169527 341876219719 346854 194480 187014 241642286054 322244 222873 269906 376746252282 187326 145011 283814 277645322780 341538 270382 188643 141752

Compute the means for different sample sizes

mu_a = NaNones(length(N_a)M)for i = 1length(N_a)

mu_a(i) = mean(data_a(1N_a(i)1M)1)end

Plotting the distribution of

the means yields Figure 0412

20 25 30 350

01

02

03

random variable X

proba

bility

(PMF)

interval for N = 10

interval for N = 20

interval for N = 100

stats Statistics student Student confidence p 4

Figure 0412 a histogram showing thedistribution of the means for each sample size

It makes sense that the larger the sample size

the smaller the spread A quantitative metric

for the spread is of course the standard

deviation of the means for each sample size

S_mu = std(mu_a02)

S_mu =

283652091810097

Look up t-table values or use Matlabrsquos tinvfor different sample sizes and 99 confidence

Use these the mean of means and the

standard deviation of means to compute the

99 confidence interval for each N

t_a = tinv(confidenceN_a-1)for i = 1length(N_a)

interval = mean(mu_a(i)) + [-11]t_a(i)S_mu(i)

disp(sprintf(interval for N = i N_a(i)))disp(interval)

end

stats Statistics multivar Student confidence p 5

t_a =

28214 25395 23646

interval for N = 10190942 351000

interval for N = 20216292 322535

interval for N = 100247036 294787

As expected the larger the sample size the

smaller the interval over which we have 99

confidence in the estimate

stats Statistics multivar Multivariate probability and correlation p 1

joint PDF

joint PMF

statsmultivar Multivariate

probability and

correlation

Thus far we have considered probability

density and mass functions (PDFs and PMFs)

of only one random variable But of course

often we measure multiple random variables

X1 X2 Xn during a single event meaning

a measurement consists of determining values

x1 x2 xn of these random variables

We can consider an n-tuple of continuous

random variables to form a sample space Ω = Rn

on which a multivariate function f Rn rarr Rcalled the joint PDF assigns a probability

density to each outcome x isin Rn The joint PDF

must be greater than or equal to zero for all

x isin Rn the multiple integral over Ω must be

unity and the multiple integral over a subset

of the sample space A sub Ω is the probability of

the event A

We can consider an n-tuple of discrete random

variables to form a sample space Nn0 on which

a multivariate function f Nn0 rarr R calledthe joint PMF assigns a probability to each

outcome x isin Nn0 The joint PMF must be greater

than or equal to zero for all x isin Nn0 the

multiple sum over Ω must be unity and the

multiple sum over a subset of the sample space

A sub Ω is the probability of the event A

stats Statistics multivar Multivariate probability and correlation p 2

Example statsmultivarshy1 re bivariate gaussian pdf

Problem StatementLetrsquos visualize multivariate PDFs by plotting

a bivariate gaussian using Matlabrsquos mvnpdffunction

Problem Solution

mu = [1020] meansSigma = [1002] cov well talk about thisx1_a = linspace(

mu(1)-5sqrt(Sigma(11))mu(1)+5sqrt(Sigma(11))30)

x2_a = linspace(mu(2)-5sqrt(Sigma(22))mu(2)+5sqrt(Sigma(22))30)

[X1X2] = meshgrid(x1_ax2_a)f = mvnpdf([X1() X2()]muSigma)f = reshape(flength(x2_a)length(x1_a))

h = figurep = surf(x1_ax2_af)xlabel($x_1$interpreterlatex)ylabel($x_2$interpreterlatex)zlabel($f(x_1x_2)$interpreterlatex)shading interphgsave(hfigurestemp)

The result is Figure 0413 Note

how the means and standard

deviations affect the distribution

510

1520

0

02

x1x2

f(x1x2)

stats Statistics multivar Multivariate probability and correlation p 1

marginal PDF

Figure 0413 twoshyvariable gaussian PDF

Marginal probability

The marginal PDF of a multivariate PDF is the

PDF of some subspace of Ω after one or more

variables have been ldquointegrated outrdquo such that

a fewer number of random variables remain

Of course these marginal PDFs must have the

same properties of any PDF such as integrating

to unity

Example statsmultivarshy2 re bivariate gaussian marginal probability

Problem StatementLetrsquos demonstrate this by numerically

integrating over x2 in the bivariate Gaussian

above

Problem Solution

Continuing from where we left off letrsquos

integrate

f1 = trapz(x2_af2) trapezoidal integration

Plotting this yields Figure 0414

h = figurep = plot(x1_af1)pLineWidth = 2xlabel($x_1$interpreterlatex)

stats Statistics multivar Multivariate probability and correlation p 1

machine learning

ylabel($g(x_1)=int_-infty^infty f(x_1x_2) d x_2$interpreterlatex)hgsave(hfigurestemp)

6 8 10 12 140

01

02

03

04

x1

g(x1)=acute infin ndashinfinf

(x1x2)dx2

Figure 0414 marginal Gaussian PDF g(x1)

We should probably verify that this

integrates to one

disp([integral over x_1 = sprintf(07ftrapz(x1_af1))]

)

integral over x_1 = 09999986

Not bad

Covariance

Very often especially in machine learning

stats Statistics multivar Multivariate probability and correlation p 1

artificial intelligence

covariance

correlation

sample covariance

covariance matrix

or artificial intelligence applications the

question about two random variables X and Y

is how do they coshyvary That is what is their

covariance defined as

Cov[XY

]equiv E

((X ndash microX)(Y ndash microY)

)= E(XY) ndash microXmicroY

Note that when X = Y the covariance is just

the variance When a covariance is large and

positive it is an indication that the random

variables are strongly correlated When

it is large and negative they are strongly

anti-correlated Zero covariance means

the variables are uncorrelated In fact

correlation is defined as

Cor[XY

]=

Cov[XY

]radicVar

[X]Var

[Y]

This is essentially the covariance ldquonormalizedrdquo

to the interval [ndash1 1]

Sample covariance

As with the other statistics wersquove considered

covariance can be estimated from

measurement The estimate called the sample

covariance qXY of random variables X and Y

with sample size N is given by

qXY =1

N ndash 1

Nsumi=1

(xi ndash X)(yi ndash Y)

Multivariate covariance

With n random variables Xi one can compute

the covariance of each pair It is common

practice to define an n times n matrix of

covariances called the covariance matrix Σ

stats Statistics multivar Multivariate probability and correlation p 2

sample covariance matrix

such that each pairrsquos covariance

Cov[Xi Xj

](multivar1)

appears in its row-column combination (making it

symmetric) as shown below

Σ =

Cov

[X1 X1

]Cov

[X1 X2

]middot middot middot Cov

[X1 Xn

]Cov

[X2 X1

]Cov

[X2 X2

]Cov

[X2 Xn

]

Cov[Xn X1

]Cov

[Xn X2

]middot middot middot Cov

[Xn Xn

]

The multivariate sample covariance matrix

Q is the same as above but with sample

covariances qXiXj

Both covariance matrices have correlation

analogs

Example statsmultivarshy3 re car data sample covariance and correlation

Problem StatementLetrsquos use a built-in multivariate data set

that describes different features of cars

listed below

d = load(carsmallmat) this is a struct

Letrsquos compute the sample covariance and

correlation matrices

Problem Solution

variables = MPGCylindersDisplacementHorsepowerWeightAcceleration

stats Statistics multivar Multivariate probability and correlation p 3

Model_Yearn = length(variables)m = length(dMPG)

data = NaNones(mn) preallocatefor i = 1n

data(i) = d(variablesi)end

cov_d = nancov(data) sample covariancecor_d = corrcov(cov_d) sample correlation

This is highly correlatedanticorrelated

data Letrsquos plot each variable versus each

other variable to see the correlations of each

We use a red-to-blue colormap to contrast

anticorrelation and correlation Purple then is

uncorrelated

The following builds the red-to-blue colormap

n_colors = 10cmap_rb = NaNones(n_colors3)for i = 1n_colors

a = in_colorscmap_rb(i) = (1-a)[100]+a[001]

end

h = figurefor i = 1n

for j = 1nsubplot(nnsub2ind([nn]ji))p = plot(d(variablesi)d(variablesj)

) hold onthis_color = cmap_rb(round((cor_d(ij)+1)(n_colors-1)2)

stats Statistics multivar Multivariate probability and correlation p 1

)pMarkerFaceColor = this_colorpMarkerEdgeColor = this_color

endendhgsave(hfigurestemp)

Figure 0415 car data correlation

Conditional probability and dependence

Independent variables are uncorrelated

However uncorrelated variables may or may

not be independent Therefore we cannot use

correlation alone as a test for independence

For instance for random variables X and Y

where X has some even distribution and Y = X2

clearly the variables are dependent However

the are also uncorrelated (due to symmetry)

stats Statistics multivar Multivariate probability and correlation p 2

Example statsmultivarshy4 re car data sample covariance and correlation

Problem StatementUsing a uniform distribution U(ndash1 1) show that

X and Y are uncorrelated (but dependent)

with Y = X2 with some sampling We compute

the correlation for different sample sizes

Problem Solution

N_a = round(linspace(10500100))qc_a = NaNones(size(N_a))rng(6)x_a = -1 + 2rand(max(N_a)1)y_a = x_a^2for i = 1length(N_a)

should write incremental algorithm but lazyq = cov(x_a(1N_a(i))y_a(1N_a(i)))qc = corrcov(q)qc_a(i) = qc(21) cross correlation

end

The absolute values of the correlations are

shown in Figure 0416 Note that we should

probably average several such curves to

estimate how the correlation would drop off

with N but the single curve describes our

understanding that the correlation in fact

approaches zero in the large-sample limit

h = figurep = plot(N_aabs(qc_a))pLineWidth = 2xlabel(sample size $N$interpreterlatex)ylabel(absolute sample correlationinterpreterlatex

stats Statistics regression Multivariate probability and correlation p 3

)hgsave(hfigurestemp)

0 100 200 300 400 5000

01

02

03

sample size Nabsolute

sampleco

rrelation

Figure 0416 absolute value of the sample

correlation between X sim U(ndash1 1) and Y = X2 fordifferent sample size N In the limit the populationcorrelation should approach zero and yet X and Yare not independent

stats Statistics regression Regression p 1

statsregression Regression

Suppose we have a sample with two

measurands (1) the force F through a spring and

(2) its displacement X (not from equilibrium)

We would like to determine an analytic

function that relates the variables perhaps

for prediction of the force given another

displacement

There is some variation in the measurement Letrsquos

say the following is the sample

X_a = 1e-3[10213041495061718092100] mF_a = [501504532559572599

610639670679703] N

Letrsquos take a look at the data The result is

Figure 0417

h = figurep = plot(X_a1e3F_abMarkerSize15)xlabel($X$ (mm)interpreterlatex)ylabel($F$ (N)interpreterlatex)xlim([0max(X_a1e3)])grid onhgsave(hfigurestemp)

How might we find an analytic function that

agrees with the data Broadly our strategy

will be to assume a general form of a function

and use the data to set the parameters in the

stats Statistics regression Regression p 2

0 20 40 60 80 10050

60

70

X (mm)

F(N)

Figure 0417 forceshydisplacement data

function such that the difference between the

data and the function is minimal

Let y be the analytic function that we would

like to fit to the data Let yi denote the

value of y(xi) where xi is the ith value of the

random variable X from the sample Then we

want to minimize the differences between the

force measurements Fi and yi

From calculus recall that we can minimize a

function by differentiating it and solving for

the zero-crossings (which correspond to local

maxima or minima)

First we need such a function to minimize

Perhaps the simplest effective function D is

constructed by squaring and summing the

differences we want to minimize for sample

size N

D(xi) =Nsumi=1

(Fi ndash yi)2 (regression1)

stats Statistics regression Regression p 3

(recall that yi = y(xi) which makes D a function

of x)

Now the form of y must be chosen We consider

only mth-order polynomial functions y but

others can be used in a similar manner

y(x) = a0 + a1x + a2x2 + middot middot middot + amxm (regression2)

If we treat D as a function of the polynomial

coefficients aj ie

D(a0 a1 middot middot middot am) (regression3)

and minimize D for each value of xi we must

take the partial derivatives of D with respect

to each aj and set each equal to zero

partD

parta0= 0

partD

parta1= 0 middot middot middot partD

partam= 0

This gives us N equations and m + 1 unknowns aj

Writing the system in matrix form

1 x1 x21middot middot middot xm1

1 x2 x22 middot middot middot xm21 x3 x23 middot middot middot xm3

1 xN x2Nmiddot middot middot xmN

︸ ︷︷ ︸

ANtimes(m+1)

a0a1a2

am

︸ ︷︷ ︸a(m+1)times1

=

F1F2F3

FN

︸ ︷︷ ︸bNtimes1

(regression4)

Typically N gt m and this is an overdetermined

system Therefore we usually canrsquot solve by

taking Andash1 because A doesnrsquot have an inverse

stats Statistics regression Regression p 4

Instead we either find the Moore-Penrose

pseudo-inverse Adagger and have a = Adaggerb as

the solution which is inefficientmdashor we can

approximate b with an algorithm such as that

used by Matlabrsquos operator In the latter case

a_a = Ab_a

Example statsregressionshy1 re regression

Problem StatementUse Matlabrsquos operator to find a good

polynomial fit for the sample Therersquos the

sometimesshydifficult question ldquowhat order

should we fitrdquo Letrsquos try out several and

see what the squaredshydifferences function D

gives

Problem Solution

N = length(X_a) sample sizem_a = 2N all the order up to N

A = NaNones(length(m_a)max(m_a)N)for k = 1length(m_a) each order

for j = 1N each measurementfor i = 1( m_a(k) + 1 ) each coef

A(kji) = X_a(j)^(i-1)end

endenddisp(squeeze(A(215)))

10000 00100 00001 00000 NaN10000 00210 00004 00000 NaN10000 00300 00009 00000 NaN10000 00410 00017 00001 NaN10000 00490 00024 00001 NaN10000 00500 00025 00001 NaN10000 00610 00037 00002 NaN10000 00710 00050 00004 NaN

stats Statistics regression Regression p 5

10000 00800 00064 00005 NaN10000 00920 00085 00008 NaN10000 01000 00100 00010 NaN

Wersquove printed the first five columns of the

third-order matrix which only has four

columns so NaNs fill in the rest

Now we can use the operator to solve for

the coefficients

a = NaNones(length(m_a)max(m_a))

warning(offall)for i = 1length(m_a)

A_now = squeeze(A(i1m_a(i)))a(i1m_a(i)) = (A_now(1m_a(i))F_a)

endwarning(onall)

n_plot = 100x_plot = linspace(min(X_a)max(X_a)n_plot)y = NaNones(n_plotlength(m_a)) preallocatefor i = 1length(m_a)

y(i) = polyval(fliplr(a(i1m_a(i)))x_plot)end

h = figurefor i = 12length(m_a)-1

p = plot(x_plot1e3y(i)linewidth15)hold onpDisplayName = sprintf(

order i (m_a(i)-1)

)

stats Statistics ex Regression p 6

endp = plot(X_a1e3F_abMarkerSize15)xlabel($X$ (mm)interpreterlatex)ylabel($F$ (N)interpreterlatex)pDisplayName = samplelegend(showlocationsoutheast)grid onhgsave(hfigurestemp)

20 40 60 80 100

50

60

70

X (mm)

F(N)

order 1

order 3

order 5

order 7

order 9

sample

Figure 0418 forceshydisplacement data with curvefits

vecs Statistics Exercises for Chapter 04 p 1

statsex Exercises for Chapter

04

calculus

limit

series

derivative

integral

vector calculus

vecs

Vector calculus

A great many physical situations of interest to

engineers can be described by calculus It can

describe how quantities continuously change

over (say) time and gives tools for computing

other quantities We assume familiarity with

the fundamentals of calculus limit series

derivative and integral From these and a

basic grasp of vectors we will outline some

of the highlights of vector calculus Vector

calculus is particularly useful for describing

the physics of for instance the following

mechanics of particles wherein is studied

the motion of particles and the forcing

causes thereof

rigid-body mechanics wherein is studied the

motion rotational and translational and

its forcing causes of bodies considered

rigid (undeformable)

solid mechanics wherein is studied the

motion and deformation and their forcing

causes of continuous solid bodies (those

that retain a specific resting shape)

vecs Vector calculus p 3

complex analysis

1 For an introduction to complex analysis see Kreyszig (Kreyszig

Advanced Engineering Mathematics Part D)

2 Ibid Chapters 9 10

3 HM Schey Div Grad Curl and All that An Informal Text on Vector

Calculus WW Norton 2005 ISBN 9780393925166

euclidean vector space

bases

components

basis vectors

invariant

coordinate system

Cartesian coordinates

fluid mechanics wherein is studied the

motion and its forcing causes of fluids

(liquids gases plasmas)

heat transfer wherein is studied the

movement of thermal energy through and

among bodies

electromagnetism wherein is studied

the motion and its forcing causes of

electrically charged particles

This last example was in fact very influential

in the original development of both vector

calculus and complex analysis1 It is not

an exaggeration to say that the topics above

comprise the majority of physical topics of

interest in engineering

A good introduction to vector calculus is

given by Kreyszig2 Perhaps the most famous

and enjoyable treatment is given by Schey3

in the adorably titled Div Grad Curl and All

that

It is important to note that in much of what

follows we will describe (typically the three-

dimensional space of our lived experience) as

a euclidean vector space an nshydimensional

vector space isomorphic to Rn As we know

from linear algebra any vector v isin Rn can be

expressed in any number of bases That is the

vector v is a basis-free object with multiple

basis representations The components and

basis vectors of a vector change with basis

changes but the vector itself is invariant A

coordinate system is in fact just a basis We

are most familiar of course with Cartesian

vecs Vector calculus div p 4

manifolds

non-euclidean geometry

parallel postulate

differential geometry

4 John M Lee Introduction to Smooth Manifolds second Vol 218

Graduate Texts in Mathematics Springer 2012

5 Francesco Bullo and Andrew D Lewis Geometric control of

mechanical systems modeling analysis and design for simple

mechanical control systems Ed by JE Marsden L Sirovich and M

Golubitsky Springer 2005

coordinates which is the specific orthonormal

basis b for Rn

b1 =

1

0

0

b2 =

0

1

0

middot middot middot bn =

0

0

1

(vecs1)

Manifolds are spaces that appear locally

as Rn but can be globally rather different

and can describe non-euclidean geometry

wherein euclidean geometryrsquos parallel

postulate is invalid Calculus on manifolds

is the focus of differential geometry a

subset of which we can consider our current

study A motivation for further study of

differential geometry is that it is very

convenient when dealingwith advanced

applications of mechanics such as rigid-body

mechanics of robots and vehicles A very nice

mathematical introduction is given by Lee4 and

Bullo and Lewis5 give a compact presentation

in the context of robotics

Vector fields have several important

properties of interest wersquoll explore in this

chapter Our goal is to gain an intuition of

these properties and be able to perform basic

calculation

vecs Vector calculus div Divergence surface integrals and flux p 1

surface integral

flux

6 Schey Div Grad Curl and All that An Informal Text on Vector

Calculus pp 21shy30

7 Kreyszig Advanced Engineering Mathematics sect 106

source

sink

incompressible

vecsdiv Divergence surface

integrals and flux

Flux and surface integrals

Consider a surface S Let

r(u v) = [x(u v) y(u v) z(u v)] be a parametric

position vector on a Euclidean vector space R3Furthermore let F R3 rarr R3 be a vectorshyvalued

function of r and let n be a unit-normal vector

on a surface S The surface integral

uml

S

F middot ndS (div1)

which integrates the normal of F over the

surface We call this quantity the flux of F

out of the surface S This terminology comes

from fluid flow for which the flux is the

mass flow rate out of S In general the flux

is a measure of a quantity (or field) passing

through a surface For more on computing

surface integrals see Schey6 and Kreyszig7

Continuity

Consider the flux out of a surface S that

encloses a volume ∆V divided by that volume

1

∆V

uml

S

F middot ndS (div2)

This gives a measure of flux per unit volume

for a volume of space Consider its physical

meaningwhen we interpret this as fluid flow

all fluid that enters the volume is negative flux

and all that leaves is positive If physical

conditions are such that we expect no fluid to

enter or exit the volume via what is called a

source or a sink and if we assume the density

of the fluid is uniform (this is called an

incompressible fluid) then all the fluid that

vecs Vector calculus div Divergence surface integrals and flux p 1

continuity equation

divergence

enters the volume must exit and we get

1

∆V

uml

S

F middot ndS = 0 (div3)

This is called a continuity equation although

typically this name is given to equations of the

form in the next section This equation is one of

the governing equations in continuum mechanics

Divergence

Letrsquos take the fluxshypershyvolume as the volume

∆Vrarr 0 we obtain the following

Equation div4 divergence integral

form

lim∆Vrarr0

1

∆V

uml

S

F middot ndS

This is called the divergence of F and is

defined at each point in R3 by taking the

volume to zero about it It is given the

shorthand div F

What interpretation can we give this quantity

It is a measure of the vector fieldrsquos flux

outward through a surface containing an

infinitesimal volume When we consider a fluid

a positive divergence is a local decrease in

density and a negative divergence is a density

increase If the fluid is incompressible and has

no sources or sinks we can write the continuity

vecs Vector calculus div Divergence surface integrals and flux p 1

equation

div F = 0 (div5)

In the Cartesian basis it can be shown that the

divergence is easily computed from the field

F = Fxi + Fyj + Fzk (div6)

as follows

Equation div7 divergence

differential form

div F = partxFx + partyFy + partzFz

Exploring divergence

Divergence is perhaps best explored by

considering it for a vector field in R2 Sucha field F = Fxi + Fyj can be represented

as a ldquoquiverrdquo plot If we overlay the quiver

plot over a ldquocolor densityrdquo plot representing

div F we can increase our intuition about the

divergence

The followingwas generated from a Jupyter

notebook with the following filename and

kernel

notebook filename div_surface_integrals_fluxipynbnotebook kernel python3

vecs Vector calculus div Divergence surface integrals and flux p 2

First load some Python packages

from sympy import import numpy as npimport matplotlibpyplot as pltfrom matplotlibticker import LogLocatorfrom matplotlibcolors import from sympyutilitieslambdify import lambdifyfrom IPythondisplay import display Markdown Latex

Now we define some symbolic variables and

functions

var(xy)F_x = Function(F_x)(xy)F_y = Function(F_y)(xy)

Rather than repeat code letrsquos write a single

function quiver_plotter_2D to make several

of these plots

def quiver_plotter_2D(field=F_xxyF_yxygrid_width=3grid_decimate_x=8grid_decimate_y=8norm=Normalize()density_operation=divprint_density=True

) define symbolicsvar(xy)F_x = Function(F_x)(xy)F_y = Function(F_y)(xy)

vecs Vector calculus div Divergence surface integrals and flux p 3

field_sub = field

compute densityif density_operation is divden = F_xdiff(x) + F_ydiff(y)

elif density_operation is curl in the k directionden = F_ydiff(x) - F_xdiff(y)

elseerror(div and curl are the only density operators)

den_simp = densubs(field_sub

)doit()simplify()if den_simpis_constant()

print(Warning density operator is constant (no density plot)

)if print_density

print(fThe density_operation is)display(den_simp)

lambdify for numericsF_x_sub = F_xsubs(field_sub)F_y_sub = F_ysubs(field_sub)F_x_fun = lambdify((xy)F_xsubs(field_sub)numpy)F_y_fun = lambdify((xy)F_ysubs(field_sub)numpy)if F_x_subis_constantF_x_fun1 = F_x_fun dummyF_x_fun = lambda xy F_x_fun1(xy)npones(xshape)

if F_y_subis_constantF_y_fun1 = F_y_fun dummyF_y_fun = lambda xy F_y_fun1(xy)npones(xshape)

if not den_simpis_constant()den_fun = lambdify((xy) den_simpnumpy)

create gridw = grid_widthY X = npmgrid[-ww100j -ww100j]

evaluate numericallyF_x_num = F_x_fun(XY)F_y_num = F_y_fun(XY)if not den_simpis_constant()den_num = den_fun(XY)

plotp = pltfigure() colormeshif not den_simpis_constant()cmap = pltget_cmap(PiYG)im = pltpcolormesh(XYden_numcmap=cmapnorm=norm)pltcolorbar()

Abs quiverdx = grid_decimate_ydy = grid_decimate_x

vecs Vector calculus div Divergence surface integrals and flux p 4

pltquiver(X[dxdy]Y[dxdy]F_x_num[dxdy]F_y_num[dxdy]units=xy scale=10

)plttitle(fF(xy) = [F_xsubs(field_sub)F_ysubs(field_sub)])return p

Note that while wersquore at it we included a

hook for density plots of the curl of F and

wersquoll return to this in a later lecture

Letrsquos inspect several cases

p = quiver_plotter_2D(field=F_xx2F_yy2

)

The div is

2x + 2y

p = quiver_plotter_2D(field=F_xxyF_yxy

)

The div is

vecs Vector calculus div Divergence surface integrals and flux p 5

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [x2y2]

10

5

0

5

10

Figure 051 png

x + y

p = quiver_plotter_2D(field=F_xx2+y2F_yx2+y2

)

The div is

2x + 2y

p = quiver_plotter_2D(field=F_xx2sqrt(x2+y2)F_yy2sqrt(x2+y2)

vecs Vector calculus div Divergence surface integrals and flux p 6

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [xyxy]

10

5

0

5

10

Figure 052 png

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [x2 + y2x2 + y2]

10

5

0

5

10

Figure 053 png

vecs Vector calculus curl Divergence surface integrals and flux p 7

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [x2sqrt(x2 + y2)y2sqrt(x2 + y2)]

100

10 1010 1

100

Figure 054 png

norm=SymLogNorm(linthresh=3 linscale=3))

The div is

ndashx3 ndash y3 + 2(x + y

) (x2 + y2

)(x2 + y2

) 32

vecs Vector calculus curl Curl line integrals and circulation p 1

line integral

8 Schey Div Grad Curl and All that An Informal Text on Vector

Calculus pp 63shy74

9 Kreyszig Advanced Engineering Mathematics sect 101 102

force

work

circulation

the law of circulation

vecscurl Curl line integrals

and circulation

Line integrals

Consider a curve C in a Euclidean vector space

R3 Let r(t) = [x(t) y(t) z(t)] be a parametric

representation of C Furthermore let F R3 rarr R3

be a vectorshyvalued function of r and let rprime(t)

be the tangent vector The line integral is

ˆ

C

F(r(t)) middot rprime(t) dt (curl1)

which integrates F along the curve For more

on computing line integrals see Schey8 and

Kreyszig9

If F is a force being applied to an object

moving along the curve C the line integral is

the work done by the force More generally the

line integral integrates F along the tangent of

C

Circulation

Consider the line integral over a closed curve

C denoted by

˛

C

F(r(t)) middot rprime(t) dt (curl2)

We call this quantity the circulation of F

around C

For certain vectorshyvalued functions F the

circulation is zero for every curve In these

cases (static electric fields for instance) this

is sometimes called the the law of circulation

vecs Vector calculus curl Curl line integrals and circulation p 2

curl

curl

Curl

Consider the division of the circulation around

a curve in R3 by the surface area it encloses

∆S

1

∆S

˛

C

F(r(t)) middot rprime(t) dt (curl3)

In a manner analogous to the operation

that gaves us the divergence letrsquos consider

shrinking this curve to a point and the surface

area to zero

lim∆Srarr0

1

∆S

˛

C

F(r(t)) middot rprime(t) dt (curl4)

We call this quantity the ldquoscalarrdquo curl of F

at each point in R3 in the direction normal

to ∆S as it shrinks to zero Taking three (or

n for Rn) ldquoscalarrdquo curls in indepedent normal

directions (enough to span the vector space)

we obtain the curl proper which is a vector-

valued function curl R3 rarr R3

The curl is coordinate-independent In

cartesian coordinates it can be shown to be

equivalent to the following

Equation curl5 curl differential

form cartesian coordinates

curl F =[partyFz ndash partzFy partzFx ndash partxFz partxFy ndash partyFx

]gt

But what does the curl of F represent It

quantifies the local rotation of F about each

point If F represents a fluidrsquos velocity curl F

vecs Vector calculus curl Curl line integrals and circulation p 1

vorticity

10 A region is simply connected if every curve in it can shrink to a

point without leaving the region An example of a region that is not

simply connected is the surface of a toroid

path independence

is the local rotation of the fluid about each

point and it is called the vorticity

Zero curl circulation and path

independence

Circulation

It can be shown that if the circulation of F on

all curves is zero then in each direction n and

at every point curl F = 0 (ie n middot curl F = 0)

Conversely for curl F = 0 in a simply connected

region10 F has zero circulation

Succinctly informally and without the

requisite qualifiers above

zero circulationrArr zero curl

(curl6)

zero curl + simply connected regionrArr zero circulation

(curl7)

Path independence

It can be shown that if the path integral of F

on all curves between any two points is path-

independent then in each direction n and

at every point curl F = 0 (ie n middot curl F = 0)

Conversely for curl F = 0 in a simply connected

region all line integrals are independent of

path

Succinctly informally and without the

requisite qualifiers above

path independencerArr zero curl

(curl8)

zero curl + simply connected regionrArr path independence

(curl9)

vecs Vector calculus curl Curl line integrals and circulation p 1

zero curl

zero circulation path independence

connectedness and

Figure 055 an implication graph relating zerocurl zero circulation path independence andconnectedness Green edges represent implication(a implies b) and black edges represent logicalconjunctions

hellip and how they relate

It is also true that

path independencehArr zero circulation

(curl10)

So putting it all together we get Figure 055

Exploring curl

Curl is perhaps best explored by considering

it for a vector field in R2 Such a field in

cartesian coordinates F = Fxi + Fyj has curl

curl F =[party0 ndash partzFy partzFx ndash partx0 partxFy ndash partyFx

]gt=[0 ndash 0 0 ndash 0 partxFy ndash partyFx

]gt=[0 0 partxFy ndash partyFx

]gt (curl11)

That is curl F = (partxFy ndash partyFx )k and the only

nonzero component is normal to the xyshyplane

If we overlay a quiver plot of F over a ldquocolor

densityrdquo plot representing the k-component of

curl F we can increase our intuition about the

curl

The followingwas generated from a Jupyter

vecs Vector calculus curl Curl line integrals and circulation p 2

notebook with the following filename and

kernel

notebook filename curl-and-line-integralsipynbnotebook kernel python3

First load some Python packages

from sympy import import numpy as npimport matplotlibpyplot as pltfrom matplotlibticker import LogLocatorfrom matplotlibcolors import from sympyutilitieslambdify import lambdifyfrom IPythondisplay import display Markdown Latex

Now we define some symbolic variables and

functions

var(xy)F_x = Function(F_x)(xy)F_y = Function(F_y)(xy)

We use the same function defined in

Lecture vecsdiv quiver_plotter_2D to make

several of these plots

Letrsquos inspect several cases

vecs Vector calculus curl Curl line integrals and circulation p 3

100 075 050 025 000 025 050 075 100100

075

050

025

000

025

050

075

100F(xy) = [0cos(2pix)]

6

4

2

0

2

4

6

Figure 056 png

p = quiver_plotter_2D(field=F_x0F_ycos(2pix)density_operation=curlgrid_decimate_x=2grid_decimate_y=10grid_width=1

)

The curl is

ndash2π sin (2πx)

p = quiver_plotter_2D(field=F_x0F_yx2density_operation=curl

vecs Vector calculus curl Curl line integrals and circulation p 4

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [0x2]

6

4

2

0

2

4

6

Figure 057 png

grid_decimate_x=2grid_decimate_y=20

)

The curl is

2x

p = quiver_plotter_2D(field=F_xy2F_yx2density_operation=curlgrid_decimate_x=2grid_decimate_y=20

)

vecs Vector calculus curl Curl line integrals and circulation p 5

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [y2x2]

6

4

2

0

2

4

6

Figure 058 png

The curl is

2x ndash 2y

p = quiver_plotter_2D(field=F_x-yF_yxdensity_operation=curlgrid_decimate_x=6grid_decimate_y=6

)

Warning density operator is constant (no densityplot)rarr

The curl is

vecs Vector calculus grad Curl line integrals and circulation p 6

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [-yx]

Figure 059 png

2

vecs Vector calculus grad Gradient p 1

gradient

direction

magnitude

principle of least action

vecsgrad Gradient

Gradient

The gradient grad of a scalarshyvalued

function f R3 rarr R is a vector field

F R3 rarr R3 that is grad f is a vectorshyvalued

function on R3 The gradientrsquos local direction

and magnitude are those of the local maximum

rate of increase of f This makes it useful in

optimization (eg in the method of gradient

descent)

In classical mechanics quantum mechanics

relativity string theory thermodynamics

and continuum mechanics (and elsewhere)

the principle of least action is taken as

fundamental (Richard P Feynman Robert B

Leighton and Matthew Sands The Feynman

Lectures on Physics New Millennium Perseus

Basic Books 2010) This principle tells us that

naturersquos laws quite frequently seem to be

derivable by assuming a certain quantitymdash

called actionmdashis minimized Considering then

that the gradient supplies us with a tool for

optimizing functions it is unsurprising that the

gradient enters into the equations of motion of

many physical quantities

The gradient is coordinate-independent but

its coordinate-free definitions donrsquot add much

to our intuition In cartesian coordinates it

can be shown to be equivalent to the following

vecs Vector calculus grad Gradient p 1

gradience

11 This is nonstandard terminology but wersquore bold

potentials

Equation grad1 gradient cartesian

coordinates

grad f =[partxf partyf partzf

]gt

Vector fields from gradients are special

Although all gradients are vector fields not

all vector fields are gradients That is given

a vector field F it may or may not be equal to

the gradient of any scalarshyvalued function f

Letrsquos say of a vector field that is a gradient

that it has gradience11 Those vector fields

that are gradients have special properties

Surprisingly those properties are connected to

path independence and curl It can be shown

that iff a field is a gradient line integrals of

the field are path independent That is for a

vector field

gradiencehArr path independence (grad2)

Consideringwhat we know from

Lecture vecscurl about path independence we

can expand Figure 055 to obtain Figure 0510

One implication is that gradients have zero

curl Many important fields that describe

physical interactions (eg static electric fields

Newtonian gravitational fields) are gradients

of scalar fields called potentials

Exploring gradient

Gradient is perhaps best explored by

considering it for a scalar field on R2 Sucha field in cartesian coordinates f(x y) has

vecs Vector calculus grad Gradient p 2

gradience

zero curl

zero circulation path independence

connectedness and

Figure 0510 an implication graph relating gradiencezero curl zero circulation path independence andconnectedness Green edges represent implication(a implies b) and black edges represent logicalconjunctions

gradient

grad f =[partxf partyf

]gt(grad3)

That is grad f = F = partxf i + partyf j If we overlay

a quiver plot of F over a ldquocolor densityrdquo

plot representing the f we can increase our

intuition about the gradient

The followingwas generated from a Jupyter

notebook with the following filename and

kernel

notebook filename gradipynbnotebook kernel python3

First load some Python packages

vecs Vector calculus grad Gradient p 3

from sympy import import numpy as npimport matplotlibpyplot as pltfrom matplotlibticker import LogLocatorfrom matplotlibcolors import from sympyutilitieslambdify import lambdifyfrom IPythondisplay import display Markdown Latex

Now we define some symbolic variables and

functions

var(xy)

(x y)

Rather than repeat code letrsquos write a single

function grad_plotter_2D to make several of

these plots

Letrsquos inspect several cases While considering

the following plots remember that they all

have zero curl

p = grad_plotter_2D(field=x

)

vecs Vector calculus grad Gradient p 4

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = x

3

2

1

0

1

2

Figure 0511 png

The gradient is

[1 0

]

p = grad_plotter_2D(field=x+y

)

The gradient is

[1 1

]

vecs Vector calculus grad Gradient p 1

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = x + y

6

4

2

0

2

4

Figure 0512 png

p = grad_plotter_2D(field=1

)

Warning field is constant (no plot)The gradient is

[0 0

]Gravitational potential

Gravitational potentials have the form of

1distance Letrsquos check out the gradient

vecs Vector calculus grad Gradient p 1

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = 1sqrt(x2 + y2)

100

101

Figure 0513 png

conic section

p = grad_plotter_2D(field=1sqrt(x2+y2)norm=SymLogNorm(linthresh=3 linscale=3)mask=True

)

The gradient is

[ndash x(x2+y2

) 32

ndashy(

x2+y2) 32

]

Conic section fields

Gradients of conic section fields can be

explored

vecs Vector calculus grad Gradient p 2

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = x2

1

2

3

4

5

6

7

8

9

Figure 0514 png

parabolic fields

eliptic fields

The following is called a parabolic field

p = grad_plotter_2D(field=x2

)

The gradient is

[2x 0

]

The following are called eliptic fields

vecs Vector calculus grad Gradient p 3

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = x2 + y2

2

4

6

8

10

12

14

16

18

Figure 0515 png

p = grad_plotter_2D(field=x2+y2

)

The gradient is

[2x 2y

]

p = grad_plotter_2D(field=-x2-y2

)

vecs Vector calculus grad Gradient p 4

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = -x2 - y2

18

16

14

12

10

8

6

4

2

Figure 0516 png

hyperbolic fields

The gradient is

[ndash2x ndash2y

]

The following is called a hyperbolic field

p = grad_plotter_2D(field=x2-y2

)

The gradient is

vecs Vector calculus stoked Gradient p 5

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = x2 - y2

8

6

4

2

0

2

4

6

8

Figure 0517 png

[2x ndash2y

]

vecs Vector calculus stoked Stokes and divergence theorems p 1

divergence theorem

triple integral

orientable

12 A surface is orientable if a consistent normal direction can

be defined Most surfaces are orientable but some notably the

Moumlbius strip cannot be See Kreyszig (Kreyszig Advanced Engineering

Mathematics sect 106) for more

13 Ibid sect 107

14 Schey Div Grad Curl and All that An Informal Text on Vector

Calculus pp 45-52

continuity equation

15 Ibid pp 49-52

vecsstoked Stokes and

divergence

theorems

Two theorems allow us to exchange certain

integrals in R3 for others that are easier to

evaluate

The divergence theorem

The divergence theorem asserts the equality

of the surface integral of a vector field F and

the triple integral of div F over the volume V

enclosed by the surface S in R3 That is

umlSF middot ndS =

˚Vdiv FdV (stoked1)

Caveats are that V is a closed region bounded

by the orientable12 surface S and that F is

continuous and continuously differentiable

over a region containing V This theorem makes

some intuitive sense we can think of the

divergence inside the volume ldquoaccumulatingrdquo

via the triple integration and equaling the

corresponding surface integral For more on the

divergence theorem see Kreyszig13 and Schey14

A lovely application of the divergence theorem

is that for any quantity of conserved stuff

(mass charge spin etc) distributed in a spatial

R3 with timeshydependent density ρ R4 rarr Rand velocity field v R4 rarr R3 the divergence

theorem can be applied to find that

parttρ = ndash div(ρv) (stoked2)

which is a more general form of a continuity

equation one of the governing equations of

many physical phenomena For a derivation of

this equation see Schey15

vecs Vector calculus ex Stokes and divergence theorems p 1

KelvinshyStokesrsquo theorem

piecewise smooth

16 A surface is smooth if its normal is continuous everywhere It

is piecewise smooth if it is composed of a finite number of smooth

surfaces

17 Kreyszig Advanced Engineering Mathematics sect 109

18 Schey Div Grad Curl and All that An Informal Text on Vector

Calculus pp 93-102

Greenersquos theorem

19 Kreyszig Advanced Engineering Mathematics sect 109

generalized Stokesrsquo theorem

20 Lee Introduction to Smooth Manifolds Ch 16

The Kelvin-Stokesrsquo theorem

The KelvinshyStokesrsquo theorem asserts the

equality of the circulation of a vector field

F over a closed curve C and the surface

integral of curl F over a surface S that has

boundary C That is for r(t) a parameterization

of C and surface normal n

˛CF(r(t)) middot rprime(t) dt =

umlSn middot curl FdS (stoked3)

Caveats are that S is piecewise smooth16 its

boundary C is a piecewise smooth simple closed

curve and F is continuous and continuously

differentiable over a region containing S

This theorem is also somewhat intuitive we

can think of the divergence over the surface

ldquoaccumulatingrdquo via the surface integration and

equaling the corresponding circulation For more

on the Kelvin-Stokesrsquo theorem see Kreyszig17

and Schey18

Related theorems

Greenersquos theorem is a twoshydimensional special

case of the Kelvin-Stokesrsquo theorem It is

described by Kreyszig19

It turns out that all of the above theorems

(and the fundamental theorem of calculus

which relates the derivative and integral)

are special cases of the generalized Stokesrsquo

theorem defined by differential geometry

We would need a deeper understanding of

differential geometry to understand this

theorem For more see Lee20

vecs Vector calculus ex Exercises for Chapter 05 p 1

vecsex Exercises for Chapter

05

Exercise 051 (6)

The PDE of Example pdeseparation-1 can be used

to describe the conduction of heat along

a long thin rod insulated along its length

where u(t x) represents temperature The initial

and dirichlet boundary conditions in that

example would be interpreted as an initial

temperature distribution along the bar and

fixed temperatures of the ends Now consider

the same PDE

parttu(t x) = kpart2xxu(t x) (ex1)

with real constant k now with neumann

boundary conditions on inverval x isin [0 L]

partxu|x=0 = 0 and partxu|x=L = 0 (ex2a)

and with initial condition

u(0 x) = f(x) (ex3)

where f is some piecewise continuous function

on [0 L] This represents the complete insulation

of the ends of the rod such that no heat flows

from the ends (or from anywhere else)

a Assume a product solution separate

variables into X(x) and T(t) and set the

separation constant to ndashλ

b Solve the boundary value problem for its

eigenfunctions Xn and eigenvalues λn

c Solve for the general solution of the

time variable ODE

d Write the product solution and apply the

initial condition f(x) by constructing it

from a generalized fourier series of the

product solution

four Vector calculus Exercises for Chapter 05 p 2

e Let L = 1 and f(x) = 100 ndash 200L |x ndash L2| as

shown in Figure 071 Compute the solution

series components Plot the sum of the

first 50 terms over x and t

0 02 04 06 08 1050100

x

f(x)

Figure 0518 initial condition for Exercise 051

four

Fourier andorthogonality

four Fourier and orthogonality series Fourier series p 1

frequency domain

Fourier analysis

1 Itrsquos important to note that the symbol ωn in this context is not the

natural frequency but a frequency indexed by integer n

frequency spectrum

fourseries Fourier series

Fourier series are mathematical series that

can represent a periodic signal as a sum

of sinusoids at different amplitudes and

frequencies They are useful for solving

for the response of a system to periodic

inputs However they are probably most

important conceptually they are our gateway

to thinking of signals in the frequency

domainmdashthat is as functions of frequency

(not time) To represent a function as a

Fourier series is to analyze it as a sum of

sinusoids at different frequencies1 ωn and

amplitudes an Its frequency spectrum is the

functional representation of amplitudes an

versus frequency ωn

Letrsquos begin with the definition

Definition fourseries1 Fourier series

trigonometric form

The Fourier analysis of a periodic function

y(t) is for n isin N0 period T and angular

frequency ωn = 2πnT

an =2

T

ˆ T2

ndashT2y(t) cos(ωnt)dt (series1)

bn =2

T

ˆ T2

ndashT2y(t) sin(ωnt)dt (series2)

The Fourier synthesis of a periodic function

y(t) with analysis components an and bn

corresponding to ωn is

y(t) =a02+

infinsumn=1

an cos(ωnt) + bn sin(ωnt)

(series3)

four Fourier and orthogonality series Fourier series p 2

harmonic

harmonic frequency

fundamental frequency

Letrsquos consider the complex form of the

Fourier series which is analogous to

Definition fourseries1 It may be helpful to

review Eulerrsquos formula(s) ndash see Appendix D01

Definition fourseries2 Fourier series

complex form

The Fourier analysis of a periodic function

y(t) is for n isin N0 period T and angular

frequency ωn = 2πnT

cplusmnn =1

T

ˆ T2

ndashT2y(t)endashjωntdt (series4)

The Fourier synthesis of a periodic function

y(t) with analysis components cn corresponding

to ωn is

y(t) =infinsumn=ndashinfin cne

jωnt (series5)

We call the integer n a harmonic and the

frequency associated with it

ωn = 2πnT (series6)

the harmonic frequency There is a special

name for the first harmonic (n = 1) the

fundamental frequency It is called this

because all other frequency components are

integer multiples of it

It is also possible to convert between the two

representations above

four Fourier and orthogonality series Fourier series p 3

harmonic amplitude

magnitude line spectrum

harmonic phase

Definition fourseries3 Fourier series

converting between

forms

The complex Fourier analysis of a periodic

function y(t) is for n isin N0 and an and bn as

defined above

cplusmnn =1

2

(a|n| ∓ jb|n|

)(series7)

The sinusoidal Fourier analysis of a periodic

function y(t) is for n isin N0 and cn as defined

above

an = cn + cndashn and (series8)

bn = j (cn ndash cndashn) (series9)

The harmonic amplitude Cn is

Cn =

radica2n + b

2n (series10)

= 2radiccncndashn (series11)

A magnitude line spectrum is a graph of the

harmonic amplitudes as a function of the

harmonic frequencies The harmonic phase is

θn = ndash arctan2(bn an) (see Appendix C02)

= arctan2(Im(cn) Re(cn))

The following illustration demonstrates how

sinusoidal components sum to represent a

square wave A line spectrum is also shown

four Fourier and orthogonality series Fourier series p 4

time

frequency

spectral amplitudeamplitude

Let us compute the associated spectral

components in the following example

Example fourseriesshy1 re Fourier series analysis line spectrum

Problem Statement

Compute the first five harmonic amplitudes

that represent the line spectrum for a square

wave in the figure above

four Fourier and orthogonality trans Fourier series p 5

four Fourier and orthogonality trans Fourier transform p 1

fourtrans Fourier transform

The followingwas generated from a Jupyter

notebook with the following filename and

kernel

notebook filename fourier_series_to_transformipynbnotebook kernel python3

We begin with the usual loading of modules

import numpy as np for numericsimport sympy as sp for symbolicsimport matplotlibpyplot as plt for plotsfrom IPythondisplay import display Markdown Latex

Letrsquos consider a periodic function f with

period T (T) Each period the function has a

triangular pulse of width δ (pulse_width) andheight δ2

period = 15 periodpulse_width = 2 pulse width

First we plot the function f in the time

domain Letrsquos begin by defining f

four Fourier and orthogonality trans Fourier transform p 2

def pulse_train(tTpulse_width)f = lambda xpulse_width2-abs(x) pulsetm = npmod(tT)if tm lt= pulse_width2

return f(tm)elif tm gt= T-pulse_width2

return f(-(tm-T))else

return 0

Now we develop a numerical array in time to

plot f

N = 201 number of points to plottpp = nplinspace(-period25period2N) time valuesfpp = nparray(npzeros(tppshape))for it_now in enumerate(tpp)

fpp[i] = pulse_train(t_nowperiodpulse_width)

axes = pltfigure(1)pltplot(tppfppb-linewidth=2) plotpltxlabel(time (s))pltxlim([-period23period2])pltxticks(

[0pulse_width2period][0$delta2$$T=+str(period)+$ s]

)pltyticks([0pulse_width2][0$delta2$])pltshow() display here

For δ = 2 and T isin [5 15 25] the leftshyhand

column of Figure 061 shows two triangle pulses

for each period T

four Fourier and orthogonality trans Fourier transform p 3

Consider the following argument Just as

a Fourier series is a frequency domain

representation of a periodic signal a Fourier

transform is a frequency domain representation

of an aperiodic signal (we will rigorously

define it in a moment) The Fourier series

components will have an analog then in the

Fourier transform Recall that they can be

computed by integrating over a period of the

signal If we increase that period infinitely

the function is effectively aperiodic The

result (within a scaling factor) will be the

Fourier transform analog of the Fourier series

components

Let us approach this understanding by actually

computing the Fourier series components for

increasing period T using Wersquoll use sympyto compute the Fourier series cosine and sine

components an and bn for component n (n) andperiod T (T)

spvar(xa_0a_nb_nreal=True)spvar(deltaTpositive=True)spvar(nnonnegative=True) a0 = 2Tspintegrate( (delta2-spAbs(x)) (x-delta2delta2) otherwise zero )simplify()an = spintegrate(

2T(delta2-spAbs(x))spcos(2sppinTx)(x-delta2delta2) otherwise zero

)simplify()bn = 2Tspintegrate(

(delta2-spAbs(x))spsin(2sppinTx)(x-delta2delta2) otherwise zero

)simplify()display(spEq(a_nan)spEq(b_nbn))

four Fourier and orthogonality trans Fourier transform p 4

an =

T(1ndashcos

(πδnT

))π2n2

for n 6= 0δ2

2Totherwise

bn = 0

Furthermore let us compute the harmonic

amplitude

(f_harmonic_amplitude)

Cn =

radica2n + b

2n (trans1)

which we have also scaled by a factor Tδ in

order to plot it with a convenient scale

spvar(C_npositive=True)cn = spsqrt(an2+bn2)display(spEq(C_ncn))

Cn =

T∣∣∣cos(πδn

T

)ndash1∣∣∣

π2n2for n 6= 0

δ2

2Totherwise

Now we lambdify the symbolic expression for a

numpy function

cn_f = splambdify((nTdelta)cn)

Now we can plot

four Fourier and orthogonality trans Fourier transform p 5

amplitude CnTδ

δ2T = 5 s

δ2

5 10

δ2

δ2

T = 15 s

δ2

5 10

δ2

δ2

T = 25 s

δ2

time (s)

5 10

δ2

frequency ω (rads)

Figure 061 triangle pulse trains (left column) withlonger periods descending and their correspondingline spectra (right column) scaled for convenientcomparison

omega_max = 12 rads max frequency in line spectrumn_max = round(omega_maxperiod(2nppi)) max harmonicn_a = nplinspace(0n_maxn_max+1)omega = 2nppin_aperiodaxes = pltfigure(2)markerline stemlines baseline = pltstem(

omega periodpulse_widthcn_f(n_aperiodpulse_width)linefmt=b- markerfmt=bo basefmt=r-use_line_collection=True

)pltxlabel(frequency $omega$ (rads))pltxlim([0omega_max])pltylim([0pulse_width2])pltyticks([0pulse_width2][0$delta2$])pltshow() show here

The line spectra are shown in the rightshyhand

column of Figure 061 Note that with our chosen

scaling as T increases the line spectra reveal

a distinct waveform

four Fourier and orthogonality trans Fourier transform p 6

2 4 6 8 10 12

δ2

frequency ω (rads)

F(ω)

Figure 062 F(ω) our mysterious Fourier seriesamplitude analog

Let F be the continuous function of angular

frequency ω

F(ω) =δ

2middot sin

2(ωδ4)

(ωδ4)2 (trans2)

First we plot it

F = lambda w pulse_width2 npsin(wpulse_width(22))2 (wpulse_width(22))2

N = 201 number of points to plotwpp = nplinspace(00001omega_maxN)Fpp = []for i in range(0N)

Fppappend(F(wpp[i])) build array of function valuesaxes = pltfigure(3)pltplot(wppFppb-linewidth=2) plotpltxlim([0omega_max])pltyticks([0pulse_width2][0$delta2$])pltxlabel(frequency $omega$ (rads))pltylabel($F(omega)$)pltshow()

Letrsquos consider the plot in Figure 062 of F Itrsquos

obviously the function emerging in Figure 061

from increasing the period of our pulse train

four Fourier and orthogonality trans Fourier transform p 7

Now we are ready to define the Fourier

transform and its inverse

Definition fourtrans1 Fourier transforms

trigonometric form

Fourier transform (analysis)

A(ω) =

ˆ infinndashinfin y(t) cos(ωt)dt (trans3)

B(ω) =

ˆ infinndashinfin y(t) sin(ωt)dt (trans4)

Inverse Fourier transform (synthesis)

y(t) =1

ˆ infinndashinfin A(ω) cos(ωt)dω +

1

ˆ infinndashinfin B(ω) sin(ωt)dω

(trans5)

Definition fourtrans2 Fourier transforms

complex form

Fourier transform F (analysis)

F(y(t)) = Y(ω) =

ˆ infinndashinfin y(t)endashjωtdt (trans6)

Inverse Fourier transform Fndash1 (synthesis)

Fndash1(Y(ω)) = y(t) =1

ˆ infinndashinfin Y(ω)ejωtdω (trans7)

So now we have defined the Fourier transform

There are many applications including solving

differential equations and frequency domain

representationsmdashcalled spectramdashof time domain

functions

There is a striking similarity between the

four Fourier and orthogonality trans Fourier transform p 8

Fourier transform and the Laplace transform

with which you are already acquainted

In fact the Fourier transform is a special

case of a Laplace transform with Laplace

transform variable s = jω instead of having

some real component Both transforms convert

differential equations to algebraic equations

which can be solved and inversely transformed

to find timeshydomain solutions The Laplace

transform is especially important to use

when an input function to a differential

equation is not absolutely integrable and

the Fourier transform is undefined (for

example our definition will yield a transform

for neither the unit step nor the unit ramp

functions) However the Laplace transform is

also preferred for initial value problems due

to its convenient way of handling them The

two transforms are equally useful for solving

steady state problems Although the Laplace

transform has many advantages for spectral

considerations the Fourier transform is the

only game in town

A table of Fourier transforms and their

properties can be found in Appendix B02

Example fourtransshy1 re a Fourier transform

Problem Statement

Consider the aperiodic signal y(t) = us(t)endashat

with us the unit step function and a gt 0 The

signal is plotted below Derive the complex

frequency spectrum and plot its magnitude

and phase

ndash2 0 2 40

051

t

y(t)

four Fourier and orthogonality trans Fourier transform p 9

Problem SolutionThe signal is aperiodic so the Fourier

transform can be computed from

Equation trans6

Y(ω) =

ˆ infinndashinfin y(t)endashjωtdt

=

ˆ infinndashinfin us(t)e

ndashatendashjωtdt (def of y)

=

ˆ infin0

endashatendashjωtdt (us effect)

=

ˆ infin0

e(ndashandashjω)tdt (multiply)

=1

ndasha ndash jωe(ndashandashjω)t

∣∣∣∣infin0dt (antiderivative)

=1

ndasha ndash jω

(limtrarrinfin e(ndashandashjω)t ndash e0

)(evaluate)

=1

ndasha ndash jω

(limtrarrinfin endashatendashjωt ndash 1

)(arrange)

=1

ndasha ndash jω

((0)(complex with mag 6 1) ndash 1

)(limit)

=ndash1

ndasha ndash jω(consequence)

=1

a + jω

=a ndash jω

a ndash jωmiddot 1

a + jω(rationalize)

=a ndash jω

a2 +ω2

The magnitude and phase of this complex

function are straightforward to compute

|Y(ω)| =radicRe(Y(ω))2 + Im(Y(ω))2

=1

a2 +ω2

radica2 +ω2

=1radic

a2 +ω2

angY(ω) = ndash arctan(ωa)

four Fourier and orthogonality general Fourier transform p 10

Now we can plot these functions of ω Setting

a = 1 (arbitrarily) we obtain the following

plots

0

05

1

|Y(ω)|

ndash10 ndash5 0 5 10

ndash1

0

1

ω

angY(ω)

four Fourier and orthogonality general Generalized fourier series and orthogonality p 1

2 A function f is square-integrable ifacuteinfinndashinfin |f(x)|2 dx ltinfin

inner product

weight function

3 This definition of the inner product can be extended to functions

on R2 and R3 domains using double- and triple-integration See

( Schey Div Grad Curl and All that An Informal Text on Vector

Calculus p 261)

orthogonality

basis

function space

legendre polynomials

fourgeneral Generalized

fourier series and

orthogonality

Let f R rarr C g R rarr C and w R rarr Cbe complex functions For square-integrable2

f g and w the inner product of f and g with

weight function w over the interval [a b] sube Ris3

〈f g〉w =ˆ b

af(x)g(x)w(x) dx (general1)

where g denotes the complex conjugate of

g The inner product of functions can be

considered analogous to the inner (or dot)

product of vectors

The fourier series components can be found by a

special property of the sin and cos functions

called orthogonality In general functions

f and g from above are orthogonal over the

interval [a b] iff

〈f g〉w = 0 (general2)

for weight function w Similar to how a set of

orthogonal vectors can be a basis for a vector

space a set of orthogonal functions can be

a basis for a function space a vector space

of functions from one set to another (with

certain caveats)

In addition to some sets of sinusoids there

are several other important sets of functions

that are orthogonal For instance sets of

legendre polynomials (Erwin Kreyszig

Advanced Engineering Mathematics 10th John

Wiley amp Sons Limited 2011 ISBN 9781119571094

The authoritative resource for engineering

four Fourier and orthogonality ex Generalized fourier series and orthogonality p 2

bessel functions

generalized fourier series

Fourier components

synthesis

analysis

fourier-legendre series

fourier-bessel series

mathematics It includes detailed accounts

of probability statistics vector calculus

linear algebra fourier analysis ordinary

and partial differential equations and

complex analysis It also includes several

other topics with varying degrees of depth

Overall it is the best place to start when

seeking mathematical guidance sect 52) and bessel

functions (Kreyszig Advanced Engineering

Mathematics sect 54) are orthogonal

As with sinusoids the orthogonality of some

sets of functions allows us to compute their

series components Let functions f0 f1 middot middot middot beorthogonal with respect to weight function

w on interval [a b] and let α0 α1 middot middot middot be realconstants A generalized fourier series is

(ibid sect 116)

f(x) =infinsumm=0

αmfm(x) (general3)

and represents a function f as a convergent

series It can be shown that the Fourier

components αm can be computed from

αm =〈f fm〉w〈fm fm〉w

(general4)

In keepingwith our previous terminology

for fourier series Equation general3 and

Equation general4 are called general fourier

synthesis and analysis respectively

For the aforementioned legendre and bessel

functions the generalized fourier series are

called fourier-legendre and fourier-bessel

series (ibid sect 116) These and the standard

fourier series (Lecture fourseries) are of

particular interest for the solution of partial

differential equations (Chapter 07)

four Fourier and orthogonality ex Exercises for Chapter 06 p 1

3 Python code in this section was generated from a Jupyter notebook

named random_signal_fftipynb with a python3 kernel

fourex Exercises for Chapter

06

Exercise 061 (secrets)

This exercise encodes a ldquosecret wordrdquo into a

sampled waveform for decoding via a discrete

fourier transform (DFT) The nominal goal of

the exercise is to decode the secret word

Along the way plotting and interpreting the DFT

will be important

First load relevant packages

import numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

We define two functions letter_to_number to

convert a letter into an integer index of the

alphabet (a becomes 1 b becomes 2 etc) and

string_to_number_list to convert a string to

a list of ints as shown in the example at the

end

def letter_to_number(letter)return ord(letter) - 96

def string_to_number_list(string)out = [] listfor i in range(0len(string))

outappend(letter_to_number(string[i]))return out list

print(faces = string_to_number_list(aces) )

four Fourier and orthogonality ex Exercises for Chapter 06 p 2

aces = [1 3 5 19]

Now we encode a code string code into

a signal by beginningwith ldquowhite noiserdquo

which is broadband (appears throughout the

spectrum) and adding to it sin functions

with amplitudes corresponding to the letter

assignments of the code and harmonic

corresponding to the position of the letter in

the string For instance the string bad would

be represented by noise plus the signal

2 sin2πt + 1 sin 4πt + 4 sin 6πt (ex1)

Letrsquos set this up for secret word chupcabra

N = 2000Tm = 30T = float(Tm)float(N)fs = 1Tx = nplinspace(0 Tm N)noise = 4nprandomnormal(0 1 N)code = chupcabra the secret wordcode_number_array = nparray(string_to_number_list(code))y = nparray(noise)for i in range(0len(code))

y = y + code_number_array[i]npsin(2nppi(i+1)x)

For proper decoding later it is important

to know the fundamental frequency of the

generated data

print(ffundamental frequency = fs Hz)

four Fourier and orthogonality ex Exercises for Chapter 06 p 3

0 1 2 3 4 5 6 7

time (s)

minus60

minus40

minus20

0

20

40

60

y n

Figure 063 the chupacabra signal

fundamental frequency = 6666666666666667 Hz

Now we plot

fig ax = pltsubplots()pltplot(xy)pltxlim([0Tm4])pltxlabel(time (s))pltylabel($y_n$)pltshow()

Finally we can save our data to a numpy file

secretsnpy to distribute our message

npsave(secretsy)

four Fourier and orthogonality ex Exercises for Chapter 06 p 1

Now I have done this (for a different secret

word) and saved the data download it here

ricopiconemathematical_foundationssourcesecretsnpy

In order to load the npy file into Python we

can use the following command

secret_array = npload(secretsnpy)

Your job is to (a) perform a DFT (b) plot the

spectrum and (c) decode the message Here are

a few hints

1 Use from scipy import fft to do the

DFT

2 Use a hanning window to minimize the

end-effects See numpyhanning for

instance The fft call might then look

like

2fft(nphanning(N)secret_array)N

where N = len(secret_array)3 Use only the positive spectrum you can

lop off the negative side and double the

positive side

Exercise 062 (seesaw)

Consider the periodic function f R rarr R with

period T defined for one period as

f(t) = at for t isin (ndashT2 T2] (ex2)

where a T isin R Perform a fourier analysis on

f Letting a = 5 and T = 1 plot f along with the

partial sum of the fourier synthesis the first

50 nonzero components over t isin [ndashT T]

four Fourier and orthogonality ex Exercises for Chapter 06 p 1

Exercise 063 (9)

Consider the function f Rrarr R defined as

f(t) =

a ndash a|t|T for t isin [ndashT T]

0 otherwise(ex3)

where a T isin R Perform a fourier analysis on f

resulting in F(ω) Plot F for various a and T

Exercise 064 (10)

Consider the function f Rrarr R defined as

f(t) = aendashb|tndashT| (ex4)

where a b T isin R Perform a fourier analysis on

f resulting in F(ω) Plot F for various a b and

T

Exercise 065 (11)

Consider the function f Rrarr R defined as

f(t) = a cosω0t + b sinω0t (ex5)

where a bω0 isin R constants Perform a fourier

analysis on f resulting in F(ω)

Exercise 066 (12)

Derive a fourier transform property for

expressions including function f Rrarr R for

f(t) cos(ω0t +ψ)

where ω0 ψ isin R

Exercise 067 (13)

Consider the function f Rrarr R defined as

f(t) = aus(t)endashbt cos(ω0t +ψ) (ex6)

where a bω0 ψ isin R and us(t) is the unit

step function Perform a fourier analysis on

f resulting in F(ω) Plot F for various a b ω0 ψ

and T

four Fourier and orthogonality ex Exercises for Chapter 06 p 1

Exercise 068 (14)

Consider the function f Rrarr R defined as

f(t) = g(t) cos(ω0t) (ex7)

where ω0 isin R and g R rarr R will be defined in

each part below Perform a fourier analysis on

f for each g below for ω1 isin R a constant and

consider how things change if ω1 rarr ω0

a g(t) = cos(ω1t)

b g(t) = sin(ω1t)

Exercise 069 (savage)

An instrument called a ldquolock-in amplifierrdquo can

measure a sinusoidal signal A cos(ω0t + ψ) =

a cos(ω0t) + b sin(ω0t) at a known frequency ω0with exceptional accuracy even in the presence

of significant noise N(t) The workings of these

devices can be described in two operations

first the following operations on the signal

with its noise f1(t) = a cos(ω0t) + b sin(ω0t) + N(t)

f2(t) = f1(t) cos(ω1t) and f3(t) = f1(t) sin(ω1t)

(ex8)

where ω0 ω1 a b isin R Note the relation of

this operation to the Fourier analysis of

Exercise 068 The key is to know with some

accuracty ω0 such that the instrument can set

ω1 asymp ω0 The second operation on the signal is

an aggressive lowshypass filter The filtered f2and f3 are called the inshyphase and quadrature

components of the signal and are typically

given a complex representation

(inshyphase) + j (quadrature)

Explain with fourier analyses on f2 and f3

pde Fourier and orthogonality Exercises for Chapter 06 p 1

a what F2 = F(f2) looks like

b what F3 = F(f3) looks like

c why we want ω1 asymp ω0 d why a lowshypass filter is desirable and

e what the timeshydomain signal will look

like

Exercise 0610 (strawman)

Consider again the lock-in amplifier explored

in Exercise 069 Investigate the lock-in

amplifier numerically with the following steps

a Generate a noisy sinusoidal signal

at some frequency ω0 Include enough

broadband white noise that the signal

is invisible in a timeshydomain plot

b Generate f2 and f3 as described in

Exercise 069

c Apply a timeshydomain discrete lowshypass

filter to each f2 7rarr φ2 and f3 7rarr φ3 such

as scipyrsquos scipysignalsosfiltfilt tocomplete the lock-in amplifier operation

Plot the results in time and as a complex

(polar) plot

d Perform a discrete fourier transform

on each f2 7rarr F2 and f3 7rarr F3 Plot the

spectra

e Construct a frequency domain lowshypass

filter F and apply it (multiply) to each

F2 7rarr Fprime2 and F3 7rarr Fprime3 Plot the filtered

spectra

f Perform an inverse discrete fourier

transform to each Fprime2 7rarr fprime2 and Fprime3 7rarr fprime3

Plot the results in time and as a complex

(polar) plot

g Compare the two methods used ie time-

domain filtering versus frequencyshydomain

filtering

ordinary differential equations

lumped-parameter modeling

pde

Partialdifferential

equations

An ordinary differential equation is one

with (ordinary) derivatives of functions a

single variable eachmdashtime in many applications

These typically describe quantities in some sort

of lumped-parameter way mass as a ldquopoint

particlerdquo a springrsquos force as a function of

timeshyvarying displacement across it a resistorrsquos

current as a function of timeshyvarying voltage

across it Given the simplicity of such models

in comparison to the wildness of nature

it is quite surprising how well they work

for a great many phenomena For instance

electronics rigid body mechanics population

dynamics bulk fluid mechanics and bulk heat

transfer can be lumpedshyparameter modeled

However as we saw in there are many

phenomena of which we require more detailed

models These include

pde Partial differential equations p 3

timeshyvarying spatial distribution

partial differential equations

dependent variables

independent variables

analytic solution

numeric solution

1 There are some analytic techniques for gaining insight into PDEs

for which there are no known solutions such as considering the

phase space This is an active area of research for more see Bove

Colombini and Santo (Antonio Bove F (Ferruccio) Colombini and

Daniele Del Santo Phase space analysis of partial differential

equations eng Progress in nonlinear differential equations and

their applications v 69 Boston Berlin Birkhaumluser 2006 ISBN

9780817645212)

bull detailed fluid mechanics

bull detailed heat transfer

bull solid mechanics

bull electromagnetism and

bull quantum mechanics

In many cases what is required to account

for is the timeshyvarying spatial distribution

of a quantity In fluid mechanics we treat a

fluid as having quantities such as density and

velocity that vary continuous over space and

time Deriving the governing equations for such

phenomena typically involves vector calculus

we observed in that statements about

quantities like the divergence (eg continuity)

can be made about certain scalar and vector

fields Such statements are governing equations

(eg the continuity equation) and they are

partial differential equations (PDEs) because

the quantities of interest called dependent

variables (eg density and velocity) are both

temporally and spatially varying (temporal

and spatial variables are therefore called

independent variables)

In this chapter we explore the analytic

solution of PDEs This is related to but

distinct from the numeric solution (ie

simulation) of PDEs which is another important

topic Many PDEs have no known analytic

solution so numeric solution is the best

available option1 However it is important

to note that the insight one can gain from an

analytic solution is often much greater than

that from a numeric solution This is easily

understood when one considers that a numeric

solution is an approximation for a specific set

pde Partial differential equations class p 4

2 Kreyszig Advanced Engineering Mathematics Ch 12

3 WA Strauss Partial Differential Equations An Introduction Wiley

2007 ISBN 9780470054567 A thorough and yet relatively compact

introduction

4 R Haberman Applied Partial Differential Equations with Fourier

Series and Boundary Value Problems (Classic Version) Pearson Modern

Classics for Advanced Mathematics Pearson Education Canada 2018

ISBN 9780134995434

of initial and boundary conditions Typically

very little can be said of what would happen

in general although this is often what we seek

to know So despite the importance of numeric

solution one should always prefer an analytic

solution

Three good texts on PDEs for further study are

Kreyszig2 Strauss3 and Haberman4

pde Partial differential equations class Classifying PDEs p 1

initial conditions

boundary conditions

well-posed problem

linear

nonlinear

pdeclass Classifying PDEs

PDEs often have an infinite number of

solutions however when applying them to

physical systems we usually assume there is

a deterministic or at least a probabilistic

sequence of events will occur Therefore we

impose additonal constraints on a PDE usually

in the form of

1 initial conditions values of

independent variables over all space at

an initial time and

2 boundary conditions values of

independent variables (or their

derivatives) over all time

Ideally imposing such conditions leaves us

with a well-posed problem which has three

aspects (Antonio Bove F (Ferruccio) Colombini

and Daniele Del Santo Phase space analysis of

partial differential equations eng Progress

in nonlinear differential equations and their

applications v 69 Boston Berlin Birkhaumluser

2006 ISBN 9780817645212 sect 15)

existence There exists at least one solution

uniqueness There exists at most one solution

stability If the PDE boundary conditons or

initial conditions are changed slightly

the solution changes only slightly

As with ODEs PDEs can be linear or nonlinear

pde Partial differential equations class Classifying PDEs p 2

order

second-order PDEs

forcing function

homogeneous

that is the independent variables and

their derivatives can appear in only linear

combinations (linear PDE) or in one or more

nonlinear combination (nonlinear PDE) As with

ODEs there are more known analytic solutions

to linear PDEs than nonlinear PDEs

The order of a PDE is the order of its

highest partial derivative A great many

physical models can be described by second-

order PDEs or systems thereof Let u be an

independent scalar variable a function of

m temporal and spatial variables xi isin RnA second-order linear PDE has the form

for coefficients αβγ δ real functions

of xi (WA Strauss Partial Differential

Equations An Introduction Wiley 2007 ISBN

9780470054567 A thorough and yet relatively

compact introduction sect 16)

nsumi=1

nsumj=1

αijpart2xixju︸ ︷︷ ︸

second-order terms

+msumk=1

(γkpartxku + δku

)︸ ︷︷ ︸

first- and zeroth-order terms

= f(x1 middot middot middot xn)︸ ︷︷ ︸forcing

(class1)

where f is called a forcing function When f

is zero Equation class1 is called homogeneous

We can consider the coefficients αij to be

components of a matrix A with rows indexed

by i and columns indexed by j There are four

prominent classes defined by the eigenvalues of

A

elliptic the eigenvalues all have the same sign

parabolic the eigenvalues have the same sign

except one that is zero

hyperbolic exactly one eigenvalue has the

opposite sign of the others and

pde Partial differential equations sturm Classifying PDEs p 3

cauchyshykowalevski theorem

cauchy problems

ultrahyperbolic at least two eigenvalues of

each signs

The first three of these have received extensive

treatment They are named after conic

sections due to the similarity the equations

have with polynomials when derivatives

are considered analogous to powers of

polynomial variables For instance here is a

case of each of the first three classes

part2xxu + part2yyu = 0 (elliptic)

part2xxu ndash part2yyu = 0 (hyperbolic)

part2xxu ndash parttu = 0 (parabolic)

When A depends on xi it may have multiple

classes across its domain In general this

equation and its associated initial and

boundary conditions do not comprise a well-

posed problem however several special cases

have been shown to be wellshyposed Thus far

the most general statement of existence and

uniqueness is the cauchyshykowalevski theorem

for cauchy problems

pde Partial differential equations sturm Sturm-liouville problems p 1

sturm-liouville (SshyL) differential equation

regular SshyL problem

5 For the SshyL problem to be regular it has the additional constraints

that p qσ are continuous and pσ gt 0 on [a b] This is also sometimes

called the sturm-liouville eigenvalue problem See Haberman (Haberman

Applied Partial Differential Equations with Fourier Series and

Boundary Value Problems (Classic Version) sect 53) for the more general

(non-regular) SshyL problem and Haberman (ibid sect 74) for the multi-

dimensional analog

boundary value problems

eigenfunctions

eigenvalues

6 These eigenvalues are closely related to but distinct from the

ldquoeigenvaluesrdquo that arise in systems of linear ODEs

7 Ibid sect 53

pdesturm Sturm-liouville

problems

Before we introduce an important solution

method for PDEs in Lecture pdeseparation we

consider an ordinary differential equation

that will arise in that method when dealing

with a single spatial dimension x the sturm-

liouville (SshyL) differential equation Let

p qσ be functions of x on open interval

(a b) Let X be the dependent variable and λ

constant The regular SshyL problem is the SshyL

ODE5

d

dx

(pXprime

)+ qX + λσX = 0 (sturm1a)

with boundary conditions

β1X(a) + β2Xprime(a) = 0 (sturm2a)

β3X(b) + β4Xprime(b) = 0 (sturm2b)

with coefficients βi isin R This is a type of

boundary value problem

This problem has nontrivial solutions

called eigenfunctions Xn(x) with n isin Z+ corresponding to specific values of λ = λn

called eigenvalues6 There are several

important theorems proven about this (see

Haberman7) Of greatest interest to us are that

1 there exist an infinite number of

eigenfunctions Xn (unique within a

multiplicative constant)

2 there exists a unique corresponding real

eigenvalue λn for each eigenfunction Xn

3 the eigenvalues can be ordered as λ1 lt

λ2 lt middot middot middot

pde Partial differential equations sturm Sturm-liouville problems p 1

homogeneous boundary conditions

periodic boundary conditions

4 eigenfunction Xn has n ndash 1 zeros on open

interval (a b)

5 the eigenfunctions Xn form an

orthogonal basis with respect to

weighting function σ such that

any piecewise continuous function

f [a b] rarr R can be represented by a

generalized fourier series on [a b]

This last theorem will be of particular interest

in Lecture pdeseparation

Types of boundary conditions

Boundary conditions of the sturm-liouville

kind (sturm2) have four sub-types

dirichlet for just β2 β4 = 0

neumann for just β1 β3 = 0

robin for all βi 6= 0 andmixed if β1 = 0 β3 6= 0 if β2 = 0 β4 6= 0

There are many problems that are not regular

sturm-liouville problems For instance the

rightshyhand sides of Equation sturm2 are

zero making them homogeneous boundary

conditions however these can also be

nonzero Another case is periodic boundary

conditions

X(a) = X(b) (sturm3a)

Xprime(a) = Xprime(b) (sturm3b)

pde Partial differential equations sturm Sturm-liouville problems p 2

Example pdesturmshy1 re a sturm-liouville problem with dirichlet

boundary conditions

Problem Statement

Consider the differential equation

Xprimeprime + λX = 0 (sturm4)

with dirichlet boundary conditions on the

boundary of the interval [0 L]

X(0) = 0 and X(L) = 0 (sturm5)

Solve for the eigenvalues and eigenfunctions

Problem SolutionThis is a sturm-liouville problem so we know

the eigenvalues are real The well-known

general solutions to the ODE is

X(x) =

k1 + k2x λ = 0

k1ejradicλx + k2e

ndashjradicλx otherwise

(sturm6)

with real constants k1 k2 The solution must

also satisfy the boundary conditions Letrsquos

apply them to the case of λ = 0 first

X(0) = 0rArr k1 + k2(0) = 0rArr k1 = 0 (sturm7)

X(L) = 0rArr k1 + k2(L) = 0rArr k2 = ndashk1L (sturm8)

Together these imply k1 = k2 = 0 which gives

the trivial solution X(x) = 0 in which we

arenrsquot interested We say then for nontrivial

solutions λ 6= 0 Now letrsquos check λ lt 0 The

solution becomes

X(x) = k1endashradic|λ|x + k2e

radic|λ|x (sturm9)

= k3 cosh(radic|λ|x) + k4 sinh(

radic|λ|x) (sturm10)

where k3 and k4 are real constants Again

applying the boundary conditions

X(0) = 0rArr k3 cosh(0) + k4 sinh(0) = 0rArr k3 + 0 = 0rArr k3 = 0

X(L) = 0rArr 0cosh(radic|λ|L) + k4 sinh(

radic|λ|L) = 0rArr k4 sinh(

radic|λ|L) = 0

However sinh(radic|λ|L) 6= 0 for L gt 0 so

k4 = k3 = 0mdashagain the trivial solution Now

pde Partial differential equations sturm Sturm-liouville problems p 3

letrsquos try λ gt 0 The solution can be written

X(x) = k5 cos(radicλx) + k6 sin(

radicλx) (sturm11)

Applying the boundary conditions for this

case

X(0) = 0rArr k5 cos(0) + k6 sin(0) = 0rArr k5 + 0 = 0rArr k5 = 0

X(L) = 0rArr 0cos(radicλL) + k6 sin(

radicλL) = 0rArr k6 sin(

radicλL) = 0

Now sin(radicλL) = 0 for

radicλL = nπrArr

λ =(nπL

)2 (n isin Z+)

Therefore the only nontrivial solutions

that satisfy both the ODE and the boundary

conditions are the eigenfunctions

Xn(x) = sin(radic

λnx)

(sturm12a)

= sin(nπLx)

(sturm12b)

with corresponding eigenvalues

λn =(nπL

)2 (sturm13)

Note that because λ gt 0 λ1 is the lowest

eigenvalue

Plotting the eigenfunctions

The followingwas generated from a Jupyter

notebook with the following filename and

kernel

notebook filename eigenfunctions_example_plotipynbnotebook kernel python3

First load some Python packages

pde Partial differential equations sturm Sturm-liouville problems p 4

import numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

Set L = 1 and compute values for the first

four eigenvalues lambda_n and eigenfunctions

X_n

L = 1x = nplinspace(0L100)n = nplinspace(144dtype=int)lambda_n = (nnppiL)2X_n = npzeros([len(n)len(x)])for in_i in enumerate(n)X_n[i] = npsin(npsqrt(lambda_n[i])x)

Plot the eigenfunctions

for in_i in enumerate(n)pltplot(

xX_n[i]linewidth=2label=n = +str(n_i)

)pltlegend()pltshow() display the plot

00 02 04 06 08 10

ndash1

ndash05

0

05

1

n = 1

n = 2

n = 3

n = 4

pde Partial differential equations separation Sturm-liouville problems p 5

We see that the fourth of the SshyL theorems

appears true n ndash 1 zeros of Xn exist on the

open interval (0 1)

pde Partial differential equations separation PDE solution by separation of variables p 1

separation of variables

linear

product solution

separable PDEs

pdeseparation PDE solution by

separation of

variables

We are now ready to learn one of the

most important techniques for solving PDEs

separation of variables It applies only to

linear PDEs since it will require the principle

of superposition Not all linear PDEs yield to

this solution technique but several that are

important do

The technique includes the following steps

assume a product solution Assume the

solution can be written as a product

solution up the product of functions

of each independent variable

separate PDE Substitute up into the PDE and

rearrange such that at least one side

of the equation has functions of a single

independent variabe If this is possible

the PDE is called separable

set equal to a constant Each side of

the equation depends on different

independent variables therefore they

must each equal the same constant often

called ndashλ

repeat separation as needed If there are more

than two independent variables there

will be an ODE in the separated variable

and a PDE (with one fewer variables) in

the other independent variables Attempt

to separate the PDE until only ODEs

remain

solve each boundary value problem Solve

each boundary value problem ODE

pde Partial differential equations separation PDE solution by separation of variables p 1

eigenfunctions

superposition

ignoring the initial conditions for now

solve the time variable ODE Solve for the

general solution of the time variable ODE

sans initial conditions

construct the product solution Multiply the

solution in each variable to construct

the product solution up If the boundary

value problems were sturm-liouville

the product solution is a family of

eigenfunctions from which any function

can be constructed via a generalized

fourier series

apply the initial condition The product

solutions individually usually do not

meet the initial condition However a

generalized fourier series of them nearly

always does Superposition tells us a

linear combination of solutions to the

PDE and boundary conditions is also

a solution the unique series that also

satisfies the initial condition is the

unique solution to the entire problem

Example pdeseparationshy1 re 1D diffusion equation

Problem Statement

Consider the oneshydimensional diffusion

equation PDEa

parttu(t x) = kpart2xxu(t x) (separation1)

with real constant k with dirichlet boundary

conditions on inverval x isin [0 L]

u(t 0) = 0 (separation2a)

u(t L) = 0 (separation2b)

and with initial condition

u(0 x) = f(x) (separation3)

where f is some piecewise continuous function

on [0 L]

pde Partial differential equations separation PDE solution by separation of variables p 2

a For more on the diffusion or heat equation see Haberman

(Haberman Applied Partial Differential Equations with Fourier

Series and Boundary Value Problems (Classic Version) sect 23)

Kreyszig (Kreyszig Advanced Engineering Mathematics sect 125)

and Strauss (Strauss Partial Differential Equations An

Introduction sect 23)

Problem Solution

Assume a product solution

First we assume a product solution of the

form up(t x) = T(t)X(x) where T and X are

unknown functions on t gt 0 and x isin [0 L]

Separate PDE

Second we substitute the product solution

into Equation separation1 and separate

variables

TprimeX = kTXprimeprime rArr (separation4)

Tprime

kT=Xprimeprime

X (separation5)

So it is separable Note that we chose to

group k with T which was arbitrary but

conventional

Set equal to a constant

Since these two sides depend on different

independent variables (t and x) they must

equal the same constant we call ndashλ so we

have two ODEs

Tprime

kT= ndashλ rArr Tprime + λkT = 0 (separation6)

Xprimeprime

X= ndashλ rArr Xprimeprime + λX = 0 (separation7)

Solve the boundary value problem

The latter of these equations with the

boundary conditions (separation2) is precisely

the same sturm-liouville boundary value

problem from Example pdesturm-1 which had

pde Partial differential equations separation PDE solution by separation of variables p 3

eigenfunctions

Xn(x) = sin(radic

λnx)

(separation8a)

= sin(nπLx)

(separation8b)

with corresponding (positive) eigenvalues

λn =(nπL

)2 (separation9)

Solve the time variable ODE

The time variable ODE is homogeneous and has

the familiar general solution

T(t) = cendashkλt (separation10)

with real constant c However the boundary

value problem restricted values of λ to λn so

Tn(t) = cendashk(nπL)

2t (separation11)

Construct the product solution

The product solution is

up(t x) = Tn(t)Xn(x)

= cendashk(nπL)2t sin

(nπLx) (separation12)

This is a family of solutions that each satisfy

only exotically specific initial conditions

Apply the initial condition

The initial condition is u(0 x) = f(x) The

eigenfunctions of the boundary value problem

form a fourier series that satisfies the initial

condition on the interval [0 L] if we extend

f to be periodic and odd over x (Kreyszig

Advanced Engineering Mathematics p 550) we

call the extension flowast The odd series synthesis

can be written

flowast(x) =infinsumn=1

bn sin(nπLx)

(separation13)

pde Partial differential equations separation PDE solution by separation of variables p 4

where the fourier analysis gives

bn =2

L

ˆ L

0flowast(χ) sin

(nπLχ) (separation14)

So the complete solution is

u(t x) =infinsumn=1

bnendashk(nπL)

2t sin(nπLx)

(separation15)

Notice this satisfies the PDE the boundary

conditions and the initial condition

Plotting solutions

If we want to plot solutions we need to

specify an initial condition u(0 x) = flowast(x)

over [0 L] We can choose anything piecewise

continuous but for simplicity letrsquos let

f(x) = 1 (x isin [0 L])

The odd periodic extension is an odd square

wave The integral (separation14) gives

bn =4

nπ(1 ndash cos(nπ))

=

0 n even

4nπ n odd

(separation16)

Now we can write the solution as

u(t x) =infinsum

n=1 n odd

4

nπendashk(nπL)

2t sin(nπLx)

(separation17)

Plotting in Python

First load some Python packages

import numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

pde Partial differential equations separation PDE solution by separation of variables p 5

Set k = L = 1 and sum values for the first Nterms of the solution

L = 1k = 1N = 100x = nplinspace(0L300)t = nplinspace(02(Lnppi)2100)u_n = npzeros([len(t)len(x)])for n in range(N)n = n+1 because index starts at 0if n 2 == 0 even

pass already initialized to zeroselse odd

u_n += 4(nnppi)npouter(npexp(-k(nnppiL)2t)npsin(nnppiLx)

)

Letrsquos first plot the initial condition

p = pltfigure()pltplot(xu_n[0])pltxlabel(space $x$)pltylabel($u(0x)$)pltshow()

00 02 04 06 08 10space x

00

02

04

06

08

10

12

u(0x)

pde Partial differential equations wave PDE solution by separation of variables p 6

Now we plot the entire response

p = pltfigure()pltcontourf(txu_nT)c = pltcolorbar()cset_label($u(tx))pltxlabel(time $t$)pltylabel(space $x$)pltshow()

000 005 010 015 020

time t

00

02

04

06

08

10

spacex

000

015

030

045

060

075

090

105

120

u(tx)

We see the diffusive action proceeds as we

expected

Python code in this section was generated from a Jupyter

notebook named pde_separation_example_01ipynb with a

python3 kernel

pde Partial differential equations wave The 1D wave equation p 1

wave equation

8 Kreyszig Advanced Engineering Mathematics sect 129 1210

9 Haberman Applied Partial Differential Equations with Fourier Series

and Boundary Value Problems (Classic Version) sect 45 73

pdewave The 1D wave equation

The oneshydimensional wave equation is the

linear PDE

part2ttu(t x) = c2part2xxu(t x) (wave1)

with real constant c This equation models

such phenomena as strings fluids sound and

light It is subject to initial and boundary

conditions and can be extended to multiple

spatial dimensions For 2D and 3D examples

in rectangular and polar coordinates see

Kreyszig8 and Haberman9

Example pdewaveshy1 re vibrating string PDE solution by separation of

variables

Problem Statement

Consider the oneshydimensional wave equation

PDE

part2ttu(t x) = c2part2xxu(t x) (wave2)

with real constant c and with dirichlet

boundary conditions on inverval x isin [0 L]

u(t 0) = 0 and u(t L) = 0 (wave3a)

and with initial conditions (we need two

because of the second timeshyderivative)

u(0 x) = f(x) and parttu(0 x) = g(x) (wave4)

where f and g are some piecewise continuous

functions on [0 L]

Problem Solution

Assume a product solution

First we assume a product solution of the

form up(t x) = T(t)X(x) where T and X are

unknown functions on t gt 0 and x isin [0 L]

pde Partial differential equations wave The 1D wave equation p 2

Separate PDE

Second we substitute the product solution

into Equation wave2 and separate variables

TprimeprimeX = c2TXprimeprime rArr (wave5)

Tprimeprime

c2T=Xprimeprime

X (wave6)

So it is separable Note that we chose to

group c with T which was arbitrary but

conventional

Set equal to a constant

Since these two sides depend on different

independent variables (t and x) they must

equal the same constant we call ndashλ so we

have two ODEs

Tprimeprime

c2T= ndashλ rArr Tprimeprime + λc2T = 0 (wave7)

Xprimeprime

X= ndashλ rArr Xprimeprime + λX = 0 (wave8)

Solve the boundary value problem

The latter of these equations with the

boundary conditions (wave3) is precisely

the same sturm-liouville boundary value

problem from Example pdesturm-1 which had

eigenfunctions

Xn(x) = sin(radic

λnx)

(wave9a)

= sin(nπLx)

(wave9b)

with corresponding (positive) eigenvalues

λn =(nπL

)2 (wave10)

Solve the time variable ODE

The time variable ODE is homogeneous and

with λ restricted by the reals by the

boundary value problem has the familiar

general solution

T(t) = k1 cos(cradicλt) + k2 sin(c

radicλt) (wave11)

pde Partial differential equations wave The 1D wave equation p 3

with real constants k1 and k2 However the

boundary value problem restricted values of λ

to λn so

Tn(t) = k1 cos(cnπLt)+ k2 sin

(cnπLt)

(wave12)

Construct the product solution

The product solution is

up(t x) = Tn(t)Xn(x)

= k1 sin(nπLx)cos

(cnπLt)+ k2 sin

(nπLx)sin

(cnπLt)

This is a family of solutions that each satisfy

only exotically specific initial conditions

Apply the initial conditions

Recall that superposition tells us that any

linear combination of the product solution is

also a solution Therefore

u(t x) =infinsumn=1

an sin(nπLx)cos

(cnπLt)+ bn sin

(nπLx)sin

(cnπLt)

(wave13)

is a solution If an and bn are properly

selected to satisfy the initial conditions

Equation wave13 will be the solution to the

entire problem Substituting t = 0 into our

potential solution gives

u(0 x) =infinsumn=1

an sin(nπLx)

(wave14a)

parttu(t x)|t=0 =infinsumn=1

bncnπ

Lsin

(nπLx) (wave14b)

Let us extend f and g to be periodic and odd

over x we call the extensions flowast and glowast From

Equation wave14 the intial conditions are

satsified if

flowast(x) =infinsumn=1

an sin(nπLx)

(wave15a)

glowast(x) =infinsumn=1

bncnπ

Lsin

(nπLx) (wave15b)

pde Partial differential equations wave The 1D wave equation p 4

We identify these as two odd fourier

syntheses The corresponding fourier analyses

are

an =2

L

ˆ L

0flowast(χ) sin

(nπLχ)

(wave16a)

bncnπ

L=2

L

ˆ L

0glowast(χ) sin

(nπLχ)

(wave16b)

So the complete solution is Equation wave13

with components given by Equation wave16

Notice this satisfies the PDE the boundary

conditions and the initial condition

Discussion

It can be shown that this series solution is

equivalent to two travelingwaves that are

interfering (see Habermana and Kreyszigb)

This is convenient because computing the

series solution exactly requires an infinite

summation We show in the following section

that the approximation by partial summation is

still quite good

Choosing specific initial conditions

If we want to plot solutions we need to

specify initial conditions over [0 L] Letrsquos

model a string being suddenly struck from

rest as

f(x) = 0 (wave17)

g(x) = δ(x ndash ∆L) (wave18)

where δ is the dirac delta distribution and

∆ isin [0 L] is a fraction of L representing the

location of the string being struck The odd

periodic extension is an odd pulse train The

integrals of (wave16) give

an = 0 (wave19a)

bn =2

cnπ

ˆ L

0δ(χ ndash ∆L) sin

(nπLχ)dx

=2

cnπsin(nπ∆) (sifting property)

pde Partial differential equations wave The 1D wave equation p 5

Now we can write the solution as

u(t x) =infinsumn=1

2

cnπsin(nπ∆) sin

(nπLx)sin

(cnπLt)

(wave20)

Plotting in Python

First load some Python packages

import numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

Set c = L = 1 and sum values for the first

N terms of the solution for some striking

location ∆

Delta = 01 0 lt= Delta lt= LL = 1c = 1N = 150t = nplinspace(030(Lnppi)2100)x = nplinspace(0L150)t_b x_b = npmeshgrid(tx)u_n = npzeros([len(x)len(t)])for n in range(N)n = n+1 because index starts at 0u_n += 4(cnnppi)

npsin(nnppiDelta) npsin(cnnppiLt_b) npsin(nnppiLx_b)

Letrsquos first plot some early snapshots of the

response

pde Partial differential equations wave The 1D wave equation p 6

import seaborn as snsn_snaps = 7snsset_palette(snsdiverging_palette(

240 10 n=n_snaps center=dark)

)

fig ax = pltsubplots()it = nplinspace(277n_snapsdtype=int)for i in range(len(it))axplot(xu_n[it[i]]label=ft = t[i]3f)

lgd = axlegend(bbox_to_anchor=(105 1)loc=upper left

)pltxlabel(space $x$)pltylabel($u(tx)$)pltshow()

00 02 04 06 08 10

space x

minus10

minus05

00

05

10

u(tx

)

t = 0000

t = 0031

t = 0061

t = 0092

t = 0123

t = 0154

t = 0184

Now we plot the entire response

fig ax = pltsubplots()p = axcontourf(txu_n)c = figcolorbar(pax=ax)cset_label($u(tx)$)pltxlabel(time $t$)pltylabel(space $x$)pltshow()

pde Partial differential equations ex The 1D wave equation p 7

00 05 10 15 20 25 30

time t

00

02

04

06

08

10

spac

ex

minus12

minus09

minus06

minus03

00

03

06

09

12

u(tx

)

We see a wave develop and travel reflecting

and inverting off each boundary

a Haberman Applied Partial Differential Equations with Fourier

Series and Boundary Value Problems (Classic Version) sect 44

b Kreyszig Advanced Engineering Mathematics sect 122

b Python code in this section was generated from a Jupyter

notebook named pde_separation_example_02ipynb with a

python3 kernel

pde Partial differential equations ex Exercises for Chapter 07 p 1

pdeex Exercises for Chapter 07

Exercise 071 (poltergeist)

The PDE of Example pdeseparation-1 can be used

to describe the conduction of heat along

a long thin rod insulated along its length

where u(t x) represents temperature The initial

and dirichlet boundary conditions in that

example would be interpreted as an initial

temperature distribution along the bar and

fixed temperatures of the ends Now consider

the same PDE

parttu(t x) = kpart2xxu(t x) (ex1)

with real constant k now with neumann

boundary conditions on inverval x isin [0 L]

partxu|x=0 = 0 and partxu|x=L = 0 (ex2a)

and with initial condition

u(0 x) = f(x) (ex3)

where f is some piecewise continuous function

on [0 L] This represents the complete insulation

of the ends of the rod such that no heat flows

from the ends (or from anywhere else)

a Assume a product solution separate

variables into X(x) and T(t) and set the

separation constant to ndashλ

b Solve the boundary value problem for its

eigenfunctions Xn and eigenvalues λn

c Solve for the general solution of the

time variable ODE

d Write the product solution and apply the

initial condition f(x) by constructing it

from a generalized fourier series of the

product solution

pde Partial differential equations ex Exercises for Chapter 07 p 1

e Let L = k = 1 and f(x) = 100ndash200L |xndashL2| as

shown in Figure 071 Compute the solution

series components Plot the sum of the

first 50 terms over x and t

0 02 04 06 08 1050100

x

f(x)

Figure 071 initial condition for Exercise 051Exercise 072 (kathmandu)

Consider the free vibration of a uniform

and relatively thin beammdashwith modulus of

elasticity E second moment of cross-sectional

area I and massshyper-length micromdashpinned at each

end The PDE describing this is a version of

the euler-bernoulli beam equation for vertical

motion u

part2ttu(t x) = ndashα2part4xxxxu(t x) (ex4)

with real constant α defined as

α2 =EI

micro (ex5)

Pinned supports fix vertical motion such that

we have boundary conditions on interval

x isin [0 L]

u(t 0) = 0 and u(t L) = 0 (ex6a)

Additionally pinned supports cannot provide

a moment so

part2xxu|x=0 = 0 and part2xxu|x=L = 0 (ex6b)

Furthermore consider the initial conditions

u(0 x) = f(x) and parttu|t=0 = 0 (ex7a)

where f is some piecewise continuous function

on [0 L]

opt Partial differential equations Exercises for Chapter 07 p 2

a Assume a product solution separate

variables into X(x) and T(t) and set the

separation constant to ndashλ

b Solve the boundary value problem for

its eigenfunctions Xn and eigenvalues λn

Assume real λ gt 0 (itrsquos true but tedious to

show)

c Solve for the general solution of the

time variable ODE

d Write the product solution and apply

the initial conditions by constructing

it from a generalized fourier series of

the product solution

e Let L = α = 1 and f(x) = sin(10πxL)

as shown in Figure 072 Compute the

solution series components Plot the sum

of the first 50 terms over x and t

0 02 04 06 08 1ndash101

x

f(x)

Figure 072 initial condition for Exercise 072

opt

Optimization

opt Optimization grad Gradient descent p 1

objective function

extremum

global extremum

local extremum

gradient descent

stationary point

saddle point

hessian matrix

optgrad Gradient descent

Consider a multivariate function f Rn rarr Rthat represents some cost or value This is

called an objective function and we often

want to find an X isin Rn that yields frsquos

extremum minimum or maximum depending on

whichever is desirable

It is important to note however that some

functions have no finite extremum Other

functions have multiple Finding a global

extremum is generally difficult however

many good methods exist for finding a local

extremum an extremum for some region R sub Rn

The method explored here is called gradient

descent It will soon become apparent why it

has this name

Stationary points

Recall from basic calculus that a function f of

a single variable had potential local extrema

where df(x) dx = 0 The multivariate version of

this for multivariate function f is

grad f = 0 (grad1)

A value X for which Equation grad1 holds is

called a stationary point However as in the

univariate case a stationary point may not be

a local extremum in these cases it called a

saddle point

Consider the hessian matrix H with values for

independent variables xi

Hij = part2xixjf (grad2)

opt Optimization grad Gradient descent p 1

second partial derivative test

positive definite

negative definite

indefinite

convex

concave

the gradient points toward stationary points

For a stationary point X the second partial

derivative test tells us if it is a local

maximum local minimum or saddle point

minimum If H(X) is positive definite (all its

eigenvalues are positive)

X is a local minimum

maximum If H(X) is negative definite (all its

eigenvalues are negative)

X is a local maximum

saddle If H(X) is indefinite (it has both

positive and negative eigenvalues)

X is a saddle point

These are sometimes called tests for concavity

minima occur where f is convex and maxima

where f is concave (ie where ndashf is convex)

It turns out however that solving

Equation grad1 directly for stationary points is

generally hard Therefore we will typically use

an iterative technique for estimating them

The gradient points the way

Although Equation grad1 isnrsquot usually directly

useful for computing stationary points it

suggests iterative techniques that are Several

techniques rely on the insight that the

gradient points toward stationary points

Recall from Lecture vecsgrad that grad f is a

vector field that points in the direction of

greatest increase in f

opt Optimization grad Gradient descent p 1

line search

step size α

Consider starting at some point x0 and

wanting to move iteratively closer to a

stationary point So if one is seeking a

maximum of f then choose x1 to be in the

direction of grad f If one is seeking a

minimum of f then choose x1 to be opposite the

direction of grad f

The question becomes how far α should we go

in (or opposite) the direction of the gradient

Surely too-small α will require more iteration

and too-large α will lead to poor convergence

or missing minima altogether This framing of

the problem is called line search There are a

few common methods for choosing α called the

step size some more computationally efficient

than others

Two methods for choosing the step size

are described below Both are framed as

minimization methods but changing the sign of

the step turns them into maximization methods

The classical method

Let

gk = grad f(xk) (grad3)

the gradient at the algorithmrsquos current

estimate xk of the minimum The classical

method of choosing α is to attempt to solve

analytically for

αk = argminα

f(xk ndash αgk) (grad4)

This solution approximates the function f

as one varies α It is approximate because

as α varies so should x But even with α as

the only variable Equation grad4 may be

opt Optimization grad Gradient descent p 1

1 Kreyszig Advanced Engineering Mathematics sect 221

2 Jonathan Barzilai and Jonathan M Borwein Two-Point Step Size

Gradient Methods In IMA Journal of Numerical Analysis 81 (Jan 1988)

pp 141ndash148 issn 0272-4979 doi 101093imanum81141 This

includes an innovative line search method

difficult or impossible to solve However this

is sometimes called the ldquooptimalrdquo choice for

α Here ldquooptimalityrdquo refers not to practicality

but to ideality This method is rarely used to

solve practical problems

The algorithm of the classical gradient descent

method can be summarized in the pseudocode

of Algorithm 1 It is described further in

Kreyszig1

Algorithm 1 Classical gradient descent

1 procedure classical_minimizer(fx0 T)

2 while δx gt T do until threshold T is

met

3 gk larr grad f(xk)

4 αk larr argminα f(xk ndash αgk)

5 xk+1 larr xk ndash αkgk6 δxlarr xk+1 ndash xk7 klarr k + 1

8 end while

9 return xk the threshold was reached

10 end procedureThe Barzilai and Borwein method

In practice several non-classical methods are

used for choosing step size α Most of these

construct criteria for step sizes that are too

small and too large and prescribe choosing

some α that (at least in certain cases) must

be in the sweet-spot in between Barzilai and

Borwein2 developed such a prescription which

we now present

Let ∆xk = xk ndash xkndash1 and ∆gk = gk ndash gkndash1 This

method minimizes ∆x ndash α∆g2 by choosing

αk =∆xk middot ∆gk∆gk middot ∆gk

(grad5)

opt Optimization grad Gradient descent p 2

3 Barzilai and Borwein Two-Point Step Size Gradient Methods

The algorithm of this gradient descent

method can be summarized in the pseudocode

of Algorithm 2 It is described further in

Barzilai and Borwein3

Algorithm 2 Barzilai and Borwein gradient

descent

1 procedure barzilai_minimizer(fx0 T)

2 while δx gt T do until threshold T is

met

3 gk larr grad f(xk)

4 ∆gk larr gk ndash gkndash15 ∆xk larr xk ndash xkndash16 αk larr

∆xkmiddot∆gk∆gkmiddot∆gk

7 xk+1 larr xk ndash αkgk8 δxlarr xk+1 ndash xk9 klarr k + 1

10 end while

11 return xk the threshold was reached

12 end procedure

Example optgradshy1 re Barzilai and Borwein gradient descent

Problem Statement

Consider the functions (a) f1 R2 rarr R and (b)

f2 R2 rarr R defined as

f1(x) = (x1 ndash 25)2 + 13(x2 + 10)

2 (grad6)

f2(x) =1

2x middot Ax ndash b middot x (grad7)

where

A =

[20 0

0 10

]and (grad8a)

b =[1 1

]gt (grad8b)

Use the method of Barzilai and Borweina

starting at some x0 to find a minimum of each

function

a Ibid

Problem Solution

First load some Python packages

opt Optimization grad Gradient descent p 3

from sympy import import numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latexfrom tabulate import tabulate

We begin by writing a class gradient_descent_minto perform the gradient descent This is not

optimized for speed

class gradient_descent_min() A Barzilai and Borwein gradient descent class

Inputs f Python function of x variables x list of symbolic variables (eg [x1 x2]) x0 list of numeric initial guess of a min of f T step size threshold for stopping the descent

To execute the gradient descent call descend method

nb This is only for gradients in cartesiancoordinates Further work would be to implementthis in multiple or generalized coordinatesSee the grad method below for implementation

def __init__(selffxx0T)selff = fselfx = Array(x)selfx0 = nparray(x0)selfT = Tselfn = len(x0) size of xselfg = lambdify(xselfgrad(fx)numpy)selfxk = nparray(x0)selftable =

def descend(self) unpack variablesf = selffx = selfxx0 = selfx0T = selfTg = selfg initialize variablesN = 0

opt Optimization grad Gradient descent p 4

x_k = x0dx = 2T cant be zerox_km1 = 9x0-1 cant equal x0g_km1 = nparray(g(x_km1))N_max = 100 max iterationstable_data = [[Nx0nparray(g(x0))0]]while (dx gt T and N lt N_max) or N lt 1N += 1 increment indexg_k = nparray(g(x_k))dg_k = g_k - g_km1dx_k = x_k - x_km1alpha_k = dx_kdot(dg_k)dg_kdot(dg_k)x_km1 = x_k storex_k = x_k - alpha_kg_k savetable_dataappend([Nx_kg_kalpha_k])selfxk = npvstack((selfxkx_k)) store other variablesg_km1 = g_kdx = nplinalgnorm(x_k - x_km1) check

selftabulater(table_data)

def tabulater(selftable_data)npset_printoptions(precision=2)tabulateLATEX_ESCAPE_RULES=selftable[python] = tabulate(table_dataheaders=[Nx_kg_kalpha]

)selftable[latex] = tabulate(table_dataheaders=[

$N$$bmx_k$$bmg_k$$alpha$]tablefmt=latex_raw

)

def grad(selffx) cartesian coords gradientreturn derive_by_array(f(x)x)

First consider f1

var(x1 x2)x = Array([x1x2])f1 = lambda x (x[0]-25)2 + 13(x[1]+10)2gd = gradient_descent_min(f=f1x=xx0=[-5040]T=1e-8)

opt Optimization grad Gradient descent p 5

Perform the gradient descent

gddescend()

Print the interesting variables

print(gdtable[python])

N xk gk α

0 [-50 40] [-150 1300] 0

1 [-4365 -1503] [-150 1300] 00423296

2 [shy3836 -10 ] [-1373 -13074] 00384979

3 [shy3311 -10 ] [-127e+02 124e-01] 0041454

4 [ 2499 -10 ] [-116e+02 -962e-03] 0499926

5 [ 25 -1005] [-002 012] 0499999

6 [ 25 -10] [-184e-08 -138e+00] 00385225

7 [ 25 -10] [-170e-08 219e-03] 00384615

8 [ 25 -10] [-157e-08 000e+00] 00384615

Now letrsquos lambdify the function f1 so we can

plot

f1_lambda = lambdify((x1x2)f1(x)numpy)

Now letrsquos plot a contour plot with the

gradient descent overlaid

opt Optimization grad Gradient descent p 6

fig ax = pltsubplots() contour plotX1 = nplinspace(-100100100)X2 = nplinspace(-5050100)X1 X2 = npmeshgrid(X1X2)F1 = f1_lambda(X1X2)pltcontourf(X1X2F1) gradient descent plotfrom mpl_toolkitsmplot3d import Axes3Dfrom matplotlibcollections import LineCollectionxX1 = gdxk[0]xX2 = gdxk[1]points = nparray([xX1 xX2])Treshape(-1 1 2)segments = npconcatenate([points[-1] points[1]] axis=1

)lc = LineCollection(segmentscmap=pltget_cmap(Reds)

)lcset_array(nplinspace(01len(xX1))) color segslcset_linewidth(3)axautoscale(False) avoid the scatter changing limsaxadd_collection(lc)axscatter(xX1xX2zorder=1marker=ocolor=pltcmReds(nplinspace(01len(xX1)))edgecolor=none

)pltshow()

minus100 minus75 minus50 minus25 0 25 50 75 100

minus40

minus20

0

20

40

opt Optimization grad Gradient descent p 7

Now consider f2

A = Matrix([[100][020]])b = Matrix([[11]])def f2(x)X = Array([x])tomatrix()Treturn 12Xdot(AX) - bdot(X)

gd = gradient_descent_min(f=f2x=xx0=[50-40]T=1e-8)

Perform the gradient descent

gddescend()

Print the interesting variables

print(gdtable[python])

opt Optimization grad Gradient descent p 8

N xk gk α

0 [ 50 -40] [ 499 -801] 0

1 [1758 1204] [ 499 -801] 00649741

2 [ 807 -101] [17478 23988] 00544221

3 [362 017] [ 7966 shy2122] 00558582

4 [ 049 -005] [3516 249] 00889491

5 [01 014] [ 389 -194] 00990201

6 [01 0 ] [004 19 ] 00750849

7 [01 005] [ 001 -095] 0050005

8 [01 005] [474e-03 958e-05] 00500012

9 [01 005] [ 237e-03 shy238e-09] 00999186

10 [01 005] [193e-06 237e-09] 01

11 [01 005] [ 000e+00 shy237e-09] 00999997

Now letrsquos lambdify the function f2 so we can

plot

f2_lambda = lambdify((x1x2)f2(x)numpy)

Now letrsquos plot a contour plot with the

gradient descent overlaid

fig ax = pltsubplots() contour plotX1 = nplinspace(-100100100)X2 = nplinspace(-5050100)X1 X2 = npmeshgrid(X1X2)F2 = f2_lambda(X1X2)pltcontourf(X1X2F2) gradient descent plotfrom mpl_toolkitsmplot3d import Axes3Dfrom matplotlibcollections import LineCollectionxX1 = gdxk[0]xX2 = gdxk[1]points = nparray([xX1 xX2])Treshape(-1 1 2)

opt Optimization lin Gradient descent p 9

segments = npconcatenate([points[-1] points[1]] axis=1

)lc = LineCollection(segmentscmap=pltget_cmap(Reds)

)lcset_array(nplinspace(01len(xX1))) color segslcset_linewidth(3)axautoscale(False) avoid the scatter changing limsaxadd_collection(lc)axscatter(xX1xX2zorder=1marker=ocolor=pltcmReds(nplinspace(01len(xX1)))edgecolor=none

)pltshow()

minus100 minus75 minus50 minus25 0 25 50 75 100

minus40

minus20

0

20

40

Python code in this section was generated from a Jupyter

notebook named gradient_descentipynb with a python3kernel

opt Optimization lin Constrained linear optimization p 1

constraints

linear programming problem

maximize

constrained

linear

feasible solution

optimal

hyperplane

optlin Constrained linear

optimization

Consider a linear objective function f

Rn rarr R with variables xi in vector x and

coefficients ci in vector c

f(x) = c middot x (lin1)

subject to the linear constraintsmdashrestrictions

on ximdash

Ax 6 a (lin2a)

Bx = b and (lin2b)

l 6 x 6 u (lin2c)

where A and B are constant matrices and

ab lu are nshyvectors This is one formulation

of what is called a linear programming

problem Usually we want to maximize f over

the constraints Such problems frequently

arise throughout engineering for instance

in manufacturing transportation operations

etc They are called constrained because

there are constraints on x they are called

linear because the objective function and the

constraints are linear

We call a pair (x f(x)) for which x satisfies

Equation lin2 a feasible solution Of course

not every feasible solution is optimal a

feasible solution is optimal iff there exists no

other feasible solution for which f is greater

(assumingwersquore maximizing) We call the vector

subspace of feasible solutions S sub Rn

Feasible solutions form a polytope

Consider the effect of the constraints Each

of the equalities and inequalities defines a

linear hyperplane in Rn (ie a linear subspace

opt Optimization lin Constrained linear optimization p 1

flat

polytope

vertices

basic feasible solutions

level set

an optimal solution occurs at a vertex of S

of dimension n ndash 1) either as a boundary of

S (inequality) or as a restriction of S to the

hyperplane When joined these hyperplanes

are the boundary of S (equalities restrict S

to lower dimension) So we see that each of

the boundaries of S is flat which makes S a

polytope (in R2 a polygon) What makes this

especially interesting is that polytopes have

vertices where the hyperplanes intersect

Solutions at the vertices are called basic

feasible solutions

Only the vertices matter

Our objective function f is linear so for

some constant h f(x) = h defines a level set

that is itself a hyperplane H in Rn If thishyperplane intersects S at a point x (x f(x) = h)

is the corresponding solution There are three

possibilities when H intersects S

1 H cap S is a vertex of S

2 H cap S is a boundary hyperplane of S or

3 H cap S slices through the interior of S

However this third option implies that there

exists a level set G corresponding to f(x) =

g such that G intersects S and g gt h so

solutions on H cap S are not optimal (We have

not proven this but it may be clear from

our progression) We conclude that either the

first or second case must be true for optimal

solutions And notice that in both cases a

(potentially optimal) solution occurs at at

least one vertex The key insight then is that

an optimal solution occurs at a vertex of

S

opt Optimization simplex Constrained linear optimization p 2

This means we donrsquot need to search all of S

or even its boundary we need only search the

vertices Helpful as this is it restricts us down

to( n constraints

)potentially optimal solutionsmdash

usually still too many to search in a naiumlve

way In Lecture optsimplex this is mitigated

by introducing a powerful searching method

opt Optimization simplex The simplex algorithm p 1

simplex algorithm

4 Another Python package pulp (PuLP) is probably more popular for

linear programming however we choose scipyoptimize because it

has applications beyond linear programming

optsimplex The simplex

algorithm

The simplex algorithm (or ldquomethodrdquo) is an

iterative technique for finding an optimal

solution of the linear programming problem

of Equations lin1 and lin2 The details of

the algorithm are somewhat involved but

the basic idea is to start at a vertex of the

feasible solution space S and traverse an edge

of the polytope that leads to another vertex

with a greater value of f Then repeat this

process until there is no neighboring vertex

with a greater value of f at which point the

solution is guaranteed to be optimal

Rather than present the details of the

algorithm we choose to show an example using

Python There have been some improvements

on the original algorithm that have been

implemented into many standard software

packages including Pythonrsquos scipy package

(Pauli Virtanen et al SciPy 10ndashFundamental

Algorithms for Scientific Computing in Python

In arXiv eshyprints arXiv190710121 [July 2019]

arXiv190710121) module scipyoptimize4

Example optsimplexshy1 re simplex method using scipyoptimize

Problem Statement

Maximize the objective function

f(x) = c middot x (simplex1a)

for x isin R2 and

c =[5 2

]gt(simplex1b)

opt Optimization simplex The simplex algorithm p 2

subject to constraints

0 6 x1 6 10 (simplex2a)

ndash5 6 x2 6 15 (simplex2b)

4x1 + x2 6 40 (simplex2c)

x1 + 3x2 6 35 (simplex2d)

ndash8x1 ndash x2 gt ndash75 (simplex2e)

Problem Solution

First load some Python packages

from scipyoptimize import linprogimport numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

Encoding the problem

Before we can use linprog we must first

encode Equations simplex2 and ex4 into

a form linprog will recognize We begin

with f which we can write as c middot x with the

coefficients of c as follows

c = [-5 -2] negative to find max

Wersquove negated each constant because

linprog minimizes f and we want to

maximize f Now letrsquos encode the inequality

constraints We will write the leftshyhand

side coefficients in the matrix A and the

opt Optimization simplex The simplex algorithm p 3

rightshyhand-side values in vector a such that

Ax 6 a (simplex3)

Notice that one of our constraint

inequalities is gt instead of 6 We can

flip this by multiplying the inequality by ndash1

We use simple lists to encode A and a

A = [[4 1][1 3][8 1]

]a = [40 35 75]

Now we need to define the lower l and upper

u bounds of x The function linprog expects

these to be in a single list of lower- and

upper-bounds of each xi

lu = [(0 10)(-515)

]

We want to keep track of each step linprogtakes We can access these by defining a

function callback to be passed to linprog

x = [] for storing the stepsdef callback(res) called at each step

opt Optimization simplex The simplex algorithm p 4

global xprint(fnit = resnit x_k = resx)xappend(resxcopy()) store

Now we need to call linprog We donrsquot have

any equality constraints so we need only use

the keyword arguments A_ub=A and b_ub=aFor demonstration purposes we tell it to

use the simplex method which is not as

good as its other methods which use better

algorithms based on the simplex

res = linprog(cA_ub=Ab_ub=abounds=lumethod=simplexcallback=callback

)

x = nparray(x)

nit = 0 x_k = [ 0 -5]nit = 1 x_k = [10 -5]nit = 2 x_k = [875 5 ]nit = 3 x_k = [772727273 909090909]nit = 4 x_k = [772727273 909090909]nit = 5 x_k = [772727273 909090909]nit = 5 x_k = [772727273 909090909]

So the optimal solution (x f(x)) is as follows

opt Optimization simplex The simplex algorithm p 5

print(foptimum x resx)print(foptimum f(x) -resfun)

optimum x [772727273 909090909]optimum f(x) 5681818181818182

The last point was repeated

1 once because there was no adjacent

vertex with greater f(x) and

2 twice because the algorithm calls

lsquocallbacklsquo twice on the last step

Plotting

When the solution space is in R2 it is

helpful to graphically represent the solution

space constraints and the progression of

the algorithm We begin by defining the

inequality lines from A and a over the bounds

of x1

n = len(c) number of variables xm = npshape(A)[0] number of inequality constraintsx2 = npempty([m2])for i in range(0m)x2[i] = -A[i][0]A[i][1]nparray(lu[0]) + a[i]A[i][1]

Now we plot a contour plot of f over

the bounds of x1 and x2 and overlay the

opt Optimization simplex The simplex algorithm p 6

inequality constraints and the steps of the

algorithm stored in x

lu_array = nparray(lu)fig ax = pltsubplots()mplrcParams[lineslinewidth] = 3 contour plotX1 = nplinspace(lu_array[0]100)X2 = nplinspace(lu_array[1]100)X1 X2 = npmeshgrid(X1X2)F2 = -c[0]X1 + -c[1]X2 negative because max hackcon = axcontourf(X1X2F2)cbar = figcolorbar(conax=ax)cbaraxset_ylabel(objective function) bounds on xun = nparray([11])opts = cwlabelNonelinewidth6pltplot(lu_array[0]lu_array[10]unopts)pltplot(lu_array[0]lu_array[11]unopts)pltplot(lu_array[00]unlu_array[1]opts)pltplot(lu_array[01]unlu_array[1]opts) inequality constraintsfor i in range(0m)p = pltplot(lu[0]x2[i]c=w)

pset_label(constraint) stepspltplot(x[0]x[1]-oc=rclip_on=Falsezorder=20label=simplex

)pltylim(lu_array[1])pltxlabel($x_1$)pltylabel($x_2$)pltlegend()pltshow()

opt Optimization ex The simplex algorithm p 7

0 2 4 6 8 10x1

minus50

minus25

00

25

50

75

100

125

150

x2

constraint

simplex

minus15

0

15

30

45

60

75

90

obje

ctiv

efu

nct

ion

Python code in this section was generated from a Jupyter

notebook named simplex_linear_programmingipynb with a

python3 kernel

nlin Optimization Exercises for Chapter 08 p 1

5 Barzilai and Borwein Two-Point Step Size Gradient Methods

optex Exercises for Chapter 08

Exercise 081 (cummerbund)

Consider the functions (a) f1 R2 rarr R and (b)

f2 R2 rarr R defined as

f1(x) = 4(x1 ndash 16)2 + (x2 + 64)

2 + x1 sin2 x1 (ex1)

f2(x) =1

2x middot Ax ndash b middot x (ex2)

where

A =

[5 0

0 15

]and (ex3a)

b =[ndash2 1

]gt (ex3b)

Use the method of Barzilai and Borwein5

starting at some x0 to find a minimum of each

function

Exercise 082 (melty)

Maximize the objective function

f(x) = c middot x (ex4a)

for x isin R3 and

c =[3 ndash8 1

]gt(ex4b)

subject to constraints

0 6 x1 6 20 (ex5a)

ndash5 6 x2 6 0 (ex5b)

5 6 x3 6 17 (ex5c)

x1 + 4x2 6 50 (ex5d)

2x1 + x3 6 43 (ex5e)

ndash4x1 + x2 ndash 5x3 gt ndash99 (ex5f)

1 As is customary we frequently say ldquosystemrdquo when we mean

ldquomathematical system modelrdquo Recall that multiple models may be

used for any given physical system depending on what one wants to

know

nlin

Nonlinear analysis

1 The ubiquity of near-linear systems and

the tools we have for analyses thereof can

sometimes give the impression that nonlinear

systems are exotic or even downright

flamboyant However a great many systems1

important for a mechanical engineer are

frequently hopelessly nonlinear Here are a some

examples of such systems

bull A robot arm

bull Viscous fluid flow (usually modelled by

the navier-stokes equations)

bull

bull Anything that ldquofills uprdquo or ldquosaturatesrdquo

bull Nonlinear optics

bull Einsteinrsquos field equations (gravitation in

general relativity)

bull Heat radiation and nonlinear heat

conduction

bull Fracture mechanics

bull

nlin Nonlinear analysis ss p 3

2 SH Strogatz and M Dichter Nonlinear Dynamics and Chaos

Second Studies in Nonlinearity Avalon Publishing 2016 isbn

9780813350844

3 W Richard Kolk and Robert A Lerman Nonlinear System Dynamics

1st ed Springer US 1993 isbn 978-1-4684shy6496shy2

2 Lest we think this is merely an

inconvenience we should keep in mind that

it is actually the nonlinearity that makes

many phenomena useful For instance the

depends on the nonlinearity

of its optics Similarly transistors and the

digital circuits made thereby (including the

microprocessor) wouldnrsquot function if their

physics were linear

3 In this chapter we will see some ways

to formulate characterize and simulate

nonlinear systems Purely

are few for nonlinear systems

Most are beyond the scope of this text but we

describe a few mostly in Lecture nlinchar

Simulation via numerical integration of

nonlinear dynamical equations is the most

accessible technique so it is introduced

4 We skip a discussion of linearization

of course if this is an option it is

preferable Instead we focus on the

5 For a good introduction to nonlinear

dynamics see Strogatz and Dichter2 A more

engineer-oriented introduction is Kolk and

Lerman3

nlin Nonlinear analysis ss Nonlinear state-space models p 1

nonlinear stateshyspace models

autonomous system

nonautonomous system

4 Strogatz and Dichter Nonlinear Dynamics and Chaos

equilibrium state

stationary point

nlinss Nonlinear stateshyspace

models

1 A state-space model has the general form

dx

dt= f(xu t) (ss1a)

y = (ss1b)

where f and g are vectorshyvalued functions

that depend on the system Nonlinear state-

space models are those for which f is a

functional of either x or u

For instance a state variable x1 might appear

as x21or two state variables might combine as

x1x2 or an input u1 might enter the equations as

log u1

Autonomous and nonautonomous systems

2 An autonomous system is one for which

f(x) with neither time nor input appearing

explicitly A nonautonomous system is one

for which either t or u do appear explicitly

in f It turns out that we can always write

nonautonomous systems as autonomous by

substituting in u(t) and introducing an extra

for t4

3 Therefore without loss of generality we

will focus on ways of analyzing autonomous

systems

Equilibrium

4 An equilibrium state (also called a

) x is one for which

dx dt = 0 In most cases this occurs only

when the input u is a constant u and for

timeshyvarying systems at a given time t For

autonomous systems equilibrium occurs when

nlin Nonlinear analysis char Nonlinear state-space models p 2

the following holds

(ss2)

This is a system of nonlinear algebraic

equations which can be challenging to solve for

x However frequently several solutionsmdashthat

is equilibrium statesmdashdo exist

nlin Nonlinear analysis char Nonlinear system characteristics p 1

system order

nlinchar Nonlinear system

characteristics

1 Characterizing nonlinear systems can be

challengingwithout the tools developed

for system characterization

However there are ways of characterizing

nonlinear systems and wersquoll here explore a few

DUM See Equation ss1a

Those in-common with linear systems

2 As with linear systems the system order is

either the number of stateshyvariables required

to describe the system or equivalently the

highest-order in a single

scalar differential equation describing the

system

3 Similarly nonlinear systems can have state

variables that depend on alone

or those that also depend on

(or some other independent variable) The

former lead to ordinary differential equations

(ODEs) and the latter to partial differential

equations (PDEs)

4 Equilibrium was already considered in

Lecture nlinss

Stability

5 In terms of system performance perhaps

no other criterion is as important as

nlin Nonlinear analysis char Nonlinear system characteristics p 1

Lyapunov stability theory

5 William L Brogan Modern Control Theory Third Prentice Hall 1991

Ch 10

6 A Choukchou-Braham et al Analysis and Control of

Underactuated Mechanical Systems SpringerLink Buumlcher Springer

International Publishing 2013 isbn 9783319026367 App A

Definition nlinchar1 Stability

If x is perturbed from an equilibrium state x

the response x(t) can

1 asymptotically return to x

(asymptotically )

2 diverge from x ( ) or

3 remain perturned or oscillate

about x with a constant amplitude

( stable)

Notice that this definition is actually

local stability in the neighborhood of one

equilibrium may not be the same as in the

neighborhood of another

6 Other than nonlinear systemsrsquo lack of

linear systemsrsquo eigenvalues poles and roots

of the characteristic equation from which to

compute it the primary difference between

the stability of linear and nonlinear systems

is that nonlinear system stability is often

difficult to establish

Using a linear systemrsquos eigenvalues it is

straightforward to establish stable unstable

and marginally stable subspaces of state-space

(via transforming to an eigenvector basis)

For nonlinear systems no such method exists

However we are not without tools to explore

nonlinear system stability One mathematical

tool to consider is

which is beyond the scope of this

course but has good treatments in5 and6

Qualities of equilibria

7 Equilibria (ie stationary points) come in

a variety of qualities It is instructive to

nlin Nonlinear analysis char Nonlinear system characteristics p 2

x

xprime

(a) r lt 0

x

xprime

(b) r = 0

x

xprime

(c) r gt 0

Figure 091 plots of xprime versus x for Equation char1

attractor

sink

stable

consider the first-order differential equation

in state variable with real constant

xprime = rx ndash x3 (char1)

If we plot xprime versus x for different values of r

we obtain the plots of Figure 091

8 By definition equilibria occur when

xprime = 0 so the xshyaxis crossings of Figure 091

are equilibria The blue arrows on the xshyaxis

show the of state

change xprime quantified by the plots For both

(a) and (b) only one equilibrium exists x = 0

Note that the blue arrows in both plots point

toward the equilibrium In such casesmdashthat is

when a exists around

an equilibrium for which state changes point

toward the equilibriummdashthe equilibrium is

called an or

Note that attractors are

9 Now consider (c) of Figure 091 When r gt 0

three equilibria emerge This change of the

number of equilibria with the changing of a

parameter is called a

nlin Nonlinear analysis sim Nonlinear system characteristics p 3

bifurcation diagram

repeller

source

unstable

saddle

A plot of bifurcations versus the parameter

is called a bifurcation diagram The x =

0 equilibrium now has arrows that point

from it Such an equilibrium is

called a or

and is The other two

equilibria here are (stable) attractors

Consider a very small initial condition x(0) = ε

If ε gt 0 the repeller pushes away x and the

positive attractor pulls x to itself Conversely

if ε lt 0 the repeller again pushes away x and

the negative attractor pulls x to itself

10 Another type of equilibrium is called

the one which acts as an

attractor along some lines and as a repeller

along others We will see this type in the

following example

Example nlincharshy1 re Saddle bifurcation

Problem Statement

Consider the dynamical equation

xprime = x2 + r (char2)

with r a real constant Sketch xprime vs x for

negative zero and positive r Identify and

classify each of the equilibria

nlin Nonlinear analysis pysim Nonlinear system simulation p 1

nlinsim Nonlinear system

simulation

nlin Nonlinear analysis pysim Simulating nonlinear systems in Python p 1

nlinpysim Simulating nonlinear

systems in Python

Example nlinpysimshy1 re a nonlinear unicycle

Problem Statementasdf

Problem Solution

First load some Python packages

import numpy as npimport sympy as spfrom scipyintegrate import solve_ivpimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

The state equation can be encoded via the

following function f

def f(t x u c)dxdt = [

x[3]npcos(x[2])x[3]npsin(x[2])x[4]1c[0] u(t)[0]1c[1] u(t)[1]

]return dxdt

The input function u must also be defined

nlin Nonlinear analysis ex Simulating nonlinear systems in Python p 2

def u(t)return [

15(1+npcos(t))25npsin(3t)

]

Define time spans initial values and constantstspan = nplinspace(0 50 300)xinit = [00000]mass = 10inertia = 10c = [massinertia]

Solve differential equationsol = solve_ivp(

lambda t x f(t x u c)[tspan[0] tspan[-1]]xinitt_eval=tspan

)

Letrsquos first plot the trajectory and

instantaneous velocity

ndash75 ndash50 ndash25 0 25 50 75 100x

ndash75

ndash50

ndash25

0

25

50

75

100

y

Figure 092

xp = soly[3]npcos(soly[2])yp = soly[3]npsin(soly[2])p = pltfigure()pltplot(soly[0]soly[1])pltquiver(soly[0]soly[1]xpyp)pltxlabel($x$)pltylabel($y$)pltshow()

Python code in this section was generated from a Jupyter

notebook named horcruxipynb with a python3 kernel

Nonlinear analysis Exercises for Chapter 09 p 1

nlinex Exercises for Chapter 09

A

Distribution tables

A Distribution tables A01 Gaussian distribution table p 1

A01 Gaussian distribution table

Below are plots of the Gaussian probability

density function f and cumulative distribution

function Φ Below them is Table A1 of CDF

values

zb

ndash3 ndash2 ndash1 0 1 2 30

01

02

03

04

z

Gaussia

nPDFf(z)

Φ(zb)

f(z)

(a) the Gaussian PDF

ndash3 ndash2 ndash1 0 1 2 30

02

04

06

08

1

interval bound zb

Gaussia

nCDFΦ(zb)

(b) the Gaussian CDF

Figure A1 the Gaussian PDF and CDF for z-scoresTable A1 z-score table Φ(zb) = P(z isin (ndashinfin zb])

zb 0 1 2 3 4 5 6 7 8 9

shy34 00003 00003 00003 00003 00003 00003 00003 00003 00003 00002

shy33 00005 00005 00005 00004 00004 00004 00004 00004 00004 00003

shy32 00007 00007 00006 00006 00006 00006 00006 00005 00005 00005

shy31 00010 00009 00009 00009 00008 00008 00008 00008 00007 00007

shy30 00013 00013 00013 00012 00012 00011 00011 00011 00010 00010

shy29 00019 00018 00018 00017 00016 00016 00015 00015 00014 00014

shy28 00026 00025 00024 00023 00023 00022 00021 00021 00020 00019

shy27 00035 00034 00033 00032 00031 00030 00029 00028 00027 00026

shy26 00047 00045 00044 00043 00041 00040 00039 00038 00037 00036

shy25 00062 00060 00059 00057 00055 00054 00052 00051 00049 00048

shy24 00082 00080 00078 00075 00073 00071 00069 00068 00066 00064

shy23 00107 00104 00102 00099 00096 00094 00091 00089 00087 00084

shy22 00139 00136 00132 00129 00125 00122 00119 00116 00113 00110

shy21 00179 00174 00170 00166 00162 00158 00154 00150 00146 00143

shy20 00228 00222 00217 00212 00207 00202 00197 00192 00188 00183

shy19 00287 00281 00274 00268 00262 00256 00250 00244 00239 00233

shy18 00359 00351 00344 00336 00329 00322 00314 00307 00301 00294

shy17 00446 00436 00427 00418 00409 00401 00392 00384 00375 00367

shy16 00548 00537 00526 00516 00505 00495 00485 00475 00465 00455

A Distribution tables A01 Gaussian distribution table p 2

Table A1 z-score table Φ(zb) = P(z isin (ndashinfin zb])

zb 0 1 2 3 4 5 6 7 8 9

shy15 00668 00655 00643 00630 00618 00606 00594 00582 00571 00559

shy14 00808 00793 00778 00764 00749 00735 00721 00708 00694 00681

shy13 00968 00951 00934 00918 00901 00885 00869 00853 00838 00823

shy12 01151 01131 01112 01093 01075 01056 01038 01020 01003 00985

shy11 01357 01335 01314 01292 01271 01251 01230 01210 01190 01170

shy10 01587 01562 01539 01515 01492 01469 01446 01423 01401 01379

shy09 01841 01814 01788 01762 01736 01711 01685 01660 01635 01611

shy08 02119 02090 02061 02033 02005 01977 01949 01922 01894 01867

shy07 02420 02389 02358 02327 02296 02266 02236 02206 02177 02148

shy06 02743 02709 02676 02643 02611 02578 02546 02514 02483 02451

shy05 03085 03050 03015 02981 02946 02912 02877 02843 02810 02776

shy04 03446 03409 03372 03336 03300 03264 03228 03192 03156 03121

shy03 03821 03783 03745 03707 03669 03632 03594 03557 03520 03483

shy02 04207 04168 04129 04090 04052 04013 03974 03936 03897 03859

shy01 04602 04562 04522 04483 04443 04404 04364 04325 04286 04247

shy00 05000 04960 04920 04880 04840 04801 04761 04721 04681 04641

00 05000 05040 05080 05120 05160 05199 05239 05279 05319 05359

01 05398 05438 05478 05517 05557 05596 05636 05675 05714 05753

02 05793 05832 05871 05910 05948 05987 06026 06064 06103 06141

03 06179 06217 06255 06293 06331 06368 06406 06443 06480 06517

04 06554 06591 06628 06664 06700 06736 06772 06808 06844 06879

05 06915 06950 06985 07019 07054 07088 07123 07157 07190 07224

06 07257 07291 07324 07357 07389 07422 07454 07486 07517 07549

07 07580 07611 07642 07673 07704 07734 07764 07794 07823 07852

08 07881 07910 07939 07967 07995 08023 08051 08078 08106 08133

09 08159 08186 08212 08238 08264 08289 08315 08340 08365 08389

10 08413 08438 08461 08485 08508 08531 08554 08577 08599 08621

11 08643 08665 08686 08708 08729 08749 08770 08790 08810 08830

12 08849 08869 08888 08907 08925 08944 08962 08980 08997 09015

13 09032 09049 09066 09082 09099 09115 09131 09147 09162 09177

14 09192 09207 09222 09236 09251 09265 09279 09292 09306 09319

15 09332 09345 09357 09370 09382 09394 09406 09418 09429 09441

16 09452 09463 09474 09484 09495 09505 09515 09525 09535 09545

17 09554 09564 09573 09582 09591 09599 09608 09616 09625 09633

18 09641 09649 09656 09664 09671 09678 09686 09693 09699 09706

19 09713 09719 09726 09732 09738 09744 09750 09756 09761 09767

20 09772 09778 09783 09788 09793 09798 09803 09808 09812 09817

21 09821 09826 09830 09834 09838 09842 09846 09850 09854 09857

22 09861 09864 09868 09871 09875 09878 09881 09884 09887 09890

23 09893 09896 09898 09901 09904 09906 09909 09911 09913 09916

24 09918 09920 09922 09925 09927 09929 09931 09932 09934 09936

25 09938 09940 09941 09943 09945 09946 09948 09949 09951 09952

26 09953 09955 09956 09957 09959 09960 09961 09962 09963 09964

27 09965 09966 09967 09968 09969 09970 09971 09972 09973 09974

28 09974 09975 09976 09977 09977 09978 09979 09979 09980 09981

29 09981 09982 09982 09983 09984 09984 09985 09985 09986 09986

30 09987 09987 09987 09988 09988 09989 09989 09989 09990 09990

31 09990 09991 09991 09991 09992 09992 09992 09992 09993 09993

32 09993 09993 09994 09994 09994 09994 09994 09995 09995 09995

33 09995 09995 09995 09996 09996 09996 09996 09996 09996 09997

34 09997 09997 09997 09997 09997 09997 09997 09997 09997 09998

A Distribution tables Studentrsquos tshydistribution table p 1

A02 Studentrsquos t-distribution

tableTable A2 two-tail inverse studentrsquos t-distribution table

percent probability

ν 600 667 750 800 875 900 950 975 990 995 999

1 0325 0577 1000 1376 2414 3078 6314 12706 31821 63657 31831

2 0289 0500 0816 1061 1604 1886 2920 4303 6965 9925 22327

3 0277 0476 0765 0978 1423 1638 2353 3182 4541 5841 10215

4 0271 0464 0741 0941 1344 1533 2132 2776 3747 4604 7173

5 0267 0457 0727 0920 1301 1476 2015 2571 3365 4032 5893

6 0265 0453 0718 0906 1273 1440 1943 2447 3143 3707 5208

7 0263 0449 0711 0896 1254 1415 1895 2365 2998 3499 4785

8 0262 0447 0706 0889 1240 1397 1860 2306 2896 3355 4501

9 0261 0445 0703 0883 1230 1383 1833 2262 2821 3250 4297

10 0260 0444 0700 0879 1221 1372 1812 2228 2764 3169 4144

11 0260 0443 0697 0876 1214 1363 1796 2201 2718 3106 4025

12 0259 0442 0695 0873 1209 1356 1782 2179 2681 3055 3930

13 0259 0441 0694 0870 1204 1350 1771 2160 2650 3012 3852

14 0258 0440 0692 0868 1200 1345 1761 2145 2624 2977 3787

15 0258 0439 0691 0866 1197 1341 1753 2131 2602 2947 3733

16 0258 0439 0690 0865 1194 1337 1746 2120 2583 2921 3686

17 0257 0438 0689 0863 1191 1333 1740 2110 2567 2898 3646

18 0257 0438 0688 0862 1189 1330 1734 2101 2552 2878 3610

19 0257 0438 0688 0861 1187 1328 1729 2093 2539 2861 3579

20 0257 0437 0687 0860 1185 1325 1725 2086 2528 2845 3552

21 0257 0437 0686 0859 1183 1323 1721 2080 2518 2831 3527

22 0256 0437 0686 0858 1182 1321 1717 2074 2508 2819 3505

23 0256 0436 0685 0858 1180 1319 1714 2069 2500 2807 3485

24 0256 0436 0685 0857 1179 1318 1711 2064 2492 2797 3467

25 0256 0436 0684 0856 1178 1316 1708 2060 2485 2787 3450

26 0256 0436 0684 0856 1177 1315 1706 2056 2479 2779 3435

27 0256 0435 0684 0855 1176 1314 1703 2052 2473 2771 3421

28 0256 0435 0683 0855 1175 1313 1701 2048 2467 2763 3408

29 0256 0435 0683 0854 1174 1311 1699 2045 2462 2756 3396

30 0256 0435 0683 0854 1173 1310 1697 2042 2457 2750 3385

35 0255 0434 0682 0852 1170 1306 1690 2030 2438 2724 3340

40 0255 0434 0681 0851 1167 1303 1684 2021 2423 2704 3307

45 0255 0434 0680 0850 1165 1301 1679 2014 2412 2690 3281

50 0255 0433 0679 0849 1164 1299 1676 2009 2403 2678 3261

55 0255 0433 0679 0848 1163 1297 1673 2004 2396 2668 3245

60 0254 0433 0679 0848 1162 1296 1671 2000 2390 2660 3232infin 0253 0431 0674 0842 1150 1282 1645 1960 2326 2576 3090

B

Fourier andLaplace tables

B Fourier and Laplace tables B01 Laplace transforms p 1

B01 Laplace transforms

Table B1 is a table with functions of time

f(t) on the left and corresponding Laplace

transforms L(s) on the right Where applicable

s = σ + jω is the Laplace transform variable

T is the timeshydomain period ω02πT is the

corresponding angular frequency j =radicndash1 a isin R+

and b t0 isin R are constants

Table B1 Laplace transform identities

function of time t function of Laplace s

a1f1(t) + a2f2(t) a1F1(s) + a2F2(s)

f(t ndash t0) F(s)endasht0s

fprime(t) sF(s) ndash f(0)

dnf(t)

dtnsnF(s) + s(nndash1)f(0) + s(nndash2)fprime(0) + middot middot middot + f(nndash1)(0)

ˆ t

0f(τ)dτ

1

sF(s)

tf(t) ndashFprime(s)

f1(t) lowast f2(t) =ˆ infinndashinfin f1(τ)f2(t ndash τ)dτ F1(s)F2(s)

δ(t) 1

us(t) 1s

ur(t) 1s2

tnndash1(n ndash 1) 1sn

endashat1

s + a

tendashat1

(s + a)2

B Fourier and Laplace tables B01 Laplace transforms p 2

1

(n ndash 1)tnndash1endashat

1

(s + a)n

1

a ndash b(eat ndash ebt)

1

(s ndash a)(s ndash b)(a 6= b)

1

a ndash b(aeat ndash bebt)

s

(s ndash a)(s ndash b)(a 6= b)

sinωtω

s2 +ω2

cosωts

s2 +ω2

eat sinωtω

(s ndash a)2 +ω2

eat cosωts ndash a

(s ndash a)2 +ω2

B Fourier and Laplace tables B02 Fourier transforms p 1

B02 Fourier transforms

Table B2 is a table with functions of time

f(t) on the left and corresponding Fourier

transforms F(ω) on the right Where applicable

T is the timeshydomain period ω02πT is the

corresponding angular frequency j =radicndash1 a isin R+

and b t0 isin R are constants Furthermore feand f0 are even and odd functions of time

respectively and it can be shown that any

function f can be written as the sum f(t) =

fe(t) + f0(t) (Hsu1967)

Table B2 Fourier transform identities

function of time t function of frequency ω

a1f1(t) + a2f2(t) a1F1(ω) + a2F2(ω)

f(at)1

|a|F(ωa)

f(ndasht) F(ndashω)

f(t ndash t0) F(ω)endashjωt0

f(t) cosω0t1

2F(ω ndashω0) +

1

2F(ω +ω0)

f(t) sinω0t1

j2F(ω ndashω0) ndash

1

j2F(ω +ω0)

fe(t) Re F(ω)

f0(t) j Im F(ω)

F(t) 2πf(ndashω)

fprime(t) jωF(ω)

dnf(t)

dtn(jω)nF(ω)

ˆ t

ndashinfin f(τ)dτ1

jωF(ω) + πF(0)δ(ω)

B Fourier and Laplace tables B02 Fourier transforms p 2

ndashjtf(t) Fprime(ω)

(ndashjt)nf(t)dnF(ω)

dωn

f1(t) lowast f2(t) =ˆ infinndashinfin f1(τ)f2(t ndash τ)dτ F1(ω)F2(ω)

f1(t)f2(t)1

2πF1(ω) lowast F2(ω) =

1

ˆ infinndashinfin F1(α)F2(ω ndash α)dα

endashatus(t)1

jω + a

endasha|t|2a

a2 +ω2

endashat2 radic

πa endashω2(4a)

1 for |t| lt a2 else 0a sin(aω2)

aω2

tendashatus(t)1

(a + jω)2

tnndash1

(n ndash 1)endashat)nus(t)

1

(a + jω)n

1

a2 + t2π

aendasha|ω|

δ(t) 1

δ(t ndash t0) endashjωt0

us(t) πδ(ω) +1

us(t ndash t0) πδ(ω) +1

jωendashjωt0

1 2πδ(ω)

t 2πjδprime(ω)

tn 2πjndnδ(ω)

dωn

ejω0t 2πδ(ω ndashω0)

cosω0t πδ(ω ndashω0) + πδ(ω +ω0)

B Fourier and Laplace tables Fourier transforms p 3

sinω0t ndashjπδ(ω ndashω0) + jπδ(ω +ω0)

us(t) cosω0tjω

ω20 ndashω2+π

2δ(ω ndashω0) +

π

2δ(ω +ω0)

us(t) sinω0tω0

ω20 ndashω2+π

2jδ(ω ndashω0) ndash

π

2jδ(ω +ω0)

tus(t) jπδprime(ω) ndash 1ω2

1t πj ndash 2πjus(ω)

1tn(ndashjω)nndash1

(n ndash 1)

(πj ndash 2πjus(ω)

)sgn t

2

jωinfinsumn=ndashinfin δ(t ndash nT) ω0

infinsumn=ndashinfin δ(ω ndash nω0)

C

Mathematicsreference

C Mathematics reference C01 Quadratic forms p 1

C01 Quadratic forms

The solution to equations of the form ax2 + bx +

c = 0 is

x =ndashbplusmn

radicb2 ndash 4ac

2a (C011)

Completing the square

This is accomplished by reshywriting the quadratic

formula in the form of the leftshyhand-side (LHS)

of this equality which describes factorization

x2 + 2xh + h2 = (x + h)2 (C012)

C Mathematics reference C02 Trigonometry p 1

C02 Trigonometry

Triangle identities

With reference to the below figure the law of

sines is

sin α

a=sin β

b=sin γ

c(C021)

and the law of cosines is

c2 = a2 + b2 ndash 2ab cos γ (C022a)

b2 = a2 + c2 ndash 2ac cos β (C022b)

a2 = c2 + b2 ndash 2cb cos α (C022c)

b

ca

α γ

β

Reciprocal identities

csc u =1

sin u(C023a)

sec u =1

cos u(C023b)

cot u =1

tan u(C023c)

Pythagorean identities

1 = sin2 u + cos2 u (C024a)

sec2 u = 1 + tan2 u (C024b)

csc2 u = 1 + cot2 u (C024c)

C Mathematics reference C02 Trigonometry p 1

Co-function identities

sin(π2ndash u

)= cos u (C025a)

cos(π2ndash u

)= sin u (C025b)

tan(π2ndash u

)= cot u (C025c)

csc(π2ndash u

)= sec u (C025d)

sec(π2ndash u

)= csc u (C025e)

cot(π2ndash u

)= tan u (C025f)

Even-odd identities

sin(ndashu) = ndash sin u (C026a)

cos(ndashu) = cos u (C026b)

tan(ndashu) = ndash tan u (C026c)

Sumshydifference formulas (AM or lock-in)

sin(uplusmn v) = sin u cos vplusmn cos u sin v (C027a)

cos(uplusmn v) = cos u cos v∓ sin u sin v (C027b)

tan(uplusmn v) = tan uplusmn tan v1∓ tanu tan v

(C027c)

Double angle formulas

sin(2u) = 2 sin u cos u (C028a)

cos(2u) = cos2 u ndash sin2 u (C028b)

= 2 cos2 u ndash 1 (C028c)

= 1 ndash 2 sin2 u (C028d)

tan(2u) =2 tanu

1 ndash tan2 u(C028e)

Power-reducing or halfshyangle formulas

sin2 u =1 ndash cos(2u)

2(C029a)

cos2 u =1 + cos(2u)

2(C029b)

tan2 u =1 ndash cos(2u)

1 + cos(2u)(C029c)

C Mathematics reference C02 Trigonometry p 1

Sum-toshyproduct formulas

sin u + sin v = 2 sinu + v

2cos

u ndash v

2(C0210a)

sin u ndash sin v = 2 cosu + v

2sin

u ndash v

2(C0210b)

cos u + cos v = 2 cosu + v

2cos

u ndash v

2(C0210c)

cos u ndash cos v = ndash2 sinu + v

2sin

u ndash v

2(C0210d)

Product-to-sum formulas

sin u sin v =1

2

[cos(u ndash v) ndash cos(u + v)

](C0211a)

cos u cos v =1

2

[cos(u ndash v) + cos(u + v)

](C0211b)

sin u cos v =1

2

[sin(u + v) + sin(u ndash v)

](C0211c)

cos u sin v =1

2

[sin(u + v) ndash sin(u ndash v)

](C0211d)

Two-to-one formulas

A sin u + B cos u = C sin(u + φ) (C0212a)

= C cos(u +ψ) where (C0212b)

C =

radicA2 + B2 (C0212c)

φ = arctanB

A(C0212d)

ψ = ndash arctanA

B(C0212e)

C Mathematics reference C03 Matrix inverses p 1

adjoint

C03 Matrix inverses

This is a guide to inverting 1times1 2times2 and ntimesnmatrices

Let A be the 1times 1 matrix

A =[a]

The inverse is simply the reciprocal

Andash1 =[1a

]

Let B be the 2times 2 matrix

B =

[b11 b12b21 b22

]

It can be shown that the inverse follows a

simple pattern

Bndash1 =1

det B

[b22 ndashb12ndashb21 b11

]

=1

b11b22 ndash b12b21

[b22 ndashb12ndashb21 b11

]

Let C be an n times n matrix It can be shown that

its inverse is

Cndash1 =1

det CadjC

where adj is the adjoint of C

C Mathematics reference Laplace transforms p 1

C04 Laplace transforms

The definition of the one-side Laplace and

inverse Laplace transforms follow

Definition C041 Laplace transforms (one-

sided)

Laplace transform L

L(y(t)) = Y(s) =

ˆ infin0

y(t)endashstdt (C041)

Inverse Laplace transform Lndash1

Lndash1(Y(s)) = y(t) =1

2πj

ˆ σ+jinfinσndashjinfin Y(s)estds (C042)

See Table B1 for a list of properties and

common transforms

D

Complex analysis

D Complex analysis Eulerrsquos formulas p 1

Eulerrsquos formulaD01 Eulerrsquos formulas

Eulerrsquos formula is our bridge backshyand forth

between trigonomentric forms (cos θ and sin θ)

and complex exponential form (ejθ)

ejθ = cos θ + j sin θ (The Formulas of Euler1)

Here are a few useful identities implied by

Eulerrsquos formula

endashjθ = cos θ ndash j sin θ (The Formulas of Euler2a)

cos θ = Re (ejθ) (The Formulas of Euler2b)

=1

2

(ejθ + endashjθ

)(The Formulas of Euler2c)

sin θ = Im (ejθ) (The Formulas of Euler2d)

=1

j2

(ejθ ndash endashjθ

)

(The Formulas of Euler2e)

Bibliography

[1] Robert B Ash Basic Probability Theory

Dover Publications Inc 2008

[2] Joan Bagaria Set Theory In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Fall 2019

Metaphysics Research Lab Stanford

University 2019

[3] Maria Baghramian and J Adam

Carter Relativism In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Winter 2019

Metaphysics Research Lab Stanford

University 2019

[4] Stephen Barker and Mark Jago Being

Positive About Negative Facts In

Philosophy and Phenomenological

Research 851 (2012) pp 117ndash138 doi 10

1111j1933-1592201000479x

[5] Jonathan Barzilai and Jonathan M

Borwein Two-Point Step Size Gradient

Methods In IMA Journal of Numerical

Analysis 81 (Jan 1988) pp 141ndash148 issn

0272-4979 doi 101093imanum81141

This includes an innovative line search

method

D Complex analysis Bibliography p 3

1 FBADL04

[6] Anat Biletzki and Anat Matar Ludwig

Wittgenstein In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Summer 2018

Metaphysics Research Lab Stanford

University 2018 An introduction to

Wittgenstein and his thought

[7] Antonio Bove F (Ferruccio) Colombini

and Daniele Del Santo Phase space

analysis of partial differential

equations eng Progress in nonlinear

differential equations and their

applications v 69 Boston Berlin

Birkhaumluser 2006 isbn 9780817645212

[8] William L Brogan Modern Control Theory

Third Prentice Hall 1991

[9] Francesco Bullo and Andrew D Lewis

Geometric control of mechanical systems

modeling analysis and design for simple

mechanical control systems Ed by JE

Marsden L Sirovich and M Golubitsky

Springer 2005

[10] Francesco Bullo and Andrew D Lewis

Supplementary Chapters for Geometric

Control of Mechanical Systems1 Jan

2005

[11] A Choukchou-Braham et al Analysis and

Control of Underactuated Mechanical

Systems SpringerLink Buumlcher Springer

International Publishing 2013 isbn

9783319026367

[12] K Ciesielski Set Theory for the

Working Mathematician London

Mathematical Society Student Texts

Cambridge University Press 1997 isbn

9780521594653 A readable introduction

to set theory

D Complex analysis Bibliography p 4

[13] Marian David The Correspondence

Theory of Truth In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Fall 2016 Metaphysics

Research Lab Stanford University

2016 A detailed overview of the

correspondence theory of truth

[14] David Dolby Wittgenstein on Truth

In A Companion to Wittgenstein John

Wiley amp Sons Ltd 2016 Chap 27 pp 433ndash

442 isbn 9781118884607 doi 101002

9781118884607ch27

[15] Peter J Eccles An Introduction to

Mathematical Reasoning Numbers

Sets and Functions Cambridge

University Press 1997 doi

10 1017 CBO9780511801136 A gentle

introduction to mathematical reasoning

It includes introductory treatments of

set theory and number systems

[16] HB Enderton Elements of Set

Theory Elsevier Science 1977 isbn

9780080570426 A gentle introduction to

set theory and mathematical reasoningmdasha

great place to start

[17] Richard P Feynman Robert B Leighton

and Matthew Sands The Feynman

Lectures on Physics New Millennium

Perseus Basic Books 2010

[18] Michael Glanzberg Truth In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Fall 2018

Metaphysics Research Lab Stanford

University 2018

[19] Hans Johann Glock Truth in the

Tractatus In Synthese 1482 (Jan

2006) pp 345ndash368 issn 1573-0964 doi

101007s11229-004-6226-2

D Complex analysis Bibliography p 5

[20] Mario GoacutemezshyTorrente Alfred Tarski

In The Stanford Encyclopedia of

Philosophy Ed by Edward N Zalta

Spring 2019 Metaphysics Research Lab

Stanford University 2019

[21] Paul Guyer and Rolf-Peter Horstmann

Idealism In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Winter 2018

Metaphysics Research Lab Stanford

University 2018

[22] R Haberman Applied Partial Differential

Equations with Fourier Series and

Boundary Value Problems (Classic

Version) Pearson Modern Classics for

Advanced Mathematics Pearson Education

Canada 2018 isbn 9780134995434

[23] GWF Hegel and AV Miller

Phenomenology of Spirit Motilal

Banarsidass 1998 isbn 9788120814738

[24] Wilfrid Hodges Model Theory In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Fall 2018

Metaphysics Research Lab Stanford

University 2018

[25] Wilfrid Hodges Tarskirsquos Truth

Definitions In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Fall 2018 Metaphysics

Research Lab Stanford University 2018

[26] Peter Hylton and Gary Kemp Willard

Van Orman Quine In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Spring 2019 Metaphysics

Research Lab Stanford University 2019

[27] ET Jaynes et al Probability Theory The

Logic of Science Cambridge University

Press 2003 isbn 9780521592710

D Complex analysis Bibliography p 6

An excellent and comprehensive

introduction to probability theory

[28] I Kant P Guyer and AW Wood Critique

of Pure Reason The Cambridge Edition

of the Works of Immanuel Kant

Cambridge University Press 1999 isbn

9781107268333

[29] Juliette Kennedy Kurt Goumldel In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Winter 2018

Metaphysics Research Lab Stanford

University 2018

[30] Drew Khlentzos Challenges to

Metaphysical Realism In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Winter 2016

Metaphysics Research Lab Stanford

University 2016

[31] Peter Klein Skepticism In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Summer 2015

Metaphysics Research Lab Stanford

University 2015

[32] M Kline Mathematics The Loss

of Certainty A Galaxy book

Oxford University Press 1982 isbn

9780195030853 A detailed account

of the ldquoillogicalrdquo development of

mathematics and an exposition of

its therefore remarkable utility in

describing the world

[33] W Richard Kolk and Robert A Lerman

Nonlinear System Dynamics 1st ed

Springer US 1993 isbn 978-1-4684shy6496shy2

[34] Erwin Kreyszig Advanced Engineering

Mathematics 10th John Wiley amp Sons

Limited 2011 isbn 9781119571094 The

authoritative resource for engineering

mathematics It includes detailed

D Complex analysis Bibliography p 7

accounts of probability statistics

vector calculus linear algebra

fourier analysis ordinary and partial

differential equations and complex

analysis It also includes several other

topics with varying degrees of depth

Overall it is the best place to start

when seeking mathematical guidance

[35] John M Lee Introduction to Smooth

Manifolds second Vol 218 Graduate

Texts in Mathematics Springer 2012

[36] Catherine Legg and Christopher

Hookway Pragmatism In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Spring 2019 Metaphysics

Research Lab Stanford University

2019 An introductory article on the

philsophical movement ldquopragmatismrdquo It

includes an important clarification of

the pragmatic slogan ldquotruth is the end of

inquiryrdquo

[37] Daniel Liberzon Calculus of Variations

and Optimal Control Theory A Concise

Introduction Princeton University Press

2012 isbn 9780691151878

[38] Panu Raatikainen Goumldelrsquos Incompleteness

Theorems In The Stanford Encyclopedia

of Philosophy Ed by Edward N Zalta

Fall 2018 Metaphysics Research Lab

Stanford University 2018 A through

and contemporary description of

Goumldelrsquos incompleteness theorems which

have significant implications for the

foundations and function of mathematics

and mathematical truth

[39] Paul Redding Georg Wilhelm Friedrich

Hegel In The Stanford Encyclopedia

of Philosophy Ed by Edward N Zalta

Summer 2018 Metaphysics Research Lab

Stanford University 2018

D Complex analysis Bibliography p 8

[40] HM Schey Div Grad Curl and All that An

Informal Text on Vector Calculus WW

Norton 2005 isbn 9780393925166

[41] Christopher Shields Aristotle In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Winter 2016

Metaphysics Research Lab Stanford

University 2016

[42] Steven S Skiena Calculated Bets

Computers Gambling and Mathematical

Modeling to Win Outlooks Cambridge

University Press 2001 doi 10 1017

CBO9780511547089 This includes a lucid

section on probability versus statistics

also available here https www3

csstonybrookedu~skienajaialai

excerptsnode12html

[43] George Smith Isaac Newton In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Fall 2008

Metaphysics Research Lab Stanford

University 2008

[44] Daniel Stoljar and Nic Damnjanovic

The Deflationary Theory of Truth

In The Stanford Encyclopedia of

Philosophy Ed by Edward N Zalta Fall

2014 Metaphysics Research Lab Stanford

University 2014

[45] WA Strauss Partial Differential

Equations An Introduction Wiley 2007

isbn 9780470054567 A thorough and yet

relatively compact introduction

[46] SH Strogatz and M Dichter Nonlinear

Dynamics and Chaos Second Studies

in Nonlinearity Avalon Publishing 2016

isbn 9780813350844

[47] Mark Textor States of Affairs In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Winter 2016

D Complex analysis Bibliography p 9

Metaphysics Research Lab Stanford

University 2016

[48] Pauli Virtanen et al SciPy

10ndashFundamental Algorithms for

Scientific Computing in Python In arXiv

eshyprints arXiv190710121 (July 2019)

arXiv190710121

[49] Wikipedia Algebra mdash Wikipedia The Free

Encyclopedia httpenwikipedia

orgwindexphptitle=Algebraampoldid=

920573802 [Online accessed 26shyOctober-

2019] 2019

[50] Wikipedia Carl Friedrich Gauss mdash

Wikipedia The Free Encyclopedia http

en wikipedia org w index php

title=Carl20Friedrich20Gaussampoldid=

922692291 [Online accessed 26shyOctober-

2019] 2019

[51] Wikipedia Euclid mdash Wikipedia The Free

Encyclopedia httpenwikipedia

orgwindexphptitle=Euclidampoldid=

923031048 [Online accessed 26shyOctober-

2019] 2019

[52] Wikipedia First-order logic mdash

Wikipedia The Free Encyclopedia http

enwikipediaorgwindexphptitle=

First-order20logicampoldid=921437906

[Online accessed 29shyOctobershy2019] 2019

[53] Wikipedia Fundamental interaction mdash

Wikipedia The Free Encyclopedia http

en wikipedia org w index php

title=Fundamental20interactionampoldid=

925884124 [Online accessed 16shyNovember-

2019] 2019

[54] Wikipedia Leonhard Euler mdash Wikipedia

The Free Encyclopedia http en

wikipediaorgwindexphptitle=

Leonhard 20Euler amp oldid = 921824700

[Online accessed 26shyOctobershy2019] 2019

D Complex analysis Bibliography p 10

[55] Wikipedia Linguistic turnmdashWikipedia

The Free Encyclopedia http en

wikipediaorgwindexphptitle=

Linguistic20turnampoldid=922305269

[Online accessed 23shyOctobershy2019] 2019

Hey we all do it

[56] Wikipedia Probability space mdash

Wikipedia The Free Encyclopedia http

enwikipediaorgwindexphptitle=

Probability20spaceampoldid=914939789

[Online accessed 31shyOctobershy2019] 2019

[57] Wikipedia Propositional calculus mdash

Wikipedia The Free Encyclopedia http

en wikipedia org w index php

title=Propositional20calculusampoldid=

914757384 [Online accessed 29shyOctober-

2019] 2019

[58] Wikipedia Quaternion mdash Wikipedia The

Free Encyclopedia httpenwikipedia

orgwindexphptitle=Quaternionamp

oldid=920710557 [Online accessed 26-

Octobershy2019] 2019

[59] Wikipedia Set-builder notation mdash

Wikipedia The Free Encyclopedia http

enwikipediaorgwindexphptitle=

Set-builder20notationampoldid=917328223

[Online accessed 29shyOctobershy2019] 2019

[60] Wikipedia William Rowan Hamilton mdash

Wikipedia The Free Encyclopedia http

enwikipediaorgwindexphptitle=

William 20Rowan 20Hamilton amp oldid =

923163451 [Online accessed 26shyOctober-

2019] 2019

[61] L Wittgenstein PMS Hacker and J

Schulte Philosophical Investigations

Wiley 2010 isbn 9781444307979

[62] Ludwig Wittgenstein Tractatus Logico-

Philosophicus Ed by C familyi=C

given=K Ogden giveni= O Project

Complex analysis Bibliography p 11

Gutenberg International Library of

Psychology Philosophy and Scientific

Method Kegan Paul Trench Trubner amp Co

Ltd 1922 A brilliant work on what is

possible to express in languagemdashand what

is not As Wittgenstein puts it ldquoWhat can

be said at all can be said clearly and

whereof one cannot speak thereof one

must be silentrdquo

[63] Slavoj Žižek Less Than Nothing

Hegel and the Shadow of Dialectical

Materialism Verso 2012 isbn

9781844678976 This is one of the most

interesting presentations of Hegel and

Lacan by one of the most exciting

contemporary philosophers

  • Mathematics itself
    • Truth
      • Neo-classical theories of truth
      • The picture theory
      • The relativity of truth
      • Other ideas about truth
      • Where this leaves us
        • The foundations of mathematics
          • Algebra ex nihilo
          • The application of mathematics to science
          • The rigorization of mathematics
          • The foundations of mathematics are built
          • The foundations have cracks
          • Mathematics is considered empirical
            • Mathematical reasoning
            • Mathematical topics in overview
            • What is mathematics for engineering
            • Exercises for Chapter 01
              • Mathematical reasoning logic and set theory
                • Introduction to set theory
                • Logical connectives and quantifiers
                  • Logical connectives
                  • Quantifiers
                    • Exercises for Chapter 02
                      • Exe 021 hardhat
                      • Exe 022 2
                      • Exe 023 3
                      • Exe 024 4
                          • Probability
                            • Probability and measurement
                            • Basic probability theory
                              • Algebra of events
                                • Independence and conditional probability
                                  • Conditional probability
                                    • Bayes theorem
                                      • Testing outcomes
                                      • Posterior probabilities
                                        • Random variables
                                        • Probability density and mass functions
                                          • Binomial PMF
                                          • Gaussian PDF
                                            • Expectation
                                            • Central moments
                                            • Exercises for Chapter 03
                                              • Exe 031 5
                                                  • Statistics
                                                    • Populations samples and machine learning
                                                    • Estimation of sample mean and variance
                                                      • Estimation and sample statistics
                                                      • Sample mean variance and standard deviation
                                                      • Sample statistics as random variables
                                                      • Nonstationary signal statistics
                                                        • Confidence
                                                          • Generate some data to test the central limit theorem
                                                          • Sample statistics
                                                          • The truth about sample means
                                                          • Gaussian and probability
                                                            • Student confidence
                                                            • Multivariate probability and correlation
                                                              • Marginal probability
                                                              • Covariance
                                                              • Conditional probability and dependence
                                                                • Regression
                                                                • Exercises for Chapter 04
                                                                  • Vector calculus
                                                                    • Divergence surface integrals and flux
                                                                      • Flux and surface integrals
                                                                      • Continuity
                                                                      • Divergence
                                                                      • Exploring divergence
                                                                        • Curl line integrals and circulation
                                                                          • Line integrals
                                                                          • Circulation
                                                                          • Curl
                                                                          • Zero curl circulation and path independence
                                                                          • Exploring curl
                                                                            • Gradient
                                                                              • Gradient
                                                                              • Vector fields from gradients are special
                                                                              • Exploring gradient
                                                                                • Stokes and divergence theorems
                                                                                  • The divergence theorem
                                                                                  • The Kelvin-Stokes theorem
                                                                                  • Related theorems
                                                                                    • Exercises for Chapter 05
                                                                                      • Exe 051 6
                                                                                          • Fourier and orthogonality
                                                                                            • Fourier series
                                                                                            • Fourier transform
                                                                                            • Generalized fourier series and orthogonality
                                                                                            • Exercises for Chapter 06
                                                                                              • Exe 061 secrets
                                                                                              • Exe 062 seesaw
                                                                                              • Exe 063 9
                                                                                              • Exe 064 10
                                                                                              • Exe 065 11
                                                                                              • Exe 066 12
                                                                                              • Exe 067 13
                                                                                              • Exe 068 14
                                                                                              • Exe 069 savage
                                                                                              • Exe 0610 strawman
                                                                                                  • Partial differential equations
                                                                                                    • Classifying PDEs
                                                                                                    • Sturm-liouville problems
                                                                                                      • Types of boundary conditions
                                                                                                        • PDE solution by separation of variables
                                                                                                        • The 1D wave equation
                                                                                                        • Exercises for Chapter 07
                                                                                                          • Exe 071 poltergeist
                                                                                                          • Exe 072 kathmandu
                                                                                                              • Optimization
                                                                                                                • Gradient descent
                                                                                                                  • Stationary points
                                                                                                                  • The gradient points the way
                                                                                                                  • The classical method
                                                                                                                  • The Barzilai and Borwein method
                                                                                                                    • Constrained linear optimization
                                                                                                                      • Feasible solutions form a polytope
                                                                                                                      • Only the vertices matter
                                                                                                                        • The simplex algorithm
                                                                                                                        • Exercises for Chapter 08
                                                                                                                          • Exe 081 cummerbund
                                                                                                                          • Exe 082 melty
                                                                                                                              • Nonlinear analysis
                                                                                                                                • Nonlinear state-space models
                                                                                                                                  • Autonomous and nonautonomous systems
                                                                                                                                  • Equilibrium
                                                                                                                                    • Nonlinear system characteristics
                                                                                                                                      • Those in-common with linear systems
                                                                                                                                      • Stability
                                                                                                                                      • Qualities of equilibria
                                                                                                                                        • Nonlinear system simulation
                                                                                                                                        • Simulating nonlinear systems in Python
                                                                                                                                        • Exercises for Chapter 09
                                                                                                                                          • Distribution tables
                                                                                                                                            • Gaussian distribution table
                                                                                                                                            • Students t-distribution table
                                                                                                                                              • Fourier and Laplace tables
                                                                                                                                                • Laplace transforms
                                                                                                                                                • Fourier transforms
                                                                                                                                                  • Mathematics reference
                                                                                                                                                    • Quadratic forms
                                                                                                                                                      • Completing the square
                                                                                                                                                        • Trigonometry
                                                                                                                                                          • Triangle identities
                                                                                                                                                          • Reciprocal identities
                                                                                                                                                          • Pythagorean identities
                                                                                                                                                          • Co-function identities
                                                                                                                                                          • Even-odd identities
                                                                                                                                                          • Sum-difference formulas (AM or lock-in)
                                                                                                                                                          • Double angle formulas
                                                                                                                                                          • Power-reducing or half-angle formulas
                                                                                                                                                          • Sum-to-product formulas
                                                                                                                                                          • Product-to-sum formulas
                                                                                                                                                          • Two-to-one formulas
                                                                                                                                                            • Matrix inverses
                                                                                                                                                            • Laplace transforms
                                                                                                                                                              • Complex analysis
                                                                                                                                                                • Eulers formulas
                                                                                                                                                                  • Bibliography
Page 4: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview

Contents Contents p 4

Generate some data to test the

central limit theorem 99

Sample statistics 101

The truth about sample means 101

Gaussian and probability 102

statsstudentStudent confidence 110

statsmultivarMultivariate probability

and correlation 115

Marginal probability 117

Covariance 118

Conditional probability and

dependence 122

statsregressionRegression 125

statsex Exercises for Chapter 04 131

05 Vector calculus 132

vecsdiv Divergence surface

integrals and flux 135

Flux and surface integrals 135

Continuity 135

Divergence 136

Exploring divergence 137

vecscurlCurl line integrals and

circulation 144

Line integrals 144

Circulation 144

Curl 145

Zero curl circulation and path

independence 146

Exploring curl 147

vecsgradGradient 153

Gradient 153

Vector fields from gradients are

special 154

Exploring gradient 154

vecsstokedStokes and divergence

theorems 164

The divergence theorem 164

The Kelvin-Stokesrsquo theorem 165

Related theorems 165

vecsex Exercises for Chapter 05 166

Contents Contents p 5

Exe 051 6 166

06 Fourier and orthogonality 168

fourseriesFourier series 169

fourtransFourier transform 174

fourgeneralGeneralized fourier series

and orthogonality 184

fourex Exercises for Chapter 06 186

Exe 061 secrets 186

Exe 062 seesaw 189

Exe 063 9 189

Exe 064 10 190

Exe 065 11 190

Exe 066 12 190

Exe 067 13 190

Exe 068 14 190

Exe 069 savage 191

Exe 0610 strawman 192

07 Partial differential equations 193

pdeclassClassifying PDEs 196

pdesturmSturm-liouville problems 199

Types of boundary conditions 200

pdeseparationPDE solution by separation of

variables 205

pdewaveThe 1D wave equation 212

pdeex Exercises for Chapter 07 219

Exe 071 poltergeist 219

Exe 072 kathmandu 220

08 Optimization 222

optgrad Gradient descent 223

Stationary points 223

The gradient points the way 224

The classical method 225

The Barzilai and Borwein method 226

optlin Constrained linear

optimization 235

Feasible solutions form a polytope 235

Only the vertices matter 236

optsimplexThe simplex algorithm 238

optex Exercises for Chapter 08 245

Contents Contents p 6

Exe 081 cummerbund 245

Exe 082 melty 245

09 Nonlinear analysis 246

nlinss Nonlinear state-space models 248

Autonomous and nonautonomous

systems 248

Equilibrium 248

nlinchar Nonlinear system

characteristics 250

Those in-common with linear

systems 250

Stability 250

Qualities of equilibria 251

nlinsim Nonlinear system simulation 254

nlinpysimSimulating nonlinear systems

in Python 255

nlinex Exercises for Chapter 09 257

A Distribution tables 258

A01 Gaussian distribution table 259

A02 Studentrsquos tshydistribution table 261

B Fourier and Laplace tables 262

B01 Laplace transforms 263

B02 Fourier transforms 265

C Mathematics reference 268

C01 Quadratic forms 269

Completing the square 269

C02 Trigonometry 270

Triangle identities 270

Reciprocal identities 270

Pythagorean identities 270

Co-function identities 271

Even-odd identities 271

Sumshydifference formulas (AM or

lock-in) 271

Double angle formulas 271

Power-reducing or halfshyangle

formulas 271

Sum-toshyproduct formulas 272

Contents Contents p 7

Product-to-sum formulas 272

Two-to-one formulas 272

C03 Matrix inverses 273

C04 Laplace transforms 274

D Complex analysis 275

D01 Eulerrsquos formulas 276

Bibliography 277

itself

Mathematics itself

itself Mathematics itself tru Truth p 1

1 For much of this lecture I rely on the thorough overview

of Glanzberg (Michael Glanzberg Truth In The Stanford

Encyclopedia of Philosophy Ed by Edward N Zalta Fall 2018

Metaphysics Research Lab Stanford University 2018)

correspondence theory

itselftru Truth

1 Before we can discuss mathematical truth

we should begin with a discussion of truth

itself1 It is important to note that this is

obviously extremely incomplete My aim is

to give a sense of the subject via brutal

(mis)abbreviation

2 Of course the study of truth cannot but

be entangled with the study of the world

as such (metaphysics) and of knowledge

(epistemology) Some of the following theories

presuppose or imply a certain metaphysical

andor epistemological theory but which these

are is controversial

Neo-classical theories of truth

3 The neo-classical theories of truth

take for granted that there is truth and

attempt to explain what its precise nature is

(Michael Glanzberg Truth In The Stanford

Encyclopedia of Philosophy Ed by Edward N

Zalta Fall 2018 Metaphysics Research Lab

Stanford University 2018) What are provided

here are modern understandings of theories

developed primarily in the early 20th century

The correspondence theory

4 A version of what is called the

correspondence theory of truth is the

following

A proposition is true iff there is an

existing entity in the world that

corresponds with it

itself Mathematics itself tru Truth p 1

facts

false

coherence theory

2 This is typically put in terms of ldquobeliefsrdquo or ldquojudgmentsrdquo but for

brevity and parallelism I have cast it in terms of propositions It is

to this theory I have probably committed the most violence

coherence

Such existing entities are called facts Facts

are relational in that their parts (eg subject

predicate etc) are related in a certain way

5 Under this theory then if a proposition

does not correspond to a fact it is false

6 This theory of truth is rather intuitive

and consistently popular (Marian David The

Correspondence Theory of Truth In The

Stanford Encyclopedia of Philosophy Ed

by Edward N Zalta Fall 2016 Metaphysics

Research Lab Stanford University 2016 A

detailed overview of the correspondence

theory of truth)

The coherence theory

7 The coherence theory of truth is adamant

that the truth of any given proposition

is only as good as its holistic system of

propositions2 This includes (but typically

goes beyond) a requirement for consistency of

a given proposition with the whole and the

self-consistency of the whole itselfmdashsometimes

called coherence

8 For parallelism letrsquos attempt a succinct

formulation of this theory cast in terms of

propositions

A proposition is true iff it is

has coherence with a system of

propositions

itself Mathematics itself tru Truth p 1

pragmatism

3 Pragmatism was an American philosophical movement of the early

20th century that valued the success of ldquopracticalrdquo application of

theories For an introduction see Legg and Hookway (Catherine

Legg and Christopher Hookway Pragmatism In The Stanford

Encyclopedia of Philosophy Ed by Edward N Zalta Spring 2019

Metaphysics Research Lab Stanford University 2019 An introductory

article on the philsophical movement ldquopragmatismrdquo It includes an

important clarification of the pragmatic slogan ldquotruth is the end of

inquiryrdquo)

4 This is especially congruent with the work of William James (ibid)

9 Note that this has no reference to facts

whatsoever However it need not necessarily

preclude them

The pragmatic theory

10 Of the neo-classical theories of truth this

is probably the least agreed upon as having a

single clear statement (Glanzberg Truth)

However as with pragmatism in general3 the

pragmatic truth is oriented practically

11 Perhaps the most important aspect

of this theory is that it is thoroughly a

correspondence theory agreeing that true

propositions are those that correspond to

the world However there is a different

focus here that differentiates it from

correspondence theory proper it values as

more true that which has some sort of practical

use in human life

12 Wersquoll try to summarize pragmatism in two

slogans with slightly different emphases herersquos

the first again cast in propositional parallel

A proposition is true iff it works4

Now there are two ways this can be

understood (a) the proposition ldquoworksrdquo in

that it empirically corresponds to the world

or (b) the proposition ldquoworksrdquo in that it has

an effect that some agent intends The former

is pretty standard correspondence theory The

itself Mathematics itself tru Truth p 1

5 This is especially congruent with the work of Charles Sanders

Peirce (Legg and Hookway Pragmatism)

inquiry

facts

latter is new and fairly obviously has ethical

implications especially today

13 Let us turn to a second formulation

A proposition is true if it

corresponds with a process of

inquiry5

This has two interesting facets (a) an agentrsquos

active inquiry creates truth and (b) it is a

sort of correspondence theory that requires a

correspondence of a proposition with a process

of inquiry not as in the correspondence theory

with a fact about the world The latter has

shades of both correspondence theory and

coherence theory

The picture theory

Before we delve into this theory we must take

a moment to clarify some terminology

States of affairs and facts

14 When discussing the correspondence

theory we have used the term fact to mean

an actual state of things in the world A

problem arises in the correspondence theory

here It says that a proposition is true iff

there is a fact that corresponds with it What

of a negative proposition like ldquothere are no

cows in Antarcticardquo We would seem to need

a corresponding ldquonegative factrdquo in the world

to make this true If a fact is taken to be

itself Mathematics itself tru Truth p 1

6 But Barker and Jago (Stephen Barker and Mark Jago Being

Positive About Negative Facts In Philosophy and Phenomenological

Research 851 [2012] pp 117ndash138 DOI 10 1111 j 1933 -1592201000479x) have attempted just that

state of affairs

obtain

model

picture

7 See Wittgenstein (Ludwig Wittgenstein Tractatus Logico-

Philosophicus Ed by C familyi=C given=K Ogden giveni= O

Project Gutenberg International Library of Psychology Philosophy

and Scientific Method Kegan Paul Trench Trubner amp Co Ltd 1922

A brilliant work on what is possible to express in languagemdashand

what is not As Wittgenstein puts it ldquoWhat can be said at all can

be said clearly and whereof one cannot speak thereof one must be

silentrdquo) Biletzki and Matar (Anat Biletzki and Anat Matar Ludwig

Wittgenstein In The Stanford Encyclopedia of Philosophy Ed by

Edward N Zalta Summer 2018 Metaphysics Research Lab Stanford

University 2018 An introduction to Wittgenstein and his thought)

glock2016 () and Dolby (David Dolby Wittgenstein on Truth In

A Companion to Wittgenstein John Wiley amp Sons Ltd 2016 Chap 27

pp 433ndash442 ISBN 9781118884607 DOI 1010029781118884607ch27)

composed of a complex of actual objects and

relations it is hard to imagine such facts6

15 Furthermore if a proposition is true it

seems that it is the corresponding fact that

makes it so what then makes a proposition

false since there is no fact to support the

falsity (Mark Textor States of Affairs In

The Stanford Encyclopedia of Philosophy Ed

by Edward N Zalta Winter 2016 Metaphysics

Research Lab Stanford University 2016)

16 And what of nonsense There are some

propositions like ldquothere is a round cuberdquo

that are neither true nor false However

the preceding correspondence theory cannot

differentiate between false and nonsensical

propositions

17 A state of affairs is something possible

that may or may not be actual (ibid) If

a state of affairs is actual it is said to

obtain The picture theory will make central

this concept instead of that of the fact

The picture theory of meaning (and truth)

18 The picture theory of meaning uses the

analogy of the model or picture to explain

the meaningfulness of propositions7

A proposition names possible objects

and arranges these names to

correspond to a state of affairs

itself Mathematics itself tru Truth p 1

Figure 011 a representation of the picture theory

nonsense

See Figure 011 This also allows for an easy

account of truth falsity and nonsense

nonsense A sentence that appears to be

a proposition is actually not if the

arrangement of named objects is

impossible Such a sentence is simply

nonsense

truth A proposition is true if the state of

affairs it depicts obtains

falsity A proposition is false if the state of

affairs it depicts does not obtain

19 Now some (Hans Johann Glock Truth

in the Tractatus In Synthese 1482 [Jan

2006] pp 345ndash368 ISSN 1573-0964 DOI 10

1007 s11229 - 004 - 6226 - 2) argue this is

a correspondence theory and others (David

Dolby Wittgenstein on Truth In A Companion

to Wittgenstein John Wiley amp Sons Ltd 2016

Chap 27 pp 433ndash442 ISBN 9781118884607 DOI

1010029781118884607ch27) that it is not In

any case it certainly solves some issues that

have plagued the correspondence theory

ldquoWhat cannot be said must be shownrdquo

20 Something the picture theory does is

declare a limit on what can meaningfully be

said A proposition (as defined above) must be

potentially true or false Therefore something

that cannot be false (something necessarily

true) cannot be a proposition (ibid) And

there are certain things that are necessarily

true for language itself to be meaningfulmdash

paradigmatically the logical structure of the

world What a proposition does then is show

itself Mathematics itself tru Truth p 1

8 See also ( Slavoj Žižek Less Than Nothing Hegel and the Shadow

of Dialectical Materialism Verso 2012 ISBN 9781844678976 This is

one of the most interesting presentations of Hegel and Lacan by one

of the most exciting contemporary philosophers pp 25shy6) from whom I

stole the section title

language itself

9 This was one of the contributions to the ldquolinguistic turnrdquo

(Wikipedia Linguistic turnmdashWikipedia The Free Encyclopedia

http en wikipedia org w index php title =Linguistic 20turn amp oldid = 922305269 [Online accessed23shyOctobershy2019] 2019 Hey we all do it) of philosophy in the early

20th century

perspective

skepticism

via its own logical structure the necessary

(for there to be meaningful propositions at all)

logical structure of the world8

21 An interesting feature of this perspective

is that it opens up language itself to analysis

and limitation9 And furthermore it suggests

that the set of what is is smaller than the set

of what can be meaningfully spoken about

The relativity of truth

22 Each subject (ie agent) in the world with

their propositions has a perspective a given

moment a given place an historical-cultural-

linguistic situation At the very least the truth

of propositions must account for this Just

how a theory of truth should do so is a matter

of significant debate (Maria Baghramian and

J Adam Carter Relativism In The Stanford

Encyclopedia of Philosophy Ed by Edward N

Zalta Winter 2019 Metaphysics Research Lab

Stanford University 2019)

23 Some go so far as to be skeptical

about truth (Peter Klein Skepticism In The

Stanford Encyclopedia of Philosophy Ed by

Edward N Zalta Summer 2015 Metaphysics

Research Lab Stanford University 2015)

regarding it to be entirely impossible Others

say that while a proposition may or may not be

true we could never come to know this

24 Often underlying this conversation is the

question of there being a common world in

which we all participate and if so whether

or not we can properly represent this world

itself Mathematics itself tru Truth p 1

10 Especially notable here is the work of Alfred Tarski in the

midshy20th century

beliefs

sentences

interpreted sentences

quasi-quotation

in language such that multiple subjects could

come to justifiably agree or disagree on the

truth of a proposition If every proposition

is so relative that it is relevant to only the

proposer truth would seem of little value On

the other hand if truth is understood to

be ldquoobjectiverdquomdashindependent of subjective

perspectivemdasha number of objections can be

made (Baghramian and Carter Relativism) such

as that there is no non-subjective perspective

from which to judge truth

Other ideas about truth

25 There are too many credible ideas

about truth to attempt a reasonable summary

however I will attempt to highlight a few

important ones

Formal methods

26 A set of tools was developed for

exploring theories of truth especially

correspondence theories10 Focus turned

from beliefs to sentences which are akin to

propositions (Recall that the above theories

have already been recast in the more modern

language of propositions) Another aspect

of these sentences under consideration is

that they begin to be taken as interpreted

sentences they are already have meaning

27 Beyond this several technical apparatus

are introduced that formalize criteria for

truth For instance a sentence is given a

sign φ A need arises to distinguish between

the quotation of sentence φ and the unqoted

sentence φ which is then given the quasi-

quotation notation pφq For instance let φ

stand for snow is white then φ rarr snow is

itself Mathematics itself tru Truth p 1

Convention T

11 Wilfrid Hodges Tarskirsquos Truth Definitions In The Stanford

Encyclopedia of Philosophy Ed by Edward N Zalta Fall 2018

Metaphysics Research Lab Stanford University 2018 Mario Goacutemez-

Torrente Alfred Tarski In The Stanford Encyclopedia of

Philosophy Ed by Edward N Zalta Spring 2019 Metaphysics Research

Lab Stanford University 2019 Peter Hylton and Gary Kemp Willard

Van Orman Quine In The Stanford Encyclopedia of Philosophy Ed

by Edward N Zalta Spring 2019 Metaphysics Research Lab Stanford

University 2019

redundancy theory

white and pφq rarr lsquosnow is whitersquo Tarski

introduces Convention T which states that

for a fixed language L with fully interpreted

sentences (Glanzberg Truth)

An adequate theory of truth for L

must imply for each sentence φ of L

pφq is true if and only if φ

Using the same example then

lsquosnow is whitersquo if and only if snow

is white

Convention T states a general rule for the

adequacy of a theory of truth and is used in

several contemporary theories

28 We can see that these formal methods get

quite technical and fun For more see Hodges

GoacutemezshyTorrente and Hylton and Kemp11

Deflationary theories

29 Deflationary theories of truth try to

minimize or eliminate altogether the concept

of or use of the term lsquotruthrsquo For instance the

redundancy theory claim that (Glanzberg

Truth)

itself Mathematics itself tru Truth p 1

12 Daniel Stoljar and Nic Damnjanovic The Deflationary Theory of

Truth In The Stanford Encyclopedia of Philosophy Ed by Edward N

Zalta Fall 2014 Metaphysics Research Lab Stanford University 2014

truth-bearer

meaning

theory of use

To assert that pφq is true is just to

assert that φ

Therefore we can eliminate the use of lsquois truersquo

30 For more of less see Stoljar and

Damnjanovic12

Language

31 It is important to recognize that language

mediates truth that is truth is embedded in

language The way language in general affects

theories of truth has been studied extensively

For instance whether the truth-bearer is

a belief or a proposition or a sentencemdashor

something elsemdashhas been much discussed The

importance of the meaning of truth-bearers

like sentences has played another large role

Theories of meaning like the picture theory

presented above are often closely intertwined

with theories of truth

32 One of the most popular theories of

meaning is called the theory of use

For a large class of cases of the

employment of the word ldquomeaningrdquo

ndash though not for all ndash this word

can be explained in this way the

meaning of a word is its use in

the language (L Wittgenstein PMS

Hacker and J Schulte Philosophical

itself Mathematics itself tru Truth p 1

languageshygames

metaphysical realism

scientific realism

13 Drew Khlentzos Challenges to Metaphysical Realism In The

Stanford Encyclopedia of Philosophy Ed by Edward N Zalta Winter

2016 Metaphysics Research Lab Stanford University 2016

antishyrealism

Investigations Wiley 2010 ISBN

9781444307979)

This theory is accompanied by the concept of

languageshygames which are loosely defined as

rule-based contexts within which sentences

have uses The idea is that the meaning of

a given sentence is its use in a network of

meaning that is constantly evolving This view

tends to be understood as deflationary or

relativistic about truth

Metaphysical and epistemological

considerations

33 We began with the recognition that

truth is intertwined with metaphysics and

epistemology Letrsquos consider a few such topics

34 The first is metaphysical realism

which claims that there is a world existing

objectively independently of how we think

about or describe it This ldquorealismrdquo tends to be

closely tied to yet distinct from scientific

realism which goes further claiming the

world is ldquoactuallyrdquo as science describes

independently of the scientific descriptions

(eg there are actual objects corresponding to

the phenomena we call atoms molecules light

particles etc)

35 There have been many challenges to the

realist claim (for some recent versions see

Khlentzos13) put forth by what is broadly

called antishyrealism These vary but often

challenge the ability of realists to properly

itself Mathematics itself tru Truth p 2

Metaphysical idealism

Epistemological idealism

14 These definitions are explicated by Guyer and Horstmann (Paul

Guyer and Rolf-Peter Horstmann Idealism In The Stanford

Encyclopedia of Philosophy Ed by Edward N Zalta Winter 2018

Metaphysics Research Lab Stanford University 2018)

noumenal world

phenomenal world

link language to supposed objects in the

world

36 Metaphysical idealism has been

characterized as claiming that ldquomindrdquo or

ldquosubjectivityrdquo generate or completely compose

the world which has no being outside mind

Epistemological idealism on the other hand

while perhaps conceding that there is a world

independent of mind claims all knowledge

of the world is created through mind and

for mind and therefore can never escape a

sort of mindshyworld gap14 This epistemological

idealism has been highly influential since

the work of Immanuel Kant (I Kant P Guyer

and AW Wood Critique of Pure Reason The

Cambridge Edition of the Works of Immanuel

Kant Cambridge University Press 1999 ISBN

9781107268333) in the late 18th century

which ushered in the idea of the noumenal

world in-itself and the phenomenal world

which is how the noumenal world presents to

us Many have held that phenomena can be

known through inquiry whereas noumena are

inaccessible Furthermore what can be known is

restricted by the categories pre-existent in

the knower

37 Another approach taken by Georg Wilhelm

Friedrich Hegel (Paul Redding Georg

Wilhelm Friedrich Hegel In The Stanford

Encyclopedia of Philosophy Ed by Edward N

Zalta Summer 2018 Metaphysics Research

Lab Stanford University 2018) and other

German idealists following Kant is to reframe

reality as thoroughly integrating subjectivity

(GWF Hegel and AV Miller Phenomenology

of Spirit Motilal Banarsidass 1998 ISBN

itself Mathematics itself tru Truth p 1

Notion

9788120814738 Slavoj Žižek Less Than

Nothing Hegel and the Shadow of Dialectical

Materialism Verso 2012 ISBN 9781844678976

This is one of the most interesting presentations

of Hegel and Lacan by one of the most exciting

contemporary philosophers) that is ldquoeverything

turns on grasping and expressing the True not

only as Substance but equally as Subjectrdquo

A subjectrsquos proposition is true inasmuch as it

corresponds with its Notion (approximately

the idea or meaning for the subject) Some

hold that this idealism is compatible with a

sort of metaphysical realism at least as far

as understanding is not independent of but

rather beholden to reality (Žižek Less Than

Nothing Hegel and the Shadow of Dialectical

Materialism p 906 ff)

38 Clearly all these ideas have many

implications for theories of truth and vice

versa

Where this leaves us

39 The truth is hard What may at first

appear to be a simple concept becomes complex

upon analysis It is important to recognize

that we have only sampled some highlights of

the theories of truth I recommend further

study of this fascinating topic

40 Despite the difficulties of finding

definitive grounds for understanding truth

we are faced with the task of provisionally

forging ahead Much of what follows in

the study of mathematics makes certain

implicit and explicit assumptions about truth

However we have found that the foundations

itself Mathematics itself found Truth p 2

of these assumptions may themselves be

problematic It is my contention that despite

the lack of clear foundations it is still worth

studying engineering analysis its mathematical

foundations and the foundations of truth

itself My justification for this claim is that

I find the utility and the beauty of this study

highly rewarding

itself Mathematics itself found The foundations of mathematics p 1

axioms

deduction

proof

theorems

15 Throughout this section for the history of mathematics I rely

heavily on Kline (M Kline Mathematics The Loss of Certainty A

Galaxy book Oxford University Press 1982 ISBN 9780195030853 A

detailed account of the ldquoillogicalrdquo development of mathematics and

an exposition of its therefore remarkable utility in describing the

world)

Euclid

geometry

Gauss

itselffound The foundations of

mathematics

1 Mathematics has long been considered

exemplary for establishing truth Primarily

it uses a method that begins with

axiomsmdashunproven propositions that include

undefined termsmdashand uses logical deduction

to prove other propositions (theorems) to

show that they are necessarily true if the

axioms are

2 It may seem obvious that truth established

in this way would always be relative to the

truth of the axioms but throughout history

this footnote was often obscured by the

ldquoobviousrdquo or ldquointuitiverdquo universal truth of the

axioms15 For instance Euclid (Wikipedia

Euclid mdash Wikipedia The Free Encyclopedia

http en wikipedia org w index php

title = Euclid amp oldid = 923031048 [Online

accessed 26shyOctobershy2019] 2019) founded

geometrymdashthe study of mathematical objects

traditionally considered to represent physical

space like points lines etcmdashon axioms thought

so solid that it was not until the early 19th

century that Carl Friedrich Gauss (Wikipedia

Carl Friedrich Gauss mdash Wikipedia The Free

Encyclopedia httpenwikipediaorgw

indexphptitle=Carl20Friedrich20Gaussamp

oldid=922692291 [Online accessed 26shyOctober-

2019] 2019) and others recognized this was

only one among many possible geometries (M

Kline Mathematics The Loss of Certainty A

Galaxy book Oxford University Press 1982

ISBN 9780195030853 A detailed account of

the ldquoillogicalrdquo development of mathematics

and an exposition of its therefore remarkable

utility in describing the world) resting on

itself Mathematics itself found The foundations of mathematics p 1

Aristotle

algebra

rational numbers

natural numbers

integers

irrational numbers

different axioms Furthermore Aristotle

(Christopher Shields Aristotle In The

Stanford Encyclopedia of Philosophy Ed by

Edward N Zalta Winter 2016 Metaphysics

Research Lab Stanford University 2016) had

acknowledged that reasoning must begin

with undefined terms however even Euclid

(presumably aware of Aristotlersquos work) seemed

to forget this and provided definitions

obscuring the foundations of his work and

starting mathematics on a path that for over

2000 years would forget its own relativity

(Kline Mathematics The Loss of Certainty p 101-

2)

3 The foundations of Euclid were even

shakier than its murky starting point several

unstated axioms were used in proofs and some

proofs were otherwise erroneous However for

two millennia mathematics was seen as the field

wherein truth could be established beyond

doubt

Algebra ex nihilo

4 Although not much work new geometry

appeared during this period the field of

algebra (Wikipedia Algebra mdash Wikipedia The

Free Encyclopedia httpenwikipediaorg

windexphptitle=Algebraampoldid=920573802

[Online accessed 26shyOctobershy2019] 2019)mdashthe

study of manipulations of symbols standing for

numbers in generalmdashbegan with no axiomatic

foundation whatsoever The Greeks had a notion

of rational numbers ratios of natural numbers

(positive integers) and it was known that

many solutions to algebraic equations were

irrational (could not be expressed as a ratio

of integers) But these irrational numbers like

virtually everything else in algebra were

itself Mathematics itself found The foundations of mathematics p 1

negative numbers

imaginary numbers

complex numbers

optics

astronomy

calculus

Newtonian mechanics

non-Euclidean geometry

gradually accepted because they were so

useful in solving practical problems (they

could be approximated by rational numbers and

this seemed good enough) The rules of basic

arithmetic were accepted as applying to these

and other forms of new numbers that arose in

algebraic solutions negative imaginary and

complex numbers

The application of mathematics to science

5 During this time mathematics was being

applied to optics and astronomy Sir

Isaac Newton then built calculus upon

algebra applying it to what is now known as

Newtonian mechanics which was really more

the product of Leonhard Euler (George Smith

Isaac Newton In The Stanford Encyclopedia

of Philosophy Ed by Edward N Zalta Fall

2008 Metaphysics Research Lab Stanford

University 2008 Wikipedia Leonhard

Euler mdash Wikipedia The Free Encyclopedia

httpenwikipediaorgwindexphptitle=

Leonhard20Eulerampoldid=921824700 [Online

accessed 26shyOctobershy2019] 2019) Calculus

introduced its own dubious operations but

the success of mechanics in describing and

predicting physical phenomena was astounding

Mathematics was hailed as the language of

God (later Nature)

The rigorization of mathematics

6 It was not until Gauss created non-

Euclidean geometry in which Euclidrsquos were

shown to be one of many possible axioms

compatible with the world and William Rowan

Hamilton (Wikipedia William Rowan Hamilton

mdash Wikipedia The Free Encyclopedia http

en wikipedia org w index php

title=William20Rowan20Hamiltonampoldid=

923163451 [Online accessed 26shyOctober-

itself Mathematics itself found The foundations of mathematics p 1

quaternions

rigorization

symbolic logic

mathematical model

16 The branch of mathematics called model theory concerns itself

with general types of models that can be made from a given formal

system like an axiomatic mathematical system For more on model

theory see Hodges (Wilfrid Hodges Model Theory In The Stanford

Encyclopedia of Philosophy Ed by Edward N Zalta Fall 2018

Metaphysics Research Lab Stanford University 2018) It is noteworthy

that the engineeringscience use of the term ldquomathematical modelrdquo is

only loosely a ldquomodelrdquo in the sense of model theory

implicit definition

2019] 2019) created quaternions (Wikipedia

Quaternion mdash Wikipedia The Free Encyclopedia

http en wikipedia org w index

php title = Quaternion amp oldid = 920710557

[Online accessed 26shyOctobershy2019] 2019)

a number system in which multiplication is

noncommunicative that it became apparent

somethingwas fundamentally wrong with the

way truth in mathematics had been understood

This started a period of rigorization in

mathematics that set about axiomatizing and

proving 19th century mathematics This included

the development of symbolic logic which

aided in the process of deductive reasoning

7 An aspect of this rigorization is that

mathematicians came to terms with the axioms

that include undefined terms For instance

a ldquopointrdquo might be such an undefined term in

an axiom A mathematical model is what we

create when we attach these undefined terms

to objects which can be anything consistent

with the axioms16 The system that results

from proving theorems would then apply to

anything ldquoproperlyrdquo described by the axioms

So two masses might be assigned ldquopointsrdquo in

a Euclidean geometric space from which

we could be confident that for instance

the ldquodistancerdquo between these masses is the

Euclidean norm of the line drawn between the

points It could be said then that a ldquopointrdquo in

Euclidean geometry is implicitly defined by

its axioms and theorems and nothing else That

is mathematical objects are not inherently

tied to the physical objects to which we tend

to apply them Euclidean geometry is not

the study of physical space as it was long

consideredmdashit is the study of the objects

implicitly defined by its axioms and theorems

The foundations of mathematics are built

itself Mathematics itself found The foundations of mathematics p 2

consistent

theorem

Set theory

foundation

sets

ZFC set theory

8 The building of the modern foundations

mathematics began with clear axioms

solid reasoning (with symbolic logic) and

lofty yet seemingly attainable goals prove

theorems to support the already ubiquitous

mathematical techniques in geometry algebra

and calculus from axioms furthermore prove

that these axioms (and things they imply) do

not contradict each other ie are consistent

and that the axioms are not results of each

other (one that can be derived from others is

a theorem not an axiom)

9 Set theory is a type of formal axiomatic

system that all modern mathematics is

expressed with so set theory is often called

the foundation of mathematics (Joan Bagaria

Set Theory In The Stanford Encyclopedia

of Philosophy Ed by Edward N Zalta Fall

2019 Metaphysics Research Lab Stanford

University 2019) We will study the basics

in The primary objects in set theory are

sets informally collections of mathematical

objects There is not just one a single set

of axioms that is used as the foundation of

all mathematics for reasons will review in a

moment However the most popular set theory

is ZermeloshyFraenkel set theory with the

axiom of choice (ZFC) The axioms of ZF sans C

are as follows (ibid)

extensionality If two sets A and B have the

same elements then they are equal

empty set There exists a set denoted by emptyand called the empty set which has no

elements

pair Given any sets A and B there exists a set

itself Mathematics itself found The foundations of mathematics p 1

denoted by A B which contains A and B

as its only elements In particular there

exists the set A which has A as its only

element

power set For every set A there exists a set

denoted by P(A) and called the power set

of A whose elements are all the subsets of

A

union For every set A there exists a set

denoted by⋃A and called the union of

A whose elements are all the elements of

the elements of A

infinity There exists an infinite set In

particular there exists a set Z that

contains empty and such that if A isin Z then⋃A A isin Z

separation For every set A and every given

property there is a set containing

exactly the elements of A that have

that property A property is given by a

formula ϕ of the first-order language

of set theory Thus separation is not a

single axiom but an axiom schema that is

an infinite list of axioms one for each

formula ϕ

replacement For every given definable

function with domain a set A there is a

set whose elements are all the values of

the function

10 ZFC also has the axiom of choice (Bagaria

Set Theory)

choice For every set A of pairwiseshydisjoint

non-empty sets there exists a set that

contains exactly one element from each

set in A

itself Mathematics itself found The foundations of mathematics p 2

first incompleteness theorem

incomplete

17 Panu Raatikainen Goumldelrsquos Incompleteness Theorems In The

Stanford Encyclopedia of Philosophy Ed by Edward N Zalta Fall

2018 Metaphysics Research Lab Stanford University 2018 A through

and contemporary description of Goumldelrsquos incompleteness theorems

which have significant implications for the foundations and function

of mathematics and mathematical truth

undecidable

second incompleteness theorem

The foundations have cracks

11 The foundationalistsrsquo goal was to prove

that some set of axioms from which all of

mathematics can be derived is both consistent

(contains no contradictions) and complete

(every true statement is provable) The work of

Kurt Goumldel (Juliette Kennedy Kurt Goumldel In

The Stanford Encyclopedia of Philosophy Ed

by Edward N Zalta Winter 2018 Metaphysics

Research Lab Stanford University 2018) in

the mid 20th century shattered this dream by

proving in his first incompleteness theorem

that any consistent formal system within which

one can do some amount of basic arithmetic is

incomplete His argument is worth reviewing

(see Raatikainen17) but at its heart is an

undecidable statement like ldquoThis sentence is

unprovablerdquo Let U stand for this statement If

it is true it is unprovable If it is provable it is

false Therefore it is true iff it is provable

Then he shows that if a statement A that

essentially says ldquoarithmetic is consistentrdquo is

provable then so is the undecidable statement

U But if U is to be consistent it cannot be

provable and therefore neither can A be

provable

12 This is problematic It tells us virtually

any conceivable axiomatic foundation of

mathematics is incomplete If one is complete

it is inconsistent (and therefore worthless) One

problem this introduces is that a true theorem

may be impossible to prove but it turns out we

can never know that in advance of its proof if

it is provable

13 But it gets worse Goumldelrsquos second

itself Mathematics itself found The foundations of mathematics p 1

material implication

Banach-Tarski paradox

empirical

18 Kline Mathematics The Loss of Certainty

incompleteness theorem shows that such

systems cannot even be shown to be consistent

This means at any moment someone could find

an inconsistency in mathematics and not only

would we lose some of the theorems we would

lose them all This is because by what is called

the material implication (Kline Mathematics

The Loss of Certainty pp 187-8 264) if one

contradiction can be found every proposition

can be proven from it And if this is the case

all (even proven) theorems in the system would

be suspect

14 Even though no contradiction has yet

appeared in ZFC its axiom of choice which

is required for the proof of most of what

has thus far been proven generates the

Banach-Tarski paradox that says a sphere

of diameter x can be partitioned into a finite

number of pieces and recombined to form

two spheres of diameter x Troubling to say

the least Attempts were made for a while to

eliminate the use of the axiom of choice but

our buddy Goumldel later proved that if ZF is

consistent so is ZFC (ibid p 267)

Mathematics is considered empirical

15 Since its inception mathematics has been

applied extensively to the modeling of the

world Despite its cracked foundations it has

striking utility Many recent leading minds of

mathematics philosophy and science suggest

we treat mathematics as empirical like any

science subject to its success in describing

and predicting events in the world As Kline18

summarizes

itself Mathematics itself reason The foundations of mathematics p 2

The upshot [hellip] is that sound

mathematics must be determined

not by any one foundation which

may some day prove to be right The

ldquocorrectnessrdquo of mathematics must

be judged by its application to the

physical world Mathematics is an

empirical science much as Newtonian

mechanics It is correct only to the

extent that it works and when it

does not it must be modified It

is not a priori knowledge even

though it was so regarded for two

thousand years It is not absolute

or unchangeable

itself Mathematics itself overview Mathematical reasoning p 1

itselfreason Mathematical

reasoning

itself Mathematics itself engmath Mathematical topics in overview p 1

itselfoverview Mathematical

topics in

overview

itself Mathematics itself ex What is mathematics for engineering p 1

itselfengmath What is

mathematics for

engineering

language

sets Mathematics itself Exercises for Chapter 01 p 1

itselfex Exercises for Chapter

01

formal languages

logic

reasoning

propositions

theorems

proof

propositional calculus

logical connectives

first-order logic

sets

Mathematicalreasoning logicand set theory

In order to communicate mathematical

ideas effectively formal languages have

been developed within which logic ie

deductive (mathematical) reasoning can

proceed Propositions are statements that

can be either true gt or false perp Axiomatic

systems begin with statements (axioms) assumed

true Theorems are proven by deduction

In many forms of logic like propositional

calculus (Wikipedia Propositional

calculus mdash Wikipedia The Free Encyclopedia

httpenwikipediaorgwindexphptitle=

Propositional20calculusampoldid=914757384

[Online accessed 29shyOctobershy2019] 2019)

compound propositions are constructed

via logical connectives like ldquoandrdquo

and ldquoorrdquo of atomic propositions (see

Lecture setslogic) In others like first-order

logic (Wikipedia First-order logic mdash

Wikipedia The Free Encyclopedia http

enwikipediaorgwindexphptitle=First-

order 20logic amp oldid = 921437906 [Online

accessed 29shyOctobershy2019] 2019) there are also

sets Mathematical reasoning logic and set theory setintro p 3

quantifiers

set theory

logical quantifiers like ldquofor everyrdquo and ldquothere

existsrdquo

The mathematical objects and operations about

which most propositions are made are expressed

in terms of set theory which was introduced

in Lecture itselffound and will be expanded

upon in Lecture setssetintro We can say

that mathematical reasoning is comprised of

mathematical objects and operations expressed

in set theory and logic allows us to reason

therewith

sets Mathematical reasoning logic and set theory setintro Introduction to set theory p 1

set theory

1 K Ciesielski Set Theory for the Working Mathematician London

Mathematical Society Student Texts Cambridge University Press 1997

ISBN 9780521594653 A readable introduction to set theory

2 HB Enderton Elements of Set Theory Elsevier Science 1977 ISBN

9780080570426 A gentle introduction to set theory and mathematical

reasoningmdasha great place to start

set

field

addition

multiplication

subtraction

division

setssetintro Introduction to

set theory

Set theory is the language of the modern

foundation of mathematics as discussed

in Lecture itselffound It is unsurprising

then that it arises throughout the study of

mathematics We will use set theory extensively

in on probability theory

The axioms of ZFC set theory were introduced

in Lecture itselffound Instead of proceeding

in the pure mathematics way of introducing

and proving theorems we will opt for a more

applied approach in which we begin with some

simple definitions and include basic operations

A more thorough and still readable treatment

is given by Ciesielski1 and a very gentle

version by Enderton2

A set is a collection of objects Set theory

gives us a way to describe these collections

Often the objects in a set are numbers or sets

of numbers However a set could represent

collections of zebras and trees and hairballs

For instance here are some sets

A field is a set with special structure This

structure is provided by the addition (+)

and multiplication (times) operators and their

inverses subtraction (ndash) and division (divide)

sets Mathematical reasoning logic and set theory setintro Introduction to set theory p 2

real numbers

3 When the natural numbers include zero we write N0

set membership

set operations

union

The quintessential example of a field is the

set of real numbers R which admits these

operators making it a field The reals R thecomplex numbers C the integers Z and the

natural numbers3 N are the fields we typically

consider

Set membership is the belonging of an object

to a set It is denoted with the symbol isinwhich can be read ldquois an element ofrdquo for

element x and set X

For instance we might say 7 isin 1 7 2 or

4 isin 1 7 2 Or we might declare that a is a

real number by stating x isin R

Set operations can be used to construct new

sets from established sets We consider a few

common set operations now

The union cup of sets is the set containing all

the elements of the original sets (no repetition

allowed) The union of sets A and B is denoted

A cup B For instance let A = 1 2 3 and B = ndash1 3

then

sets Mathematical reasoning logic and set theory setintro Introduction to set theory p 3

intersection

empty set

set difference

subset

proper subset

The intersection cap of sets is a set containing

the elements common to all the original sets

The intersection of sets A and B is denoted

A cap B For instance let A = 1 2 3 and B = 2 3 4

then

If two sets have no elements in common the

intersection is the empty set empty = the unique

set with no elements

The set difference of two sets A and B is the

set of elements in A that arenrsquot also in B It is

denoted A B For instance let A = 1 2 3 and

B = 2 3 4 Then

A subset sube of a set is a set the elements of

which are contained in the original set If the

two sets are equal one is still considered a

subset of the other We call a subset that is

not equal to the other set a proper subset sub

sets Mathematical reasoning logic and set theory setintro Introduction to set theory p 4

complement

cartesian product

set-builder notation

For instance let A = 1 2 3 and B = 1 2 Then

The complement of a subset is a set of elements

of the original set that arenrsquot in the subset

For instance if B sube A then the complement of B

denoted B is

The cartesian product of two sets A and B is

denoted A times B and is the set of all ordered

pairs (a b) where a isin A and b isin B Itrsquos

worthwhile considering the following notation

for this definition

which means ldquothe cartesian product of A and B

is the ordered pair (a b) such that a isin A and

b isin Brdquo in set-builder notation (Wikipedia

Set-builder notation mdash Wikipedia The Free

Encyclopedia httpenwikipediaorgw

indexphptitle=Set-builder20notationamp

oldid=917328223 [Online accessed 29shyOctober-

sets Mathematical reasoning logic and set theory logic Introduction to set theory p 5

map

function

value

domain

codomain

image

2019] 2019)

Let A and B be sets A map or function f from

A to B is an assignment of some element a isin A

to each element b isin B The function is denoted

f A rarr B and we say that f maps each element

a isin A to an element f(a) isin B called the value

of a under f or a 7rarr f(a) We say that f has

domain A and codomain B The image of f is

the subset of its codomain B that contains

the values of all elements mapped by f from its

domain A

sets Mathematical reasoning logic and set theory ex Logical connectives and quantifiers p 1

first-order logic

atomistic proposition

compound proposition

logical connectives

not

andor

truth table

universal quantifier symbol

existential quantifier

setslogic Logical connectives

and quantifiers

In order to make compound propositions

we need to define logical connectives In

order to specify quantities of variables

we need to define logical quantifiers The

following is a form of first-order logic

(Wikipedia First-order logic mdash Wikipedia The

Free Encyclopedia)

Logical connectives

A proposition can be either true gt and false perpWhen it does not contain a logical connective

it is called an atomistic proposition

To combine propositions into a compound

proposition we require logical connectives

They are not (not) and (and) and or (or) Table 021

is a truth table for a number of connectives

Quantifiers

Logical quantifiers allow us to indicate

the quantity of a variable The universal

quantifier symbol forall means ldquofor allrdquo For

instance let A be a set then foralla isin A means

ldquofor all elements in Ardquo and gives this quantity

variable a The existential quantifier existmeans ldquothere exists at least onerdquo or ldquofor somerdquo

For instance let A be a set then exista isin A

Table 021 a truth table for logical connectivesThe first two columns are the truth values ofpropositions p and q the rest are outputs

not and or nand nor xor xnor

p q notp pand q por q p uarr q p darr q p Y q phArr q

perp perp gt perp perp gt gt perp gtperp gt gt perp gt gt perp gt perpgt perp perp perp gt gt perp gt perpgt gt perp gt gt perp perp perp gt

sets Mathematical reasoning logic and set theory ex Logical connectives and quantifiers p 2

means ldquothere exists at least one element a in

A helliprdquo

prob Mathematical reasoning logic and set theory Exercises for Chapter 02 p 1

setsex Exercises for Chapter

02

Exercise 021 (hardhat)

For the following write the set described in

set-builder notation

a A = 2 3 5 9 17 33 middot middot middot b B is the set of integers divisible by 11

c C = 13 14 15 middot middot middot d D is the set of reals between ndash3 and 42

Exercise 022 (2)

Let xy isin Rn Prove the Cauchy-Schwarz

Inequality

|x middot y| 6 xy (ex1)

Hint you may find the geometric definition of

the dot product helpful

Exercise 023 (3)

Let x isin Rn Prove that

x middot x = x2 (ex2)

Hint you may find the geometric definition of

the dot product helpful

Exercise 024 (4)

Let xy isin Rn Prove the Triangle Inequality

x + y 6 x + y (ex3)

Hint you may find the Cauchy-Schwarz

Inequality helpful

prob

Probability

prob Probability prob Probability and measurement p 1

Probability theory

1 For a good introduction to probability theory see Ash (Robert B

Ash Basic Probability Theory Dover Publications Inc 2008) or Jaynes

et al (ET Jaynes et al Probability Theory The Logic of Science

Cambridge University Press 2003 ISBN 9780521592710 An excellent

and comprehensive introduction to probability theory)

interpretation of probability

event

probmeas Probability and

measurement

Probability theory is a wellshydefined branch

of mathematics Andrey Kolmogorov described

a set of axioms in 1933 that are still in use

today as the foundation of probability

theory1

We will implicitly use these axioms in our

analysis The interpretation of probability is

a contentious matter Some believe probability

quantifies the frequency of the occurrence of

some event that is repeated in a large number

of trials Others believe it quantifies the

state of our knowledge or belief that some

event will occur

In experiments our measurements are tightly

coupled to probability This is apparent in the

questions we ask Here are some examples

1 How common is a given event

2 What is the probability we will reject

a good theory based on experimental

results

3 How repeatable are the results

4 How confident are we in the results

5 What is the character of the fluctuations

and drift in the data

6 How much data do we need

prob Probability prob Basic probability theory p 1

probability space

sample space

outcomes

probprob Basic probability

theory

The mathematical model for a class of

measurements is called the probability

space and is composed of a mathematical

triple of a sample space Ω σshyalgebra

F and probability measure P typically

denoted (ΩF P) each of which we will

consider in turn (Wikipedia Probability

space mdash Wikipedia The Free Encyclopedia

httpenwikipediaorgwindexphptitle=

Probability20spaceampoldid=914939789 [Online

accessed 31shyOctobershy2019] 2019)

The sample space Ω of an experiment is the

set representing all possible outcomes of the

experiment If a coin is flipped the sample

space is Ω = H T where H is heads and T

is tails If a coin is flipped twice the sample

space could be

However the same experiment can have

different sample spaces For instance for two

coin flips we could also choose

prob Probability prob Basic probability theory p 1

event

We base our choice of Ω on the problem at

hand

An event is a subset of the sample space

That is an event corresponds to a yes-or-no

question about the experiment For instance

event A (remember A sube Ω) in the coin flipping

experiment (two flips) might be A = HT TH A is

an event that corresponds to the question ldquoIs

the second flip different than the firstrdquo A is

the event for which the answer is ldquoyesrdquo

Algebra of events

Because events are sets we can perform the

usual set operations with them

Example probprobshy1 re set operations with events

Problem Statement

Consider a toss of a single die We choose

the sample space to be Ω = 1 2 3 4 5 6 Let the

following define events

A equiv the result is even = 2 4 6

B equiv the result is greater than 2 = 3 4 5 6

Find the following event combinations

A cup B A cap B A B B A A B

prob Probability condition Basic probability theory p 2

σshyalgebra

probability measure

The σshyalgebra F is the collection of events

of interest Often F is the set of all possible

events given a sample space Ω which is just

the power set of Ω (Wikipedia Probability

space mdash Wikipedia The Free Encyclopedia)

When referring to an event we often state

that it is an element of F For instance we

might say an event A isin F

Wersquore finally ready to assign probabilities to

events We define the probability measure

P F rarr [0 1] to be a function satisfying the

following conditions

1 For every event A isin F the probability

measure of A is greater than or equal to

zeromdashie P(A) gt 0

2 If an event is the entire sample space its

probability measure is unitymdashie if A = Ω

P(A) = 1

3 If events A1 A2 middot middot middot are disjoint sets (no

elements in common) then P(A1 cup A2 cup middot middot middot ) =P(A1) + P(A2) + middot middot middot

We conclude the basics by observing four facts

that can be proven from the definitions above

1

2

3

4

prob Probability condition Independence and conditional probability p 1

independent

probcondition Independence

and conditional

probability

Two events A and B are independent if and

only if

P(A cap B) = P(A)P(B)

If an experimenter must make a judgment

without data about the independence of

events they base it on their knowledge of the

events as discussed in the following example

Example probconditionshy1 re independence

Problem Statement

Answer the following questions and

imperatives

1 Consider a single fair die rolled twice

What is the probability that both rolls

are 6

2 What changes if the die is biased by a

weight such that P(6) = 17

3 What changes if the die is biased by a

magnet rolled on a magnetic dice-rolling

tray such that P(6) = 17

4 What changes if there are two dice

biased by weights such that for each

P(6) = 17 rolled once both resulting in

6

5 What changes if there are two dice

biased by magnets rolled together

prob Probability condition Independence and conditional probability p 1

dependent

conditional probability

Conditional probability

If events A and B are somehow dependent we

need a way to compute the probability of B

occurring given that A occurs This is called

the conditional probability of B given A and

is denoted P(B | A) For P(A) gt 0 it is defined as

P(B | A) =P(A cap B)P(A)

(condition1)

We can interpret this as a restriction of

the sample space Ω to A ie the new sample

space Ωprime = A sube Ω Note that if A and B are

independent we obtain the obvious result

prob Probability bay Independence and conditional probability p 2

Example probconditionshy2 re dependence

Problem Statement

Consider two unbiased dice rolled once

Let events A = sum of faces = 8 and

B = faces are equal What is the probability

the faces are equal given that their sum is 8

prob Probability bay Bayesrsquo theorem p 1

Bayesrsquo theorem

probbay Bayesrsquo theorem

Given two events A and B Bayesrsquo theorem (aka

Bayesrsquo rule) states that

P(A | B) = P(B | A)P(A)

P(B) (bay1)

Sometimes this is written

P(A | B) =P(B | A)P(A)

P(B | A)P(A) + P(B | notA)P(notA)(bay2)

=1

1 +P(B | notA)

P(B | A)middot P(notA)P(A)

(bay3)

This is a useful theorem for determining a

testrsquos effectiveness If a test is performed to

determine whether an event has occurred we

might as questions like ldquoif the test indicates

that the event has occurred what is the

probability it has actually occurredrdquo Bayesrsquo

theorem can help compute an answer

Testing outcomes

The test can be either positive or negative

meaning it can either indicate or not indicate

that A has occurred Furthermore this result

can be either true or false

A notA

positive (B) true false

negative (notB) false true

Table 031 test outcomeB for event A

There are four

options then

Consider an event

A and an event that

is a test result B

indicating that event

A has occurred

Table 031 shows

these four possible test outcomes The event

A occurring can lead to a true positive or a

prob Probability bay Bayesrsquo theorem p 1

sensitivity

detection rate

specificity

posterior probability

prior probability

false negative whereas notA can lead to a true

negative or a false positive

Terminology is important here

bull P(true positive) = P(B | A) aka sensitivity

or detection rate

bull P(true negative) = P(notB | notA) aka

specificity

bull P(false positive) = P(B | notA)

bull P(false negative) = P(notB | A)

Clearly the desirable result for any test

is that it is true However no test is true

100 percent of the time So sometimes it

is desirable to err on the side of the

false positive as in the case of a medical

diagnostic Other times it is more desirable to

err on the side of a false negative as in the

case of testing for defects in manufactured

balloons (when a false negative isnrsquot a big

deal)

Posterior probabilities

Returning to Bayesrsquo theorem we can evaluate

the posterior probability P(A | B) of the

event A having occurred given that the

test B is positive given information that

includes the prior probability P(A) of A The

form in Equation bay2 or (bay3) is typically

useful because it uses commonly known test

probabilities of the true positive P(B | A) and

of the false positive P(B | notA) We calculate

P(A | B) when we want to interpret test results

prob Probability bay Bayesrsquo theorem p 2

0 02 04 06 08 1

middot10ndash2

0

02

04

06

08

1

P(A) rarr

P(A|B)rarr

P(B|A) = 09000

P(B|A) = 09900

P(B|A) = 09990

P(B|A) = 09999

Figure 031 for different high-sensitivities theprobability that an event A occurred given thata test for it B is positive versus the probabilitythat the event A occurs under the assumption thespecificity equals the sensitivity

Some interesting results can be found from this

For instance if we let P(B | A) = P(notB | notA)

(sensitivity equal specificity) and realize that

P(B | notA) + P(notB | notA) = 1 (when notA either B or notB)

we can derive the expression

P(B | notA) = 1 ndash P(B | A) (bay4)

Using this and P(notA) = 1 ndash P(A) in Equation bay3

gives (recall wersquove assumed sensitivity equals

specificity)

P(A | B) =1

1 +1 ndash P(B | A)

P(B | A)middot 1 ndash P(A)P(A)

(bay5)

=1

1 +

(1

P(B | A)ndash 1

)(1

P(A)ndash 1

) (bay6)

This expression is plotted in Figure 031 See that

a positive result for a rare event (small P(A)) is

hard to trust unless the sensitivity P(B | A) and

specificity P(notB | notA) are very high indeed

prob Probability bay Bayesrsquo theorem p 3

Example probbayshy1 re Bayesrsquo theorem

Problem StatementSuppose 01 of springs manufactured at a

given plant are defective Suppose you need

to design a test that when it indicates

a deffective part the part is actually

defective 99 of the time What sensitivity

should your test have assuming it can be made

equal to its specificity

Problem Solution

The followingwas generated from a Jupyter

notebook with the following filename and

kernel

notebook filename bayes_theorem_example_01ipynbnotebook kernel python3

from sympy import for symbolicsimport numpy as np for numericsimport matplotlibpyplot as plt for plotsfrom IPythondisplay import display Markdown Latex

Define symbolic variables

var(p_Ap_nAp_Bp_nBp_B_Ap_B_nAp_A_Breal=True)

(p_A p_nA p_B p_nB p_B_A p_B_nA p_A_B)

prob Probability bay Bayesrsquo theorem p 4

Beginningwith Bayesrsquo theorem and assuming

the sensitivity and specificity are equal by

Equation bay4 we can derive the following

expression for the posterior probability P(A | B)

p_A_B_e1 = Eq(p_A_Bp_B_Ap_Ap_B)subs(

p_B p_B_Ap_A+p_B_nAp_nA conditional probp_B_nA 1-p_B_A Eq (35)p_nA 1-p_A

)display(p_A_B_e1)

pAB =pApBA

pApBA +(1 ndash pA

) (1 ndash pBA

)

Solve this for P(B | A) the quantity we seek

p_B_A_sol = solve(p_A_B_e1p_B_Adict=True)p_B_A_eq1 = Eq(p_B_Ap_B_A_sol[0][p_B_A])display(p_B_A_eq1)

pBA =pAB

(1 ndash pA

)ndash2pApAB + pA + pAB

Now letrsquos substitute the given probabilities

p_B_A_spec = p_B_A_eq1subs(

prob Probability rando Bayesrsquo theorem p 5

p_A 0001p_A_B 099

)display(p_B_A_spec)

pBA = 0999989888981011

Thatrsquos a tall order

prob Probability rando Random variables p 1

random variable

discrete random variable

continuous random variable

probrando Random variables

Probabilities are useful even when they do

not deal strictly with events It often occurs

that we measure something that has randomness

associated with it We use random variables to

represent these measurements

A random variable X Ω rarr R is a function

that maps an outcome ω from the sample

space Ω to a real number x isin R as shown in

Figure 032 A random variable will be denoted

with a capital letter (eg X and K) and a

specific value that it maps to (the value) will

be denoted with a lowercase letter (eg x and

k)

A discrete random variable K is one that

takes on discrete values A continuous random

variable X is one that takes on continuous

values

Example probrandoshy1 re dice again

Problem Statement

Roll two unbiased dice Let K be a random

variable representing the sum of the two Let

rvR

Ω

outcome ω

x

X

Figure 032 a random variable X maps an outcomeω isin Ω to an x isin R

prob Probability rando Random variables p 2

P(k) be the probability of the result k isin K Plotand interpret P(k)

Example probrandoshy2 re Johnson-Nyquist noise

Problem Statement

A resistor at nonzero temperature without

any applied voltage exhibits an interesting

phenomenon its voltage randomly fluctuates

This is called JohnsonshyNyquist noise and

is a result of thermal excitation of charge

carriers (electrons typically) For a given

resistor and measurement system let the

probability density function fV of the voltage

V across an unrealistically hot resistor be

fV(V) =1radicπendashV

2

Plot and interpret the meaning of this

function

prob Probability pxf Random variables p 3

prob Probability pxf Probability density and mass functions p 1

frequency distribution

probability mass function

Figure 033 plot of a probability mass function

frequency density distribution

probability density function

probpxf Probability density

and mass functions

Consider an experiment that measures a

random variable We can plot the relative

frequency of the measurand landing in

different ldquobinsrdquo (ranges of values) This is

called a frequency distribution or a

probability mass function (PMF)

Consider for instance a probability mass

function as plotted in Figure 033 where a

frequency ai can be interpreted as an estimate

of the probability of the measurand being in

the ith interval The sum of the frequencies

must be unity

with k being the number of bins

The frequency density distribution is similar

to the frequency distribution but with ai 7rarrai∆x where ∆x is the bin width

If we let the bin width approach zero we

derive the probability density function (PDF)

f(x) = limkrarrinfin∆xrarr0

ksumj=1

aj∆x (pxf1)

prob Probability pxf Probability density and mass functions p 1

Figure 034 plot of a probability density function

We typically think of a probability density

function f like the one in Figure 034 as a

function that can be integrated over to

find the probability of the random variable

(measurand) being in an interval [a b]

P(x isin [a b]) =ˆ b

af(χ)dχ (pxf2)

Of course

We now consider a common PMF and a common

PDF

Binomial PMF

Consider a random binary sequence of length

n such that each element is a random 0 or 1

generated independently like

(1 0 1 1 0 middot middot middot 1 1) (pxf3)

Let events 1 and 0 be mutually exclusive

and exhaustive and P(1) = p The probability of

the sequence above occurring is

prob Probability pxf Probability density and mass functions p 2

binomial distribution PDF

There are n choose k(n

k

)=

n

k(n ndash k) (pxf4)

possible combinations of k ones for n bits

Therefore the probability of any combination

of k ones in a series is

f(k) =

(n

k

)pk(1 ndash p)nndashk (pxf5)

We call Equation pxf5 the binomial

distribution PDF

Example probpxfshy1 re Binomial PMF

Problem Statement

Consider a field sensor that fails for a

given measurement with probability p Given

n measurements plot the binomial PMF as

a function of k failed measurements for

a few different probabilities of failure

p isin [004 025 05 075 096]

Problem SolutionFigure 036 shows Matlab code for

constructing the PDFs plotted in Figure 035

Note that the symmetry is due to the fact

that events 1 and 0 are mutually exclusive

and exhaustive

Example probpxfshy2 re hi

Problem Statement

Sed mattis erat sit amet gravida malesuada

elit augue egestas diam tempus scelerisque

nunc nisl vitae libero Sed consequat feugiat

massa Nunc porta eros in eleifend varius

erat leo rutrum dui non convallis lectus

prob Probability pxf Probability density and mass functions p 3

0 20 40 60 80 1000

005

01

015

02

number of failures in sequence

proba

bility

p = 004

p = 025

p = 05

p = 075

p = 096

Figure 035 binomial PDF for n = 100 measurementsand different values of P(1) = p the probability ofa measurement error The plot is generated by theMatlab code of Figure 036

parametersn = 100k_a = linspace(1nn)p_a = [042557596]

binomial functionf = (nkp) nchoosek(nk)p^k(1-p)^(n-k)

loop through to construct an arrayf_a = NaNones(length(k_a)length(p_a))for i = 1length(k_a)

for j = 1length(p_a)f_a(ij) = f(nk_a(i)p_a(j))

endend

plotfigurecolors = jet(length(p_a))for j = 1length(p_a)

bar(k_af_a(j)facecolorcolors(j)facealpha05displayname [$p = num2str(p_a(j))$]

) hold onendleg = legend(showlocationnorth)set(leginterpreterlatex)hold offxlabel(number of ones in sequence k)ylabel(probability)xlim([0100])

Figure 036 a Matlab script for generating binomialPMFs

prob Probability pxf Probability density and mass functions p 4

orci ut nibh Sed lorem massa nonummy quis

egestas id condimentum at nisl Maecenas at

nibh Aliquam et augue at nunc pellentesque

ullamcorper Duis nisl nibh laoreet suscipit

convallis ut rutrum id enim Phasellus odio

Nulla nulla elit molestie non scelerisque at

vestibulum eu nulla Ut odio nisl facilisis

id mollis et scelerisque nec enim Aenean sem

leo pellentesque sit amet scelerisque sit amet

vehicula pellentesque sapien

prob Probability pxf Probability density and mass functions p 5

prob Probability pxf Probability density and mass functions p 6

prob Probability E Probability density and mass functions p 1

Gaussian or normal random variable

mean

standard deviation

variance

Gaussian PDF

The Gaussian or normal random variable x has

PDF

f(x) =1

σradic2π

expndash(x ndash micro)2

2σ2 (pxf6)

Although wersquore not quite ready to understand

these quantities in detail it can be shown that

the parameters micro and σ have the following

meanings

bull micro is the mean of x

bull σ is the standard deviation of x and

bull σ2 is the variance of x

Consider the ldquobell-shapedrdquo Gaussian PDF in

Figure 037 It is always symmetric The mean micro

is its central value and the standard deviation

σ is directly related to its width We will

continue to explore the Gaussian distribution

in the following lectures especially in

Lecture statsconfidence

prob Probability E Probability density and mass functions p 2

ndash3 ndash2 ndash1 0 1 2 3

random variable x

Figure 037 PDF for Gaussian random variable x

mean micro = 0 and standard deviation σ = 1radic2

prob Probability E Expectation p 1

expected value

expectation

mean

probE Expectation

Recall that a random variable is a function

X Ω rarr R that maps from the sample

space to the reals Random variables are the

arguments of probability mass functions (PMFs)

and probability density functions (PDFs)

The expected value (or expectation) of a

random variable is akin to its ldquoaverage valuerdquo

and depends on its PMF or PDF The expected

value of a random variable X is denoted

〈X〉 or E[X] There are two definitions of

the expectation one for a discrete random

variable the other for a continuous random

variable Before we define them however it

is useful to predefine the most fundamental

property of a random variable its mean

Definition probE1 mean

The mean of a random variable X is defined

as

mX = E[X]

Letrsquos begin with a discrete random variable

Definition probE2 expectation of a

discrete random

variable

Let K be a discrete random variable and f its

PMF The expected value of K is defined as

E[K]=sumforallk

kf(k)

prob Probability E Expectation p 2

Example probEshy1 re expectation of a discrete random variable

Problem Statement

Given a discrete random variable K with PMF

shown below what is its mean mK

Let us now turn to the expectation of a

continuous random variable

Definition probE3 expectation of a

continuous random

variable

Let X be a continuous random variable and f

its PDF The expected value of X is defined as

E[X]=

ˆ infinndashinfin xf(x)dx

Example probEshy2 re expectation of a continuous random variable

Problem Statement

Given a continuous random variable X with

Gaussian PDF f what is the expected value of

X

random variable x

prob Probability moments Expectation p 3

Due to its sum or integral form the expected

value E [middot] has some familiar properties for

random variables X and Y and reals a and b

E [a] = a (E1a)

E[X + a

]= E

[X]+ a (E1b)

E[aX

]= aE

[X]

(E1c)

E[E[X]]= E

[X]

(E1d)

E[aX + bY

]= aE

[X]+ b E

[Y] (E1e)

prob Probability moments Central moments p 1

central moments

variance

probmoments Central moments

Given a probability mass function (PMF) or

probability density function (PDF) of a random

variable several useful parameters of the

random variable can be computed These are

called central moments which quantify

parameters relative to its mean

Definition probmoments1 central moments

The nth central moment of random variable X

with PDF f is defined as

E[(X ndash microX)

n] = ˆ infinndashinfin (x ndash microX)

nf(x)dx

For discrete random variable K with PMF f

E[(K ndash microK)

n] = infinsumforallk(k ndash microK)

nf(k)

Example probmomentsshy1 re first moment

Problem Statement

Prove that the first moment of continuous

random variable X is zero

The second central moment of random variable

X is called the variance and is denoted

prob Probability moments Central moments p 2

standard deviation

skewness

σ2X or Var[X]

or E[(X ndash microX)

2]

(moments1)

The variance is a measure of the width or

spread of the PMF or PDF We usually compute

the variance with the formula

Other properties of variance include for real

constant c

Var [c] = 0 (moments2)

Var[X + c

]= Var

[X]

(moments3)

Var[cX

]= c2 Var

[X] (moments4)

The standard deviation is defined as

Although the variance is mathematically more

convenient the standard deviation has the

same physical units as X so it is often the

more physically meaningful quantity Due to

its meaning as the width or spread of the

probability distribution and its sharing of

physical units it is a convenient choice for

error bars on plots of a random variable

The skewness Skew[X]is a normalized third

prob Probability ex Central moments p 3

asymmetry

kurtosis

tailedness

central moment

Skew[X]=E[(X ndash microX)

3]

σ3X

(moments5)

Skewness is a measure of asymmetry of a

random variablersquos PDF or PMF For a symmetric

PMF or PDF such as the Gaussian PDF Skew[X]=

0

The kurtosis Kurt[X]is a normalized fourth

central moment

Kurt[X]=E[(X ndash microX)

4]

σ4X

(moments6)

Kurtosis is a measure of the tailedness of a

random variablersquos PDF or PMF ldquoHeavierrdquo tails

yield higher kurtosis

A Gaussian random variable has PDF with

kurtosis 3 Given that for Gaussians both

skewness and kurtosis have nice values (0 and

3) we can think of skewness and and kurtosis

as measures of similarity to the Gaussian PDF

stats Probability Exercises for Chapter 03 p 1

probex Exercises for Chapter

03

Exercise 031 (5)

Several physical processes can be modeled

with a random walk a process of interatively

changing a quantity by some random

amount Infinitely many variations are

possible but common factors of variation

include probability distribution step

size dimensionality (eg oneshydimensional

twoshydimensional etc) and coordinate system

Graphical representations of these walks can be

beautiful Develop a computer program that

generates random walks and corresponding

graphics Do it well and call it art because it

is

estimation

stats

Statistics

Whereas probability theory is primarily

focused on the relations among mathematical

objects statistics is concerned with making

sense of the outcomes of observation (Steven

S Skiena Calculated Bets Computers Gambling

and Mathematical Modeling to Win Outlooks

Cambridge University Press 2001 DOI 10

1017CBO9780511547089 This includes a lucid

section on probability versus statistics also

available here httpswww3csstonybrook

edu~skienajaialaiexcerptsnode12html)

However we frequently use statistical methods

to estimate probabilistic models For instance

we will learn how to estimate the standard

deviation of a random process we have some

reason to expect has a Gaussian probability

distribution

Statistics has applications in nearly every

applied science and engineering discipline

Any time measurements are made statistical

analysis is how one makes sense of the results

For instance determining a reasonable level

of confidence in a measured parameter

stats Statistics terms p 3

machine learning

1 Ash Basic Probability Theory

2 Jaynes et al Probability Theory The Logic of Science

3 Erwin Kreyszig Advanced Engineering Mathematics 10th John Wiley amp

Sons Limited 2011 ISBN 9781119571094 The authoritative resource for

engineering mathematics It includes detailed accounts of probability

statistics vector calculus linear algebra fourier analysis ordinary

and partial differential equations and complex analysis It also

includes several other topics with varying degrees of depth Overall

it is the best place to start when seeking mathematical guidance

requires statistics

A particularly hot topic nowadays is machine

learning which seems to be a field with

applications that continue to expand This field

is fundamentally built on statistics

A good introduction to statistics appears at

the end of Ash1 A more involved introduction

is given by Jaynes et al2 The treatment by

Kreyszig3 is rather incomplete as will be our

own

stats Statistics terms Populations samples and machine learning p 1

population

sample

random

machine learning

training

training set

features

predictive model

statsterms Populations samples

and machine

learning

An experimentrsquos population is a complete

collection of objects that we would like to

study These objects can be people machines

processes or anything else we would like to

understand experimentally

Of course we typically canrsquot measure all of

the population Instead we take a subset of

the populationmdashcalled a samplemdashand infer the

characteristics of the entire population from

this sample

However this inference that the sample is

somehow representative of the population

assumes the sample size is sufficiently large

and that the sampling is random This means

selection of the sample should be such that no

one group within a population are systematically

over- or under-represented in the sample

Machine learning is a field that makes

extensive use of measurements and statistical

inference In it an algorithm is trained by

exposure to sample data which is called a

training set The variables measured are

called features Typically a predictive

model is developed that can be used to

extrapolate from the data to a new situation

The methods of statistical analysis we

introduce in this chapter are the foundation

of most machine learning methods

stats Statistics sample Populations samples and machine learning p 2

Example statstermsshy1 re combat boots

Problem Statement

Consider a robot Pierre with a particular

gravitas and sense of style He seeks just

the right-looking pair of combat boots

for wearing in the autumn rains Pierre

is to purchase the boots online via image

recognition and decides to gather data

by visiting a hipster hangout one evening to

train his style For contrast he also watches

footage of a White Nationalist rally focusing

special attention on the boots of wearers of

khakis and polos Comment on Pierrersquos methods

stats Statistics sample Estimation of sample mean and variance p 1

sample mean

statssample Estimation of sample

mean and variance

Estimation and sample statistics

The mean and variance definitions of

Lecture probE and Lecture probmoments apply

only to a random variable for which we have a

theoretical probability distribution Typically

it is not until after having performed many

measurements of a random variable that

we can assign a good distribution model

Until then measurements can help us estimate

aspects of the data We usually start by

estimating basic parameters such as mean

and variance before estimating a probability

distribution

There are two key aspects to randomness in the

measurement of a random variable First of

course there is the underlying randomness with

its probability distribution mean standard

deviation etc which we call the population

statistics Second there is the statistical

variability that is due to the fact that we

are estimating the random variablersquos statisticsmdash

called its sample statisticsmdashfrom some sample

Statistical variability is decreased with

greater sample size and number of samples

whereas the underlying randomness of the

random variable does not decrease Instead

our estimates of its probability distribution

and statistics improve

Sample mean variance and standard

deviation

The arithmetic mean or sample mean of a

measurand with sample size N represented by

stats Statistics sample Estimation of sample mean and variance p 2

population mean

sample variance

population variance

random variable X is defined as

x =1

N

Nsumi=1

xi (sample1)

If the sample size is large x rarr mX (the sample

mean approaches the mean) The population

mean is another name for the mean mX which

is equal to

mX = limNrarrinfin 1

N

Nsumi=1

xi (sample2)

Recall that the definition of the mean is

mX = E [x]

The sample variance of a measurand

represented by random variable X is defined

as

S2X =1

N ndash 1

Nsumi=1

(xi ndash x)2 (sample3)

If the sample size is large S2X rarr σ2X(the

sample variance approaches the variance) The

population variance is another term for the

variance σ2X and can be expressed as

σ2X = limNrarrinfin 1

N ndash 1

Nsumi=1

(xi ndash x)2 (sample4)

Recall that the definition of the variance is

σ2X= E

[(X ndash mX)

2]

The sample standard deviation of a measurand

represented by random variable X is defined

as

SX =

radicS2X (sample5)

If the sample size is large SX rarr σX (the sample

standard deviation approaches the standard

stats Statistics sample Estimation of sample mean and variance p 1

mean of means Xi

mean of standard deviations SXi

standard deviation of means SXi

standard deviation of standard deviations SSXi

deviation) The population standard deviation is

another term for the standard deviation σX

and can be expressed as

σX = limNrarrinfin

radicS2X (sample6)

Recall that the definition of the standard

deviation is σX =radicσ2X

Sample statistics as random variables

There is an ambiguity in our usage of the term

ldquosamplerdquo It can mean just one measurement

or it can mean a collection of measurements

gathered together Hopefully it is clear from

context

In the latter sense often we collect multiple

samples each of which has its own sample

mean Xi and standard deviation SXi In this

situation Xi and SXi are themselves random

variables (meta af I know) This means they

have their own sample means Xi and SXi and

standard deviations SXi

and SSXi

The mean of means Xi is equivalent to

a mean with a larger sample size and is

therefore our best estimate of the mean of

the underlying random process The mean of

standard deviations SXi is our best estimate

of the standard deviation of the underlying

random process The standard deviation

of means SXi

is a measure of the spread

in our estimates of the mean It is our best

estimate of the standard deviation of the

statistical variation and should therefore

tend to zero as sample size and number of

samples increases The standard deviation

of standard deviations SSXiis a measure of

stats Statistics sample Estimation of sample mean and variance p 1

the spread in our estimates of the standard

deviation of the underlying process It should

also tend to zero as sample size and number of

samples increases

Let N be the size of each sample It can be

shown that the standard deviation of the

means SXi

can be estimated from a single

sample standard deviation

SXiasympSXiradicN (sample7)

This shows that as the sample size N increases

the statistical variability of the mean

decreases (and in the limit approaches zero)

Nonstationary signal statistics

The sample mean variance and standard

deviation definitions above assume the random

process is stationarymdashthat is its population

mean does not vary with time However

a great many measurement signals have

populations that do vary with time ie they are

nonstationary Sometimes the nonstationarity

arises from a ldquodriftrdquo in the dc value of a

signal or some other slowly changing variable

But dynamic signals can also change in a

recognizable and predictable manner as when

say the temperature of a room changes when

a window is opened or when a water level

changes with the tide

Typically we would like to minimize the effect

of nonstationarity on the signal statistics In

certain cases such as drift the variation is a

nuissance only but other times it is the point

of the measurement

stats Statistics sample Estimation of sample mean and variance p 1

Two common techniques are used depending

on the overall type of nonstationarity If it

is periodic with some known or estimated

period the measurement data series can

be ldquofoldedrdquo or ldquoreshapedrdquo such that the ith

measurement of each period corresponds to

the ith measurement of all other periods In

this case somewhat counterintuitively we can

consider the ith measurements to correspond

to a sample of size N where N is the number of

periods over which measurements are made

When the signal is aperiodic we often simply

divide it into ldquosmallrdquo (relative to its overall

trend) intervals over which statistics are

computed separately

Note that in this discussion we have assumed

that the nonstationarity of the signal is due to

a variable that is deterministic (not random)

Example statssampleshy1 re measurement of gaussian noise on

nonstationary signal

Problem StatementConsider the measurement of the temperature

inside a desktop computer chassis via an

inexpensive thermistor a resistor that

changes resistance with temperature The

processor and power supply heat the chassis

in a manner that depends on processing

demand For the test protocol the processors

are cycled sinusoidally through processing

power levels at a frequency of 50 mHz for

nT = 12 periods and sampled at 1 Hz Assume

a temperature fluctuation between about 20

and 50 C and gaussian noise with standard

stats Statistics sample Estimation of sample mean and variance p 2

deviation 4 C Consider a sample to be the

multiple measurements of a certain instant in

the period

1 Generate and plot simulated

temperature data as a time series

and as a histogram or frequency

distribution Comment on why the

frequency distribution sucks

2 Compute the sample mean and standard

deviation for each sample in the cycle

3 Subtract the mean from each sample

in the period such that each sample

distribution is centered at zero Plot

the composite frequency distribution

of all samples together This represents

our best estimate of the frequency

distribution of the underlying process

4 Plot a comparison of the theoretical

mean which is 35 and the sample mean

of means with an error bar Vary the

number of samples nT and comment on its

effect on the estimate

5 Plot a comparison of the theoretical

standard deviation and the sample mean

of sample standard deviations with an

error bar Vary the number of samples

nT and comment on its effect on the

estimate

6 Plot the sample means over a single

period with error bars of plusmn one sample

standard deviation of the means This

represents our best estimate of the

sinusoidal heating temperature Vary the

number of samples nT and comment on

the estimate

Problem Solution

stats Statistics sample Estimation of sample mean and variance p 3

clear close all clear kernel

Generate the temperature data

The temperature data can be generated by

constructing an array that is passed to a

sinusoid then ldquorandomizedrdquo by gaussian

random numbers Note that we add 1 to npand n to avoid the sneaky fencepost error

f = 50e-3 Hz sinusoid frequencya = 15 C amplitude of oscillationdc = 35 C dc offset of oscillationfs = 1 Hz sampling frequencynT = 12 number of sinusoid periodss = 4 C standard deviationnp = fsf+1 number of samples per periodn = nTnp+1 total number of samples

t_a = linspace(0nTfn) time arraysin_a = dc + asin(2pift_a) sinusoidal arrayrng(43) seed the random number generatornoise_a = srandn(size(t_a)) gaussian noisesignal_a = sin_a + noise_a sinusoid + noise

Now that we have an array of ldquodatardquo wersquore

ready to plot

h = figurep = plot(t_asignal_ao-

Color[888]MarkerFaceColorbMarkerEdgeColornoneMarkerSize3)

xlabel(time (s))ylabel(temperature (C))hgsave(hfigurestemp)

stats Statistics sample Estimation of sample mean and variance p 4

0 50 100 150 200 250

20

40

60

time (s)

tempera

ture

(C)

Figure 041 temperature over time

This is something like what we might see for

continuous measurement data Now the

histogram

h = figurehistogram(signal_a

30 number of binsnormalizationprobability for PMF

)xlabel(temperature (C))ylabel(probability)hgsave(hfigurestemp)

10 20 30 40 50 600

002

004

006

008

temperature (C)

proba

bility

Figure 042 a poor histogram due tounstationarity of the signal

stats Statistics sample Estimation of sample mean and variance p 5

This sucks because we plot a frequency

distribution to tell us about the random

variation but this data includes the sinusoid

Sample mean variance and standard

deviation

To compute the sample mean micro and standard

deviation s for each sample in the period

we must ldquopick outrdquo the nT data points that

correspond to each other Currently theyrsquore in

one long 1 times n array signal_a It is helpful

to reshape the data so it is in an nT times nparray which each row corresponding to a

new period This leaves the correct points

aligned in columns It is important to note

that we can do this ldquofoldingrdquo operation only

when we know rather precisely the period

of the underlying sinusoid It is given in the

problem that it is a controlled experiment

variable If we did not know it we would

have to estimate it too from the data

signal_ar = reshape(signal_a(1end-1)[npnT])size(signal_ar) check sizesignal_ar(1314) print three rows four columns

ans =

12 21

ans =

302718 400946 408341 447662401836 372245 494076 461137400571 409718 461627 419145

stats Statistics sample Estimation of sample mean and variance p 6

Define the mean variance and standard

deviation functions as ldquoanonmymous

functionsrdquo We ldquoroll our ownrdquo These are not

as efficient or flexible as the built-in Matlab

functions mean var and std which should

typically be used

my_mean = (vec) sum(vec)length(vec)my_var = (vec) sum((vec-my_mean(vec))^2)(length(vec)-1)

my_std = (vec) sqrt(my_var(vec))

Now the sample mean variance and standard

deviations can be computed We proceed by

looping through each column of the reshaped

signal array

mu_a = NaNones(1np) initialize mean arrayvar_a = NaNones(1np) initialize var arrays_a = NaNones(1np) initialize std array

for i = 1np for each columnmu_a(i) = my_mean(signal_ar(i))var_a(i) = my_var(signal_ar(i))s_a(i) = sqrt(var_a(i)) touch of speed

end

Composite frequency distribution

The columns represent samples We want to

subtract the mean from each column We use

repmat to reproduce mu_a in nT rows so it can

be easily subtracted

stats Statistics sample Estimation of sample mean and variance p 7

signal_arz = signal_ar - repmat(mu_a[nT1])size(signal_arz) check sizesignal_arz(1314) print three rows four columns

ans =

12 21

ans =

-50881 09525 -02909 -1570048237 -19176 82826 -0222546972 18297 50377 -44216

Now that all samples have the same mean

we can lump them into one big bin for the

frequency distribution There are some nice

built-in functions to do a quick reshape and

fit

resizesignal_arzr = reshape(signal_arz[1nTnp])size(signal_arzr) check size fitpdfit_model = fitdist(signal_arzrnormal) fitx_a = linspace(-1515100)pdfit_a = pdf(pdfit_modelx_a)pdf_a = normpdf(x_a0s) theoretical pdf

ans =

1 252

Plot

stats Statistics sample Estimation of sample mean and variance p 8

h = figurehistogram(signal_arzr

round(ssqrt(nT)) number of binsnormalizationprobability for PMF

)hold onplot(x_apdfit_ab-linewidth2) hold onplot(x_apdf_ag--linewidth2)legend(pmfpdf estpdf)xlabel(zero-mean temperature (C))ylabel(probability massdensity)hgsave(hfigurestemp)

ndash15 ndash10 ndash5 0 5 10 150

005

01

015

zero-mean temperature (C)proba

bility

massdensity

pmf

pdf est

pdf

Figure 043 PMF and estimated and theoreticalPDFs

Means comparison

The sample mean of means is simply the

following

mu_mu = my_mean(mu_a)

mu_mu =

351175

stats Statistics sample Estimation of sample mean and variance p 9

The standard deviation that works as an

error bar which should reflect how well

we can estimate the point plotted is the

standard deviation of the means It is

difficult to compute this directly for a

nonstationary process We use the estimate

given above and improve upon it by using the

mean of standard deviations instead of a

single samplersquos standard deviation

s_mu = mean(s_a)sqrt(nT)

s_mu =

11580

Now for the simple plot

h = figurebar(mu_mu) hold on gives barerrorbar(mu_mus_murlinewidth2) error barax = gca current axisaxXTickLabels = $overlineoverlineX$axTickLabelInterpreter = latexhgsave(hfigurestemp)

Standard deviations comparison

The sample mean of standard deviations is

simply the following

mu_s = my_mean(s_a)

stats Statistics sample Estimation of sample mean and variance p 10

mu_s =

40114

The standard deviation that works as

an error bar which should reflect how

well we can estimate the point plotted is

the standard deviation of the standard

deviations We can compute this directly

s_s = my_std(s_a)

s_s =

08495

Now for the simple plot

h = figurebar(mu_s) hold on gives barerrorbar(mu_ss_srlinewidth2) error barsax = gca current axisaxXTickLabels = $overlineS_X$axTickLabelInterpreter = latexhgsave(hfigurestemp)

X0

10

20

30

40

Figure 044 sample mean of sample means

SX

0

2

4

Figure 045 sample standard deviation of sample means

Plot a period with error bars

Plotting the data with error bars is fairly

straightforward with the built-in errorbar

stats Statistics confidence Estimation of sample mean and variance p 11

function The main question is ldquowhich

standard deviationrdquo Since wersquore plotting the

means it makes sense to plot the error bars

as a single sample standard deviation of the

means

h = figuree1 = errorbar(t_a(1np)mu_as_muones(1np)b)hold ont_a2 = linspace(01f101)e2 = plot(t_a2dc + asin(2pift_a2)r-)xlim([t_a(1)t_a(np)])grid onxlabel(folded time (s))ylabel(temperature (C))legend([e1 e2]sample meanpopulation meanLocationNorthEast)hgsave(hfigurestemp)

0 5 10 15

20

30

40

50

folded time (s)

tempera

ture

(C)

sample mean

population mean

Figure 046 sample means over a period

stats Statistics confidence Confidence p 1

confidence

central limit theorem

statsconfidence Confidence

One really ought to have it to give a lecture

named it but wersquoll give it a try anyway

Confidence is used in the common sense

although we do endow it with a mathematical

definition to scare business majors who

arenrsquot actually impressed but indifferent

Approximately if under some reasonable

assumptions (probabilistic model) we estimate

the probability of some event to be P we say

we have P confidence in it I mean business

majors are all ldquoSupply and demand Letrsquos call

that a lsquolawrsquo rdquo so I think wersquore even

So wersquore back to computing probability from

distributionsmdashprobability density functions

(PDFs) and probability mass functions (PMFs)

Usually we care most about estimating the

mean of our distribution Recall from the

previous lecture that when several samples

are taken each with its own mean the mean

is itself a random variablemdashwith a mean of

course Meanception

But the mean has a probability distribution

of its own The central limit theorem has

as one of its implications that as the sample

size N gets large regardless of the sample

distributions this distribution of means

approaches the Gaussian distribution

But sometimes I always worry Irsquom being lied to

so letrsquos check

stats Statistics confidence Confidence p 1

clear close all clear kernel

Generate some data to test the central

limit theorem

Data can be generated by constructing an

array using a (seeded for consistency) random

number generator Letrsquos use a uniformly

distributed PDF between 0 and 1

N = 150 sample size (number of measurements per sample)M = 120 number of samplesn = NM total number of measurementsmu_pop = 05 because its a uniform PDF between 0 and 1

rng(11) seed the random number generatorsignal_a = rand(NM) uniform PDFsize(signal_a) check the size

ans =

150 120

Letrsquos take a look at the data by plotting

the first ten samples (columns) as shown in

Figure 047

This is something like what we might see for

continuous measurement data Now the

histogram shown in

samples_to_plot = 10h = figure

stats Statistics confidence Confidence p 1

0 20 40 60 80 100 120 1400

02

04

06

08

1

index

measurement

Figure 047 raw data with colors correspondingto samples

0 02 04 06 08 10

002

004

006

008

measurement

proba

bility

Figure 048 a histogram showing the approximatelyuniform distribution of each sample (color)

c = jet(samples_to_plot) color arrayfor j=1samples_to_plot

histogram(signal_a(j)30 number of binsfacecolorc(j)facealpha3normalizationprobability for PMF

)hold on

endhold offxlim([-05105])xlabel(measurement)ylabel(probability)hgsave(hfigurestemp)

This isnrsquot a great plot but it shows roughly

that each sample is fairly uniformly

distributed

stats Statistics confidence Confidence p 1

Sample statistics

Now letrsquos check out the sample statistics We

want the sample mean and standard deviation

of each column Letrsquos use the built-in functions

mean and std

mu_a = mean(signal_a1) mean of each columns_a = std(signal_a1) std of each column

Now we can compute the mean statistics both

the mean of the mean X and the standard

deviation of the mean sX which we donrsquot

strictly need for this part but wersquore curious

We choose to use the direct estimate instead

of the sXradicN formula but they should be close

mu_mu = mean(mu_a)s_mu = std(mu_a)

mu_mu =

04987

s_mu =

00236

The truth about sample means

Itrsquos the moment of truth Letrsquos look at the

distribution shown in Figure 049

stats Statistics confidence Confidence p 1

042 044 046 048 05 052 054 056 0580

01

02

03

measurement

proba

bility

Figure 049 a histogram showing the approximatelynormal distribution of the means

h = figurehistogram(mu_a

normalizationprobability for PMF)hold offxlabel(measurement)ylabel(probability)hgsave(hfigurestemp)

This looks like a Gaussian distribution about

the mean of means so I guess the central limit

theorem is legit

Gaussian and probability

We already know how to compute the

probability P a value of a random variable X

lies in a certain interval from a PMF or PDF

(the sum or the integral respectively) This

means that for sufficiently large sample size N

such that we can assume from the central limit

theorem that the sample means xi are normally

distributed the probability a sample mean

value xi is in a certain interval is given by

integrating the Gaussian PDF The Gaussian PDF

for random variable Y representing the sample

means is

f(y) =1

σradic2π

expndash(y ndash micro)2

2σ2 (confidence1)

stats Statistics confidence Confidence p 2

Gaussian cumulative distribution function

where micro is the population mean and σ is the

population standard deviation

The integral of f over some interval is the

probability a value will be in that interval

Unfortunately that integral is uncool It gives

rise to the definition of the error function

which for the Gaussian random variable Y is

erf(yb) =1radicπ

ˆ yb

ndashyb

endasht2dt (confidence2)

This expresses the probability a sample mean

being in the interval [ndashyb yb] if Y has mean 0

and variance 12

Matlab has this built-in as erf shown in

Figure 0410

y_a = linspace(03100)h = figurep1 = plot(y_aerf(y_a))p1LineWidth = 2grid onxlabel(interval bound $y_b$interpreterlatex)ylabel(error function $textrmerf(y_b)$

interpreterlatex)hgsave(hfigurestemp)

We could deal directly with the error

function but most people donrsquot and wersquore

weird enough as it is Instead people use the

Gaussian cumulative distribution function

stats Statistics confidence Confidence p 3

0 05 1 15 2 25 30

02

04

06

08

1

interval bound yb

errorfunctionerf(yb)

Figure 0410 the error function

(CDF) Φ Rrarr R which is defined as

Φ(z) =1

2

(1 + erf

(zradic2

))(confidence3)

and which expresses the probability of a

Gaussian random variable Z with mean 0 and

standard deviation 1 taking on a value in

the interval (ndashinfin z] The Gaussian CDF and PDF

are shown in Figure 0411 Values can be taken

directly from the graph but itrsquos more accurate

to use the table of values in Appendix A01

Thatrsquos great and all but occasionally (always)

we have Gaussian random variables with

nonzero means and nonunity standard

deviations It turns out we can shift any

Gaussian random variable by its mean and

scale it by its standard deviation to make

it have zero mean and standard deviation

We can then use Φ and interpret the results

as being relative to the mean and standard

deviation using phrases like ldquothe probability

it is within two standard deviations of its

meanrdquo The transformed random variable Z and

its values z are sometimes called the z-score

For a particular value x of a random variable

X we can compute its z-score (or value z of

stats Statistics confidence Confidence p 4

zb

ndash3 ndash2 ndash1 0 1 2 30

01

02

03

04

z

Gaussia

nPDFf(z)

Φ(zb)

f(z)

(a) the Gaussian PDF

ndash3 ndash2 ndash1 0 1 2 30

02

04

06

08

1

interval bound zb

Gaussia

nCDFΦ(zb)

(b) the Gaussian CDF

Figure 0411 the Gaussian PDF and CDF for z-scores

random variable Z) with the formula

z =x ndash microXσX

(confidence4)

and compute the probability of X taking on a

value within the interval say x isin [xbndash xb+]

from the table (Sample statistics X and SXare appropriate when population statistics are

unknown)

For instance compute the probability a

Gaussian random variable X with microX = 5 and

σX = 234 takes on a value within the interval

x isin [3 6]

1 Compute the z-score of each endpoint of

the interval

stats Statistics confidence Confidence p 5

z3 =3 ndash microXσX

asymp ndash085 (confidence5)

z6 =6 ndash microXσX

asymp 043 (confidence6)

2 Look up the CDF values for z3 and z6 which

are Φ(z3) = 01977 and Φ(z6) = 06664 3 The

CDF values correspond to the probabilities x lt 3

and x lt 6 Therefore to find the probability

x lies in that interval we subtract the lower

bound probability

P(x isin [3 6]) = P(x lt 6) ndash P(x lt 3) (confidence7)

= Φ(6) ndashΦ(3) (confidence8)

asymp 06664 ndash 01977 (confidence9)

asymp 04689 (confidence10)

So there is a 4689 probability and therefore

we have 4689 confidence that x isin [3 6]

Often we want to go the other way estimating

the symmetric interval [xbndash xb+] for which

there is a given probability In this case we

first look up the z-score corresponding to a

certain probability For concreteness given

the same population statistics above letrsquos find

the symmetric interval [xbndash xb+] over which we

have 90 confidence From the table we want

two symmetric z-scores that have CDFshyvalue

difference 09 Or in maths

Φ(zb+) ndashΦ(zbndash) = 09 and zb+ = ndashzbndash

(confidence11)

stats Statistics confidence Confidence p 6

Due to the latter relation and the additional

fact that the Gaussian CDF has antisymmetry

Φ(zb+) +Φ(zbndash) = 1 (confidence12)

Adding the two Φ equations

Φ(zb+) = 192 (confidence13)

= 095 (confidence14)

and Φ(zbndash) = 005 From the table these

correspond (with a linear interpolation) to

zb = zb+ = ndashzbndash asymp 1645 All that remains is

to solve the z-score formula for x

x = microX + zσX (confidence15)

From this

xb+ = microX + zb+σX asymp 8849 (confidence16)

xbndash = microX + zbndashσX asymp 1151 (confidence17)

and X has a 90 confidence interval

[1151 8849]

stats Statistics confidence Confidence p 7

Example statsconfidenceshy1 re gaussian confidence for a mean

Problem StatementConsider the data set generated above

What is our 95 confidence interval in our

estimate of the mean

Problem Solution

Assumingwe have a sufficiently large

data set the distribution of means is

approximately Gaussian Following the same

logic as above we need z-score that gives an

upper CDF value of From the

table we obtain the zb = zb+ = ndashzbndash below

z_b = 196

Now we can estimate the mean using our

sample and mean statistics

X = Xplusmn zbSX (confidence18)

mu_x_95 = mu_mu + [-z_bz_b]s_mu

mu_x_95 =

04526 05449

stats Statistics student Confidence p 8

This is our 95 confidence interval in our

estimate of the mean

stats Statistics student Student confidence p 1

Studentrsquos t-distribution

degrees of freedom

statsstudent Student

confidence

The central limit theorem tells us that for

large sample size N the distribution of the

means is Gaussian However for small sample

size the Gaussian isnrsquot as good of an estimate

Studentrsquos t-distribution is superior for

lower sample size and equivalent at higher

sample size Technically if the population

standard deviation σX is known even for

low sample size we should use the Gaussian

distribution However this rarely arises in

practice so we can usually get away with an

ldquoalways trdquo approach

A way that the tshydistribution accounts for

low-N is by having an entirely different

distribution for each N (seems a bit of a

cheat to me) Actually instead of N it uses

the degrees of freedom ν which is N minus the

number of parameters required to compute the

statistic Since the standard deviation requires

only the mean for most of our cases ν = N ndash 1

As with the Gaussian distribution the

tshydistributionrsquos integral is difficult to

calculate Typically we will use a t-table such

as the one given in Appendix A02 There are

three points of note

1 Since we are primarily concerned with

going from probabilityconfidence

values (eg P probabilityconfidence) to

intervals typically there is a column for

each probability

stats Statistics student Student confidence p 2

2 The extra parameter ν takes over one of

the dimensions of the table because three-

dimensional tables are illegal

3 Many of these tables are ldquotwo-sidedrdquo

meaning their t-scores and probabilities

assume you want the symmetric

probability about the mean over the

interval [ndashtb tb] where tb is your t-score

bound

Consider the following example

Example statsstudentshy1 re student confidence interval

Problem StatementWrite a Matlab script to generate a data

set with 200 samples and sample sizes

N isin 10 20 100 using any old distribution

Compare the distribution of the means for

the different N Use the sample distributions

and a t-table to compute 99 confidence

intervals

Problem Solution

Generate the data set

confidence = 099 requirement

M = 200 of samplesN_a = [1020100] sample sizes

mu = 27 population meansigma = 9 population std

rng(1) seed random number generatordata_a = mu + sigmarandn(N_a(end)M) normal

stats Statistics student Student confidence p 3

size(data_a) check sizedata_a(11015) check 10 rows and five columns

ans =

100 200

ans =

211589 302894 278705 307835 283662376305 171264 282973 240811 343486201739 443719 437059 390699 322002170135 326064 369030 379230 365747193900 329156 237230 224749 197709218460 138295 312479 169527 341876219719 346854 194480 187014 241642286054 322244 222873 269906 376746252282 187326 145011 283814 277645322780 341538 270382 188643 141752

Compute the means for different sample sizes

mu_a = NaNones(length(N_a)M)for i = 1length(N_a)

mu_a(i) = mean(data_a(1N_a(i)1M)1)end

Plotting the distribution of

the means yields Figure 0412

20 25 30 350

01

02

03

random variable X

proba

bility

(PMF)

interval for N = 10

interval for N = 20

interval for N = 100

stats Statistics student Student confidence p 4

Figure 0412 a histogram showing thedistribution of the means for each sample size

It makes sense that the larger the sample size

the smaller the spread A quantitative metric

for the spread is of course the standard

deviation of the means for each sample size

S_mu = std(mu_a02)

S_mu =

283652091810097

Look up t-table values or use Matlabrsquos tinvfor different sample sizes and 99 confidence

Use these the mean of means and the

standard deviation of means to compute the

99 confidence interval for each N

t_a = tinv(confidenceN_a-1)for i = 1length(N_a)

interval = mean(mu_a(i)) + [-11]t_a(i)S_mu(i)

disp(sprintf(interval for N = i N_a(i)))disp(interval)

end

stats Statistics multivar Student confidence p 5

t_a =

28214 25395 23646

interval for N = 10190942 351000

interval for N = 20216292 322535

interval for N = 100247036 294787

As expected the larger the sample size the

smaller the interval over which we have 99

confidence in the estimate

stats Statistics multivar Multivariate probability and correlation p 1

joint PDF

joint PMF

statsmultivar Multivariate

probability and

correlation

Thus far we have considered probability

density and mass functions (PDFs and PMFs)

of only one random variable But of course

often we measure multiple random variables

X1 X2 Xn during a single event meaning

a measurement consists of determining values

x1 x2 xn of these random variables

We can consider an n-tuple of continuous

random variables to form a sample space Ω = Rn

on which a multivariate function f Rn rarr Rcalled the joint PDF assigns a probability

density to each outcome x isin Rn The joint PDF

must be greater than or equal to zero for all

x isin Rn the multiple integral over Ω must be

unity and the multiple integral over a subset

of the sample space A sub Ω is the probability of

the event A

We can consider an n-tuple of discrete random

variables to form a sample space Nn0 on which

a multivariate function f Nn0 rarr R calledthe joint PMF assigns a probability to each

outcome x isin Nn0 The joint PMF must be greater

than or equal to zero for all x isin Nn0 the

multiple sum over Ω must be unity and the

multiple sum over a subset of the sample space

A sub Ω is the probability of the event A

stats Statistics multivar Multivariate probability and correlation p 2

Example statsmultivarshy1 re bivariate gaussian pdf

Problem StatementLetrsquos visualize multivariate PDFs by plotting

a bivariate gaussian using Matlabrsquos mvnpdffunction

Problem Solution

mu = [1020] meansSigma = [1002] cov well talk about thisx1_a = linspace(

mu(1)-5sqrt(Sigma(11))mu(1)+5sqrt(Sigma(11))30)

x2_a = linspace(mu(2)-5sqrt(Sigma(22))mu(2)+5sqrt(Sigma(22))30)

[X1X2] = meshgrid(x1_ax2_a)f = mvnpdf([X1() X2()]muSigma)f = reshape(flength(x2_a)length(x1_a))

h = figurep = surf(x1_ax2_af)xlabel($x_1$interpreterlatex)ylabel($x_2$interpreterlatex)zlabel($f(x_1x_2)$interpreterlatex)shading interphgsave(hfigurestemp)

The result is Figure 0413 Note

how the means and standard

deviations affect the distribution

510

1520

0

02

x1x2

f(x1x2)

stats Statistics multivar Multivariate probability and correlation p 1

marginal PDF

Figure 0413 twoshyvariable gaussian PDF

Marginal probability

The marginal PDF of a multivariate PDF is the

PDF of some subspace of Ω after one or more

variables have been ldquointegrated outrdquo such that

a fewer number of random variables remain

Of course these marginal PDFs must have the

same properties of any PDF such as integrating

to unity

Example statsmultivarshy2 re bivariate gaussian marginal probability

Problem StatementLetrsquos demonstrate this by numerically

integrating over x2 in the bivariate Gaussian

above

Problem Solution

Continuing from where we left off letrsquos

integrate

f1 = trapz(x2_af2) trapezoidal integration

Plotting this yields Figure 0414

h = figurep = plot(x1_af1)pLineWidth = 2xlabel($x_1$interpreterlatex)

stats Statistics multivar Multivariate probability and correlation p 1

machine learning

ylabel($g(x_1)=int_-infty^infty f(x_1x_2) d x_2$interpreterlatex)hgsave(hfigurestemp)

6 8 10 12 140

01

02

03

04

x1

g(x1)=acute infin ndashinfinf

(x1x2)dx2

Figure 0414 marginal Gaussian PDF g(x1)

We should probably verify that this

integrates to one

disp([integral over x_1 = sprintf(07ftrapz(x1_af1))]

)

integral over x_1 = 09999986

Not bad

Covariance

Very often especially in machine learning

stats Statistics multivar Multivariate probability and correlation p 1

artificial intelligence

covariance

correlation

sample covariance

covariance matrix

or artificial intelligence applications the

question about two random variables X and Y

is how do they coshyvary That is what is their

covariance defined as

Cov[XY

]equiv E

((X ndash microX)(Y ndash microY)

)= E(XY) ndash microXmicroY

Note that when X = Y the covariance is just

the variance When a covariance is large and

positive it is an indication that the random

variables are strongly correlated When

it is large and negative they are strongly

anti-correlated Zero covariance means

the variables are uncorrelated In fact

correlation is defined as

Cor[XY

]=

Cov[XY

]radicVar

[X]Var

[Y]

This is essentially the covariance ldquonormalizedrdquo

to the interval [ndash1 1]

Sample covariance

As with the other statistics wersquove considered

covariance can be estimated from

measurement The estimate called the sample

covariance qXY of random variables X and Y

with sample size N is given by

qXY =1

N ndash 1

Nsumi=1

(xi ndash X)(yi ndash Y)

Multivariate covariance

With n random variables Xi one can compute

the covariance of each pair It is common

practice to define an n times n matrix of

covariances called the covariance matrix Σ

stats Statistics multivar Multivariate probability and correlation p 2

sample covariance matrix

such that each pairrsquos covariance

Cov[Xi Xj

](multivar1)

appears in its row-column combination (making it

symmetric) as shown below

Σ =

Cov

[X1 X1

]Cov

[X1 X2

]middot middot middot Cov

[X1 Xn

]Cov

[X2 X1

]Cov

[X2 X2

]Cov

[X2 Xn

]

Cov[Xn X1

]Cov

[Xn X2

]middot middot middot Cov

[Xn Xn

]

The multivariate sample covariance matrix

Q is the same as above but with sample

covariances qXiXj

Both covariance matrices have correlation

analogs

Example statsmultivarshy3 re car data sample covariance and correlation

Problem StatementLetrsquos use a built-in multivariate data set

that describes different features of cars

listed below

d = load(carsmallmat) this is a struct

Letrsquos compute the sample covariance and

correlation matrices

Problem Solution

variables = MPGCylindersDisplacementHorsepowerWeightAcceleration

stats Statistics multivar Multivariate probability and correlation p 3

Model_Yearn = length(variables)m = length(dMPG)

data = NaNones(mn) preallocatefor i = 1n

data(i) = d(variablesi)end

cov_d = nancov(data) sample covariancecor_d = corrcov(cov_d) sample correlation

This is highly correlatedanticorrelated

data Letrsquos plot each variable versus each

other variable to see the correlations of each

We use a red-to-blue colormap to contrast

anticorrelation and correlation Purple then is

uncorrelated

The following builds the red-to-blue colormap

n_colors = 10cmap_rb = NaNones(n_colors3)for i = 1n_colors

a = in_colorscmap_rb(i) = (1-a)[100]+a[001]

end

h = figurefor i = 1n

for j = 1nsubplot(nnsub2ind([nn]ji))p = plot(d(variablesi)d(variablesj)

) hold onthis_color = cmap_rb(round((cor_d(ij)+1)(n_colors-1)2)

stats Statistics multivar Multivariate probability and correlation p 1

)pMarkerFaceColor = this_colorpMarkerEdgeColor = this_color

endendhgsave(hfigurestemp)

Figure 0415 car data correlation

Conditional probability and dependence

Independent variables are uncorrelated

However uncorrelated variables may or may

not be independent Therefore we cannot use

correlation alone as a test for independence

For instance for random variables X and Y

where X has some even distribution and Y = X2

clearly the variables are dependent However

the are also uncorrelated (due to symmetry)

stats Statistics multivar Multivariate probability and correlation p 2

Example statsmultivarshy4 re car data sample covariance and correlation

Problem StatementUsing a uniform distribution U(ndash1 1) show that

X and Y are uncorrelated (but dependent)

with Y = X2 with some sampling We compute

the correlation for different sample sizes

Problem Solution

N_a = round(linspace(10500100))qc_a = NaNones(size(N_a))rng(6)x_a = -1 + 2rand(max(N_a)1)y_a = x_a^2for i = 1length(N_a)

should write incremental algorithm but lazyq = cov(x_a(1N_a(i))y_a(1N_a(i)))qc = corrcov(q)qc_a(i) = qc(21) cross correlation

end

The absolute values of the correlations are

shown in Figure 0416 Note that we should

probably average several such curves to

estimate how the correlation would drop off

with N but the single curve describes our

understanding that the correlation in fact

approaches zero in the large-sample limit

h = figurep = plot(N_aabs(qc_a))pLineWidth = 2xlabel(sample size $N$interpreterlatex)ylabel(absolute sample correlationinterpreterlatex

stats Statistics regression Multivariate probability and correlation p 3

)hgsave(hfigurestemp)

0 100 200 300 400 5000

01

02

03

sample size Nabsolute

sampleco

rrelation

Figure 0416 absolute value of the sample

correlation between X sim U(ndash1 1) and Y = X2 fordifferent sample size N In the limit the populationcorrelation should approach zero and yet X and Yare not independent

stats Statistics regression Regression p 1

statsregression Regression

Suppose we have a sample with two

measurands (1) the force F through a spring and

(2) its displacement X (not from equilibrium)

We would like to determine an analytic

function that relates the variables perhaps

for prediction of the force given another

displacement

There is some variation in the measurement Letrsquos

say the following is the sample

X_a = 1e-3[10213041495061718092100] mF_a = [501504532559572599

610639670679703] N

Letrsquos take a look at the data The result is

Figure 0417

h = figurep = plot(X_a1e3F_abMarkerSize15)xlabel($X$ (mm)interpreterlatex)ylabel($F$ (N)interpreterlatex)xlim([0max(X_a1e3)])grid onhgsave(hfigurestemp)

How might we find an analytic function that

agrees with the data Broadly our strategy

will be to assume a general form of a function

and use the data to set the parameters in the

stats Statistics regression Regression p 2

0 20 40 60 80 10050

60

70

X (mm)

F(N)

Figure 0417 forceshydisplacement data

function such that the difference between the

data and the function is minimal

Let y be the analytic function that we would

like to fit to the data Let yi denote the

value of y(xi) where xi is the ith value of the

random variable X from the sample Then we

want to minimize the differences between the

force measurements Fi and yi

From calculus recall that we can minimize a

function by differentiating it and solving for

the zero-crossings (which correspond to local

maxima or minima)

First we need such a function to minimize

Perhaps the simplest effective function D is

constructed by squaring and summing the

differences we want to minimize for sample

size N

D(xi) =Nsumi=1

(Fi ndash yi)2 (regression1)

stats Statistics regression Regression p 3

(recall that yi = y(xi) which makes D a function

of x)

Now the form of y must be chosen We consider

only mth-order polynomial functions y but

others can be used in a similar manner

y(x) = a0 + a1x + a2x2 + middot middot middot + amxm (regression2)

If we treat D as a function of the polynomial

coefficients aj ie

D(a0 a1 middot middot middot am) (regression3)

and minimize D for each value of xi we must

take the partial derivatives of D with respect

to each aj and set each equal to zero

partD

parta0= 0

partD

parta1= 0 middot middot middot partD

partam= 0

This gives us N equations and m + 1 unknowns aj

Writing the system in matrix form

1 x1 x21middot middot middot xm1

1 x2 x22 middot middot middot xm21 x3 x23 middot middot middot xm3

1 xN x2Nmiddot middot middot xmN

︸ ︷︷ ︸

ANtimes(m+1)

a0a1a2

am

︸ ︷︷ ︸a(m+1)times1

=

F1F2F3

FN

︸ ︷︷ ︸bNtimes1

(regression4)

Typically N gt m and this is an overdetermined

system Therefore we usually canrsquot solve by

taking Andash1 because A doesnrsquot have an inverse

stats Statistics regression Regression p 4

Instead we either find the Moore-Penrose

pseudo-inverse Adagger and have a = Adaggerb as

the solution which is inefficientmdashor we can

approximate b with an algorithm such as that

used by Matlabrsquos operator In the latter case

a_a = Ab_a

Example statsregressionshy1 re regression

Problem StatementUse Matlabrsquos operator to find a good

polynomial fit for the sample Therersquos the

sometimesshydifficult question ldquowhat order

should we fitrdquo Letrsquos try out several and

see what the squaredshydifferences function D

gives

Problem Solution

N = length(X_a) sample sizem_a = 2N all the order up to N

A = NaNones(length(m_a)max(m_a)N)for k = 1length(m_a) each order

for j = 1N each measurementfor i = 1( m_a(k) + 1 ) each coef

A(kji) = X_a(j)^(i-1)end

endenddisp(squeeze(A(215)))

10000 00100 00001 00000 NaN10000 00210 00004 00000 NaN10000 00300 00009 00000 NaN10000 00410 00017 00001 NaN10000 00490 00024 00001 NaN10000 00500 00025 00001 NaN10000 00610 00037 00002 NaN10000 00710 00050 00004 NaN

stats Statistics regression Regression p 5

10000 00800 00064 00005 NaN10000 00920 00085 00008 NaN10000 01000 00100 00010 NaN

Wersquove printed the first five columns of the

third-order matrix which only has four

columns so NaNs fill in the rest

Now we can use the operator to solve for

the coefficients

a = NaNones(length(m_a)max(m_a))

warning(offall)for i = 1length(m_a)

A_now = squeeze(A(i1m_a(i)))a(i1m_a(i)) = (A_now(1m_a(i))F_a)

endwarning(onall)

n_plot = 100x_plot = linspace(min(X_a)max(X_a)n_plot)y = NaNones(n_plotlength(m_a)) preallocatefor i = 1length(m_a)

y(i) = polyval(fliplr(a(i1m_a(i)))x_plot)end

h = figurefor i = 12length(m_a)-1

p = plot(x_plot1e3y(i)linewidth15)hold onpDisplayName = sprintf(

order i (m_a(i)-1)

)

stats Statistics ex Regression p 6

endp = plot(X_a1e3F_abMarkerSize15)xlabel($X$ (mm)interpreterlatex)ylabel($F$ (N)interpreterlatex)pDisplayName = samplelegend(showlocationsoutheast)grid onhgsave(hfigurestemp)

20 40 60 80 100

50

60

70

X (mm)

F(N)

order 1

order 3

order 5

order 7

order 9

sample

Figure 0418 forceshydisplacement data with curvefits

vecs Statistics Exercises for Chapter 04 p 1

statsex Exercises for Chapter

04

calculus

limit

series

derivative

integral

vector calculus

vecs

Vector calculus

A great many physical situations of interest to

engineers can be described by calculus It can

describe how quantities continuously change

over (say) time and gives tools for computing

other quantities We assume familiarity with

the fundamentals of calculus limit series

derivative and integral From these and a

basic grasp of vectors we will outline some

of the highlights of vector calculus Vector

calculus is particularly useful for describing

the physics of for instance the following

mechanics of particles wherein is studied

the motion of particles and the forcing

causes thereof

rigid-body mechanics wherein is studied the

motion rotational and translational and

its forcing causes of bodies considered

rigid (undeformable)

solid mechanics wherein is studied the

motion and deformation and their forcing

causes of continuous solid bodies (those

that retain a specific resting shape)

vecs Vector calculus p 3

complex analysis

1 For an introduction to complex analysis see Kreyszig (Kreyszig

Advanced Engineering Mathematics Part D)

2 Ibid Chapters 9 10

3 HM Schey Div Grad Curl and All that An Informal Text on Vector

Calculus WW Norton 2005 ISBN 9780393925166

euclidean vector space

bases

components

basis vectors

invariant

coordinate system

Cartesian coordinates

fluid mechanics wherein is studied the

motion and its forcing causes of fluids

(liquids gases plasmas)

heat transfer wherein is studied the

movement of thermal energy through and

among bodies

electromagnetism wherein is studied

the motion and its forcing causes of

electrically charged particles

This last example was in fact very influential

in the original development of both vector

calculus and complex analysis1 It is not

an exaggeration to say that the topics above

comprise the majority of physical topics of

interest in engineering

A good introduction to vector calculus is

given by Kreyszig2 Perhaps the most famous

and enjoyable treatment is given by Schey3

in the adorably titled Div Grad Curl and All

that

It is important to note that in much of what

follows we will describe (typically the three-

dimensional space of our lived experience) as

a euclidean vector space an nshydimensional

vector space isomorphic to Rn As we know

from linear algebra any vector v isin Rn can be

expressed in any number of bases That is the

vector v is a basis-free object with multiple

basis representations The components and

basis vectors of a vector change with basis

changes but the vector itself is invariant A

coordinate system is in fact just a basis We

are most familiar of course with Cartesian

vecs Vector calculus div p 4

manifolds

non-euclidean geometry

parallel postulate

differential geometry

4 John M Lee Introduction to Smooth Manifolds second Vol 218

Graduate Texts in Mathematics Springer 2012

5 Francesco Bullo and Andrew D Lewis Geometric control of

mechanical systems modeling analysis and design for simple

mechanical control systems Ed by JE Marsden L Sirovich and M

Golubitsky Springer 2005

coordinates which is the specific orthonormal

basis b for Rn

b1 =

1

0

0

b2 =

0

1

0

middot middot middot bn =

0

0

1

(vecs1)

Manifolds are spaces that appear locally

as Rn but can be globally rather different

and can describe non-euclidean geometry

wherein euclidean geometryrsquos parallel

postulate is invalid Calculus on manifolds

is the focus of differential geometry a

subset of which we can consider our current

study A motivation for further study of

differential geometry is that it is very

convenient when dealingwith advanced

applications of mechanics such as rigid-body

mechanics of robots and vehicles A very nice

mathematical introduction is given by Lee4 and

Bullo and Lewis5 give a compact presentation

in the context of robotics

Vector fields have several important

properties of interest wersquoll explore in this

chapter Our goal is to gain an intuition of

these properties and be able to perform basic

calculation

vecs Vector calculus div Divergence surface integrals and flux p 1

surface integral

flux

6 Schey Div Grad Curl and All that An Informal Text on Vector

Calculus pp 21shy30

7 Kreyszig Advanced Engineering Mathematics sect 106

source

sink

incompressible

vecsdiv Divergence surface

integrals and flux

Flux and surface integrals

Consider a surface S Let

r(u v) = [x(u v) y(u v) z(u v)] be a parametric

position vector on a Euclidean vector space R3Furthermore let F R3 rarr R3 be a vectorshyvalued

function of r and let n be a unit-normal vector

on a surface S The surface integral

uml

S

F middot ndS (div1)

which integrates the normal of F over the

surface We call this quantity the flux of F

out of the surface S This terminology comes

from fluid flow for which the flux is the

mass flow rate out of S In general the flux

is a measure of a quantity (or field) passing

through a surface For more on computing

surface integrals see Schey6 and Kreyszig7

Continuity

Consider the flux out of a surface S that

encloses a volume ∆V divided by that volume

1

∆V

uml

S

F middot ndS (div2)

This gives a measure of flux per unit volume

for a volume of space Consider its physical

meaningwhen we interpret this as fluid flow

all fluid that enters the volume is negative flux

and all that leaves is positive If physical

conditions are such that we expect no fluid to

enter or exit the volume via what is called a

source or a sink and if we assume the density

of the fluid is uniform (this is called an

incompressible fluid) then all the fluid that

vecs Vector calculus div Divergence surface integrals and flux p 1

continuity equation

divergence

enters the volume must exit and we get

1

∆V

uml

S

F middot ndS = 0 (div3)

This is called a continuity equation although

typically this name is given to equations of the

form in the next section This equation is one of

the governing equations in continuum mechanics

Divergence

Letrsquos take the fluxshypershyvolume as the volume

∆Vrarr 0 we obtain the following

Equation div4 divergence integral

form

lim∆Vrarr0

1

∆V

uml

S

F middot ndS

This is called the divergence of F and is

defined at each point in R3 by taking the

volume to zero about it It is given the

shorthand div F

What interpretation can we give this quantity

It is a measure of the vector fieldrsquos flux

outward through a surface containing an

infinitesimal volume When we consider a fluid

a positive divergence is a local decrease in

density and a negative divergence is a density

increase If the fluid is incompressible and has

no sources or sinks we can write the continuity

vecs Vector calculus div Divergence surface integrals and flux p 1

equation

div F = 0 (div5)

In the Cartesian basis it can be shown that the

divergence is easily computed from the field

F = Fxi + Fyj + Fzk (div6)

as follows

Equation div7 divergence

differential form

div F = partxFx + partyFy + partzFz

Exploring divergence

Divergence is perhaps best explored by

considering it for a vector field in R2 Sucha field F = Fxi + Fyj can be represented

as a ldquoquiverrdquo plot If we overlay the quiver

plot over a ldquocolor densityrdquo plot representing

div F we can increase our intuition about the

divergence

The followingwas generated from a Jupyter

notebook with the following filename and

kernel

notebook filename div_surface_integrals_fluxipynbnotebook kernel python3

vecs Vector calculus div Divergence surface integrals and flux p 2

First load some Python packages

from sympy import import numpy as npimport matplotlibpyplot as pltfrom matplotlibticker import LogLocatorfrom matplotlibcolors import from sympyutilitieslambdify import lambdifyfrom IPythondisplay import display Markdown Latex

Now we define some symbolic variables and

functions

var(xy)F_x = Function(F_x)(xy)F_y = Function(F_y)(xy)

Rather than repeat code letrsquos write a single

function quiver_plotter_2D to make several

of these plots

def quiver_plotter_2D(field=F_xxyF_yxygrid_width=3grid_decimate_x=8grid_decimate_y=8norm=Normalize()density_operation=divprint_density=True

) define symbolicsvar(xy)F_x = Function(F_x)(xy)F_y = Function(F_y)(xy)

vecs Vector calculus div Divergence surface integrals and flux p 3

field_sub = field

compute densityif density_operation is divden = F_xdiff(x) + F_ydiff(y)

elif density_operation is curl in the k directionden = F_ydiff(x) - F_xdiff(y)

elseerror(div and curl are the only density operators)

den_simp = densubs(field_sub

)doit()simplify()if den_simpis_constant()

print(Warning density operator is constant (no density plot)

)if print_density

print(fThe density_operation is)display(den_simp)

lambdify for numericsF_x_sub = F_xsubs(field_sub)F_y_sub = F_ysubs(field_sub)F_x_fun = lambdify((xy)F_xsubs(field_sub)numpy)F_y_fun = lambdify((xy)F_ysubs(field_sub)numpy)if F_x_subis_constantF_x_fun1 = F_x_fun dummyF_x_fun = lambda xy F_x_fun1(xy)npones(xshape)

if F_y_subis_constantF_y_fun1 = F_y_fun dummyF_y_fun = lambda xy F_y_fun1(xy)npones(xshape)

if not den_simpis_constant()den_fun = lambdify((xy) den_simpnumpy)

create gridw = grid_widthY X = npmgrid[-ww100j -ww100j]

evaluate numericallyF_x_num = F_x_fun(XY)F_y_num = F_y_fun(XY)if not den_simpis_constant()den_num = den_fun(XY)

plotp = pltfigure() colormeshif not den_simpis_constant()cmap = pltget_cmap(PiYG)im = pltpcolormesh(XYden_numcmap=cmapnorm=norm)pltcolorbar()

Abs quiverdx = grid_decimate_ydy = grid_decimate_x

vecs Vector calculus div Divergence surface integrals and flux p 4

pltquiver(X[dxdy]Y[dxdy]F_x_num[dxdy]F_y_num[dxdy]units=xy scale=10

)plttitle(fF(xy) = [F_xsubs(field_sub)F_ysubs(field_sub)])return p

Note that while wersquore at it we included a

hook for density plots of the curl of F and

wersquoll return to this in a later lecture

Letrsquos inspect several cases

p = quiver_plotter_2D(field=F_xx2F_yy2

)

The div is

2x + 2y

p = quiver_plotter_2D(field=F_xxyF_yxy

)

The div is

vecs Vector calculus div Divergence surface integrals and flux p 5

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [x2y2]

10

5

0

5

10

Figure 051 png

x + y

p = quiver_plotter_2D(field=F_xx2+y2F_yx2+y2

)

The div is

2x + 2y

p = quiver_plotter_2D(field=F_xx2sqrt(x2+y2)F_yy2sqrt(x2+y2)

vecs Vector calculus div Divergence surface integrals and flux p 6

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [xyxy]

10

5

0

5

10

Figure 052 png

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [x2 + y2x2 + y2]

10

5

0

5

10

Figure 053 png

vecs Vector calculus curl Divergence surface integrals and flux p 7

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [x2sqrt(x2 + y2)y2sqrt(x2 + y2)]

100

10 1010 1

100

Figure 054 png

norm=SymLogNorm(linthresh=3 linscale=3))

The div is

ndashx3 ndash y3 + 2(x + y

) (x2 + y2

)(x2 + y2

) 32

vecs Vector calculus curl Curl line integrals and circulation p 1

line integral

8 Schey Div Grad Curl and All that An Informal Text on Vector

Calculus pp 63shy74

9 Kreyszig Advanced Engineering Mathematics sect 101 102

force

work

circulation

the law of circulation

vecscurl Curl line integrals

and circulation

Line integrals

Consider a curve C in a Euclidean vector space

R3 Let r(t) = [x(t) y(t) z(t)] be a parametric

representation of C Furthermore let F R3 rarr R3

be a vectorshyvalued function of r and let rprime(t)

be the tangent vector The line integral is

ˆ

C

F(r(t)) middot rprime(t) dt (curl1)

which integrates F along the curve For more

on computing line integrals see Schey8 and

Kreyszig9

If F is a force being applied to an object

moving along the curve C the line integral is

the work done by the force More generally the

line integral integrates F along the tangent of

C

Circulation

Consider the line integral over a closed curve

C denoted by

˛

C

F(r(t)) middot rprime(t) dt (curl2)

We call this quantity the circulation of F

around C

For certain vectorshyvalued functions F the

circulation is zero for every curve In these

cases (static electric fields for instance) this

is sometimes called the the law of circulation

vecs Vector calculus curl Curl line integrals and circulation p 2

curl

curl

Curl

Consider the division of the circulation around

a curve in R3 by the surface area it encloses

∆S

1

∆S

˛

C

F(r(t)) middot rprime(t) dt (curl3)

In a manner analogous to the operation

that gaves us the divergence letrsquos consider

shrinking this curve to a point and the surface

area to zero

lim∆Srarr0

1

∆S

˛

C

F(r(t)) middot rprime(t) dt (curl4)

We call this quantity the ldquoscalarrdquo curl of F

at each point in R3 in the direction normal

to ∆S as it shrinks to zero Taking three (or

n for Rn) ldquoscalarrdquo curls in indepedent normal

directions (enough to span the vector space)

we obtain the curl proper which is a vector-

valued function curl R3 rarr R3

The curl is coordinate-independent In

cartesian coordinates it can be shown to be

equivalent to the following

Equation curl5 curl differential

form cartesian coordinates

curl F =[partyFz ndash partzFy partzFx ndash partxFz partxFy ndash partyFx

]gt

But what does the curl of F represent It

quantifies the local rotation of F about each

point If F represents a fluidrsquos velocity curl F

vecs Vector calculus curl Curl line integrals and circulation p 1

vorticity

10 A region is simply connected if every curve in it can shrink to a

point without leaving the region An example of a region that is not

simply connected is the surface of a toroid

path independence

is the local rotation of the fluid about each

point and it is called the vorticity

Zero curl circulation and path

independence

Circulation

It can be shown that if the circulation of F on

all curves is zero then in each direction n and

at every point curl F = 0 (ie n middot curl F = 0)

Conversely for curl F = 0 in a simply connected

region10 F has zero circulation

Succinctly informally and without the

requisite qualifiers above

zero circulationrArr zero curl

(curl6)

zero curl + simply connected regionrArr zero circulation

(curl7)

Path independence

It can be shown that if the path integral of F

on all curves between any two points is path-

independent then in each direction n and

at every point curl F = 0 (ie n middot curl F = 0)

Conversely for curl F = 0 in a simply connected

region all line integrals are independent of

path

Succinctly informally and without the

requisite qualifiers above

path independencerArr zero curl

(curl8)

zero curl + simply connected regionrArr path independence

(curl9)

vecs Vector calculus curl Curl line integrals and circulation p 1

zero curl

zero circulation path independence

connectedness and

Figure 055 an implication graph relating zerocurl zero circulation path independence andconnectedness Green edges represent implication(a implies b) and black edges represent logicalconjunctions

hellip and how they relate

It is also true that

path independencehArr zero circulation

(curl10)

So putting it all together we get Figure 055

Exploring curl

Curl is perhaps best explored by considering

it for a vector field in R2 Such a field in

cartesian coordinates F = Fxi + Fyj has curl

curl F =[party0 ndash partzFy partzFx ndash partx0 partxFy ndash partyFx

]gt=[0 ndash 0 0 ndash 0 partxFy ndash partyFx

]gt=[0 0 partxFy ndash partyFx

]gt (curl11)

That is curl F = (partxFy ndash partyFx )k and the only

nonzero component is normal to the xyshyplane

If we overlay a quiver plot of F over a ldquocolor

densityrdquo plot representing the k-component of

curl F we can increase our intuition about the

curl

The followingwas generated from a Jupyter

vecs Vector calculus curl Curl line integrals and circulation p 2

notebook with the following filename and

kernel

notebook filename curl-and-line-integralsipynbnotebook kernel python3

First load some Python packages

from sympy import import numpy as npimport matplotlibpyplot as pltfrom matplotlibticker import LogLocatorfrom matplotlibcolors import from sympyutilitieslambdify import lambdifyfrom IPythondisplay import display Markdown Latex

Now we define some symbolic variables and

functions

var(xy)F_x = Function(F_x)(xy)F_y = Function(F_y)(xy)

We use the same function defined in

Lecture vecsdiv quiver_plotter_2D to make

several of these plots

Letrsquos inspect several cases

vecs Vector calculus curl Curl line integrals and circulation p 3

100 075 050 025 000 025 050 075 100100

075

050

025

000

025

050

075

100F(xy) = [0cos(2pix)]

6

4

2

0

2

4

6

Figure 056 png

p = quiver_plotter_2D(field=F_x0F_ycos(2pix)density_operation=curlgrid_decimate_x=2grid_decimate_y=10grid_width=1

)

The curl is

ndash2π sin (2πx)

p = quiver_plotter_2D(field=F_x0F_yx2density_operation=curl

vecs Vector calculus curl Curl line integrals and circulation p 4

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [0x2]

6

4

2

0

2

4

6

Figure 057 png

grid_decimate_x=2grid_decimate_y=20

)

The curl is

2x

p = quiver_plotter_2D(field=F_xy2F_yx2density_operation=curlgrid_decimate_x=2grid_decimate_y=20

)

vecs Vector calculus curl Curl line integrals and circulation p 5

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [y2x2]

6

4

2

0

2

4

6

Figure 058 png

The curl is

2x ndash 2y

p = quiver_plotter_2D(field=F_x-yF_yxdensity_operation=curlgrid_decimate_x=6grid_decimate_y=6

)

Warning density operator is constant (no densityplot)rarr

The curl is

vecs Vector calculus grad Curl line integrals and circulation p 6

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [-yx]

Figure 059 png

2

vecs Vector calculus grad Gradient p 1

gradient

direction

magnitude

principle of least action

vecsgrad Gradient

Gradient

The gradient grad of a scalarshyvalued

function f R3 rarr R is a vector field

F R3 rarr R3 that is grad f is a vectorshyvalued

function on R3 The gradientrsquos local direction

and magnitude are those of the local maximum

rate of increase of f This makes it useful in

optimization (eg in the method of gradient

descent)

In classical mechanics quantum mechanics

relativity string theory thermodynamics

and continuum mechanics (and elsewhere)

the principle of least action is taken as

fundamental (Richard P Feynman Robert B

Leighton and Matthew Sands The Feynman

Lectures on Physics New Millennium Perseus

Basic Books 2010) This principle tells us that

naturersquos laws quite frequently seem to be

derivable by assuming a certain quantitymdash

called actionmdashis minimized Considering then

that the gradient supplies us with a tool for

optimizing functions it is unsurprising that the

gradient enters into the equations of motion of

many physical quantities

The gradient is coordinate-independent but

its coordinate-free definitions donrsquot add much

to our intuition In cartesian coordinates it

can be shown to be equivalent to the following

vecs Vector calculus grad Gradient p 1

gradience

11 This is nonstandard terminology but wersquore bold

potentials

Equation grad1 gradient cartesian

coordinates

grad f =[partxf partyf partzf

]gt

Vector fields from gradients are special

Although all gradients are vector fields not

all vector fields are gradients That is given

a vector field F it may or may not be equal to

the gradient of any scalarshyvalued function f

Letrsquos say of a vector field that is a gradient

that it has gradience11 Those vector fields

that are gradients have special properties

Surprisingly those properties are connected to

path independence and curl It can be shown

that iff a field is a gradient line integrals of

the field are path independent That is for a

vector field

gradiencehArr path independence (grad2)

Consideringwhat we know from

Lecture vecscurl about path independence we

can expand Figure 055 to obtain Figure 0510

One implication is that gradients have zero

curl Many important fields that describe

physical interactions (eg static electric fields

Newtonian gravitational fields) are gradients

of scalar fields called potentials

Exploring gradient

Gradient is perhaps best explored by

considering it for a scalar field on R2 Sucha field in cartesian coordinates f(x y) has

vecs Vector calculus grad Gradient p 2

gradience

zero curl

zero circulation path independence

connectedness and

Figure 0510 an implication graph relating gradiencezero curl zero circulation path independence andconnectedness Green edges represent implication(a implies b) and black edges represent logicalconjunctions

gradient

grad f =[partxf partyf

]gt(grad3)

That is grad f = F = partxf i + partyf j If we overlay

a quiver plot of F over a ldquocolor densityrdquo

plot representing the f we can increase our

intuition about the gradient

The followingwas generated from a Jupyter

notebook with the following filename and

kernel

notebook filename gradipynbnotebook kernel python3

First load some Python packages

vecs Vector calculus grad Gradient p 3

from sympy import import numpy as npimport matplotlibpyplot as pltfrom matplotlibticker import LogLocatorfrom matplotlibcolors import from sympyutilitieslambdify import lambdifyfrom IPythondisplay import display Markdown Latex

Now we define some symbolic variables and

functions

var(xy)

(x y)

Rather than repeat code letrsquos write a single

function grad_plotter_2D to make several of

these plots

Letrsquos inspect several cases While considering

the following plots remember that they all

have zero curl

p = grad_plotter_2D(field=x

)

vecs Vector calculus grad Gradient p 4

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = x

3

2

1

0

1

2

Figure 0511 png

The gradient is

[1 0

]

p = grad_plotter_2D(field=x+y

)

The gradient is

[1 1

]

vecs Vector calculus grad Gradient p 1

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = x + y

6

4

2

0

2

4

Figure 0512 png

p = grad_plotter_2D(field=1

)

Warning field is constant (no plot)The gradient is

[0 0

]Gravitational potential

Gravitational potentials have the form of

1distance Letrsquos check out the gradient

vecs Vector calculus grad Gradient p 1

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = 1sqrt(x2 + y2)

100

101

Figure 0513 png

conic section

p = grad_plotter_2D(field=1sqrt(x2+y2)norm=SymLogNorm(linthresh=3 linscale=3)mask=True

)

The gradient is

[ndash x(x2+y2

) 32

ndashy(

x2+y2) 32

]

Conic section fields

Gradients of conic section fields can be

explored

vecs Vector calculus grad Gradient p 2

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = x2

1

2

3

4

5

6

7

8

9

Figure 0514 png

parabolic fields

eliptic fields

The following is called a parabolic field

p = grad_plotter_2D(field=x2

)

The gradient is

[2x 0

]

The following are called eliptic fields

vecs Vector calculus grad Gradient p 3

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = x2 + y2

2

4

6

8

10

12

14

16

18

Figure 0515 png

p = grad_plotter_2D(field=x2+y2

)

The gradient is

[2x 2y

]

p = grad_plotter_2D(field=-x2-y2

)

vecs Vector calculus grad Gradient p 4

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = -x2 - y2

18

16

14

12

10

8

6

4

2

Figure 0516 png

hyperbolic fields

The gradient is

[ndash2x ndash2y

]

The following is called a hyperbolic field

p = grad_plotter_2D(field=x2-y2

)

The gradient is

vecs Vector calculus stoked Gradient p 5

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = x2 - y2

8

6

4

2

0

2

4

6

8

Figure 0517 png

[2x ndash2y

]

vecs Vector calculus stoked Stokes and divergence theorems p 1

divergence theorem

triple integral

orientable

12 A surface is orientable if a consistent normal direction can

be defined Most surfaces are orientable but some notably the

Moumlbius strip cannot be See Kreyszig (Kreyszig Advanced Engineering

Mathematics sect 106) for more

13 Ibid sect 107

14 Schey Div Grad Curl and All that An Informal Text on Vector

Calculus pp 45-52

continuity equation

15 Ibid pp 49-52

vecsstoked Stokes and

divergence

theorems

Two theorems allow us to exchange certain

integrals in R3 for others that are easier to

evaluate

The divergence theorem

The divergence theorem asserts the equality

of the surface integral of a vector field F and

the triple integral of div F over the volume V

enclosed by the surface S in R3 That is

umlSF middot ndS =

˚Vdiv FdV (stoked1)

Caveats are that V is a closed region bounded

by the orientable12 surface S and that F is

continuous and continuously differentiable

over a region containing V This theorem makes

some intuitive sense we can think of the

divergence inside the volume ldquoaccumulatingrdquo

via the triple integration and equaling the

corresponding surface integral For more on the

divergence theorem see Kreyszig13 and Schey14

A lovely application of the divergence theorem

is that for any quantity of conserved stuff

(mass charge spin etc) distributed in a spatial

R3 with timeshydependent density ρ R4 rarr Rand velocity field v R4 rarr R3 the divergence

theorem can be applied to find that

parttρ = ndash div(ρv) (stoked2)

which is a more general form of a continuity

equation one of the governing equations of

many physical phenomena For a derivation of

this equation see Schey15

vecs Vector calculus ex Stokes and divergence theorems p 1

KelvinshyStokesrsquo theorem

piecewise smooth

16 A surface is smooth if its normal is continuous everywhere It

is piecewise smooth if it is composed of a finite number of smooth

surfaces

17 Kreyszig Advanced Engineering Mathematics sect 109

18 Schey Div Grad Curl and All that An Informal Text on Vector

Calculus pp 93-102

Greenersquos theorem

19 Kreyszig Advanced Engineering Mathematics sect 109

generalized Stokesrsquo theorem

20 Lee Introduction to Smooth Manifolds Ch 16

The Kelvin-Stokesrsquo theorem

The KelvinshyStokesrsquo theorem asserts the

equality of the circulation of a vector field

F over a closed curve C and the surface

integral of curl F over a surface S that has

boundary C That is for r(t) a parameterization

of C and surface normal n

˛CF(r(t)) middot rprime(t) dt =

umlSn middot curl FdS (stoked3)

Caveats are that S is piecewise smooth16 its

boundary C is a piecewise smooth simple closed

curve and F is continuous and continuously

differentiable over a region containing S

This theorem is also somewhat intuitive we

can think of the divergence over the surface

ldquoaccumulatingrdquo via the surface integration and

equaling the corresponding circulation For more

on the Kelvin-Stokesrsquo theorem see Kreyszig17

and Schey18

Related theorems

Greenersquos theorem is a twoshydimensional special

case of the Kelvin-Stokesrsquo theorem It is

described by Kreyszig19

It turns out that all of the above theorems

(and the fundamental theorem of calculus

which relates the derivative and integral)

are special cases of the generalized Stokesrsquo

theorem defined by differential geometry

We would need a deeper understanding of

differential geometry to understand this

theorem For more see Lee20

vecs Vector calculus ex Exercises for Chapter 05 p 1

vecsex Exercises for Chapter

05

Exercise 051 (6)

The PDE of Example pdeseparation-1 can be used

to describe the conduction of heat along

a long thin rod insulated along its length

where u(t x) represents temperature The initial

and dirichlet boundary conditions in that

example would be interpreted as an initial

temperature distribution along the bar and

fixed temperatures of the ends Now consider

the same PDE

parttu(t x) = kpart2xxu(t x) (ex1)

with real constant k now with neumann

boundary conditions on inverval x isin [0 L]

partxu|x=0 = 0 and partxu|x=L = 0 (ex2a)

and with initial condition

u(0 x) = f(x) (ex3)

where f is some piecewise continuous function

on [0 L] This represents the complete insulation

of the ends of the rod such that no heat flows

from the ends (or from anywhere else)

a Assume a product solution separate

variables into X(x) and T(t) and set the

separation constant to ndashλ

b Solve the boundary value problem for its

eigenfunctions Xn and eigenvalues λn

c Solve for the general solution of the

time variable ODE

d Write the product solution and apply the

initial condition f(x) by constructing it

from a generalized fourier series of the

product solution

four Vector calculus Exercises for Chapter 05 p 2

e Let L = 1 and f(x) = 100 ndash 200L |x ndash L2| as

shown in Figure 071 Compute the solution

series components Plot the sum of the

first 50 terms over x and t

0 02 04 06 08 1050100

x

f(x)

Figure 0518 initial condition for Exercise 051

four

Fourier andorthogonality

four Fourier and orthogonality series Fourier series p 1

frequency domain

Fourier analysis

1 Itrsquos important to note that the symbol ωn in this context is not the

natural frequency but a frequency indexed by integer n

frequency spectrum

fourseries Fourier series

Fourier series are mathematical series that

can represent a periodic signal as a sum

of sinusoids at different amplitudes and

frequencies They are useful for solving

for the response of a system to periodic

inputs However they are probably most

important conceptually they are our gateway

to thinking of signals in the frequency

domainmdashthat is as functions of frequency

(not time) To represent a function as a

Fourier series is to analyze it as a sum of

sinusoids at different frequencies1 ωn and

amplitudes an Its frequency spectrum is the

functional representation of amplitudes an

versus frequency ωn

Letrsquos begin with the definition

Definition fourseries1 Fourier series

trigonometric form

The Fourier analysis of a periodic function

y(t) is for n isin N0 period T and angular

frequency ωn = 2πnT

an =2

T

ˆ T2

ndashT2y(t) cos(ωnt)dt (series1)

bn =2

T

ˆ T2

ndashT2y(t) sin(ωnt)dt (series2)

The Fourier synthesis of a periodic function

y(t) with analysis components an and bn

corresponding to ωn is

y(t) =a02+

infinsumn=1

an cos(ωnt) + bn sin(ωnt)

(series3)

four Fourier and orthogonality series Fourier series p 2

harmonic

harmonic frequency

fundamental frequency

Letrsquos consider the complex form of the

Fourier series which is analogous to

Definition fourseries1 It may be helpful to

review Eulerrsquos formula(s) ndash see Appendix D01

Definition fourseries2 Fourier series

complex form

The Fourier analysis of a periodic function

y(t) is for n isin N0 period T and angular

frequency ωn = 2πnT

cplusmnn =1

T

ˆ T2

ndashT2y(t)endashjωntdt (series4)

The Fourier synthesis of a periodic function

y(t) with analysis components cn corresponding

to ωn is

y(t) =infinsumn=ndashinfin cne

jωnt (series5)

We call the integer n a harmonic and the

frequency associated with it

ωn = 2πnT (series6)

the harmonic frequency There is a special

name for the first harmonic (n = 1) the

fundamental frequency It is called this

because all other frequency components are

integer multiples of it

It is also possible to convert between the two

representations above

four Fourier and orthogonality series Fourier series p 3

harmonic amplitude

magnitude line spectrum

harmonic phase

Definition fourseries3 Fourier series

converting between

forms

The complex Fourier analysis of a periodic

function y(t) is for n isin N0 and an and bn as

defined above

cplusmnn =1

2

(a|n| ∓ jb|n|

)(series7)

The sinusoidal Fourier analysis of a periodic

function y(t) is for n isin N0 and cn as defined

above

an = cn + cndashn and (series8)

bn = j (cn ndash cndashn) (series9)

The harmonic amplitude Cn is

Cn =

radica2n + b

2n (series10)

= 2radiccncndashn (series11)

A magnitude line spectrum is a graph of the

harmonic amplitudes as a function of the

harmonic frequencies The harmonic phase is

θn = ndash arctan2(bn an) (see Appendix C02)

= arctan2(Im(cn) Re(cn))

The following illustration demonstrates how

sinusoidal components sum to represent a

square wave A line spectrum is also shown

four Fourier and orthogonality series Fourier series p 4

time

frequency

spectral amplitudeamplitude

Let us compute the associated spectral

components in the following example

Example fourseriesshy1 re Fourier series analysis line spectrum

Problem Statement

Compute the first five harmonic amplitudes

that represent the line spectrum for a square

wave in the figure above

four Fourier and orthogonality trans Fourier series p 5

four Fourier and orthogonality trans Fourier transform p 1

fourtrans Fourier transform

The followingwas generated from a Jupyter

notebook with the following filename and

kernel

notebook filename fourier_series_to_transformipynbnotebook kernel python3

We begin with the usual loading of modules

import numpy as np for numericsimport sympy as sp for symbolicsimport matplotlibpyplot as plt for plotsfrom IPythondisplay import display Markdown Latex

Letrsquos consider a periodic function f with

period T (T) Each period the function has a

triangular pulse of width δ (pulse_width) andheight δ2

period = 15 periodpulse_width = 2 pulse width

First we plot the function f in the time

domain Letrsquos begin by defining f

four Fourier and orthogonality trans Fourier transform p 2

def pulse_train(tTpulse_width)f = lambda xpulse_width2-abs(x) pulsetm = npmod(tT)if tm lt= pulse_width2

return f(tm)elif tm gt= T-pulse_width2

return f(-(tm-T))else

return 0

Now we develop a numerical array in time to

plot f

N = 201 number of points to plottpp = nplinspace(-period25period2N) time valuesfpp = nparray(npzeros(tppshape))for it_now in enumerate(tpp)

fpp[i] = pulse_train(t_nowperiodpulse_width)

axes = pltfigure(1)pltplot(tppfppb-linewidth=2) plotpltxlabel(time (s))pltxlim([-period23period2])pltxticks(

[0pulse_width2period][0$delta2$$T=+str(period)+$ s]

)pltyticks([0pulse_width2][0$delta2$])pltshow() display here

For δ = 2 and T isin [5 15 25] the leftshyhand

column of Figure 061 shows two triangle pulses

for each period T

four Fourier and orthogonality trans Fourier transform p 3

Consider the following argument Just as

a Fourier series is a frequency domain

representation of a periodic signal a Fourier

transform is a frequency domain representation

of an aperiodic signal (we will rigorously

define it in a moment) The Fourier series

components will have an analog then in the

Fourier transform Recall that they can be

computed by integrating over a period of the

signal If we increase that period infinitely

the function is effectively aperiodic The

result (within a scaling factor) will be the

Fourier transform analog of the Fourier series

components

Let us approach this understanding by actually

computing the Fourier series components for

increasing period T using Wersquoll use sympyto compute the Fourier series cosine and sine

components an and bn for component n (n) andperiod T (T)

spvar(xa_0a_nb_nreal=True)spvar(deltaTpositive=True)spvar(nnonnegative=True) a0 = 2Tspintegrate( (delta2-spAbs(x)) (x-delta2delta2) otherwise zero )simplify()an = spintegrate(

2T(delta2-spAbs(x))spcos(2sppinTx)(x-delta2delta2) otherwise zero

)simplify()bn = 2Tspintegrate(

(delta2-spAbs(x))spsin(2sppinTx)(x-delta2delta2) otherwise zero

)simplify()display(spEq(a_nan)spEq(b_nbn))

four Fourier and orthogonality trans Fourier transform p 4

an =

T(1ndashcos

(πδnT

))π2n2

for n 6= 0δ2

2Totherwise

bn = 0

Furthermore let us compute the harmonic

amplitude

(f_harmonic_amplitude)

Cn =

radica2n + b

2n (trans1)

which we have also scaled by a factor Tδ in

order to plot it with a convenient scale

spvar(C_npositive=True)cn = spsqrt(an2+bn2)display(spEq(C_ncn))

Cn =

T∣∣∣cos(πδn

T

)ndash1∣∣∣

π2n2for n 6= 0

δ2

2Totherwise

Now we lambdify the symbolic expression for a

numpy function

cn_f = splambdify((nTdelta)cn)

Now we can plot

four Fourier and orthogonality trans Fourier transform p 5

amplitude CnTδ

δ2T = 5 s

δ2

5 10

δ2

δ2

T = 15 s

δ2

5 10

δ2

δ2

T = 25 s

δ2

time (s)

5 10

δ2

frequency ω (rads)

Figure 061 triangle pulse trains (left column) withlonger periods descending and their correspondingline spectra (right column) scaled for convenientcomparison

omega_max = 12 rads max frequency in line spectrumn_max = round(omega_maxperiod(2nppi)) max harmonicn_a = nplinspace(0n_maxn_max+1)omega = 2nppin_aperiodaxes = pltfigure(2)markerline stemlines baseline = pltstem(

omega periodpulse_widthcn_f(n_aperiodpulse_width)linefmt=b- markerfmt=bo basefmt=r-use_line_collection=True

)pltxlabel(frequency $omega$ (rads))pltxlim([0omega_max])pltylim([0pulse_width2])pltyticks([0pulse_width2][0$delta2$])pltshow() show here

The line spectra are shown in the rightshyhand

column of Figure 061 Note that with our chosen

scaling as T increases the line spectra reveal

a distinct waveform

four Fourier and orthogonality trans Fourier transform p 6

2 4 6 8 10 12

δ2

frequency ω (rads)

F(ω)

Figure 062 F(ω) our mysterious Fourier seriesamplitude analog

Let F be the continuous function of angular

frequency ω

F(ω) =δ

2middot sin

2(ωδ4)

(ωδ4)2 (trans2)

First we plot it

F = lambda w pulse_width2 npsin(wpulse_width(22))2 (wpulse_width(22))2

N = 201 number of points to plotwpp = nplinspace(00001omega_maxN)Fpp = []for i in range(0N)

Fppappend(F(wpp[i])) build array of function valuesaxes = pltfigure(3)pltplot(wppFppb-linewidth=2) plotpltxlim([0omega_max])pltyticks([0pulse_width2][0$delta2$])pltxlabel(frequency $omega$ (rads))pltylabel($F(omega)$)pltshow()

Letrsquos consider the plot in Figure 062 of F Itrsquos

obviously the function emerging in Figure 061

from increasing the period of our pulse train

four Fourier and orthogonality trans Fourier transform p 7

Now we are ready to define the Fourier

transform and its inverse

Definition fourtrans1 Fourier transforms

trigonometric form

Fourier transform (analysis)

A(ω) =

ˆ infinndashinfin y(t) cos(ωt)dt (trans3)

B(ω) =

ˆ infinndashinfin y(t) sin(ωt)dt (trans4)

Inverse Fourier transform (synthesis)

y(t) =1

ˆ infinndashinfin A(ω) cos(ωt)dω +

1

ˆ infinndashinfin B(ω) sin(ωt)dω

(trans5)

Definition fourtrans2 Fourier transforms

complex form

Fourier transform F (analysis)

F(y(t)) = Y(ω) =

ˆ infinndashinfin y(t)endashjωtdt (trans6)

Inverse Fourier transform Fndash1 (synthesis)

Fndash1(Y(ω)) = y(t) =1

ˆ infinndashinfin Y(ω)ejωtdω (trans7)

So now we have defined the Fourier transform

There are many applications including solving

differential equations and frequency domain

representationsmdashcalled spectramdashof time domain

functions

There is a striking similarity between the

four Fourier and orthogonality trans Fourier transform p 8

Fourier transform and the Laplace transform

with which you are already acquainted

In fact the Fourier transform is a special

case of a Laplace transform with Laplace

transform variable s = jω instead of having

some real component Both transforms convert

differential equations to algebraic equations

which can be solved and inversely transformed

to find timeshydomain solutions The Laplace

transform is especially important to use

when an input function to a differential

equation is not absolutely integrable and

the Fourier transform is undefined (for

example our definition will yield a transform

for neither the unit step nor the unit ramp

functions) However the Laplace transform is

also preferred for initial value problems due

to its convenient way of handling them The

two transforms are equally useful for solving

steady state problems Although the Laplace

transform has many advantages for spectral

considerations the Fourier transform is the

only game in town

A table of Fourier transforms and their

properties can be found in Appendix B02

Example fourtransshy1 re a Fourier transform

Problem Statement

Consider the aperiodic signal y(t) = us(t)endashat

with us the unit step function and a gt 0 The

signal is plotted below Derive the complex

frequency spectrum and plot its magnitude

and phase

ndash2 0 2 40

051

t

y(t)

four Fourier and orthogonality trans Fourier transform p 9

Problem SolutionThe signal is aperiodic so the Fourier

transform can be computed from

Equation trans6

Y(ω) =

ˆ infinndashinfin y(t)endashjωtdt

=

ˆ infinndashinfin us(t)e

ndashatendashjωtdt (def of y)

=

ˆ infin0

endashatendashjωtdt (us effect)

=

ˆ infin0

e(ndashandashjω)tdt (multiply)

=1

ndasha ndash jωe(ndashandashjω)t

∣∣∣∣infin0dt (antiderivative)

=1

ndasha ndash jω

(limtrarrinfin e(ndashandashjω)t ndash e0

)(evaluate)

=1

ndasha ndash jω

(limtrarrinfin endashatendashjωt ndash 1

)(arrange)

=1

ndasha ndash jω

((0)(complex with mag 6 1) ndash 1

)(limit)

=ndash1

ndasha ndash jω(consequence)

=1

a + jω

=a ndash jω

a ndash jωmiddot 1

a + jω(rationalize)

=a ndash jω

a2 +ω2

The magnitude and phase of this complex

function are straightforward to compute

|Y(ω)| =radicRe(Y(ω))2 + Im(Y(ω))2

=1

a2 +ω2

radica2 +ω2

=1radic

a2 +ω2

angY(ω) = ndash arctan(ωa)

four Fourier and orthogonality general Fourier transform p 10

Now we can plot these functions of ω Setting

a = 1 (arbitrarily) we obtain the following

plots

0

05

1

|Y(ω)|

ndash10 ndash5 0 5 10

ndash1

0

1

ω

angY(ω)

four Fourier and orthogonality general Generalized fourier series and orthogonality p 1

2 A function f is square-integrable ifacuteinfinndashinfin |f(x)|2 dx ltinfin

inner product

weight function

3 This definition of the inner product can be extended to functions

on R2 and R3 domains using double- and triple-integration See

( Schey Div Grad Curl and All that An Informal Text on Vector

Calculus p 261)

orthogonality

basis

function space

legendre polynomials

fourgeneral Generalized

fourier series and

orthogonality

Let f R rarr C g R rarr C and w R rarr Cbe complex functions For square-integrable2

f g and w the inner product of f and g with

weight function w over the interval [a b] sube Ris3

〈f g〉w =ˆ b

af(x)g(x)w(x) dx (general1)

where g denotes the complex conjugate of

g The inner product of functions can be

considered analogous to the inner (or dot)

product of vectors

The fourier series components can be found by a

special property of the sin and cos functions

called orthogonality In general functions

f and g from above are orthogonal over the

interval [a b] iff

〈f g〉w = 0 (general2)

for weight function w Similar to how a set of

orthogonal vectors can be a basis for a vector

space a set of orthogonal functions can be

a basis for a function space a vector space

of functions from one set to another (with

certain caveats)

In addition to some sets of sinusoids there

are several other important sets of functions

that are orthogonal For instance sets of

legendre polynomials (Erwin Kreyszig

Advanced Engineering Mathematics 10th John

Wiley amp Sons Limited 2011 ISBN 9781119571094

The authoritative resource for engineering

four Fourier and orthogonality ex Generalized fourier series and orthogonality p 2

bessel functions

generalized fourier series

Fourier components

synthesis

analysis

fourier-legendre series

fourier-bessel series

mathematics It includes detailed accounts

of probability statistics vector calculus

linear algebra fourier analysis ordinary

and partial differential equations and

complex analysis It also includes several

other topics with varying degrees of depth

Overall it is the best place to start when

seeking mathematical guidance sect 52) and bessel

functions (Kreyszig Advanced Engineering

Mathematics sect 54) are orthogonal

As with sinusoids the orthogonality of some

sets of functions allows us to compute their

series components Let functions f0 f1 middot middot middot beorthogonal with respect to weight function

w on interval [a b] and let α0 α1 middot middot middot be realconstants A generalized fourier series is

(ibid sect 116)

f(x) =infinsumm=0

αmfm(x) (general3)

and represents a function f as a convergent

series It can be shown that the Fourier

components αm can be computed from

αm =〈f fm〉w〈fm fm〉w

(general4)

In keepingwith our previous terminology

for fourier series Equation general3 and

Equation general4 are called general fourier

synthesis and analysis respectively

For the aforementioned legendre and bessel

functions the generalized fourier series are

called fourier-legendre and fourier-bessel

series (ibid sect 116) These and the standard

fourier series (Lecture fourseries) are of

particular interest for the solution of partial

differential equations (Chapter 07)

four Fourier and orthogonality ex Exercises for Chapter 06 p 1

3 Python code in this section was generated from a Jupyter notebook

named random_signal_fftipynb with a python3 kernel

fourex Exercises for Chapter

06

Exercise 061 (secrets)

This exercise encodes a ldquosecret wordrdquo into a

sampled waveform for decoding via a discrete

fourier transform (DFT) The nominal goal of

the exercise is to decode the secret word

Along the way plotting and interpreting the DFT

will be important

First load relevant packages

import numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

We define two functions letter_to_number to

convert a letter into an integer index of the

alphabet (a becomes 1 b becomes 2 etc) and

string_to_number_list to convert a string to

a list of ints as shown in the example at the

end

def letter_to_number(letter)return ord(letter) - 96

def string_to_number_list(string)out = [] listfor i in range(0len(string))

outappend(letter_to_number(string[i]))return out list

print(faces = string_to_number_list(aces) )

four Fourier and orthogonality ex Exercises for Chapter 06 p 2

aces = [1 3 5 19]

Now we encode a code string code into

a signal by beginningwith ldquowhite noiserdquo

which is broadband (appears throughout the

spectrum) and adding to it sin functions

with amplitudes corresponding to the letter

assignments of the code and harmonic

corresponding to the position of the letter in

the string For instance the string bad would

be represented by noise plus the signal

2 sin2πt + 1 sin 4πt + 4 sin 6πt (ex1)

Letrsquos set this up for secret word chupcabra

N = 2000Tm = 30T = float(Tm)float(N)fs = 1Tx = nplinspace(0 Tm N)noise = 4nprandomnormal(0 1 N)code = chupcabra the secret wordcode_number_array = nparray(string_to_number_list(code))y = nparray(noise)for i in range(0len(code))

y = y + code_number_array[i]npsin(2nppi(i+1)x)

For proper decoding later it is important

to know the fundamental frequency of the

generated data

print(ffundamental frequency = fs Hz)

four Fourier and orthogonality ex Exercises for Chapter 06 p 3

0 1 2 3 4 5 6 7

time (s)

minus60

minus40

minus20

0

20

40

60

y n

Figure 063 the chupacabra signal

fundamental frequency = 6666666666666667 Hz

Now we plot

fig ax = pltsubplots()pltplot(xy)pltxlim([0Tm4])pltxlabel(time (s))pltylabel($y_n$)pltshow()

Finally we can save our data to a numpy file

secretsnpy to distribute our message

npsave(secretsy)

four Fourier and orthogonality ex Exercises for Chapter 06 p 1

Now I have done this (for a different secret

word) and saved the data download it here

ricopiconemathematical_foundationssourcesecretsnpy

In order to load the npy file into Python we

can use the following command

secret_array = npload(secretsnpy)

Your job is to (a) perform a DFT (b) plot the

spectrum and (c) decode the message Here are

a few hints

1 Use from scipy import fft to do the

DFT

2 Use a hanning window to minimize the

end-effects See numpyhanning for

instance The fft call might then look

like

2fft(nphanning(N)secret_array)N

where N = len(secret_array)3 Use only the positive spectrum you can

lop off the negative side and double the

positive side

Exercise 062 (seesaw)

Consider the periodic function f R rarr R with

period T defined for one period as

f(t) = at for t isin (ndashT2 T2] (ex2)

where a T isin R Perform a fourier analysis on

f Letting a = 5 and T = 1 plot f along with the

partial sum of the fourier synthesis the first

50 nonzero components over t isin [ndashT T]

four Fourier and orthogonality ex Exercises for Chapter 06 p 1

Exercise 063 (9)

Consider the function f Rrarr R defined as

f(t) =

a ndash a|t|T for t isin [ndashT T]

0 otherwise(ex3)

where a T isin R Perform a fourier analysis on f

resulting in F(ω) Plot F for various a and T

Exercise 064 (10)

Consider the function f Rrarr R defined as

f(t) = aendashb|tndashT| (ex4)

where a b T isin R Perform a fourier analysis on

f resulting in F(ω) Plot F for various a b and

T

Exercise 065 (11)

Consider the function f Rrarr R defined as

f(t) = a cosω0t + b sinω0t (ex5)

where a bω0 isin R constants Perform a fourier

analysis on f resulting in F(ω)

Exercise 066 (12)

Derive a fourier transform property for

expressions including function f Rrarr R for

f(t) cos(ω0t +ψ)

where ω0 ψ isin R

Exercise 067 (13)

Consider the function f Rrarr R defined as

f(t) = aus(t)endashbt cos(ω0t +ψ) (ex6)

where a bω0 ψ isin R and us(t) is the unit

step function Perform a fourier analysis on

f resulting in F(ω) Plot F for various a b ω0 ψ

and T

four Fourier and orthogonality ex Exercises for Chapter 06 p 1

Exercise 068 (14)

Consider the function f Rrarr R defined as

f(t) = g(t) cos(ω0t) (ex7)

where ω0 isin R and g R rarr R will be defined in

each part below Perform a fourier analysis on

f for each g below for ω1 isin R a constant and

consider how things change if ω1 rarr ω0

a g(t) = cos(ω1t)

b g(t) = sin(ω1t)

Exercise 069 (savage)

An instrument called a ldquolock-in amplifierrdquo can

measure a sinusoidal signal A cos(ω0t + ψ) =

a cos(ω0t) + b sin(ω0t) at a known frequency ω0with exceptional accuracy even in the presence

of significant noise N(t) The workings of these

devices can be described in two operations

first the following operations on the signal

with its noise f1(t) = a cos(ω0t) + b sin(ω0t) + N(t)

f2(t) = f1(t) cos(ω1t) and f3(t) = f1(t) sin(ω1t)

(ex8)

where ω0 ω1 a b isin R Note the relation of

this operation to the Fourier analysis of

Exercise 068 The key is to know with some

accuracty ω0 such that the instrument can set

ω1 asymp ω0 The second operation on the signal is

an aggressive lowshypass filter The filtered f2and f3 are called the inshyphase and quadrature

components of the signal and are typically

given a complex representation

(inshyphase) + j (quadrature)

Explain with fourier analyses on f2 and f3

pde Fourier and orthogonality Exercises for Chapter 06 p 1

a what F2 = F(f2) looks like

b what F3 = F(f3) looks like

c why we want ω1 asymp ω0 d why a lowshypass filter is desirable and

e what the timeshydomain signal will look

like

Exercise 0610 (strawman)

Consider again the lock-in amplifier explored

in Exercise 069 Investigate the lock-in

amplifier numerically with the following steps

a Generate a noisy sinusoidal signal

at some frequency ω0 Include enough

broadband white noise that the signal

is invisible in a timeshydomain plot

b Generate f2 and f3 as described in

Exercise 069

c Apply a timeshydomain discrete lowshypass

filter to each f2 7rarr φ2 and f3 7rarr φ3 such

as scipyrsquos scipysignalsosfiltfilt tocomplete the lock-in amplifier operation

Plot the results in time and as a complex

(polar) plot

d Perform a discrete fourier transform

on each f2 7rarr F2 and f3 7rarr F3 Plot the

spectra

e Construct a frequency domain lowshypass

filter F and apply it (multiply) to each

F2 7rarr Fprime2 and F3 7rarr Fprime3 Plot the filtered

spectra

f Perform an inverse discrete fourier

transform to each Fprime2 7rarr fprime2 and Fprime3 7rarr fprime3

Plot the results in time and as a complex

(polar) plot

g Compare the two methods used ie time-

domain filtering versus frequencyshydomain

filtering

ordinary differential equations

lumped-parameter modeling

pde

Partialdifferential

equations

An ordinary differential equation is one

with (ordinary) derivatives of functions a

single variable eachmdashtime in many applications

These typically describe quantities in some sort

of lumped-parameter way mass as a ldquopoint

particlerdquo a springrsquos force as a function of

timeshyvarying displacement across it a resistorrsquos

current as a function of timeshyvarying voltage

across it Given the simplicity of such models

in comparison to the wildness of nature

it is quite surprising how well they work

for a great many phenomena For instance

electronics rigid body mechanics population

dynamics bulk fluid mechanics and bulk heat

transfer can be lumpedshyparameter modeled

However as we saw in there are many

phenomena of which we require more detailed

models These include

pde Partial differential equations p 3

timeshyvarying spatial distribution

partial differential equations

dependent variables

independent variables

analytic solution

numeric solution

1 There are some analytic techniques for gaining insight into PDEs

for which there are no known solutions such as considering the

phase space This is an active area of research for more see Bove

Colombini and Santo (Antonio Bove F (Ferruccio) Colombini and

Daniele Del Santo Phase space analysis of partial differential

equations eng Progress in nonlinear differential equations and

their applications v 69 Boston Berlin Birkhaumluser 2006 ISBN

9780817645212)

bull detailed fluid mechanics

bull detailed heat transfer

bull solid mechanics

bull electromagnetism and

bull quantum mechanics

In many cases what is required to account

for is the timeshyvarying spatial distribution

of a quantity In fluid mechanics we treat a

fluid as having quantities such as density and

velocity that vary continuous over space and

time Deriving the governing equations for such

phenomena typically involves vector calculus

we observed in that statements about

quantities like the divergence (eg continuity)

can be made about certain scalar and vector

fields Such statements are governing equations

(eg the continuity equation) and they are

partial differential equations (PDEs) because

the quantities of interest called dependent

variables (eg density and velocity) are both

temporally and spatially varying (temporal

and spatial variables are therefore called

independent variables)

In this chapter we explore the analytic

solution of PDEs This is related to but

distinct from the numeric solution (ie

simulation) of PDEs which is another important

topic Many PDEs have no known analytic

solution so numeric solution is the best

available option1 However it is important

to note that the insight one can gain from an

analytic solution is often much greater than

that from a numeric solution This is easily

understood when one considers that a numeric

solution is an approximation for a specific set

pde Partial differential equations class p 4

2 Kreyszig Advanced Engineering Mathematics Ch 12

3 WA Strauss Partial Differential Equations An Introduction Wiley

2007 ISBN 9780470054567 A thorough and yet relatively compact

introduction

4 R Haberman Applied Partial Differential Equations with Fourier

Series and Boundary Value Problems (Classic Version) Pearson Modern

Classics for Advanced Mathematics Pearson Education Canada 2018

ISBN 9780134995434

of initial and boundary conditions Typically

very little can be said of what would happen

in general although this is often what we seek

to know So despite the importance of numeric

solution one should always prefer an analytic

solution

Three good texts on PDEs for further study are

Kreyszig2 Strauss3 and Haberman4

pde Partial differential equations class Classifying PDEs p 1

initial conditions

boundary conditions

well-posed problem

linear

nonlinear

pdeclass Classifying PDEs

PDEs often have an infinite number of

solutions however when applying them to

physical systems we usually assume there is

a deterministic or at least a probabilistic

sequence of events will occur Therefore we

impose additonal constraints on a PDE usually

in the form of

1 initial conditions values of

independent variables over all space at

an initial time and

2 boundary conditions values of

independent variables (or their

derivatives) over all time

Ideally imposing such conditions leaves us

with a well-posed problem which has three

aspects (Antonio Bove F (Ferruccio) Colombini

and Daniele Del Santo Phase space analysis of

partial differential equations eng Progress

in nonlinear differential equations and their

applications v 69 Boston Berlin Birkhaumluser

2006 ISBN 9780817645212 sect 15)

existence There exists at least one solution

uniqueness There exists at most one solution

stability If the PDE boundary conditons or

initial conditions are changed slightly

the solution changes only slightly

As with ODEs PDEs can be linear or nonlinear

pde Partial differential equations class Classifying PDEs p 2

order

second-order PDEs

forcing function

homogeneous

that is the independent variables and

their derivatives can appear in only linear

combinations (linear PDE) or in one or more

nonlinear combination (nonlinear PDE) As with

ODEs there are more known analytic solutions

to linear PDEs than nonlinear PDEs

The order of a PDE is the order of its

highest partial derivative A great many

physical models can be described by second-

order PDEs or systems thereof Let u be an

independent scalar variable a function of

m temporal and spatial variables xi isin RnA second-order linear PDE has the form

for coefficients αβγ δ real functions

of xi (WA Strauss Partial Differential

Equations An Introduction Wiley 2007 ISBN

9780470054567 A thorough and yet relatively

compact introduction sect 16)

nsumi=1

nsumj=1

αijpart2xixju︸ ︷︷ ︸

second-order terms

+msumk=1

(γkpartxku + δku

)︸ ︷︷ ︸

first- and zeroth-order terms

= f(x1 middot middot middot xn)︸ ︷︷ ︸forcing

(class1)

where f is called a forcing function When f

is zero Equation class1 is called homogeneous

We can consider the coefficients αij to be

components of a matrix A with rows indexed

by i and columns indexed by j There are four

prominent classes defined by the eigenvalues of

A

elliptic the eigenvalues all have the same sign

parabolic the eigenvalues have the same sign

except one that is zero

hyperbolic exactly one eigenvalue has the

opposite sign of the others and

pde Partial differential equations sturm Classifying PDEs p 3

cauchyshykowalevski theorem

cauchy problems

ultrahyperbolic at least two eigenvalues of

each signs

The first three of these have received extensive

treatment They are named after conic

sections due to the similarity the equations

have with polynomials when derivatives

are considered analogous to powers of

polynomial variables For instance here is a

case of each of the first three classes

part2xxu + part2yyu = 0 (elliptic)

part2xxu ndash part2yyu = 0 (hyperbolic)

part2xxu ndash parttu = 0 (parabolic)

When A depends on xi it may have multiple

classes across its domain In general this

equation and its associated initial and

boundary conditions do not comprise a well-

posed problem however several special cases

have been shown to be wellshyposed Thus far

the most general statement of existence and

uniqueness is the cauchyshykowalevski theorem

for cauchy problems

pde Partial differential equations sturm Sturm-liouville problems p 1

sturm-liouville (SshyL) differential equation

regular SshyL problem

5 For the SshyL problem to be regular it has the additional constraints

that p qσ are continuous and pσ gt 0 on [a b] This is also sometimes

called the sturm-liouville eigenvalue problem See Haberman (Haberman

Applied Partial Differential Equations with Fourier Series and

Boundary Value Problems (Classic Version) sect 53) for the more general

(non-regular) SshyL problem and Haberman (ibid sect 74) for the multi-

dimensional analog

boundary value problems

eigenfunctions

eigenvalues

6 These eigenvalues are closely related to but distinct from the

ldquoeigenvaluesrdquo that arise in systems of linear ODEs

7 Ibid sect 53

pdesturm Sturm-liouville

problems

Before we introduce an important solution

method for PDEs in Lecture pdeseparation we

consider an ordinary differential equation

that will arise in that method when dealing

with a single spatial dimension x the sturm-

liouville (SshyL) differential equation Let

p qσ be functions of x on open interval

(a b) Let X be the dependent variable and λ

constant The regular SshyL problem is the SshyL

ODE5

d

dx

(pXprime

)+ qX + λσX = 0 (sturm1a)

with boundary conditions

β1X(a) + β2Xprime(a) = 0 (sturm2a)

β3X(b) + β4Xprime(b) = 0 (sturm2b)

with coefficients βi isin R This is a type of

boundary value problem

This problem has nontrivial solutions

called eigenfunctions Xn(x) with n isin Z+ corresponding to specific values of λ = λn

called eigenvalues6 There are several

important theorems proven about this (see

Haberman7) Of greatest interest to us are that

1 there exist an infinite number of

eigenfunctions Xn (unique within a

multiplicative constant)

2 there exists a unique corresponding real

eigenvalue λn for each eigenfunction Xn

3 the eigenvalues can be ordered as λ1 lt

λ2 lt middot middot middot

pde Partial differential equations sturm Sturm-liouville problems p 1

homogeneous boundary conditions

periodic boundary conditions

4 eigenfunction Xn has n ndash 1 zeros on open

interval (a b)

5 the eigenfunctions Xn form an

orthogonal basis with respect to

weighting function σ such that

any piecewise continuous function

f [a b] rarr R can be represented by a

generalized fourier series on [a b]

This last theorem will be of particular interest

in Lecture pdeseparation

Types of boundary conditions

Boundary conditions of the sturm-liouville

kind (sturm2) have four sub-types

dirichlet for just β2 β4 = 0

neumann for just β1 β3 = 0

robin for all βi 6= 0 andmixed if β1 = 0 β3 6= 0 if β2 = 0 β4 6= 0

There are many problems that are not regular

sturm-liouville problems For instance the

rightshyhand sides of Equation sturm2 are

zero making them homogeneous boundary

conditions however these can also be

nonzero Another case is periodic boundary

conditions

X(a) = X(b) (sturm3a)

Xprime(a) = Xprime(b) (sturm3b)

pde Partial differential equations sturm Sturm-liouville problems p 2

Example pdesturmshy1 re a sturm-liouville problem with dirichlet

boundary conditions

Problem Statement

Consider the differential equation

Xprimeprime + λX = 0 (sturm4)

with dirichlet boundary conditions on the

boundary of the interval [0 L]

X(0) = 0 and X(L) = 0 (sturm5)

Solve for the eigenvalues and eigenfunctions

Problem SolutionThis is a sturm-liouville problem so we know

the eigenvalues are real The well-known

general solutions to the ODE is

X(x) =

k1 + k2x λ = 0

k1ejradicλx + k2e

ndashjradicλx otherwise

(sturm6)

with real constants k1 k2 The solution must

also satisfy the boundary conditions Letrsquos

apply them to the case of λ = 0 first

X(0) = 0rArr k1 + k2(0) = 0rArr k1 = 0 (sturm7)

X(L) = 0rArr k1 + k2(L) = 0rArr k2 = ndashk1L (sturm8)

Together these imply k1 = k2 = 0 which gives

the trivial solution X(x) = 0 in which we

arenrsquot interested We say then for nontrivial

solutions λ 6= 0 Now letrsquos check λ lt 0 The

solution becomes

X(x) = k1endashradic|λ|x + k2e

radic|λ|x (sturm9)

= k3 cosh(radic|λ|x) + k4 sinh(

radic|λ|x) (sturm10)

where k3 and k4 are real constants Again

applying the boundary conditions

X(0) = 0rArr k3 cosh(0) + k4 sinh(0) = 0rArr k3 + 0 = 0rArr k3 = 0

X(L) = 0rArr 0cosh(radic|λ|L) + k4 sinh(

radic|λ|L) = 0rArr k4 sinh(

radic|λ|L) = 0

However sinh(radic|λ|L) 6= 0 for L gt 0 so

k4 = k3 = 0mdashagain the trivial solution Now

pde Partial differential equations sturm Sturm-liouville problems p 3

letrsquos try λ gt 0 The solution can be written

X(x) = k5 cos(radicλx) + k6 sin(

radicλx) (sturm11)

Applying the boundary conditions for this

case

X(0) = 0rArr k5 cos(0) + k6 sin(0) = 0rArr k5 + 0 = 0rArr k5 = 0

X(L) = 0rArr 0cos(radicλL) + k6 sin(

radicλL) = 0rArr k6 sin(

radicλL) = 0

Now sin(radicλL) = 0 for

radicλL = nπrArr

λ =(nπL

)2 (n isin Z+)

Therefore the only nontrivial solutions

that satisfy both the ODE and the boundary

conditions are the eigenfunctions

Xn(x) = sin(radic

λnx)

(sturm12a)

= sin(nπLx)

(sturm12b)

with corresponding eigenvalues

λn =(nπL

)2 (sturm13)

Note that because λ gt 0 λ1 is the lowest

eigenvalue

Plotting the eigenfunctions

The followingwas generated from a Jupyter

notebook with the following filename and

kernel

notebook filename eigenfunctions_example_plotipynbnotebook kernel python3

First load some Python packages

pde Partial differential equations sturm Sturm-liouville problems p 4

import numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

Set L = 1 and compute values for the first

four eigenvalues lambda_n and eigenfunctions

X_n

L = 1x = nplinspace(0L100)n = nplinspace(144dtype=int)lambda_n = (nnppiL)2X_n = npzeros([len(n)len(x)])for in_i in enumerate(n)X_n[i] = npsin(npsqrt(lambda_n[i])x)

Plot the eigenfunctions

for in_i in enumerate(n)pltplot(

xX_n[i]linewidth=2label=n = +str(n_i)

)pltlegend()pltshow() display the plot

00 02 04 06 08 10

ndash1

ndash05

0

05

1

n = 1

n = 2

n = 3

n = 4

pde Partial differential equations separation Sturm-liouville problems p 5

We see that the fourth of the SshyL theorems

appears true n ndash 1 zeros of Xn exist on the

open interval (0 1)

pde Partial differential equations separation PDE solution by separation of variables p 1

separation of variables

linear

product solution

separable PDEs

pdeseparation PDE solution by

separation of

variables

We are now ready to learn one of the

most important techniques for solving PDEs

separation of variables It applies only to

linear PDEs since it will require the principle

of superposition Not all linear PDEs yield to

this solution technique but several that are

important do

The technique includes the following steps

assume a product solution Assume the

solution can be written as a product

solution up the product of functions

of each independent variable

separate PDE Substitute up into the PDE and

rearrange such that at least one side

of the equation has functions of a single

independent variabe If this is possible

the PDE is called separable

set equal to a constant Each side of

the equation depends on different

independent variables therefore they

must each equal the same constant often

called ndashλ

repeat separation as needed If there are more

than two independent variables there

will be an ODE in the separated variable

and a PDE (with one fewer variables) in

the other independent variables Attempt

to separate the PDE until only ODEs

remain

solve each boundary value problem Solve

each boundary value problem ODE

pde Partial differential equations separation PDE solution by separation of variables p 1

eigenfunctions

superposition

ignoring the initial conditions for now

solve the time variable ODE Solve for the

general solution of the time variable ODE

sans initial conditions

construct the product solution Multiply the

solution in each variable to construct

the product solution up If the boundary

value problems were sturm-liouville

the product solution is a family of

eigenfunctions from which any function

can be constructed via a generalized

fourier series

apply the initial condition The product

solutions individually usually do not

meet the initial condition However a

generalized fourier series of them nearly

always does Superposition tells us a

linear combination of solutions to the

PDE and boundary conditions is also

a solution the unique series that also

satisfies the initial condition is the

unique solution to the entire problem

Example pdeseparationshy1 re 1D diffusion equation

Problem Statement

Consider the oneshydimensional diffusion

equation PDEa

parttu(t x) = kpart2xxu(t x) (separation1)

with real constant k with dirichlet boundary

conditions on inverval x isin [0 L]

u(t 0) = 0 (separation2a)

u(t L) = 0 (separation2b)

and with initial condition

u(0 x) = f(x) (separation3)

where f is some piecewise continuous function

on [0 L]

pde Partial differential equations separation PDE solution by separation of variables p 2

a For more on the diffusion or heat equation see Haberman

(Haberman Applied Partial Differential Equations with Fourier

Series and Boundary Value Problems (Classic Version) sect 23)

Kreyszig (Kreyszig Advanced Engineering Mathematics sect 125)

and Strauss (Strauss Partial Differential Equations An

Introduction sect 23)

Problem Solution

Assume a product solution

First we assume a product solution of the

form up(t x) = T(t)X(x) where T and X are

unknown functions on t gt 0 and x isin [0 L]

Separate PDE

Second we substitute the product solution

into Equation separation1 and separate

variables

TprimeX = kTXprimeprime rArr (separation4)

Tprime

kT=Xprimeprime

X (separation5)

So it is separable Note that we chose to

group k with T which was arbitrary but

conventional

Set equal to a constant

Since these two sides depend on different

independent variables (t and x) they must

equal the same constant we call ndashλ so we

have two ODEs

Tprime

kT= ndashλ rArr Tprime + λkT = 0 (separation6)

Xprimeprime

X= ndashλ rArr Xprimeprime + λX = 0 (separation7)

Solve the boundary value problem

The latter of these equations with the

boundary conditions (separation2) is precisely

the same sturm-liouville boundary value

problem from Example pdesturm-1 which had

pde Partial differential equations separation PDE solution by separation of variables p 3

eigenfunctions

Xn(x) = sin(radic

λnx)

(separation8a)

= sin(nπLx)

(separation8b)

with corresponding (positive) eigenvalues

λn =(nπL

)2 (separation9)

Solve the time variable ODE

The time variable ODE is homogeneous and has

the familiar general solution

T(t) = cendashkλt (separation10)

with real constant c However the boundary

value problem restricted values of λ to λn so

Tn(t) = cendashk(nπL)

2t (separation11)

Construct the product solution

The product solution is

up(t x) = Tn(t)Xn(x)

= cendashk(nπL)2t sin

(nπLx) (separation12)

This is a family of solutions that each satisfy

only exotically specific initial conditions

Apply the initial condition

The initial condition is u(0 x) = f(x) The

eigenfunctions of the boundary value problem

form a fourier series that satisfies the initial

condition on the interval [0 L] if we extend

f to be periodic and odd over x (Kreyszig

Advanced Engineering Mathematics p 550) we

call the extension flowast The odd series synthesis

can be written

flowast(x) =infinsumn=1

bn sin(nπLx)

(separation13)

pde Partial differential equations separation PDE solution by separation of variables p 4

where the fourier analysis gives

bn =2

L

ˆ L

0flowast(χ) sin

(nπLχ) (separation14)

So the complete solution is

u(t x) =infinsumn=1

bnendashk(nπL)

2t sin(nπLx)

(separation15)

Notice this satisfies the PDE the boundary

conditions and the initial condition

Plotting solutions

If we want to plot solutions we need to

specify an initial condition u(0 x) = flowast(x)

over [0 L] We can choose anything piecewise

continuous but for simplicity letrsquos let

f(x) = 1 (x isin [0 L])

The odd periodic extension is an odd square

wave The integral (separation14) gives

bn =4

nπ(1 ndash cos(nπ))

=

0 n even

4nπ n odd

(separation16)

Now we can write the solution as

u(t x) =infinsum

n=1 n odd

4

nπendashk(nπL)

2t sin(nπLx)

(separation17)

Plotting in Python

First load some Python packages

import numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

pde Partial differential equations separation PDE solution by separation of variables p 5

Set k = L = 1 and sum values for the first Nterms of the solution

L = 1k = 1N = 100x = nplinspace(0L300)t = nplinspace(02(Lnppi)2100)u_n = npzeros([len(t)len(x)])for n in range(N)n = n+1 because index starts at 0if n 2 == 0 even

pass already initialized to zeroselse odd

u_n += 4(nnppi)npouter(npexp(-k(nnppiL)2t)npsin(nnppiLx)

)

Letrsquos first plot the initial condition

p = pltfigure()pltplot(xu_n[0])pltxlabel(space $x$)pltylabel($u(0x)$)pltshow()

00 02 04 06 08 10space x

00

02

04

06

08

10

12

u(0x)

pde Partial differential equations wave PDE solution by separation of variables p 6

Now we plot the entire response

p = pltfigure()pltcontourf(txu_nT)c = pltcolorbar()cset_label($u(tx))pltxlabel(time $t$)pltylabel(space $x$)pltshow()

000 005 010 015 020

time t

00

02

04

06

08

10

spacex

000

015

030

045

060

075

090

105

120

u(tx)

We see the diffusive action proceeds as we

expected

Python code in this section was generated from a Jupyter

notebook named pde_separation_example_01ipynb with a

python3 kernel

pde Partial differential equations wave The 1D wave equation p 1

wave equation

8 Kreyszig Advanced Engineering Mathematics sect 129 1210

9 Haberman Applied Partial Differential Equations with Fourier Series

and Boundary Value Problems (Classic Version) sect 45 73

pdewave The 1D wave equation

The oneshydimensional wave equation is the

linear PDE

part2ttu(t x) = c2part2xxu(t x) (wave1)

with real constant c This equation models

such phenomena as strings fluids sound and

light It is subject to initial and boundary

conditions and can be extended to multiple

spatial dimensions For 2D and 3D examples

in rectangular and polar coordinates see

Kreyszig8 and Haberman9

Example pdewaveshy1 re vibrating string PDE solution by separation of

variables

Problem Statement

Consider the oneshydimensional wave equation

PDE

part2ttu(t x) = c2part2xxu(t x) (wave2)

with real constant c and with dirichlet

boundary conditions on inverval x isin [0 L]

u(t 0) = 0 and u(t L) = 0 (wave3a)

and with initial conditions (we need two

because of the second timeshyderivative)

u(0 x) = f(x) and parttu(0 x) = g(x) (wave4)

where f and g are some piecewise continuous

functions on [0 L]

Problem Solution

Assume a product solution

First we assume a product solution of the

form up(t x) = T(t)X(x) where T and X are

unknown functions on t gt 0 and x isin [0 L]

pde Partial differential equations wave The 1D wave equation p 2

Separate PDE

Second we substitute the product solution

into Equation wave2 and separate variables

TprimeprimeX = c2TXprimeprime rArr (wave5)

Tprimeprime

c2T=Xprimeprime

X (wave6)

So it is separable Note that we chose to

group c with T which was arbitrary but

conventional

Set equal to a constant

Since these two sides depend on different

independent variables (t and x) they must

equal the same constant we call ndashλ so we

have two ODEs

Tprimeprime

c2T= ndashλ rArr Tprimeprime + λc2T = 0 (wave7)

Xprimeprime

X= ndashλ rArr Xprimeprime + λX = 0 (wave8)

Solve the boundary value problem

The latter of these equations with the

boundary conditions (wave3) is precisely

the same sturm-liouville boundary value

problem from Example pdesturm-1 which had

eigenfunctions

Xn(x) = sin(radic

λnx)

(wave9a)

= sin(nπLx)

(wave9b)

with corresponding (positive) eigenvalues

λn =(nπL

)2 (wave10)

Solve the time variable ODE

The time variable ODE is homogeneous and

with λ restricted by the reals by the

boundary value problem has the familiar

general solution

T(t) = k1 cos(cradicλt) + k2 sin(c

radicλt) (wave11)

pde Partial differential equations wave The 1D wave equation p 3

with real constants k1 and k2 However the

boundary value problem restricted values of λ

to λn so

Tn(t) = k1 cos(cnπLt)+ k2 sin

(cnπLt)

(wave12)

Construct the product solution

The product solution is

up(t x) = Tn(t)Xn(x)

= k1 sin(nπLx)cos

(cnπLt)+ k2 sin

(nπLx)sin

(cnπLt)

This is a family of solutions that each satisfy

only exotically specific initial conditions

Apply the initial conditions

Recall that superposition tells us that any

linear combination of the product solution is

also a solution Therefore

u(t x) =infinsumn=1

an sin(nπLx)cos

(cnπLt)+ bn sin

(nπLx)sin

(cnπLt)

(wave13)

is a solution If an and bn are properly

selected to satisfy the initial conditions

Equation wave13 will be the solution to the

entire problem Substituting t = 0 into our

potential solution gives

u(0 x) =infinsumn=1

an sin(nπLx)

(wave14a)

parttu(t x)|t=0 =infinsumn=1

bncnπ

Lsin

(nπLx) (wave14b)

Let us extend f and g to be periodic and odd

over x we call the extensions flowast and glowast From

Equation wave14 the intial conditions are

satsified if

flowast(x) =infinsumn=1

an sin(nπLx)

(wave15a)

glowast(x) =infinsumn=1

bncnπ

Lsin

(nπLx) (wave15b)

pde Partial differential equations wave The 1D wave equation p 4

We identify these as two odd fourier

syntheses The corresponding fourier analyses

are

an =2

L

ˆ L

0flowast(χ) sin

(nπLχ)

(wave16a)

bncnπ

L=2

L

ˆ L

0glowast(χ) sin

(nπLχ)

(wave16b)

So the complete solution is Equation wave13

with components given by Equation wave16

Notice this satisfies the PDE the boundary

conditions and the initial condition

Discussion

It can be shown that this series solution is

equivalent to two travelingwaves that are

interfering (see Habermana and Kreyszigb)

This is convenient because computing the

series solution exactly requires an infinite

summation We show in the following section

that the approximation by partial summation is

still quite good

Choosing specific initial conditions

If we want to plot solutions we need to

specify initial conditions over [0 L] Letrsquos

model a string being suddenly struck from

rest as

f(x) = 0 (wave17)

g(x) = δ(x ndash ∆L) (wave18)

where δ is the dirac delta distribution and

∆ isin [0 L] is a fraction of L representing the

location of the string being struck The odd

periodic extension is an odd pulse train The

integrals of (wave16) give

an = 0 (wave19a)

bn =2

cnπ

ˆ L

0δ(χ ndash ∆L) sin

(nπLχ)dx

=2

cnπsin(nπ∆) (sifting property)

pde Partial differential equations wave The 1D wave equation p 5

Now we can write the solution as

u(t x) =infinsumn=1

2

cnπsin(nπ∆) sin

(nπLx)sin

(cnπLt)

(wave20)

Plotting in Python

First load some Python packages

import numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

Set c = L = 1 and sum values for the first

N terms of the solution for some striking

location ∆

Delta = 01 0 lt= Delta lt= LL = 1c = 1N = 150t = nplinspace(030(Lnppi)2100)x = nplinspace(0L150)t_b x_b = npmeshgrid(tx)u_n = npzeros([len(x)len(t)])for n in range(N)n = n+1 because index starts at 0u_n += 4(cnnppi)

npsin(nnppiDelta) npsin(cnnppiLt_b) npsin(nnppiLx_b)

Letrsquos first plot some early snapshots of the

response

pde Partial differential equations wave The 1D wave equation p 6

import seaborn as snsn_snaps = 7snsset_palette(snsdiverging_palette(

240 10 n=n_snaps center=dark)

)

fig ax = pltsubplots()it = nplinspace(277n_snapsdtype=int)for i in range(len(it))axplot(xu_n[it[i]]label=ft = t[i]3f)

lgd = axlegend(bbox_to_anchor=(105 1)loc=upper left

)pltxlabel(space $x$)pltylabel($u(tx)$)pltshow()

00 02 04 06 08 10

space x

minus10

minus05

00

05

10

u(tx

)

t = 0000

t = 0031

t = 0061

t = 0092

t = 0123

t = 0154

t = 0184

Now we plot the entire response

fig ax = pltsubplots()p = axcontourf(txu_n)c = figcolorbar(pax=ax)cset_label($u(tx)$)pltxlabel(time $t$)pltylabel(space $x$)pltshow()

pde Partial differential equations ex The 1D wave equation p 7

00 05 10 15 20 25 30

time t

00

02

04

06

08

10

spac

ex

minus12

minus09

minus06

minus03

00

03

06

09

12

u(tx

)

We see a wave develop and travel reflecting

and inverting off each boundary

a Haberman Applied Partial Differential Equations with Fourier

Series and Boundary Value Problems (Classic Version) sect 44

b Kreyszig Advanced Engineering Mathematics sect 122

b Python code in this section was generated from a Jupyter

notebook named pde_separation_example_02ipynb with a

python3 kernel

pde Partial differential equations ex Exercises for Chapter 07 p 1

pdeex Exercises for Chapter 07

Exercise 071 (poltergeist)

The PDE of Example pdeseparation-1 can be used

to describe the conduction of heat along

a long thin rod insulated along its length

where u(t x) represents temperature The initial

and dirichlet boundary conditions in that

example would be interpreted as an initial

temperature distribution along the bar and

fixed temperatures of the ends Now consider

the same PDE

parttu(t x) = kpart2xxu(t x) (ex1)

with real constant k now with neumann

boundary conditions on inverval x isin [0 L]

partxu|x=0 = 0 and partxu|x=L = 0 (ex2a)

and with initial condition

u(0 x) = f(x) (ex3)

where f is some piecewise continuous function

on [0 L] This represents the complete insulation

of the ends of the rod such that no heat flows

from the ends (or from anywhere else)

a Assume a product solution separate

variables into X(x) and T(t) and set the

separation constant to ndashλ

b Solve the boundary value problem for its

eigenfunctions Xn and eigenvalues λn

c Solve for the general solution of the

time variable ODE

d Write the product solution and apply the

initial condition f(x) by constructing it

from a generalized fourier series of the

product solution

pde Partial differential equations ex Exercises for Chapter 07 p 1

e Let L = k = 1 and f(x) = 100ndash200L |xndashL2| as

shown in Figure 071 Compute the solution

series components Plot the sum of the

first 50 terms over x and t

0 02 04 06 08 1050100

x

f(x)

Figure 071 initial condition for Exercise 051Exercise 072 (kathmandu)

Consider the free vibration of a uniform

and relatively thin beammdashwith modulus of

elasticity E second moment of cross-sectional

area I and massshyper-length micromdashpinned at each

end The PDE describing this is a version of

the euler-bernoulli beam equation for vertical

motion u

part2ttu(t x) = ndashα2part4xxxxu(t x) (ex4)

with real constant α defined as

α2 =EI

micro (ex5)

Pinned supports fix vertical motion such that

we have boundary conditions on interval

x isin [0 L]

u(t 0) = 0 and u(t L) = 0 (ex6a)

Additionally pinned supports cannot provide

a moment so

part2xxu|x=0 = 0 and part2xxu|x=L = 0 (ex6b)

Furthermore consider the initial conditions

u(0 x) = f(x) and parttu|t=0 = 0 (ex7a)

where f is some piecewise continuous function

on [0 L]

opt Partial differential equations Exercises for Chapter 07 p 2

a Assume a product solution separate

variables into X(x) and T(t) and set the

separation constant to ndashλ

b Solve the boundary value problem for

its eigenfunctions Xn and eigenvalues λn

Assume real λ gt 0 (itrsquos true but tedious to

show)

c Solve for the general solution of the

time variable ODE

d Write the product solution and apply

the initial conditions by constructing

it from a generalized fourier series of

the product solution

e Let L = α = 1 and f(x) = sin(10πxL)

as shown in Figure 072 Compute the

solution series components Plot the sum

of the first 50 terms over x and t

0 02 04 06 08 1ndash101

x

f(x)

Figure 072 initial condition for Exercise 072

opt

Optimization

opt Optimization grad Gradient descent p 1

objective function

extremum

global extremum

local extremum

gradient descent

stationary point

saddle point

hessian matrix

optgrad Gradient descent

Consider a multivariate function f Rn rarr Rthat represents some cost or value This is

called an objective function and we often

want to find an X isin Rn that yields frsquos

extremum minimum or maximum depending on

whichever is desirable

It is important to note however that some

functions have no finite extremum Other

functions have multiple Finding a global

extremum is generally difficult however

many good methods exist for finding a local

extremum an extremum for some region R sub Rn

The method explored here is called gradient

descent It will soon become apparent why it

has this name

Stationary points

Recall from basic calculus that a function f of

a single variable had potential local extrema

where df(x) dx = 0 The multivariate version of

this for multivariate function f is

grad f = 0 (grad1)

A value X for which Equation grad1 holds is

called a stationary point However as in the

univariate case a stationary point may not be

a local extremum in these cases it called a

saddle point

Consider the hessian matrix H with values for

independent variables xi

Hij = part2xixjf (grad2)

opt Optimization grad Gradient descent p 1

second partial derivative test

positive definite

negative definite

indefinite

convex

concave

the gradient points toward stationary points

For a stationary point X the second partial

derivative test tells us if it is a local

maximum local minimum or saddle point

minimum If H(X) is positive definite (all its

eigenvalues are positive)

X is a local minimum

maximum If H(X) is negative definite (all its

eigenvalues are negative)

X is a local maximum

saddle If H(X) is indefinite (it has both

positive and negative eigenvalues)

X is a saddle point

These are sometimes called tests for concavity

minima occur where f is convex and maxima

where f is concave (ie where ndashf is convex)

It turns out however that solving

Equation grad1 directly for stationary points is

generally hard Therefore we will typically use

an iterative technique for estimating them

The gradient points the way

Although Equation grad1 isnrsquot usually directly

useful for computing stationary points it

suggests iterative techniques that are Several

techniques rely on the insight that the

gradient points toward stationary points

Recall from Lecture vecsgrad that grad f is a

vector field that points in the direction of

greatest increase in f

opt Optimization grad Gradient descent p 1

line search

step size α

Consider starting at some point x0 and

wanting to move iteratively closer to a

stationary point So if one is seeking a

maximum of f then choose x1 to be in the

direction of grad f If one is seeking a

minimum of f then choose x1 to be opposite the

direction of grad f

The question becomes how far α should we go

in (or opposite) the direction of the gradient

Surely too-small α will require more iteration

and too-large α will lead to poor convergence

or missing minima altogether This framing of

the problem is called line search There are a

few common methods for choosing α called the

step size some more computationally efficient

than others

Two methods for choosing the step size

are described below Both are framed as

minimization methods but changing the sign of

the step turns them into maximization methods

The classical method

Let

gk = grad f(xk) (grad3)

the gradient at the algorithmrsquos current

estimate xk of the minimum The classical

method of choosing α is to attempt to solve

analytically for

αk = argminα

f(xk ndash αgk) (grad4)

This solution approximates the function f

as one varies α It is approximate because

as α varies so should x But even with α as

the only variable Equation grad4 may be

opt Optimization grad Gradient descent p 1

1 Kreyszig Advanced Engineering Mathematics sect 221

2 Jonathan Barzilai and Jonathan M Borwein Two-Point Step Size

Gradient Methods In IMA Journal of Numerical Analysis 81 (Jan 1988)

pp 141ndash148 issn 0272-4979 doi 101093imanum81141 This

includes an innovative line search method

difficult or impossible to solve However this

is sometimes called the ldquooptimalrdquo choice for

α Here ldquooptimalityrdquo refers not to practicality

but to ideality This method is rarely used to

solve practical problems

The algorithm of the classical gradient descent

method can be summarized in the pseudocode

of Algorithm 1 It is described further in

Kreyszig1

Algorithm 1 Classical gradient descent

1 procedure classical_minimizer(fx0 T)

2 while δx gt T do until threshold T is

met

3 gk larr grad f(xk)

4 αk larr argminα f(xk ndash αgk)

5 xk+1 larr xk ndash αkgk6 δxlarr xk+1 ndash xk7 klarr k + 1

8 end while

9 return xk the threshold was reached

10 end procedureThe Barzilai and Borwein method

In practice several non-classical methods are

used for choosing step size α Most of these

construct criteria for step sizes that are too

small and too large and prescribe choosing

some α that (at least in certain cases) must

be in the sweet-spot in between Barzilai and

Borwein2 developed such a prescription which

we now present

Let ∆xk = xk ndash xkndash1 and ∆gk = gk ndash gkndash1 This

method minimizes ∆x ndash α∆g2 by choosing

αk =∆xk middot ∆gk∆gk middot ∆gk

(grad5)

opt Optimization grad Gradient descent p 2

3 Barzilai and Borwein Two-Point Step Size Gradient Methods

The algorithm of this gradient descent

method can be summarized in the pseudocode

of Algorithm 2 It is described further in

Barzilai and Borwein3

Algorithm 2 Barzilai and Borwein gradient

descent

1 procedure barzilai_minimizer(fx0 T)

2 while δx gt T do until threshold T is

met

3 gk larr grad f(xk)

4 ∆gk larr gk ndash gkndash15 ∆xk larr xk ndash xkndash16 αk larr

∆xkmiddot∆gk∆gkmiddot∆gk

7 xk+1 larr xk ndash αkgk8 δxlarr xk+1 ndash xk9 klarr k + 1

10 end while

11 return xk the threshold was reached

12 end procedure

Example optgradshy1 re Barzilai and Borwein gradient descent

Problem Statement

Consider the functions (a) f1 R2 rarr R and (b)

f2 R2 rarr R defined as

f1(x) = (x1 ndash 25)2 + 13(x2 + 10)

2 (grad6)

f2(x) =1

2x middot Ax ndash b middot x (grad7)

where

A =

[20 0

0 10

]and (grad8a)

b =[1 1

]gt (grad8b)

Use the method of Barzilai and Borweina

starting at some x0 to find a minimum of each

function

a Ibid

Problem Solution

First load some Python packages

opt Optimization grad Gradient descent p 3

from sympy import import numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latexfrom tabulate import tabulate

We begin by writing a class gradient_descent_minto perform the gradient descent This is not

optimized for speed

class gradient_descent_min() A Barzilai and Borwein gradient descent class

Inputs f Python function of x variables x list of symbolic variables (eg [x1 x2]) x0 list of numeric initial guess of a min of f T step size threshold for stopping the descent

To execute the gradient descent call descend method

nb This is only for gradients in cartesiancoordinates Further work would be to implementthis in multiple or generalized coordinatesSee the grad method below for implementation

def __init__(selffxx0T)selff = fselfx = Array(x)selfx0 = nparray(x0)selfT = Tselfn = len(x0) size of xselfg = lambdify(xselfgrad(fx)numpy)selfxk = nparray(x0)selftable =

def descend(self) unpack variablesf = selffx = selfxx0 = selfx0T = selfTg = selfg initialize variablesN = 0

opt Optimization grad Gradient descent p 4

x_k = x0dx = 2T cant be zerox_km1 = 9x0-1 cant equal x0g_km1 = nparray(g(x_km1))N_max = 100 max iterationstable_data = [[Nx0nparray(g(x0))0]]while (dx gt T and N lt N_max) or N lt 1N += 1 increment indexg_k = nparray(g(x_k))dg_k = g_k - g_km1dx_k = x_k - x_km1alpha_k = dx_kdot(dg_k)dg_kdot(dg_k)x_km1 = x_k storex_k = x_k - alpha_kg_k savetable_dataappend([Nx_kg_kalpha_k])selfxk = npvstack((selfxkx_k)) store other variablesg_km1 = g_kdx = nplinalgnorm(x_k - x_km1) check

selftabulater(table_data)

def tabulater(selftable_data)npset_printoptions(precision=2)tabulateLATEX_ESCAPE_RULES=selftable[python] = tabulate(table_dataheaders=[Nx_kg_kalpha]

)selftable[latex] = tabulate(table_dataheaders=[

$N$$bmx_k$$bmg_k$$alpha$]tablefmt=latex_raw

)

def grad(selffx) cartesian coords gradientreturn derive_by_array(f(x)x)

First consider f1

var(x1 x2)x = Array([x1x2])f1 = lambda x (x[0]-25)2 + 13(x[1]+10)2gd = gradient_descent_min(f=f1x=xx0=[-5040]T=1e-8)

opt Optimization grad Gradient descent p 5

Perform the gradient descent

gddescend()

Print the interesting variables

print(gdtable[python])

N xk gk α

0 [-50 40] [-150 1300] 0

1 [-4365 -1503] [-150 1300] 00423296

2 [shy3836 -10 ] [-1373 -13074] 00384979

3 [shy3311 -10 ] [-127e+02 124e-01] 0041454

4 [ 2499 -10 ] [-116e+02 -962e-03] 0499926

5 [ 25 -1005] [-002 012] 0499999

6 [ 25 -10] [-184e-08 -138e+00] 00385225

7 [ 25 -10] [-170e-08 219e-03] 00384615

8 [ 25 -10] [-157e-08 000e+00] 00384615

Now letrsquos lambdify the function f1 so we can

plot

f1_lambda = lambdify((x1x2)f1(x)numpy)

Now letrsquos plot a contour plot with the

gradient descent overlaid

opt Optimization grad Gradient descent p 6

fig ax = pltsubplots() contour plotX1 = nplinspace(-100100100)X2 = nplinspace(-5050100)X1 X2 = npmeshgrid(X1X2)F1 = f1_lambda(X1X2)pltcontourf(X1X2F1) gradient descent plotfrom mpl_toolkitsmplot3d import Axes3Dfrom matplotlibcollections import LineCollectionxX1 = gdxk[0]xX2 = gdxk[1]points = nparray([xX1 xX2])Treshape(-1 1 2)segments = npconcatenate([points[-1] points[1]] axis=1

)lc = LineCollection(segmentscmap=pltget_cmap(Reds)

)lcset_array(nplinspace(01len(xX1))) color segslcset_linewidth(3)axautoscale(False) avoid the scatter changing limsaxadd_collection(lc)axscatter(xX1xX2zorder=1marker=ocolor=pltcmReds(nplinspace(01len(xX1)))edgecolor=none

)pltshow()

minus100 minus75 minus50 minus25 0 25 50 75 100

minus40

minus20

0

20

40

opt Optimization grad Gradient descent p 7

Now consider f2

A = Matrix([[100][020]])b = Matrix([[11]])def f2(x)X = Array([x])tomatrix()Treturn 12Xdot(AX) - bdot(X)

gd = gradient_descent_min(f=f2x=xx0=[50-40]T=1e-8)

Perform the gradient descent

gddescend()

Print the interesting variables

print(gdtable[python])

opt Optimization grad Gradient descent p 8

N xk gk α

0 [ 50 -40] [ 499 -801] 0

1 [1758 1204] [ 499 -801] 00649741

2 [ 807 -101] [17478 23988] 00544221

3 [362 017] [ 7966 shy2122] 00558582

4 [ 049 -005] [3516 249] 00889491

5 [01 014] [ 389 -194] 00990201

6 [01 0 ] [004 19 ] 00750849

7 [01 005] [ 001 -095] 0050005

8 [01 005] [474e-03 958e-05] 00500012

9 [01 005] [ 237e-03 shy238e-09] 00999186

10 [01 005] [193e-06 237e-09] 01

11 [01 005] [ 000e+00 shy237e-09] 00999997

Now letrsquos lambdify the function f2 so we can

plot

f2_lambda = lambdify((x1x2)f2(x)numpy)

Now letrsquos plot a contour plot with the

gradient descent overlaid

fig ax = pltsubplots() contour plotX1 = nplinspace(-100100100)X2 = nplinspace(-5050100)X1 X2 = npmeshgrid(X1X2)F2 = f2_lambda(X1X2)pltcontourf(X1X2F2) gradient descent plotfrom mpl_toolkitsmplot3d import Axes3Dfrom matplotlibcollections import LineCollectionxX1 = gdxk[0]xX2 = gdxk[1]points = nparray([xX1 xX2])Treshape(-1 1 2)

opt Optimization lin Gradient descent p 9

segments = npconcatenate([points[-1] points[1]] axis=1

)lc = LineCollection(segmentscmap=pltget_cmap(Reds)

)lcset_array(nplinspace(01len(xX1))) color segslcset_linewidth(3)axautoscale(False) avoid the scatter changing limsaxadd_collection(lc)axscatter(xX1xX2zorder=1marker=ocolor=pltcmReds(nplinspace(01len(xX1)))edgecolor=none

)pltshow()

minus100 minus75 minus50 minus25 0 25 50 75 100

minus40

minus20

0

20

40

Python code in this section was generated from a Jupyter

notebook named gradient_descentipynb with a python3kernel

opt Optimization lin Constrained linear optimization p 1

constraints

linear programming problem

maximize

constrained

linear

feasible solution

optimal

hyperplane

optlin Constrained linear

optimization

Consider a linear objective function f

Rn rarr R with variables xi in vector x and

coefficients ci in vector c

f(x) = c middot x (lin1)

subject to the linear constraintsmdashrestrictions

on ximdash

Ax 6 a (lin2a)

Bx = b and (lin2b)

l 6 x 6 u (lin2c)

where A and B are constant matrices and

ab lu are nshyvectors This is one formulation

of what is called a linear programming

problem Usually we want to maximize f over

the constraints Such problems frequently

arise throughout engineering for instance

in manufacturing transportation operations

etc They are called constrained because

there are constraints on x they are called

linear because the objective function and the

constraints are linear

We call a pair (x f(x)) for which x satisfies

Equation lin2 a feasible solution Of course

not every feasible solution is optimal a

feasible solution is optimal iff there exists no

other feasible solution for which f is greater

(assumingwersquore maximizing) We call the vector

subspace of feasible solutions S sub Rn

Feasible solutions form a polytope

Consider the effect of the constraints Each

of the equalities and inequalities defines a

linear hyperplane in Rn (ie a linear subspace

opt Optimization lin Constrained linear optimization p 1

flat

polytope

vertices

basic feasible solutions

level set

an optimal solution occurs at a vertex of S

of dimension n ndash 1) either as a boundary of

S (inequality) or as a restriction of S to the

hyperplane When joined these hyperplanes

are the boundary of S (equalities restrict S

to lower dimension) So we see that each of

the boundaries of S is flat which makes S a

polytope (in R2 a polygon) What makes this

especially interesting is that polytopes have

vertices where the hyperplanes intersect

Solutions at the vertices are called basic

feasible solutions

Only the vertices matter

Our objective function f is linear so for

some constant h f(x) = h defines a level set

that is itself a hyperplane H in Rn If thishyperplane intersects S at a point x (x f(x) = h)

is the corresponding solution There are three

possibilities when H intersects S

1 H cap S is a vertex of S

2 H cap S is a boundary hyperplane of S or

3 H cap S slices through the interior of S

However this third option implies that there

exists a level set G corresponding to f(x) =

g such that G intersects S and g gt h so

solutions on H cap S are not optimal (We have

not proven this but it may be clear from

our progression) We conclude that either the

first or second case must be true for optimal

solutions And notice that in both cases a

(potentially optimal) solution occurs at at

least one vertex The key insight then is that

an optimal solution occurs at a vertex of

S

opt Optimization simplex Constrained linear optimization p 2

This means we donrsquot need to search all of S

or even its boundary we need only search the

vertices Helpful as this is it restricts us down

to( n constraints

)potentially optimal solutionsmdash

usually still too many to search in a naiumlve

way In Lecture optsimplex this is mitigated

by introducing a powerful searching method

opt Optimization simplex The simplex algorithm p 1

simplex algorithm

4 Another Python package pulp (PuLP) is probably more popular for

linear programming however we choose scipyoptimize because it

has applications beyond linear programming

optsimplex The simplex

algorithm

The simplex algorithm (or ldquomethodrdquo) is an

iterative technique for finding an optimal

solution of the linear programming problem

of Equations lin1 and lin2 The details of

the algorithm are somewhat involved but

the basic idea is to start at a vertex of the

feasible solution space S and traverse an edge

of the polytope that leads to another vertex

with a greater value of f Then repeat this

process until there is no neighboring vertex

with a greater value of f at which point the

solution is guaranteed to be optimal

Rather than present the details of the

algorithm we choose to show an example using

Python There have been some improvements

on the original algorithm that have been

implemented into many standard software

packages including Pythonrsquos scipy package

(Pauli Virtanen et al SciPy 10ndashFundamental

Algorithms for Scientific Computing in Python

In arXiv eshyprints arXiv190710121 [July 2019]

arXiv190710121) module scipyoptimize4

Example optsimplexshy1 re simplex method using scipyoptimize

Problem Statement

Maximize the objective function

f(x) = c middot x (simplex1a)

for x isin R2 and

c =[5 2

]gt(simplex1b)

opt Optimization simplex The simplex algorithm p 2

subject to constraints

0 6 x1 6 10 (simplex2a)

ndash5 6 x2 6 15 (simplex2b)

4x1 + x2 6 40 (simplex2c)

x1 + 3x2 6 35 (simplex2d)

ndash8x1 ndash x2 gt ndash75 (simplex2e)

Problem Solution

First load some Python packages

from scipyoptimize import linprogimport numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

Encoding the problem

Before we can use linprog we must first

encode Equations simplex2 and ex4 into

a form linprog will recognize We begin

with f which we can write as c middot x with the

coefficients of c as follows

c = [-5 -2] negative to find max

Wersquove negated each constant because

linprog minimizes f and we want to

maximize f Now letrsquos encode the inequality

constraints We will write the leftshyhand

side coefficients in the matrix A and the

opt Optimization simplex The simplex algorithm p 3

rightshyhand-side values in vector a such that

Ax 6 a (simplex3)

Notice that one of our constraint

inequalities is gt instead of 6 We can

flip this by multiplying the inequality by ndash1

We use simple lists to encode A and a

A = [[4 1][1 3][8 1]

]a = [40 35 75]

Now we need to define the lower l and upper

u bounds of x The function linprog expects

these to be in a single list of lower- and

upper-bounds of each xi

lu = [(0 10)(-515)

]

We want to keep track of each step linprogtakes We can access these by defining a

function callback to be passed to linprog

x = [] for storing the stepsdef callback(res) called at each step

opt Optimization simplex The simplex algorithm p 4

global xprint(fnit = resnit x_k = resx)xappend(resxcopy()) store

Now we need to call linprog We donrsquot have

any equality constraints so we need only use

the keyword arguments A_ub=A and b_ub=aFor demonstration purposes we tell it to

use the simplex method which is not as

good as its other methods which use better

algorithms based on the simplex

res = linprog(cA_ub=Ab_ub=abounds=lumethod=simplexcallback=callback

)

x = nparray(x)

nit = 0 x_k = [ 0 -5]nit = 1 x_k = [10 -5]nit = 2 x_k = [875 5 ]nit = 3 x_k = [772727273 909090909]nit = 4 x_k = [772727273 909090909]nit = 5 x_k = [772727273 909090909]nit = 5 x_k = [772727273 909090909]

So the optimal solution (x f(x)) is as follows

opt Optimization simplex The simplex algorithm p 5

print(foptimum x resx)print(foptimum f(x) -resfun)

optimum x [772727273 909090909]optimum f(x) 5681818181818182

The last point was repeated

1 once because there was no adjacent

vertex with greater f(x) and

2 twice because the algorithm calls

lsquocallbacklsquo twice on the last step

Plotting

When the solution space is in R2 it is

helpful to graphically represent the solution

space constraints and the progression of

the algorithm We begin by defining the

inequality lines from A and a over the bounds

of x1

n = len(c) number of variables xm = npshape(A)[0] number of inequality constraintsx2 = npempty([m2])for i in range(0m)x2[i] = -A[i][0]A[i][1]nparray(lu[0]) + a[i]A[i][1]

Now we plot a contour plot of f over

the bounds of x1 and x2 and overlay the

opt Optimization simplex The simplex algorithm p 6

inequality constraints and the steps of the

algorithm stored in x

lu_array = nparray(lu)fig ax = pltsubplots()mplrcParams[lineslinewidth] = 3 contour plotX1 = nplinspace(lu_array[0]100)X2 = nplinspace(lu_array[1]100)X1 X2 = npmeshgrid(X1X2)F2 = -c[0]X1 + -c[1]X2 negative because max hackcon = axcontourf(X1X2F2)cbar = figcolorbar(conax=ax)cbaraxset_ylabel(objective function) bounds on xun = nparray([11])opts = cwlabelNonelinewidth6pltplot(lu_array[0]lu_array[10]unopts)pltplot(lu_array[0]lu_array[11]unopts)pltplot(lu_array[00]unlu_array[1]opts)pltplot(lu_array[01]unlu_array[1]opts) inequality constraintsfor i in range(0m)p = pltplot(lu[0]x2[i]c=w)

pset_label(constraint) stepspltplot(x[0]x[1]-oc=rclip_on=Falsezorder=20label=simplex

)pltylim(lu_array[1])pltxlabel($x_1$)pltylabel($x_2$)pltlegend()pltshow()

opt Optimization ex The simplex algorithm p 7

0 2 4 6 8 10x1

minus50

minus25

00

25

50

75

100

125

150

x2

constraint

simplex

minus15

0

15

30

45

60

75

90

obje

ctiv

efu

nct

ion

Python code in this section was generated from a Jupyter

notebook named simplex_linear_programmingipynb with a

python3 kernel

nlin Optimization Exercises for Chapter 08 p 1

5 Barzilai and Borwein Two-Point Step Size Gradient Methods

optex Exercises for Chapter 08

Exercise 081 (cummerbund)

Consider the functions (a) f1 R2 rarr R and (b)

f2 R2 rarr R defined as

f1(x) = 4(x1 ndash 16)2 + (x2 + 64)

2 + x1 sin2 x1 (ex1)

f2(x) =1

2x middot Ax ndash b middot x (ex2)

where

A =

[5 0

0 15

]and (ex3a)

b =[ndash2 1

]gt (ex3b)

Use the method of Barzilai and Borwein5

starting at some x0 to find a minimum of each

function

Exercise 082 (melty)

Maximize the objective function

f(x) = c middot x (ex4a)

for x isin R3 and

c =[3 ndash8 1

]gt(ex4b)

subject to constraints

0 6 x1 6 20 (ex5a)

ndash5 6 x2 6 0 (ex5b)

5 6 x3 6 17 (ex5c)

x1 + 4x2 6 50 (ex5d)

2x1 + x3 6 43 (ex5e)

ndash4x1 + x2 ndash 5x3 gt ndash99 (ex5f)

1 As is customary we frequently say ldquosystemrdquo when we mean

ldquomathematical system modelrdquo Recall that multiple models may be

used for any given physical system depending on what one wants to

know

nlin

Nonlinear analysis

1 The ubiquity of near-linear systems and

the tools we have for analyses thereof can

sometimes give the impression that nonlinear

systems are exotic or even downright

flamboyant However a great many systems1

important for a mechanical engineer are

frequently hopelessly nonlinear Here are a some

examples of such systems

bull A robot arm

bull Viscous fluid flow (usually modelled by

the navier-stokes equations)

bull

bull Anything that ldquofills uprdquo or ldquosaturatesrdquo

bull Nonlinear optics

bull Einsteinrsquos field equations (gravitation in

general relativity)

bull Heat radiation and nonlinear heat

conduction

bull Fracture mechanics

bull

nlin Nonlinear analysis ss p 3

2 SH Strogatz and M Dichter Nonlinear Dynamics and Chaos

Second Studies in Nonlinearity Avalon Publishing 2016 isbn

9780813350844

3 W Richard Kolk and Robert A Lerman Nonlinear System Dynamics

1st ed Springer US 1993 isbn 978-1-4684shy6496shy2

2 Lest we think this is merely an

inconvenience we should keep in mind that

it is actually the nonlinearity that makes

many phenomena useful For instance the

depends on the nonlinearity

of its optics Similarly transistors and the

digital circuits made thereby (including the

microprocessor) wouldnrsquot function if their

physics were linear

3 In this chapter we will see some ways

to formulate characterize and simulate

nonlinear systems Purely

are few for nonlinear systems

Most are beyond the scope of this text but we

describe a few mostly in Lecture nlinchar

Simulation via numerical integration of

nonlinear dynamical equations is the most

accessible technique so it is introduced

4 We skip a discussion of linearization

of course if this is an option it is

preferable Instead we focus on the

5 For a good introduction to nonlinear

dynamics see Strogatz and Dichter2 A more

engineer-oriented introduction is Kolk and

Lerman3

nlin Nonlinear analysis ss Nonlinear state-space models p 1

nonlinear stateshyspace models

autonomous system

nonautonomous system

4 Strogatz and Dichter Nonlinear Dynamics and Chaos

equilibrium state

stationary point

nlinss Nonlinear stateshyspace

models

1 A state-space model has the general form

dx

dt= f(xu t) (ss1a)

y = (ss1b)

where f and g are vectorshyvalued functions

that depend on the system Nonlinear state-

space models are those for which f is a

functional of either x or u

For instance a state variable x1 might appear

as x21or two state variables might combine as

x1x2 or an input u1 might enter the equations as

log u1

Autonomous and nonautonomous systems

2 An autonomous system is one for which

f(x) with neither time nor input appearing

explicitly A nonautonomous system is one

for which either t or u do appear explicitly

in f It turns out that we can always write

nonautonomous systems as autonomous by

substituting in u(t) and introducing an extra

for t4

3 Therefore without loss of generality we

will focus on ways of analyzing autonomous

systems

Equilibrium

4 An equilibrium state (also called a

) x is one for which

dx dt = 0 In most cases this occurs only

when the input u is a constant u and for

timeshyvarying systems at a given time t For

autonomous systems equilibrium occurs when

nlin Nonlinear analysis char Nonlinear state-space models p 2

the following holds

(ss2)

This is a system of nonlinear algebraic

equations which can be challenging to solve for

x However frequently several solutionsmdashthat

is equilibrium statesmdashdo exist

nlin Nonlinear analysis char Nonlinear system characteristics p 1

system order

nlinchar Nonlinear system

characteristics

1 Characterizing nonlinear systems can be

challengingwithout the tools developed

for system characterization

However there are ways of characterizing

nonlinear systems and wersquoll here explore a few

DUM See Equation ss1a

Those in-common with linear systems

2 As with linear systems the system order is

either the number of stateshyvariables required

to describe the system or equivalently the

highest-order in a single

scalar differential equation describing the

system

3 Similarly nonlinear systems can have state

variables that depend on alone

or those that also depend on

(or some other independent variable) The

former lead to ordinary differential equations

(ODEs) and the latter to partial differential

equations (PDEs)

4 Equilibrium was already considered in

Lecture nlinss

Stability

5 In terms of system performance perhaps

no other criterion is as important as

nlin Nonlinear analysis char Nonlinear system characteristics p 1

Lyapunov stability theory

5 William L Brogan Modern Control Theory Third Prentice Hall 1991

Ch 10

6 A Choukchou-Braham et al Analysis and Control of

Underactuated Mechanical Systems SpringerLink Buumlcher Springer

International Publishing 2013 isbn 9783319026367 App A

Definition nlinchar1 Stability

If x is perturbed from an equilibrium state x

the response x(t) can

1 asymptotically return to x

(asymptotically )

2 diverge from x ( ) or

3 remain perturned or oscillate

about x with a constant amplitude

( stable)

Notice that this definition is actually

local stability in the neighborhood of one

equilibrium may not be the same as in the

neighborhood of another

6 Other than nonlinear systemsrsquo lack of

linear systemsrsquo eigenvalues poles and roots

of the characteristic equation from which to

compute it the primary difference between

the stability of linear and nonlinear systems

is that nonlinear system stability is often

difficult to establish

Using a linear systemrsquos eigenvalues it is

straightforward to establish stable unstable

and marginally stable subspaces of state-space

(via transforming to an eigenvector basis)

For nonlinear systems no such method exists

However we are not without tools to explore

nonlinear system stability One mathematical

tool to consider is

which is beyond the scope of this

course but has good treatments in5 and6

Qualities of equilibria

7 Equilibria (ie stationary points) come in

a variety of qualities It is instructive to

nlin Nonlinear analysis char Nonlinear system characteristics p 2

x

xprime

(a) r lt 0

x

xprime

(b) r = 0

x

xprime

(c) r gt 0

Figure 091 plots of xprime versus x for Equation char1

attractor

sink

stable

consider the first-order differential equation

in state variable with real constant

xprime = rx ndash x3 (char1)

If we plot xprime versus x for different values of r

we obtain the plots of Figure 091

8 By definition equilibria occur when

xprime = 0 so the xshyaxis crossings of Figure 091

are equilibria The blue arrows on the xshyaxis

show the of state

change xprime quantified by the plots For both

(a) and (b) only one equilibrium exists x = 0

Note that the blue arrows in both plots point

toward the equilibrium In such casesmdashthat is

when a exists around

an equilibrium for which state changes point

toward the equilibriummdashthe equilibrium is

called an or

Note that attractors are

9 Now consider (c) of Figure 091 When r gt 0

three equilibria emerge This change of the

number of equilibria with the changing of a

parameter is called a

nlin Nonlinear analysis sim Nonlinear system characteristics p 3

bifurcation diagram

repeller

source

unstable

saddle

A plot of bifurcations versus the parameter

is called a bifurcation diagram The x =

0 equilibrium now has arrows that point

from it Such an equilibrium is

called a or

and is The other two

equilibria here are (stable) attractors

Consider a very small initial condition x(0) = ε

If ε gt 0 the repeller pushes away x and the

positive attractor pulls x to itself Conversely

if ε lt 0 the repeller again pushes away x and

the negative attractor pulls x to itself

10 Another type of equilibrium is called

the one which acts as an

attractor along some lines and as a repeller

along others We will see this type in the

following example

Example nlincharshy1 re Saddle bifurcation

Problem Statement

Consider the dynamical equation

xprime = x2 + r (char2)

with r a real constant Sketch xprime vs x for

negative zero and positive r Identify and

classify each of the equilibria

nlin Nonlinear analysis pysim Nonlinear system simulation p 1

nlinsim Nonlinear system

simulation

nlin Nonlinear analysis pysim Simulating nonlinear systems in Python p 1

nlinpysim Simulating nonlinear

systems in Python

Example nlinpysimshy1 re a nonlinear unicycle

Problem Statementasdf

Problem Solution

First load some Python packages

import numpy as npimport sympy as spfrom scipyintegrate import solve_ivpimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

The state equation can be encoded via the

following function f

def f(t x u c)dxdt = [

x[3]npcos(x[2])x[3]npsin(x[2])x[4]1c[0] u(t)[0]1c[1] u(t)[1]

]return dxdt

The input function u must also be defined

nlin Nonlinear analysis ex Simulating nonlinear systems in Python p 2

def u(t)return [

15(1+npcos(t))25npsin(3t)

]

Define time spans initial values and constantstspan = nplinspace(0 50 300)xinit = [00000]mass = 10inertia = 10c = [massinertia]

Solve differential equationsol = solve_ivp(

lambda t x f(t x u c)[tspan[0] tspan[-1]]xinitt_eval=tspan

)

Letrsquos first plot the trajectory and

instantaneous velocity

ndash75 ndash50 ndash25 0 25 50 75 100x

ndash75

ndash50

ndash25

0

25

50

75

100

y

Figure 092

xp = soly[3]npcos(soly[2])yp = soly[3]npsin(soly[2])p = pltfigure()pltplot(soly[0]soly[1])pltquiver(soly[0]soly[1]xpyp)pltxlabel($x$)pltylabel($y$)pltshow()

Python code in this section was generated from a Jupyter

notebook named horcruxipynb with a python3 kernel

Nonlinear analysis Exercises for Chapter 09 p 1

nlinex Exercises for Chapter 09

A

Distribution tables

A Distribution tables A01 Gaussian distribution table p 1

A01 Gaussian distribution table

Below are plots of the Gaussian probability

density function f and cumulative distribution

function Φ Below them is Table A1 of CDF

values

zb

ndash3 ndash2 ndash1 0 1 2 30

01

02

03

04

z

Gaussia

nPDFf(z)

Φ(zb)

f(z)

(a) the Gaussian PDF

ndash3 ndash2 ndash1 0 1 2 30

02

04

06

08

1

interval bound zb

Gaussia

nCDFΦ(zb)

(b) the Gaussian CDF

Figure A1 the Gaussian PDF and CDF for z-scoresTable A1 z-score table Φ(zb) = P(z isin (ndashinfin zb])

zb 0 1 2 3 4 5 6 7 8 9

shy34 00003 00003 00003 00003 00003 00003 00003 00003 00003 00002

shy33 00005 00005 00005 00004 00004 00004 00004 00004 00004 00003

shy32 00007 00007 00006 00006 00006 00006 00006 00005 00005 00005

shy31 00010 00009 00009 00009 00008 00008 00008 00008 00007 00007

shy30 00013 00013 00013 00012 00012 00011 00011 00011 00010 00010

shy29 00019 00018 00018 00017 00016 00016 00015 00015 00014 00014

shy28 00026 00025 00024 00023 00023 00022 00021 00021 00020 00019

shy27 00035 00034 00033 00032 00031 00030 00029 00028 00027 00026

shy26 00047 00045 00044 00043 00041 00040 00039 00038 00037 00036

shy25 00062 00060 00059 00057 00055 00054 00052 00051 00049 00048

shy24 00082 00080 00078 00075 00073 00071 00069 00068 00066 00064

shy23 00107 00104 00102 00099 00096 00094 00091 00089 00087 00084

shy22 00139 00136 00132 00129 00125 00122 00119 00116 00113 00110

shy21 00179 00174 00170 00166 00162 00158 00154 00150 00146 00143

shy20 00228 00222 00217 00212 00207 00202 00197 00192 00188 00183

shy19 00287 00281 00274 00268 00262 00256 00250 00244 00239 00233

shy18 00359 00351 00344 00336 00329 00322 00314 00307 00301 00294

shy17 00446 00436 00427 00418 00409 00401 00392 00384 00375 00367

shy16 00548 00537 00526 00516 00505 00495 00485 00475 00465 00455

A Distribution tables A01 Gaussian distribution table p 2

Table A1 z-score table Φ(zb) = P(z isin (ndashinfin zb])

zb 0 1 2 3 4 5 6 7 8 9

shy15 00668 00655 00643 00630 00618 00606 00594 00582 00571 00559

shy14 00808 00793 00778 00764 00749 00735 00721 00708 00694 00681

shy13 00968 00951 00934 00918 00901 00885 00869 00853 00838 00823

shy12 01151 01131 01112 01093 01075 01056 01038 01020 01003 00985

shy11 01357 01335 01314 01292 01271 01251 01230 01210 01190 01170

shy10 01587 01562 01539 01515 01492 01469 01446 01423 01401 01379

shy09 01841 01814 01788 01762 01736 01711 01685 01660 01635 01611

shy08 02119 02090 02061 02033 02005 01977 01949 01922 01894 01867

shy07 02420 02389 02358 02327 02296 02266 02236 02206 02177 02148

shy06 02743 02709 02676 02643 02611 02578 02546 02514 02483 02451

shy05 03085 03050 03015 02981 02946 02912 02877 02843 02810 02776

shy04 03446 03409 03372 03336 03300 03264 03228 03192 03156 03121

shy03 03821 03783 03745 03707 03669 03632 03594 03557 03520 03483

shy02 04207 04168 04129 04090 04052 04013 03974 03936 03897 03859

shy01 04602 04562 04522 04483 04443 04404 04364 04325 04286 04247

shy00 05000 04960 04920 04880 04840 04801 04761 04721 04681 04641

00 05000 05040 05080 05120 05160 05199 05239 05279 05319 05359

01 05398 05438 05478 05517 05557 05596 05636 05675 05714 05753

02 05793 05832 05871 05910 05948 05987 06026 06064 06103 06141

03 06179 06217 06255 06293 06331 06368 06406 06443 06480 06517

04 06554 06591 06628 06664 06700 06736 06772 06808 06844 06879

05 06915 06950 06985 07019 07054 07088 07123 07157 07190 07224

06 07257 07291 07324 07357 07389 07422 07454 07486 07517 07549

07 07580 07611 07642 07673 07704 07734 07764 07794 07823 07852

08 07881 07910 07939 07967 07995 08023 08051 08078 08106 08133

09 08159 08186 08212 08238 08264 08289 08315 08340 08365 08389

10 08413 08438 08461 08485 08508 08531 08554 08577 08599 08621

11 08643 08665 08686 08708 08729 08749 08770 08790 08810 08830

12 08849 08869 08888 08907 08925 08944 08962 08980 08997 09015

13 09032 09049 09066 09082 09099 09115 09131 09147 09162 09177

14 09192 09207 09222 09236 09251 09265 09279 09292 09306 09319

15 09332 09345 09357 09370 09382 09394 09406 09418 09429 09441

16 09452 09463 09474 09484 09495 09505 09515 09525 09535 09545

17 09554 09564 09573 09582 09591 09599 09608 09616 09625 09633

18 09641 09649 09656 09664 09671 09678 09686 09693 09699 09706

19 09713 09719 09726 09732 09738 09744 09750 09756 09761 09767

20 09772 09778 09783 09788 09793 09798 09803 09808 09812 09817

21 09821 09826 09830 09834 09838 09842 09846 09850 09854 09857

22 09861 09864 09868 09871 09875 09878 09881 09884 09887 09890

23 09893 09896 09898 09901 09904 09906 09909 09911 09913 09916

24 09918 09920 09922 09925 09927 09929 09931 09932 09934 09936

25 09938 09940 09941 09943 09945 09946 09948 09949 09951 09952

26 09953 09955 09956 09957 09959 09960 09961 09962 09963 09964

27 09965 09966 09967 09968 09969 09970 09971 09972 09973 09974

28 09974 09975 09976 09977 09977 09978 09979 09979 09980 09981

29 09981 09982 09982 09983 09984 09984 09985 09985 09986 09986

30 09987 09987 09987 09988 09988 09989 09989 09989 09990 09990

31 09990 09991 09991 09991 09992 09992 09992 09992 09993 09993

32 09993 09993 09994 09994 09994 09994 09994 09995 09995 09995

33 09995 09995 09995 09996 09996 09996 09996 09996 09996 09997

34 09997 09997 09997 09997 09997 09997 09997 09997 09997 09998

A Distribution tables Studentrsquos tshydistribution table p 1

A02 Studentrsquos t-distribution

tableTable A2 two-tail inverse studentrsquos t-distribution table

percent probability

ν 600 667 750 800 875 900 950 975 990 995 999

1 0325 0577 1000 1376 2414 3078 6314 12706 31821 63657 31831

2 0289 0500 0816 1061 1604 1886 2920 4303 6965 9925 22327

3 0277 0476 0765 0978 1423 1638 2353 3182 4541 5841 10215

4 0271 0464 0741 0941 1344 1533 2132 2776 3747 4604 7173

5 0267 0457 0727 0920 1301 1476 2015 2571 3365 4032 5893

6 0265 0453 0718 0906 1273 1440 1943 2447 3143 3707 5208

7 0263 0449 0711 0896 1254 1415 1895 2365 2998 3499 4785

8 0262 0447 0706 0889 1240 1397 1860 2306 2896 3355 4501

9 0261 0445 0703 0883 1230 1383 1833 2262 2821 3250 4297

10 0260 0444 0700 0879 1221 1372 1812 2228 2764 3169 4144

11 0260 0443 0697 0876 1214 1363 1796 2201 2718 3106 4025

12 0259 0442 0695 0873 1209 1356 1782 2179 2681 3055 3930

13 0259 0441 0694 0870 1204 1350 1771 2160 2650 3012 3852

14 0258 0440 0692 0868 1200 1345 1761 2145 2624 2977 3787

15 0258 0439 0691 0866 1197 1341 1753 2131 2602 2947 3733

16 0258 0439 0690 0865 1194 1337 1746 2120 2583 2921 3686

17 0257 0438 0689 0863 1191 1333 1740 2110 2567 2898 3646

18 0257 0438 0688 0862 1189 1330 1734 2101 2552 2878 3610

19 0257 0438 0688 0861 1187 1328 1729 2093 2539 2861 3579

20 0257 0437 0687 0860 1185 1325 1725 2086 2528 2845 3552

21 0257 0437 0686 0859 1183 1323 1721 2080 2518 2831 3527

22 0256 0437 0686 0858 1182 1321 1717 2074 2508 2819 3505

23 0256 0436 0685 0858 1180 1319 1714 2069 2500 2807 3485

24 0256 0436 0685 0857 1179 1318 1711 2064 2492 2797 3467

25 0256 0436 0684 0856 1178 1316 1708 2060 2485 2787 3450

26 0256 0436 0684 0856 1177 1315 1706 2056 2479 2779 3435

27 0256 0435 0684 0855 1176 1314 1703 2052 2473 2771 3421

28 0256 0435 0683 0855 1175 1313 1701 2048 2467 2763 3408

29 0256 0435 0683 0854 1174 1311 1699 2045 2462 2756 3396

30 0256 0435 0683 0854 1173 1310 1697 2042 2457 2750 3385

35 0255 0434 0682 0852 1170 1306 1690 2030 2438 2724 3340

40 0255 0434 0681 0851 1167 1303 1684 2021 2423 2704 3307

45 0255 0434 0680 0850 1165 1301 1679 2014 2412 2690 3281

50 0255 0433 0679 0849 1164 1299 1676 2009 2403 2678 3261

55 0255 0433 0679 0848 1163 1297 1673 2004 2396 2668 3245

60 0254 0433 0679 0848 1162 1296 1671 2000 2390 2660 3232infin 0253 0431 0674 0842 1150 1282 1645 1960 2326 2576 3090

B

Fourier andLaplace tables

B Fourier and Laplace tables B01 Laplace transforms p 1

B01 Laplace transforms

Table B1 is a table with functions of time

f(t) on the left and corresponding Laplace

transforms L(s) on the right Where applicable

s = σ + jω is the Laplace transform variable

T is the timeshydomain period ω02πT is the

corresponding angular frequency j =radicndash1 a isin R+

and b t0 isin R are constants

Table B1 Laplace transform identities

function of time t function of Laplace s

a1f1(t) + a2f2(t) a1F1(s) + a2F2(s)

f(t ndash t0) F(s)endasht0s

fprime(t) sF(s) ndash f(0)

dnf(t)

dtnsnF(s) + s(nndash1)f(0) + s(nndash2)fprime(0) + middot middot middot + f(nndash1)(0)

ˆ t

0f(τ)dτ

1

sF(s)

tf(t) ndashFprime(s)

f1(t) lowast f2(t) =ˆ infinndashinfin f1(τ)f2(t ndash τ)dτ F1(s)F2(s)

δ(t) 1

us(t) 1s

ur(t) 1s2

tnndash1(n ndash 1) 1sn

endashat1

s + a

tendashat1

(s + a)2

B Fourier and Laplace tables B01 Laplace transforms p 2

1

(n ndash 1)tnndash1endashat

1

(s + a)n

1

a ndash b(eat ndash ebt)

1

(s ndash a)(s ndash b)(a 6= b)

1

a ndash b(aeat ndash bebt)

s

(s ndash a)(s ndash b)(a 6= b)

sinωtω

s2 +ω2

cosωts

s2 +ω2

eat sinωtω

(s ndash a)2 +ω2

eat cosωts ndash a

(s ndash a)2 +ω2

B Fourier and Laplace tables B02 Fourier transforms p 1

B02 Fourier transforms

Table B2 is a table with functions of time

f(t) on the left and corresponding Fourier

transforms F(ω) on the right Where applicable

T is the timeshydomain period ω02πT is the

corresponding angular frequency j =radicndash1 a isin R+

and b t0 isin R are constants Furthermore feand f0 are even and odd functions of time

respectively and it can be shown that any

function f can be written as the sum f(t) =

fe(t) + f0(t) (Hsu1967)

Table B2 Fourier transform identities

function of time t function of frequency ω

a1f1(t) + a2f2(t) a1F1(ω) + a2F2(ω)

f(at)1

|a|F(ωa)

f(ndasht) F(ndashω)

f(t ndash t0) F(ω)endashjωt0

f(t) cosω0t1

2F(ω ndashω0) +

1

2F(ω +ω0)

f(t) sinω0t1

j2F(ω ndashω0) ndash

1

j2F(ω +ω0)

fe(t) Re F(ω)

f0(t) j Im F(ω)

F(t) 2πf(ndashω)

fprime(t) jωF(ω)

dnf(t)

dtn(jω)nF(ω)

ˆ t

ndashinfin f(τ)dτ1

jωF(ω) + πF(0)δ(ω)

B Fourier and Laplace tables B02 Fourier transforms p 2

ndashjtf(t) Fprime(ω)

(ndashjt)nf(t)dnF(ω)

dωn

f1(t) lowast f2(t) =ˆ infinndashinfin f1(τ)f2(t ndash τ)dτ F1(ω)F2(ω)

f1(t)f2(t)1

2πF1(ω) lowast F2(ω) =

1

ˆ infinndashinfin F1(α)F2(ω ndash α)dα

endashatus(t)1

jω + a

endasha|t|2a

a2 +ω2

endashat2 radic

πa endashω2(4a)

1 for |t| lt a2 else 0a sin(aω2)

aω2

tendashatus(t)1

(a + jω)2

tnndash1

(n ndash 1)endashat)nus(t)

1

(a + jω)n

1

a2 + t2π

aendasha|ω|

δ(t) 1

δ(t ndash t0) endashjωt0

us(t) πδ(ω) +1

us(t ndash t0) πδ(ω) +1

jωendashjωt0

1 2πδ(ω)

t 2πjδprime(ω)

tn 2πjndnδ(ω)

dωn

ejω0t 2πδ(ω ndashω0)

cosω0t πδ(ω ndashω0) + πδ(ω +ω0)

B Fourier and Laplace tables Fourier transforms p 3

sinω0t ndashjπδ(ω ndashω0) + jπδ(ω +ω0)

us(t) cosω0tjω

ω20 ndashω2+π

2δ(ω ndashω0) +

π

2δ(ω +ω0)

us(t) sinω0tω0

ω20 ndashω2+π

2jδ(ω ndashω0) ndash

π

2jδ(ω +ω0)

tus(t) jπδprime(ω) ndash 1ω2

1t πj ndash 2πjus(ω)

1tn(ndashjω)nndash1

(n ndash 1)

(πj ndash 2πjus(ω)

)sgn t

2

jωinfinsumn=ndashinfin δ(t ndash nT) ω0

infinsumn=ndashinfin δ(ω ndash nω0)

C

Mathematicsreference

C Mathematics reference C01 Quadratic forms p 1

C01 Quadratic forms

The solution to equations of the form ax2 + bx +

c = 0 is

x =ndashbplusmn

radicb2 ndash 4ac

2a (C011)

Completing the square

This is accomplished by reshywriting the quadratic

formula in the form of the leftshyhand-side (LHS)

of this equality which describes factorization

x2 + 2xh + h2 = (x + h)2 (C012)

C Mathematics reference C02 Trigonometry p 1

C02 Trigonometry

Triangle identities

With reference to the below figure the law of

sines is

sin α

a=sin β

b=sin γ

c(C021)

and the law of cosines is

c2 = a2 + b2 ndash 2ab cos γ (C022a)

b2 = a2 + c2 ndash 2ac cos β (C022b)

a2 = c2 + b2 ndash 2cb cos α (C022c)

b

ca

α γ

β

Reciprocal identities

csc u =1

sin u(C023a)

sec u =1

cos u(C023b)

cot u =1

tan u(C023c)

Pythagorean identities

1 = sin2 u + cos2 u (C024a)

sec2 u = 1 + tan2 u (C024b)

csc2 u = 1 + cot2 u (C024c)

C Mathematics reference C02 Trigonometry p 1

Co-function identities

sin(π2ndash u

)= cos u (C025a)

cos(π2ndash u

)= sin u (C025b)

tan(π2ndash u

)= cot u (C025c)

csc(π2ndash u

)= sec u (C025d)

sec(π2ndash u

)= csc u (C025e)

cot(π2ndash u

)= tan u (C025f)

Even-odd identities

sin(ndashu) = ndash sin u (C026a)

cos(ndashu) = cos u (C026b)

tan(ndashu) = ndash tan u (C026c)

Sumshydifference formulas (AM or lock-in)

sin(uplusmn v) = sin u cos vplusmn cos u sin v (C027a)

cos(uplusmn v) = cos u cos v∓ sin u sin v (C027b)

tan(uplusmn v) = tan uplusmn tan v1∓ tanu tan v

(C027c)

Double angle formulas

sin(2u) = 2 sin u cos u (C028a)

cos(2u) = cos2 u ndash sin2 u (C028b)

= 2 cos2 u ndash 1 (C028c)

= 1 ndash 2 sin2 u (C028d)

tan(2u) =2 tanu

1 ndash tan2 u(C028e)

Power-reducing or halfshyangle formulas

sin2 u =1 ndash cos(2u)

2(C029a)

cos2 u =1 + cos(2u)

2(C029b)

tan2 u =1 ndash cos(2u)

1 + cos(2u)(C029c)

C Mathematics reference C02 Trigonometry p 1

Sum-toshyproduct formulas

sin u + sin v = 2 sinu + v

2cos

u ndash v

2(C0210a)

sin u ndash sin v = 2 cosu + v

2sin

u ndash v

2(C0210b)

cos u + cos v = 2 cosu + v

2cos

u ndash v

2(C0210c)

cos u ndash cos v = ndash2 sinu + v

2sin

u ndash v

2(C0210d)

Product-to-sum formulas

sin u sin v =1

2

[cos(u ndash v) ndash cos(u + v)

](C0211a)

cos u cos v =1

2

[cos(u ndash v) + cos(u + v)

](C0211b)

sin u cos v =1

2

[sin(u + v) + sin(u ndash v)

](C0211c)

cos u sin v =1

2

[sin(u + v) ndash sin(u ndash v)

](C0211d)

Two-to-one formulas

A sin u + B cos u = C sin(u + φ) (C0212a)

= C cos(u +ψ) where (C0212b)

C =

radicA2 + B2 (C0212c)

φ = arctanB

A(C0212d)

ψ = ndash arctanA

B(C0212e)

C Mathematics reference C03 Matrix inverses p 1

adjoint

C03 Matrix inverses

This is a guide to inverting 1times1 2times2 and ntimesnmatrices

Let A be the 1times 1 matrix

A =[a]

The inverse is simply the reciprocal

Andash1 =[1a

]

Let B be the 2times 2 matrix

B =

[b11 b12b21 b22

]

It can be shown that the inverse follows a

simple pattern

Bndash1 =1

det B

[b22 ndashb12ndashb21 b11

]

=1

b11b22 ndash b12b21

[b22 ndashb12ndashb21 b11

]

Let C be an n times n matrix It can be shown that

its inverse is

Cndash1 =1

det CadjC

where adj is the adjoint of C

C Mathematics reference Laplace transforms p 1

C04 Laplace transforms

The definition of the one-side Laplace and

inverse Laplace transforms follow

Definition C041 Laplace transforms (one-

sided)

Laplace transform L

L(y(t)) = Y(s) =

ˆ infin0

y(t)endashstdt (C041)

Inverse Laplace transform Lndash1

Lndash1(Y(s)) = y(t) =1

2πj

ˆ σ+jinfinσndashjinfin Y(s)estds (C042)

See Table B1 for a list of properties and

common transforms

D

Complex analysis

D Complex analysis Eulerrsquos formulas p 1

Eulerrsquos formulaD01 Eulerrsquos formulas

Eulerrsquos formula is our bridge backshyand forth

between trigonomentric forms (cos θ and sin θ)

and complex exponential form (ejθ)

ejθ = cos θ + j sin θ (The Formulas of Euler1)

Here are a few useful identities implied by

Eulerrsquos formula

endashjθ = cos θ ndash j sin θ (The Formulas of Euler2a)

cos θ = Re (ejθ) (The Formulas of Euler2b)

=1

2

(ejθ + endashjθ

)(The Formulas of Euler2c)

sin θ = Im (ejθ) (The Formulas of Euler2d)

=1

j2

(ejθ ndash endashjθ

)

(The Formulas of Euler2e)

Bibliography

[1] Robert B Ash Basic Probability Theory

Dover Publications Inc 2008

[2] Joan Bagaria Set Theory In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Fall 2019

Metaphysics Research Lab Stanford

University 2019

[3] Maria Baghramian and J Adam

Carter Relativism In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Winter 2019

Metaphysics Research Lab Stanford

University 2019

[4] Stephen Barker and Mark Jago Being

Positive About Negative Facts In

Philosophy and Phenomenological

Research 851 (2012) pp 117ndash138 doi 10

1111j1933-1592201000479x

[5] Jonathan Barzilai and Jonathan M

Borwein Two-Point Step Size Gradient

Methods In IMA Journal of Numerical

Analysis 81 (Jan 1988) pp 141ndash148 issn

0272-4979 doi 101093imanum81141

This includes an innovative line search

method

D Complex analysis Bibliography p 3

1 FBADL04

[6] Anat Biletzki and Anat Matar Ludwig

Wittgenstein In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Summer 2018

Metaphysics Research Lab Stanford

University 2018 An introduction to

Wittgenstein and his thought

[7] Antonio Bove F (Ferruccio) Colombini

and Daniele Del Santo Phase space

analysis of partial differential

equations eng Progress in nonlinear

differential equations and their

applications v 69 Boston Berlin

Birkhaumluser 2006 isbn 9780817645212

[8] William L Brogan Modern Control Theory

Third Prentice Hall 1991

[9] Francesco Bullo and Andrew D Lewis

Geometric control of mechanical systems

modeling analysis and design for simple

mechanical control systems Ed by JE

Marsden L Sirovich and M Golubitsky

Springer 2005

[10] Francesco Bullo and Andrew D Lewis

Supplementary Chapters for Geometric

Control of Mechanical Systems1 Jan

2005

[11] A Choukchou-Braham et al Analysis and

Control of Underactuated Mechanical

Systems SpringerLink Buumlcher Springer

International Publishing 2013 isbn

9783319026367

[12] K Ciesielski Set Theory for the

Working Mathematician London

Mathematical Society Student Texts

Cambridge University Press 1997 isbn

9780521594653 A readable introduction

to set theory

D Complex analysis Bibliography p 4

[13] Marian David The Correspondence

Theory of Truth In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Fall 2016 Metaphysics

Research Lab Stanford University

2016 A detailed overview of the

correspondence theory of truth

[14] David Dolby Wittgenstein on Truth

In A Companion to Wittgenstein John

Wiley amp Sons Ltd 2016 Chap 27 pp 433ndash

442 isbn 9781118884607 doi 101002

9781118884607ch27

[15] Peter J Eccles An Introduction to

Mathematical Reasoning Numbers

Sets and Functions Cambridge

University Press 1997 doi

10 1017 CBO9780511801136 A gentle

introduction to mathematical reasoning

It includes introductory treatments of

set theory and number systems

[16] HB Enderton Elements of Set

Theory Elsevier Science 1977 isbn

9780080570426 A gentle introduction to

set theory and mathematical reasoningmdasha

great place to start

[17] Richard P Feynman Robert B Leighton

and Matthew Sands The Feynman

Lectures on Physics New Millennium

Perseus Basic Books 2010

[18] Michael Glanzberg Truth In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Fall 2018

Metaphysics Research Lab Stanford

University 2018

[19] Hans Johann Glock Truth in the

Tractatus In Synthese 1482 (Jan

2006) pp 345ndash368 issn 1573-0964 doi

101007s11229-004-6226-2

D Complex analysis Bibliography p 5

[20] Mario GoacutemezshyTorrente Alfred Tarski

In The Stanford Encyclopedia of

Philosophy Ed by Edward N Zalta

Spring 2019 Metaphysics Research Lab

Stanford University 2019

[21] Paul Guyer and Rolf-Peter Horstmann

Idealism In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Winter 2018

Metaphysics Research Lab Stanford

University 2018

[22] R Haberman Applied Partial Differential

Equations with Fourier Series and

Boundary Value Problems (Classic

Version) Pearson Modern Classics for

Advanced Mathematics Pearson Education

Canada 2018 isbn 9780134995434

[23] GWF Hegel and AV Miller

Phenomenology of Spirit Motilal

Banarsidass 1998 isbn 9788120814738

[24] Wilfrid Hodges Model Theory In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Fall 2018

Metaphysics Research Lab Stanford

University 2018

[25] Wilfrid Hodges Tarskirsquos Truth

Definitions In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Fall 2018 Metaphysics

Research Lab Stanford University 2018

[26] Peter Hylton and Gary Kemp Willard

Van Orman Quine In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Spring 2019 Metaphysics

Research Lab Stanford University 2019

[27] ET Jaynes et al Probability Theory The

Logic of Science Cambridge University

Press 2003 isbn 9780521592710

D Complex analysis Bibliography p 6

An excellent and comprehensive

introduction to probability theory

[28] I Kant P Guyer and AW Wood Critique

of Pure Reason The Cambridge Edition

of the Works of Immanuel Kant

Cambridge University Press 1999 isbn

9781107268333

[29] Juliette Kennedy Kurt Goumldel In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Winter 2018

Metaphysics Research Lab Stanford

University 2018

[30] Drew Khlentzos Challenges to

Metaphysical Realism In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Winter 2016

Metaphysics Research Lab Stanford

University 2016

[31] Peter Klein Skepticism In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Summer 2015

Metaphysics Research Lab Stanford

University 2015

[32] M Kline Mathematics The Loss

of Certainty A Galaxy book

Oxford University Press 1982 isbn

9780195030853 A detailed account

of the ldquoillogicalrdquo development of

mathematics and an exposition of

its therefore remarkable utility in

describing the world

[33] W Richard Kolk and Robert A Lerman

Nonlinear System Dynamics 1st ed

Springer US 1993 isbn 978-1-4684shy6496shy2

[34] Erwin Kreyszig Advanced Engineering

Mathematics 10th John Wiley amp Sons

Limited 2011 isbn 9781119571094 The

authoritative resource for engineering

mathematics It includes detailed

D Complex analysis Bibliography p 7

accounts of probability statistics

vector calculus linear algebra

fourier analysis ordinary and partial

differential equations and complex

analysis It also includes several other

topics with varying degrees of depth

Overall it is the best place to start

when seeking mathematical guidance

[35] John M Lee Introduction to Smooth

Manifolds second Vol 218 Graduate

Texts in Mathematics Springer 2012

[36] Catherine Legg and Christopher

Hookway Pragmatism In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Spring 2019 Metaphysics

Research Lab Stanford University

2019 An introductory article on the

philsophical movement ldquopragmatismrdquo It

includes an important clarification of

the pragmatic slogan ldquotruth is the end of

inquiryrdquo

[37] Daniel Liberzon Calculus of Variations

and Optimal Control Theory A Concise

Introduction Princeton University Press

2012 isbn 9780691151878

[38] Panu Raatikainen Goumldelrsquos Incompleteness

Theorems In The Stanford Encyclopedia

of Philosophy Ed by Edward N Zalta

Fall 2018 Metaphysics Research Lab

Stanford University 2018 A through

and contemporary description of

Goumldelrsquos incompleteness theorems which

have significant implications for the

foundations and function of mathematics

and mathematical truth

[39] Paul Redding Georg Wilhelm Friedrich

Hegel In The Stanford Encyclopedia

of Philosophy Ed by Edward N Zalta

Summer 2018 Metaphysics Research Lab

Stanford University 2018

D Complex analysis Bibliography p 8

[40] HM Schey Div Grad Curl and All that An

Informal Text on Vector Calculus WW

Norton 2005 isbn 9780393925166

[41] Christopher Shields Aristotle In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Winter 2016

Metaphysics Research Lab Stanford

University 2016

[42] Steven S Skiena Calculated Bets

Computers Gambling and Mathematical

Modeling to Win Outlooks Cambridge

University Press 2001 doi 10 1017

CBO9780511547089 This includes a lucid

section on probability versus statistics

also available here https www3

csstonybrookedu~skienajaialai

excerptsnode12html

[43] George Smith Isaac Newton In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Fall 2008

Metaphysics Research Lab Stanford

University 2008

[44] Daniel Stoljar and Nic Damnjanovic

The Deflationary Theory of Truth

In The Stanford Encyclopedia of

Philosophy Ed by Edward N Zalta Fall

2014 Metaphysics Research Lab Stanford

University 2014

[45] WA Strauss Partial Differential

Equations An Introduction Wiley 2007

isbn 9780470054567 A thorough and yet

relatively compact introduction

[46] SH Strogatz and M Dichter Nonlinear

Dynamics and Chaos Second Studies

in Nonlinearity Avalon Publishing 2016

isbn 9780813350844

[47] Mark Textor States of Affairs In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Winter 2016

D Complex analysis Bibliography p 9

Metaphysics Research Lab Stanford

University 2016

[48] Pauli Virtanen et al SciPy

10ndashFundamental Algorithms for

Scientific Computing in Python In arXiv

eshyprints arXiv190710121 (July 2019)

arXiv190710121

[49] Wikipedia Algebra mdash Wikipedia The Free

Encyclopedia httpenwikipedia

orgwindexphptitle=Algebraampoldid=

920573802 [Online accessed 26shyOctober-

2019] 2019

[50] Wikipedia Carl Friedrich Gauss mdash

Wikipedia The Free Encyclopedia http

en wikipedia org w index php

title=Carl20Friedrich20Gaussampoldid=

922692291 [Online accessed 26shyOctober-

2019] 2019

[51] Wikipedia Euclid mdash Wikipedia The Free

Encyclopedia httpenwikipedia

orgwindexphptitle=Euclidampoldid=

923031048 [Online accessed 26shyOctober-

2019] 2019

[52] Wikipedia First-order logic mdash

Wikipedia The Free Encyclopedia http

enwikipediaorgwindexphptitle=

First-order20logicampoldid=921437906

[Online accessed 29shyOctobershy2019] 2019

[53] Wikipedia Fundamental interaction mdash

Wikipedia The Free Encyclopedia http

en wikipedia org w index php

title=Fundamental20interactionampoldid=

925884124 [Online accessed 16shyNovember-

2019] 2019

[54] Wikipedia Leonhard Euler mdash Wikipedia

The Free Encyclopedia http en

wikipediaorgwindexphptitle=

Leonhard 20Euler amp oldid = 921824700

[Online accessed 26shyOctobershy2019] 2019

D Complex analysis Bibliography p 10

[55] Wikipedia Linguistic turnmdashWikipedia

The Free Encyclopedia http en

wikipediaorgwindexphptitle=

Linguistic20turnampoldid=922305269

[Online accessed 23shyOctobershy2019] 2019

Hey we all do it

[56] Wikipedia Probability space mdash

Wikipedia The Free Encyclopedia http

enwikipediaorgwindexphptitle=

Probability20spaceampoldid=914939789

[Online accessed 31shyOctobershy2019] 2019

[57] Wikipedia Propositional calculus mdash

Wikipedia The Free Encyclopedia http

en wikipedia org w index php

title=Propositional20calculusampoldid=

914757384 [Online accessed 29shyOctober-

2019] 2019

[58] Wikipedia Quaternion mdash Wikipedia The

Free Encyclopedia httpenwikipedia

orgwindexphptitle=Quaternionamp

oldid=920710557 [Online accessed 26-

Octobershy2019] 2019

[59] Wikipedia Set-builder notation mdash

Wikipedia The Free Encyclopedia http

enwikipediaorgwindexphptitle=

Set-builder20notationampoldid=917328223

[Online accessed 29shyOctobershy2019] 2019

[60] Wikipedia William Rowan Hamilton mdash

Wikipedia The Free Encyclopedia http

enwikipediaorgwindexphptitle=

William 20Rowan 20Hamilton amp oldid =

923163451 [Online accessed 26shyOctober-

2019] 2019

[61] L Wittgenstein PMS Hacker and J

Schulte Philosophical Investigations

Wiley 2010 isbn 9781444307979

[62] Ludwig Wittgenstein Tractatus Logico-

Philosophicus Ed by C familyi=C

given=K Ogden giveni= O Project

Complex analysis Bibliography p 11

Gutenberg International Library of

Psychology Philosophy and Scientific

Method Kegan Paul Trench Trubner amp Co

Ltd 1922 A brilliant work on what is

possible to express in languagemdashand what

is not As Wittgenstein puts it ldquoWhat can

be said at all can be said clearly and

whereof one cannot speak thereof one

must be silentrdquo

[63] Slavoj Žižek Less Than Nothing

Hegel and the Shadow of Dialectical

Materialism Verso 2012 isbn

9781844678976 This is one of the most

interesting presentations of Hegel and

Lacan by one of the most exciting

contemporary philosophers

  • Mathematics itself
    • Truth
      • Neo-classical theories of truth
      • The picture theory
      • The relativity of truth
      • Other ideas about truth
      • Where this leaves us
        • The foundations of mathematics
          • Algebra ex nihilo
          • The application of mathematics to science
          • The rigorization of mathematics
          • The foundations of mathematics are built
          • The foundations have cracks
          • Mathematics is considered empirical
            • Mathematical reasoning
            • Mathematical topics in overview
            • What is mathematics for engineering
            • Exercises for Chapter 01
              • Mathematical reasoning logic and set theory
                • Introduction to set theory
                • Logical connectives and quantifiers
                  • Logical connectives
                  • Quantifiers
                    • Exercises for Chapter 02
                      • Exe 021 hardhat
                      • Exe 022 2
                      • Exe 023 3
                      • Exe 024 4
                          • Probability
                            • Probability and measurement
                            • Basic probability theory
                              • Algebra of events
                                • Independence and conditional probability
                                  • Conditional probability
                                    • Bayes theorem
                                      • Testing outcomes
                                      • Posterior probabilities
                                        • Random variables
                                        • Probability density and mass functions
                                          • Binomial PMF
                                          • Gaussian PDF
                                            • Expectation
                                            • Central moments
                                            • Exercises for Chapter 03
                                              • Exe 031 5
                                                  • Statistics
                                                    • Populations samples and machine learning
                                                    • Estimation of sample mean and variance
                                                      • Estimation and sample statistics
                                                      • Sample mean variance and standard deviation
                                                      • Sample statistics as random variables
                                                      • Nonstationary signal statistics
                                                        • Confidence
                                                          • Generate some data to test the central limit theorem
                                                          • Sample statistics
                                                          • The truth about sample means
                                                          • Gaussian and probability
                                                            • Student confidence
                                                            • Multivariate probability and correlation
                                                              • Marginal probability
                                                              • Covariance
                                                              • Conditional probability and dependence
                                                                • Regression
                                                                • Exercises for Chapter 04
                                                                  • Vector calculus
                                                                    • Divergence surface integrals and flux
                                                                      • Flux and surface integrals
                                                                      • Continuity
                                                                      • Divergence
                                                                      • Exploring divergence
                                                                        • Curl line integrals and circulation
                                                                          • Line integrals
                                                                          • Circulation
                                                                          • Curl
                                                                          • Zero curl circulation and path independence
                                                                          • Exploring curl
                                                                            • Gradient
                                                                              • Gradient
                                                                              • Vector fields from gradients are special
                                                                              • Exploring gradient
                                                                                • Stokes and divergence theorems
                                                                                  • The divergence theorem
                                                                                  • The Kelvin-Stokes theorem
                                                                                  • Related theorems
                                                                                    • Exercises for Chapter 05
                                                                                      • Exe 051 6
                                                                                          • Fourier and orthogonality
                                                                                            • Fourier series
                                                                                            • Fourier transform
                                                                                            • Generalized fourier series and orthogonality
                                                                                            • Exercises for Chapter 06
                                                                                              • Exe 061 secrets
                                                                                              • Exe 062 seesaw
                                                                                              • Exe 063 9
                                                                                              • Exe 064 10
                                                                                              • Exe 065 11
                                                                                              • Exe 066 12
                                                                                              • Exe 067 13
                                                                                              • Exe 068 14
                                                                                              • Exe 069 savage
                                                                                              • Exe 0610 strawman
                                                                                                  • Partial differential equations
                                                                                                    • Classifying PDEs
                                                                                                    • Sturm-liouville problems
                                                                                                      • Types of boundary conditions
                                                                                                        • PDE solution by separation of variables
                                                                                                        • The 1D wave equation
                                                                                                        • Exercises for Chapter 07
                                                                                                          • Exe 071 poltergeist
                                                                                                          • Exe 072 kathmandu
                                                                                                              • Optimization
                                                                                                                • Gradient descent
                                                                                                                  • Stationary points
                                                                                                                  • The gradient points the way
                                                                                                                  • The classical method
                                                                                                                  • The Barzilai and Borwein method
                                                                                                                    • Constrained linear optimization
                                                                                                                      • Feasible solutions form a polytope
                                                                                                                      • Only the vertices matter
                                                                                                                        • The simplex algorithm
                                                                                                                        • Exercises for Chapter 08
                                                                                                                          • Exe 081 cummerbund
                                                                                                                          • Exe 082 melty
                                                                                                                              • Nonlinear analysis
                                                                                                                                • Nonlinear state-space models
                                                                                                                                  • Autonomous and nonautonomous systems
                                                                                                                                  • Equilibrium
                                                                                                                                    • Nonlinear system characteristics
                                                                                                                                      • Those in-common with linear systems
                                                                                                                                      • Stability
                                                                                                                                      • Qualities of equilibria
                                                                                                                                        • Nonlinear system simulation
                                                                                                                                        • Simulating nonlinear systems in Python
                                                                                                                                        • Exercises for Chapter 09
                                                                                                                                          • Distribution tables
                                                                                                                                            • Gaussian distribution table
                                                                                                                                            • Students t-distribution table
                                                                                                                                              • Fourier and Laplace tables
                                                                                                                                                • Laplace transforms
                                                                                                                                                • Fourier transforms
                                                                                                                                                  • Mathematics reference
                                                                                                                                                    • Quadratic forms
                                                                                                                                                      • Completing the square
                                                                                                                                                        • Trigonometry
                                                                                                                                                          • Triangle identities
                                                                                                                                                          • Reciprocal identities
                                                                                                                                                          • Pythagorean identities
                                                                                                                                                          • Co-function identities
                                                                                                                                                          • Even-odd identities
                                                                                                                                                          • Sum-difference formulas (AM or lock-in)
                                                                                                                                                          • Double angle formulas
                                                                                                                                                          • Power-reducing or half-angle formulas
                                                                                                                                                          • Sum-to-product formulas
                                                                                                                                                          • Product-to-sum formulas
                                                                                                                                                          • Two-to-one formulas
                                                                                                                                                            • Matrix inverses
                                                                                                                                                            • Laplace transforms
                                                                                                                                                              • Complex analysis
                                                                                                                                                                • Eulers formulas
                                                                                                                                                                  • Bibliography
Page 5: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview

Contents Contents p 5

Exe 051 6 166

06 Fourier and orthogonality 168

fourseriesFourier series 169

fourtransFourier transform 174

fourgeneralGeneralized fourier series

and orthogonality 184

fourex Exercises for Chapter 06 186

Exe 061 secrets 186

Exe 062 seesaw 189

Exe 063 9 189

Exe 064 10 190

Exe 065 11 190

Exe 066 12 190

Exe 067 13 190

Exe 068 14 190

Exe 069 savage 191

Exe 0610 strawman 192

07 Partial differential equations 193

pdeclassClassifying PDEs 196

pdesturmSturm-liouville problems 199

Types of boundary conditions 200

pdeseparationPDE solution by separation of

variables 205

pdewaveThe 1D wave equation 212

pdeex Exercises for Chapter 07 219

Exe 071 poltergeist 219

Exe 072 kathmandu 220

08 Optimization 222

optgrad Gradient descent 223

Stationary points 223

The gradient points the way 224

The classical method 225

The Barzilai and Borwein method 226

optlin Constrained linear

optimization 235

Feasible solutions form a polytope 235

Only the vertices matter 236

optsimplexThe simplex algorithm 238

optex Exercises for Chapter 08 245

Contents Contents p 6

Exe 081 cummerbund 245

Exe 082 melty 245

09 Nonlinear analysis 246

nlinss Nonlinear state-space models 248

Autonomous and nonautonomous

systems 248

Equilibrium 248

nlinchar Nonlinear system

characteristics 250

Those in-common with linear

systems 250

Stability 250

Qualities of equilibria 251

nlinsim Nonlinear system simulation 254

nlinpysimSimulating nonlinear systems

in Python 255

nlinex Exercises for Chapter 09 257

A Distribution tables 258

A01 Gaussian distribution table 259

A02 Studentrsquos tshydistribution table 261

B Fourier and Laplace tables 262

B01 Laplace transforms 263

B02 Fourier transforms 265

C Mathematics reference 268

C01 Quadratic forms 269

Completing the square 269

C02 Trigonometry 270

Triangle identities 270

Reciprocal identities 270

Pythagorean identities 270

Co-function identities 271

Even-odd identities 271

Sumshydifference formulas (AM or

lock-in) 271

Double angle formulas 271

Power-reducing or halfshyangle

formulas 271

Sum-toshyproduct formulas 272

Contents Contents p 7

Product-to-sum formulas 272

Two-to-one formulas 272

C03 Matrix inverses 273

C04 Laplace transforms 274

D Complex analysis 275

D01 Eulerrsquos formulas 276

Bibliography 277

itself

Mathematics itself

itself Mathematics itself tru Truth p 1

1 For much of this lecture I rely on the thorough overview

of Glanzberg (Michael Glanzberg Truth In The Stanford

Encyclopedia of Philosophy Ed by Edward N Zalta Fall 2018

Metaphysics Research Lab Stanford University 2018)

correspondence theory

itselftru Truth

1 Before we can discuss mathematical truth

we should begin with a discussion of truth

itself1 It is important to note that this is

obviously extremely incomplete My aim is

to give a sense of the subject via brutal

(mis)abbreviation

2 Of course the study of truth cannot but

be entangled with the study of the world

as such (metaphysics) and of knowledge

(epistemology) Some of the following theories

presuppose or imply a certain metaphysical

andor epistemological theory but which these

are is controversial

Neo-classical theories of truth

3 The neo-classical theories of truth

take for granted that there is truth and

attempt to explain what its precise nature is

(Michael Glanzberg Truth In The Stanford

Encyclopedia of Philosophy Ed by Edward N

Zalta Fall 2018 Metaphysics Research Lab

Stanford University 2018) What are provided

here are modern understandings of theories

developed primarily in the early 20th century

The correspondence theory

4 A version of what is called the

correspondence theory of truth is the

following

A proposition is true iff there is an

existing entity in the world that

corresponds with it

itself Mathematics itself tru Truth p 1

facts

false

coherence theory

2 This is typically put in terms of ldquobeliefsrdquo or ldquojudgmentsrdquo but for

brevity and parallelism I have cast it in terms of propositions It is

to this theory I have probably committed the most violence

coherence

Such existing entities are called facts Facts

are relational in that their parts (eg subject

predicate etc) are related in a certain way

5 Under this theory then if a proposition

does not correspond to a fact it is false

6 This theory of truth is rather intuitive

and consistently popular (Marian David The

Correspondence Theory of Truth In The

Stanford Encyclopedia of Philosophy Ed

by Edward N Zalta Fall 2016 Metaphysics

Research Lab Stanford University 2016 A

detailed overview of the correspondence

theory of truth)

The coherence theory

7 The coherence theory of truth is adamant

that the truth of any given proposition

is only as good as its holistic system of

propositions2 This includes (but typically

goes beyond) a requirement for consistency of

a given proposition with the whole and the

self-consistency of the whole itselfmdashsometimes

called coherence

8 For parallelism letrsquos attempt a succinct

formulation of this theory cast in terms of

propositions

A proposition is true iff it is

has coherence with a system of

propositions

itself Mathematics itself tru Truth p 1

pragmatism

3 Pragmatism was an American philosophical movement of the early

20th century that valued the success of ldquopracticalrdquo application of

theories For an introduction see Legg and Hookway (Catherine

Legg and Christopher Hookway Pragmatism In The Stanford

Encyclopedia of Philosophy Ed by Edward N Zalta Spring 2019

Metaphysics Research Lab Stanford University 2019 An introductory

article on the philsophical movement ldquopragmatismrdquo It includes an

important clarification of the pragmatic slogan ldquotruth is the end of

inquiryrdquo)

4 This is especially congruent with the work of William James (ibid)

9 Note that this has no reference to facts

whatsoever However it need not necessarily

preclude them

The pragmatic theory

10 Of the neo-classical theories of truth this

is probably the least agreed upon as having a

single clear statement (Glanzberg Truth)

However as with pragmatism in general3 the

pragmatic truth is oriented practically

11 Perhaps the most important aspect

of this theory is that it is thoroughly a

correspondence theory agreeing that true

propositions are those that correspond to

the world However there is a different

focus here that differentiates it from

correspondence theory proper it values as

more true that which has some sort of practical

use in human life

12 Wersquoll try to summarize pragmatism in two

slogans with slightly different emphases herersquos

the first again cast in propositional parallel

A proposition is true iff it works4

Now there are two ways this can be

understood (a) the proposition ldquoworksrdquo in

that it empirically corresponds to the world

or (b) the proposition ldquoworksrdquo in that it has

an effect that some agent intends The former

is pretty standard correspondence theory The

itself Mathematics itself tru Truth p 1

5 This is especially congruent with the work of Charles Sanders

Peirce (Legg and Hookway Pragmatism)

inquiry

facts

latter is new and fairly obviously has ethical

implications especially today

13 Let us turn to a second formulation

A proposition is true if it

corresponds with a process of

inquiry5

This has two interesting facets (a) an agentrsquos

active inquiry creates truth and (b) it is a

sort of correspondence theory that requires a

correspondence of a proposition with a process

of inquiry not as in the correspondence theory

with a fact about the world The latter has

shades of both correspondence theory and

coherence theory

The picture theory

Before we delve into this theory we must take

a moment to clarify some terminology

States of affairs and facts

14 When discussing the correspondence

theory we have used the term fact to mean

an actual state of things in the world A

problem arises in the correspondence theory

here It says that a proposition is true iff

there is a fact that corresponds with it What

of a negative proposition like ldquothere are no

cows in Antarcticardquo We would seem to need

a corresponding ldquonegative factrdquo in the world

to make this true If a fact is taken to be

itself Mathematics itself tru Truth p 1

6 But Barker and Jago (Stephen Barker and Mark Jago Being

Positive About Negative Facts In Philosophy and Phenomenological

Research 851 [2012] pp 117ndash138 DOI 10 1111 j 1933 -1592201000479x) have attempted just that

state of affairs

obtain

model

picture

7 See Wittgenstein (Ludwig Wittgenstein Tractatus Logico-

Philosophicus Ed by C familyi=C given=K Ogden giveni= O

Project Gutenberg International Library of Psychology Philosophy

and Scientific Method Kegan Paul Trench Trubner amp Co Ltd 1922

A brilliant work on what is possible to express in languagemdashand

what is not As Wittgenstein puts it ldquoWhat can be said at all can

be said clearly and whereof one cannot speak thereof one must be

silentrdquo) Biletzki and Matar (Anat Biletzki and Anat Matar Ludwig

Wittgenstein In The Stanford Encyclopedia of Philosophy Ed by

Edward N Zalta Summer 2018 Metaphysics Research Lab Stanford

University 2018 An introduction to Wittgenstein and his thought)

glock2016 () and Dolby (David Dolby Wittgenstein on Truth In

A Companion to Wittgenstein John Wiley amp Sons Ltd 2016 Chap 27

pp 433ndash442 ISBN 9781118884607 DOI 1010029781118884607ch27)

composed of a complex of actual objects and

relations it is hard to imagine such facts6

15 Furthermore if a proposition is true it

seems that it is the corresponding fact that

makes it so what then makes a proposition

false since there is no fact to support the

falsity (Mark Textor States of Affairs In

The Stanford Encyclopedia of Philosophy Ed

by Edward N Zalta Winter 2016 Metaphysics

Research Lab Stanford University 2016)

16 And what of nonsense There are some

propositions like ldquothere is a round cuberdquo

that are neither true nor false However

the preceding correspondence theory cannot

differentiate between false and nonsensical

propositions

17 A state of affairs is something possible

that may or may not be actual (ibid) If

a state of affairs is actual it is said to

obtain The picture theory will make central

this concept instead of that of the fact

The picture theory of meaning (and truth)

18 The picture theory of meaning uses the

analogy of the model or picture to explain

the meaningfulness of propositions7

A proposition names possible objects

and arranges these names to

correspond to a state of affairs

itself Mathematics itself tru Truth p 1

Figure 011 a representation of the picture theory

nonsense

See Figure 011 This also allows for an easy

account of truth falsity and nonsense

nonsense A sentence that appears to be

a proposition is actually not if the

arrangement of named objects is

impossible Such a sentence is simply

nonsense

truth A proposition is true if the state of

affairs it depicts obtains

falsity A proposition is false if the state of

affairs it depicts does not obtain

19 Now some (Hans Johann Glock Truth

in the Tractatus In Synthese 1482 [Jan

2006] pp 345ndash368 ISSN 1573-0964 DOI 10

1007 s11229 - 004 - 6226 - 2) argue this is

a correspondence theory and others (David

Dolby Wittgenstein on Truth In A Companion

to Wittgenstein John Wiley amp Sons Ltd 2016

Chap 27 pp 433ndash442 ISBN 9781118884607 DOI

1010029781118884607ch27) that it is not In

any case it certainly solves some issues that

have plagued the correspondence theory

ldquoWhat cannot be said must be shownrdquo

20 Something the picture theory does is

declare a limit on what can meaningfully be

said A proposition (as defined above) must be

potentially true or false Therefore something

that cannot be false (something necessarily

true) cannot be a proposition (ibid) And

there are certain things that are necessarily

true for language itself to be meaningfulmdash

paradigmatically the logical structure of the

world What a proposition does then is show

itself Mathematics itself tru Truth p 1

8 See also ( Slavoj Žižek Less Than Nothing Hegel and the Shadow

of Dialectical Materialism Verso 2012 ISBN 9781844678976 This is

one of the most interesting presentations of Hegel and Lacan by one

of the most exciting contemporary philosophers pp 25shy6) from whom I

stole the section title

language itself

9 This was one of the contributions to the ldquolinguistic turnrdquo

(Wikipedia Linguistic turnmdashWikipedia The Free Encyclopedia

http en wikipedia org w index php title =Linguistic 20turn amp oldid = 922305269 [Online accessed23shyOctobershy2019] 2019 Hey we all do it) of philosophy in the early

20th century

perspective

skepticism

via its own logical structure the necessary

(for there to be meaningful propositions at all)

logical structure of the world8

21 An interesting feature of this perspective

is that it opens up language itself to analysis

and limitation9 And furthermore it suggests

that the set of what is is smaller than the set

of what can be meaningfully spoken about

The relativity of truth

22 Each subject (ie agent) in the world with

their propositions has a perspective a given

moment a given place an historical-cultural-

linguistic situation At the very least the truth

of propositions must account for this Just

how a theory of truth should do so is a matter

of significant debate (Maria Baghramian and

J Adam Carter Relativism In The Stanford

Encyclopedia of Philosophy Ed by Edward N

Zalta Winter 2019 Metaphysics Research Lab

Stanford University 2019)

23 Some go so far as to be skeptical

about truth (Peter Klein Skepticism In The

Stanford Encyclopedia of Philosophy Ed by

Edward N Zalta Summer 2015 Metaphysics

Research Lab Stanford University 2015)

regarding it to be entirely impossible Others

say that while a proposition may or may not be

true we could never come to know this

24 Often underlying this conversation is the

question of there being a common world in

which we all participate and if so whether

or not we can properly represent this world

itself Mathematics itself tru Truth p 1

10 Especially notable here is the work of Alfred Tarski in the

midshy20th century

beliefs

sentences

interpreted sentences

quasi-quotation

in language such that multiple subjects could

come to justifiably agree or disagree on the

truth of a proposition If every proposition

is so relative that it is relevant to only the

proposer truth would seem of little value On

the other hand if truth is understood to

be ldquoobjectiverdquomdashindependent of subjective

perspectivemdasha number of objections can be

made (Baghramian and Carter Relativism) such

as that there is no non-subjective perspective

from which to judge truth

Other ideas about truth

25 There are too many credible ideas

about truth to attempt a reasonable summary

however I will attempt to highlight a few

important ones

Formal methods

26 A set of tools was developed for

exploring theories of truth especially

correspondence theories10 Focus turned

from beliefs to sentences which are akin to

propositions (Recall that the above theories

have already been recast in the more modern

language of propositions) Another aspect

of these sentences under consideration is

that they begin to be taken as interpreted

sentences they are already have meaning

27 Beyond this several technical apparatus

are introduced that formalize criteria for

truth For instance a sentence is given a

sign φ A need arises to distinguish between

the quotation of sentence φ and the unqoted

sentence φ which is then given the quasi-

quotation notation pφq For instance let φ

stand for snow is white then φ rarr snow is

itself Mathematics itself tru Truth p 1

Convention T

11 Wilfrid Hodges Tarskirsquos Truth Definitions In The Stanford

Encyclopedia of Philosophy Ed by Edward N Zalta Fall 2018

Metaphysics Research Lab Stanford University 2018 Mario Goacutemez-

Torrente Alfred Tarski In The Stanford Encyclopedia of

Philosophy Ed by Edward N Zalta Spring 2019 Metaphysics Research

Lab Stanford University 2019 Peter Hylton and Gary Kemp Willard

Van Orman Quine In The Stanford Encyclopedia of Philosophy Ed

by Edward N Zalta Spring 2019 Metaphysics Research Lab Stanford

University 2019

redundancy theory

white and pφq rarr lsquosnow is whitersquo Tarski

introduces Convention T which states that

for a fixed language L with fully interpreted

sentences (Glanzberg Truth)

An adequate theory of truth for L

must imply for each sentence φ of L

pφq is true if and only if φ

Using the same example then

lsquosnow is whitersquo if and only if snow

is white

Convention T states a general rule for the

adequacy of a theory of truth and is used in

several contemporary theories

28 We can see that these formal methods get

quite technical and fun For more see Hodges

GoacutemezshyTorrente and Hylton and Kemp11

Deflationary theories

29 Deflationary theories of truth try to

minimize or eliminate altogether the concept

of or use of the term lsquotruthrsquo For instance the

redundancy theory claim that (Glanzberg

Truth)

itself Mathematics itself tru Truth p 1

12 Daniel Stoljar and Nic Damnjanovic The Deflationary Theory of

Truth In The Stanford Encyclopedia of Philosophy Ed by Edward N

Zalta Fall 2014 Metaphysics Research Lab Stanford University 2014

truth-bearer

meaning

theory of use

To assert that pφq is true is just to

assert that φ

Therefore we can eliminate the use of lsquois truersquo

30 For more of less see Stoljar and

Damnjanovic12

Language

31 It is important to recognize that language

mediates truth that is truth is embedded in

language The way language in general affects

theories of truth has been studied extensively

For instance whether the truth-bearer is

a belief or a proposition or a sentencemdashor

something elsemdashhas been much discussed The

importance of the meaning of truth-bearers

like sentences has played another large role

Theories of meaning like the picture theory

presented above are often closely intertwined

with theories of truth

32 One of the most popular theories of

meaning is called the theory of use

For a large class of cases of the

employment of the word ldquomeaningrdquo

ndash though not for all ndash this word

can be explained in this way the

meaning of a word is its use in

the language (L Wittgenstein PMS

Hacker and J Schulte Philosophical

itself Mathematics itself tru Truth p 1

languageshygames

metaphysical realism

scientific realism

13 Drew Khlentzos Challenges to Metaphysical Realism In The

Stanford Encyclopedia of Philosophy Ed by Edward N Zalta Winter

2016 Metaphysics Research Lab Stanford University 2016

antishyrealism

Investigations Wiley 2010 ISBN

9781444307979)

This theory is accompanied by the concept of

languageshygames which are loosely defined as

rule-based contexts within which sentences

have uses The idea is that the meaning of

a given sentence is its use in a network of

meaning that is constantly evolving This view

tends to be understood as deflationary or

relativistic about truth

Metaphysical and epistemological

considerations

33 We began with the recognition that

truth is intertwined with metaphysics and

epistemology Letrsquos consider a few such topics

34 The first is metaphysical realism

which claims that there is a world existing

objectively independently of how we think

about or describe it This ldquorealismrdquo tends to be

closely tied to yet distinct from scientific

realism which goes further claiming the

world is ldquoactuallyrdquo as science describes

independently of the scientific descriptions

(eg there are actual objects corresponding to

the phenomena we call atoms molecules light

particles etc)

35 There have been many challenges to the

realist claim (for some recent versions see

Khlentzos13) put forth by what is broadly

called antishyrealism These vary but often

challenge the ability of realists to properly

itself Mathematics itself tru Truth p 2

Metaphysical idealism

Epistemological idealism

14 These definitions are explicated by Guyer and Horstmann (Paul

Guyer and Rolf-Peter Horstmann Idealism In The Stanford

Encyclopedia of Philosophy Ed by Edward N Zalta Winter 2018

Metaphysics Research Lab Stanford University 2018)

noumenal world

phenomenal world

link language to supposed objects in the

world

36 Metaphysical idealism has been

characterized as claiming that ldquomindrdquo or

ldquosubjectivityrdquo generate or completely compose

the world which has no being outside mind

Epistemological idealism on the other hand

while perhaps conceding that there is a world

independent of mind claims all knowledge

of the world is created through mind and

for mind and therefore can never escape a

sort of mindshyworld gap14 This epistemological

idealism has been highly influential since

the work of Immanuel Kant (I Kant P Guyer

and AW Wood Critique of Pure Reason The

Cambridge Edition of the Works of Immanuel

Kant Cambridge University Press 1999 ISBN

9781107268333) in the late 18th century

which ushered in the idea of the noumenal

world in-itself and the phenomenal world

which is how the noumenal world presents to

us Many have held that phenomena can be

known through inquiry whereas noumena are

inaccessible Furthermore what can be known is

restricted by the categories pre-existent in

the knower

37 Another approach taken by Georg Wilhelm

Friedrich Hegel (Paul Redding Georg

Wilhelm Friedrich Hegel In The Stanford

Encyclopedia of Philosophy Ed by Edward N

Zalta Summer 2018 Metaphysics Research

Lab Stanford University 2018) and other

German idealists following Kant is to reframe

reality as thoroughly integrating subjectivity

(GWF Hegel and AV Miller Phenomenology

of Spirit Motilal Banarsidass 1998 ISBN

itself Mathematics itself tru Truth p 1

Notion

9788120814738 Slavoj Žižek Less Than

Nothing Hegel and the Shadow of Dialectical

Materialism Verso 2012 ISBN 9781844678976

This is one of the most interesting presentations

of Hegel and Lacan by one of the most exciting

contemporary philosophers) that is ldquoeverything

turns on grasping and expressing the True not

only as Substance but equally as Subjectrdquo

A subjectrsquos proposition is true inasmuch as it

corresponds with its Notion (approximately

the idea or meaning for the subject) Some

hold that this idealism is compatible with a

sort of metaphysical realism at least as far

as understanding is not independent of but

rather beholden to reality (Žižek Less Than

Nothing Hegel and the Shadow of Dialectical

Materialism p 906 ff)

38 Clearly all these ideas have many

implications for theories of truth and vice

versa

Where this leaves us

39 The truth is hard What may at first

appear to be a simple concept becomes complex

upon analysis It is important to recognize

that we have only sampled some highlights of

the theories of truth I recommend further

study of this fascinating topic

40 Despite the difficulties of finding

definitive grounds for understanding truth

we are faced with the task of provisionally

forging ahead Much of what follows in

the study of mathematics makes certain

implicit and explicit assumptions about truth

However we have found that the foundations

itself Mathematics itself found Truth p 2

of these assumptions may themselves be

problematic It is my contention that despite

the lack of clear foundations it is still worth

studying engineering analysis its mathematical

foundations and the foundations of truth

itself My justification for this claim is that

I find the utility and the beauty of this study

highly rewarding

itself Mathematics itself found The foundations of mathematics p 1

axioms

deduction

proof

theorems

15 Throughout this section for the history of mathematics I rely

heavily on Kline (M Kline Mathematics The Loss of Certainty A

Galaxy book Oxford University Press 1982 ISBN 9780195030853 A

detailed account of the ldquoillogicalrdquo development of mathematics and

an exposition of its therefore remarkable utility in describing the

world)

Euclid

geometry

Gauss

itselffound The foundations of

mathematics

1 Mathematics has long been considered

exemplary for establishing truth Primarily

it uses a method that begins with

axiomsmdashunproven propositions that include

undefined termsmdashand uses logical deduction

to prove other propositions (theorems) to

show that they are necessarily true if the

axioms are

2 It may seem obvious that truth established

in this way would always be relative to the

truth of the axioms but throughout history

this footnote was often obscured by the

ldquoobviousrdquo or ldquointuitiverdquo universal truth of the

axioms15 For instance Euclid (Wikipedia

Euclid mdash Wikipedia The Free Encyclopedia

http en wikipedia org w index php

title = Euclid amp oldid = 923031048 [Online

accessed 26shyOctobershy2019] 2019) founded

geometrymdashthe study of mathematical objects

traditionally considered to represent physical

space like points lines etcmdashon axioms thought

so solid that it was not until the early 19th

century that Carl Friedrich Gauss (Wikipedia

Carl Friedrich Gauss mdash Wikipedia The Free

Encyclopedia httpenwikipediaorgw

indexphptitle=Carl20Friedrich20Gaussamp

oldid=922692291 [Online accessed 26shyOctober-

2019] 2019) and others recognized this was

only one among many possible geometries (M

Kline Mathematics The Loss of Certainty A

Galaxy book Oxford University Press 1982

ISBN 9780195030853 A detailed account of

the ldquoillogicalrdquo development of mathematics

and an exposition of its therefore remarkable

utility in describing the world) resting on

itself Mathematics itself found The foundations of mathematics p 1

Aristotle

algebra

rational numbers

natural numbers

integers

irrational numbers

different axioms Furthermore Aristotle

(Christopher Shields Aristotle In The

Stanford Encyclopedia of Philosophy Ed by

Edward N Zalta Winter 2016 Metaphysics

Research Lab Stanford University 2016) had

acknowledged that reasoning must begin

with undefined terms however even Euclid

(presumably aware of Aristotlersquos work) seemed

to forget this and provided definitions

obscuring the foundations of his work and

starting mathematics on a path that for over

2000 years would forget its own relativity

(Kline Mathematics The Loss of Certainty p 101-

2)

3 The foundations of Euclid were even

shakier than its murky starting point several

unstated axioms were used in proofs and some

proofs were otherwise erroneous However for

two millennia mathematics was seen as the field

wherein truth could be established beyond

doubt

Algebra ex nihilo

4 Although not much work new geometry

appeared during this period the field of

algebra (Wikipedia Algebra mdash Wikipedia The

Free Encyclopedia httpenwikipediaorg

windexphptitle=Algebraampoldid=920573802

[Online accessed 26shyOctobershy2019] 2019)mdashthe

study of manipulations of symbols standing for

numbers in generalmdashbegan with no axiomatic

foundation whatsoever The Greeks had a notion

of rational numbers ratios of natural numbers

(positive integers) and it was known that

many solutions to algebraic equations were

irrational (could not be expressed as a ratio

of integers) But these irrational numbers like

virtually everything else in algebra were

itself Mathematics itself found The foundations of mathematics p 1

negative numbers

imaginary numbers

complex numbers

optics

astronomy

calculus

Newtonian mechanics

non-Euclidean geometry

gradually accepted because they were so

useful in solving practical problems (they

could be approximated by rational numbers and

this seemed good enough) The rules of basic

arithmetic were accepted as applying to these

and other forms of new numbers that arose in

algebraic solutions negative imaginary and

complex numbers

The application of mathematics to science

5 During this time mathematics was being

applied to optics and astronomy Sir

Isaac Newton then built calculus upon

algebra applying it to what is now known as

Newtonian mechanics which was really more

the product of Leonhard Euler (George Smith

Isaac Newton In The Stanford Encyclopedia

of Philosophy Ed by Edward N Zalta Fall

2008 Metaphysics Research Lab Stanford

University 2008 Wikipedia Leonhard

Euler mdash Wikipedia The Free Encyclopedia

httpenwikipediaorgwindexphptitle=

Leonhard20Eulerampoldid=921824700 [Online

accessed 26shyOctobershy2019] 2019) Calculus

introduced its own dubious operations but

the success of mechanics in describing and

predicting physical phenomena was astounding

Mathematics was hailed as the language of

God (later Nature)

The rigorization of mathematics

6 It was not until Gauss created non-

Euclidean geometry in which Euclidrsquos were

shown to be one of many possible axioms

compatible with the world and William Rowan

Hamilton (Wikipedia William Rowan Hamilton

mdash Wikipedia The Free Encyclopedia http

en wikipedia org w index php

title=William20Rowan20Hamiltonampoldid=

923163451 [Online accessed 26shyOctober-

itself Mathematics itself found The foundations of mathematics p 1

quaternions

rigorization

symbolic logic

mathematical model

16 The branch of mathematics called model theory concerns itself

with general types of models that can be made from a given formal

system like an axiomatic mathematical system For more on model

theory see Hodges (Wilfrid Hodges Model Theory In The Stanford

Encyclopedia of Philosophy Ed by Edward N Zalta Fall 2018

Metaphysics Research Lab Stanford University 2018) It is noteworthy

that the engineeringscience use of the term ldquomathematical modelrdquo is

only loosely a ldquomodelrdquo in the sense of model theory

implicit definition

2019] 2019) created quaternions (Wikipedia

Quaternion mdash Wikipedia The Free Encyclopedia

http en wikipedia org w index

php title = Quaternion amp oldid = 920710557

[Online accessed 26shyOctobershy2019] 2019)

a number system in which multiplication is

noncommunicative that it became apparent

somethingwas fundamentally wrong with the

way truth in mathematics had been understood

This started a period of rigorization in

mathematics that set about axiomatizing and

proving 19th century mathematics This included

the development of symbolic logic which

aided in the process of deductive reasoning

7 An aspect of this rigorization is that

mathematicians came to terms with the axioms

that include undefined terms For instance

a ldquopointrdquo might be such an undefined term in

an axiom A mathematical model is what we

create when we attach these undefined terms

to objects which can be anything consistent

with the axioms16 The system that results

from proving theorems would then apply to

anything ldquoproperlyrdquo described by the axioms

So two masses might be assigned ldquopointsrdquo in

a Euclidean geometric space from which

we could be confident that for instance

the ldquodistancerdquo between these masses is the

Euclidean norm of the line drawn between the

points It could be said then that a ldquopointrdquo in

Euclidean geometry is implicitly defined by

its axioms and theorems and nothing else That

is mathematical objects are not inherently

tied to the physical objects to which we tend

to apply them Euclidean geometry is not

the study of physical space as it was long

consideredmdashit is the study of the objects

implicitly defined by its axioms and theorems

The foundations of mathematics are built

itself Mathematics itself found The foundations of mathematics p 2

consistent

theorem

Set theory

foundation

sets

ZFC set theory

8 The building of the modern foundations

mathematics began with clear axioms

solid reasoning (with symbolic logic) and

lofty yet seemingly attainable goals prove

theorems to support the already ubiquitous

mathematical techniques in geometry algebra

and calculus from axioms furthermore prove

that these axioms (and things they imply) do

not contradict each other ie are consistent

and that the axioms are not results of each

other (one that can be derived from others is

a theorem not an axiom)

9 Set theory is a type of formal axiomatic

system that all modern mathematics is

expressed with so set theory is often called

the foundation of mathematics (Joan Bagaria

Set Theory In The Stanford Encyclopedia

of Philosophy Ed by Edward N Zalta Fall

2019 Metaphysics Research Lab Stanford

University 2019) We will study the basics

in The primary objects in set theory are

sets informally collections of mathematical

objects There is not just one a single set

of axioms that is used as the foundation of

all mathematics for reasons will review in a

moment However the most popular set theory

is ZermeloshyFraenkel set theory with the

axiom of choice (ZFC) The axioms of ZF sans C

are as follows (ibid)

extensionality If two sets A and B have the

same elements then they are equal

empty set There exists a set denoted by emptyand called the empty set which has no

elements

pair Given any sets A and B there exists a set

itself Mathematics itself found The foundations of mathematics p 1

denoted by A B which contains A and B

as its only elements In particular there

exists the set A which has A as its only

element

power set For every set A there exists a set

denoted by P(A) and called the power set

of A whose elements are all the subsets of

A

union For every set A there exists a set

denoted by⋃A and called the union of

A whose elements are all the elements of

the elements of A

infinity There exists an infinite set In

particular there exists a set Z that

contains empty and such that if A isin Z then⋃A A isin Z

separation For every set A and every given

property there is a set containing

exactly the elements of A that have

that property A property is given by a

formula ϕ of the first-order language

of set theory Thus separation is not a

single axiom but an axiom schema that is

an infinite list of axioms one for each

formula ϕ

replacement For every given definable

function with domain a set A there is a

set whose elements are all the values of

the function

10 ZFC also has the axiom of choice (Bagaria

Set Theory)

choice For every set A of pairwiseshydisjoint

non-empty sets there exists a set that

contains exactly one element from each

set in A

itself Mathematics itself found The foundations of mathematics p 2

first incompleteness theorem

incomplete

17 Panu Raatikainen Goumldelrsquos Incompleteness Theorems In The

Stanford Encyclopedia of Philosophy Ed by Edward N Zalta Fall

2018 Metaphysics Research Lab Stanford University 2018 A through

and contemporary description of Goumldelrsquos incompleteness theorems

which have significant implications for the foundations and function

of mathematics and mathematical truth

undecidable

second incompleteness theorem

The foundations have cracks

11 The foundationalistsrsquo goal was to prove

that some set of axioms from which all of

mathematics can be derived is both consistent

(contains no contradictions) and complete

(every true statement is provable) The work of

Kurt Goumldel (Juliette Kennedy Kurt Goumldel In

The Stanford Encyclopedia of Philosophy Ed

by Edward N Zalta Winter 2018 Metaphysics

Research Lab Stanford University 2018) in

the mid 20th century shattered this dream by

proving in his first incompleteness theorem

that any consistent formal system within which

one can do some amount of basic arithmetic is

incomplete His argument is worth reviewing

(see Raatikainen17) but at its heart is an

undecidable statement like ldquoThis sentence is

unprovablerdquo Let U stand for this statement If

it is true it is unprovable If it is provable it is

false Therefore it is true iff it is provable

Then he shows that if a statement A that

essentially says ldquoarithmetic is consistentrdquo is

provable then so is the undecidable statement

U But if U is to be consistent it cannot be

provable and therefore neither can A be

provable

12 This is problematic It tells us virtually

any conceivable axiomatic foundation of

mathematics is incomplete If one is complete

it is inconsistent (and therefore worthless) One

problem this introduces is that a true theorem

may be impossible to prove but it turns out we

can never know that in advance of its proof if

it is provable

13 But it gets worse Goumldelrsquos second

itself Mathematics itself found The foundations of mathematics p 1

material implication

Banach-Tarski paradox

empirical

18 Kline Mathematics The Loss of Certainty

incompleteness theorem shows that such

systems cannot even be shown to be consistent

This means at any moment someone could find

an inconsistency in mathematics and not only

would we lose some of the theorems we would

lose them all This is because by what is called

the material implication (Kline Mathematics

The Loss of Certainty pp 187-8 264) if one

contradiction can be found every proposition

can be proven from it And if this is the case

all (even proven) theorems in the system would

be suspect

14 Even though no contradiction has yet

appeared in ZFC its axiom of choice which

is required for the proof of most of what

has thus far been proven generates the

Banach-Tarski paradox that says a sphere

of diameter x can be partitioned into a finite

number of pieces and recombined to form

two spheres of diameter x Troubling to say

the least Attempts were made for a while to

eliminate the use of the axiom of choice but

our buddy Goumldel later proved that if ZF is

consistent so is ZFC (ibid p 267)

Mathematics is considered empirical

15 Since its inception mathematics has been

applied extensively to the modeling of the

world Despite its cracked foundations it has

striking utility Many recent leading minds of

mathematics philosophy and science suggest

we treat mathematics as empirical like any

science subject to its success in describing

and predicting events in the world As Kline18

summarizes

itself Mathematics itself reason The foundations of mathematics p 2

The upshot [hellip] is that sound

mathematics must be determined

not by any one foundation which

may some day prove to be right The

ldquocorrectnessrdquo of mathematics must

be judged by its application to the

physical world Mathematics is an

empirical science much as Newtonian

mechanics It is correct only to the

extent that it works and when it

does not it must be modified It

is not a priori knowledge even

though it was so regarded for two

thousand years It is not absolute

or unchangeable

itself Mathematics itself overview Mathematical reasoning p 1

itselfreason Mathematical

reasoning

itself Mathematics itself engmath Mathematical topics in overview p 1

itselfoverview Mathematical

topics in

overview

itself Mathematics itself ex What is mathematics for engineering p 1

itselfengmath What is

mathematics for

engineering

language

sets Mathematics itself Exercises for Chapter 01 p 1

itselfex Exercises for Chapter

01

formal languages

logic

reasoning

propositions

theorems

proof

propositional calculus

logical connectives

first-order logic

sets

Mathematicalreasoning logicand set theory

In order to communicate mathematical

ideas effectively formal languages have

been developed within which logic ie

deductive (mathematical) reasoning can

proceed Propositions are statements that

can be either true gt or false perp Axiomatic

systems begin with statements (axioms) assumed

true Theorems are proven by deduction

In many forms of logic like propositional

calculus (Wikipedia Propositional

calculus mdash Wikipedia The Free Encyclopedia

httpenwikipediaorgwindexphptitle=

Propositional20calculusampoldid=914757384

[Online accessed 29shyOctobershy2019] 2019)

compound propositions are constructed

via logical connectives like ldquoandrdquo

and ldquoorrdquo of atomic propositions (see

Lecture setslogic) In others like first-order

logic (Wikipedia First-order logic mdash

Wikipedia The Free Encyclopedia http

enwikipediaorgwindexphptitle=First-

order 20logic amp oldid = 921437906 [Online

accessed 29shyOctobershy2019] 2019) there are also

sets Mathematical reasoning logic and set theory setintro p 3

quantifiers

set theory

logical quantifiers like ldquofor everyrdquo and ldquothere

existsrdquo

The mathematical objects and operations about

which most propositions are made are expressed

in terms of set theory which was introduced

in Lecture itselffound and will be expanded

upon in Lecture setssetintro We can say

that mathematical reasoning is comprised of

mathematical objects and operations expressed

in set theory and logic allows us to reason

therewith

sets Mathematical reasoning logic and set theory setintro Introduction to set theory p 1

set theory

1 K Ciesielski Set Theory for the Working Mathematician London

Mathematical Society Student Texts Cambridge University Press 1997

ISBN 9780521594653 A readable introduction to set theory

2 HB Enderton Elements of Set Theory Elsevier Science 1977 ISBN

9780080570426 A gentle introduction to set theory and mathematical

reasoningmdasha great place to start

set

field

addition

multiplication

subtraction

division

setssetintro Introduction to

set theory

Set theory is the language of the modern

foundation of mathematics as discussed

in Lecture itselffound It is unsurprising

then that it arises throughout the study of

mathematics We will use set theory extensively

in on probability theory

The axioms of ZFC set theory were introduced

in Lecture itselffound Instead of proceeding

in the pure mathematics way of introducing

and proving theorems we will opt for a more

applied approach in which we begin with some

simple definitions and include basic operations

A more thorough and still readable treatment

is given by Ciesielski1 and a very gentle

version by Enderton2

A set is a collection of objects Set theory

gives us a way to describe these collections

Often the objects in a set are numbers or sets

of numbers However a set could represent

collections of zebras and trees and hairballs

For instance here are some sets

A field is a set with special structure This

structure is provided by the addition (+)

and multiplication (times) operators and their

inverses subtraction (ndash) and division (divide)

sets Mathematical reasoning logic and set theory setintro Introduction to set theory p 2

real numbers

3 When the natural numbers include zero we write N0

set membership

set operations

union

The quintessential example of a field is the

set of real numbers R which admits these

operators making it a field The reals R thecomplex numbers C the integers Z and the

natural numbers3 N are the fields we typically

consider

Set membership is the belonging of an object

to a set It is denoted with the symbol isinwhich can be read ldquois an element ofrdquo for

element x and set X

For instance we might say 7 isin 1 7 2 or

4 isin 1 7 2 Or we might declare that a is a

real number by stating x isin R

Set operations can be used to construct new

sets from established sets We consider a few

common set operations now

The union cup of sets is the set containing all

the elements of the original sets (no repetition

allowed) The union of sets A and B is denoted

A cup B For instance let A = 1 2 3 and B = ndash1 3

then

sets Mathematical reasoning logic and set theory setintro Introduction to set theory p 3

intersection

empty set

set difference

subset

proper subset

The intersection cap of sets is a set containing

the elements common to all the original sets

The intersection of sets A and B is denoted

A cap B For instance let A = 1 2 3 and B = 2 3 4

then

If two sets have no elements in common the

intersection is the empty set empty = the unique

set with no elements

The set difference of two sets A and B is the

set of elements in A that arenrsquot also in B It is

denoted A B For instance let A = 1 2 3 and

B = 2 3 4 Then

A subset sube of a set is a set the elements of

which are contained in the original set If the

two sets are equal one is still considered a

subset of the other We call a subset that is

not equal to the other set a proper subset sub

sets Mathematical reasoning logic and set theory setintro Introduction to set theory p 4

complement

cartesian product

set-builder notation

For instance let A = 1 2 3 and B = 1 2 Then

The complement of a subset is a set of elements

of the original set that arenrsquot in the subset

For instance if B sube A then the complement of B

denoted B is

The cartesian product of two sets A and B is

denoted A times B and is the set of all ordered

pairs (a b) where a isin A and b isin B Itrsquos

worthwhile considering the following notation

for this definition

which means ldquothe cartesian product of A and B

is the ordered pair (a b) such that a isin A and

b isin Brdquo in set-builder notation (Wikipedia

Set-builder notation mdash Wikipedia The Free

Encyclopedia httpenwikipediaorgw

indexphptitle=Set-builder20notationamp

oldid=917328223 [Online accessed 29shyOctober-

sets Mathematical reasoning logic and set theory logic Introduction to set theory p 5

map

function

value

domain

codomain

image

2019] 2019)

Let A and B be sets A map or function f from

A to B is an assignment of some element a isin A

to each element b isin B The function is denoted

f A rarr B and we say that f maps each element

a isin A to an element f(a) isin B called the value

of a under f or a 7rarr f(a) We say that f has

domain A and codomain B The image of f is

the subset of its codomain B that contains

the values of all elements mapped by f from its

domain A

sets Mathematical reasoning logic and set theory ex Logical connectives and quantifiers p 1

first-order logic

atomistic proposition

compound proposition

logical connectives

not

andor

truth table

universal quantifier symbol

existential quantifier

setslogic Logical connectives

and quantifiers

In order to make compound propositions

we need to define logical connectives In

order to specify quantities of variables

we need to define logical quantifiers The

following is a form of first-order logic

(Wikipedia First-order logic mdash Wikipedia The

Free Encyclopedia)

Logical connectives

A proposition can be either true gt and false perpWhen it does not contain a logical connective

it is called an atomistic proposition

To combine propositions into a compound

proposition we require logical connectives

They are not (not) and (and) and or (or) Table 021

is a truth table for a number of connectives

Quantifiers

Logical quantifiers allow us to indicate

the quantity of a variable The universal

quantifier symbol forall means ldquofor allrdquo For

instance let A be a set then foralla isin A means

ldquofor all elements in Ardquo and gives this quantity

variable a The existential quantifier existmeans ldquothere exists at least onerdquo or ldquofor somerdquo

For instance let A be a set then exista isin A

Table 021 a truth table for logical connectivesThe first two columns are the truth values ofpropositions p and q the rest are outputs

not and or nand nor xor xnor

p q notp pand q por q p uarr q p darr q p Y q phArr q

perp perp gt perp perp gt gt perp gtperp gt gt perp gt gt perp gt perpgt perp perp perp gt gt perp gt perpgt gt perp gt gt perp perp perp gt

sets Mathematical reasoning logic and set theory ex Logical connectives and quantifiers p 2

means ldquothere exists at least one element a in

A helliprdquo

prob Mathematical reasoning logic and set theory Exercises for Chapter 02 p 1

setsex Exercises for Chapter

02

Exercise 021 (hardhat)

For the following write the set described in

set-builder notation

a A = 2 3 5 9 17 33 middot middot middot b B is the set of integers divisible by 11

c C = 13 14 15 middot middot middot d D is the set of reals between ndash3 and 42

Exercise 022 (2)

Let xy isin Rn Prove the Cauchy-Schwarz

Inequality

|x middot y| 6 xy (ex1)

Hint you may find the geometric definition of

the dot product helpful

Exercise 023 (3)

Let x isin Rn Prove that

x middot x = x2 (ex2)

Hint you may find the geometric definition of

the dot product helpful

Exercise 024 (4)

Let xy isin Rn Prove the Triangle Inequality

x + y 6 x + y (ex3)

Hint you may find the Cauchy-Schwarz

Inequality helpful

prob

Probability

prob Probability prob Probability and measurement p 1

Probability theory

1 For a good introduction to probability theory see Ash (Robert B

Ash Basic Probability Theory Dover Publications Inc 2008) or Jaynes

et al (ET Jaynes et al Probability Theory The Logic of Science

Cambridge University Press 2003 ISBN 9780521592710 An excellent

and comprehensive introduction to probability theory)

interpretation of probability

event

probmeas Probability and

measurement

Probability theory is a wellshydefined branch

of mathematics Andrey Kolmogorov described

a set of axioms in 1933 that are still in use

today as the foundation of probability

theory1

We will implicitly use these axioms in our

analysis The interpretation of probability is

a contentious matter Some believe probability

quantifies the frequency of the occurrence of

some event that is repeated in a large number

of trials Others believe it quantifies the

state of our knowledge or belief that some

event will occur

In experiments our measurements are tightly

coupled to probability This is apparent in the

questions we ask Here are some examples

1 How common is a given event

2 What is the probability we will reject

a good theory based on experimental

results

3 How repeatable are the results

4 How confident are we in the results

5 What is the character of the fluctuations

and drift in the data

6 How much data do we need

prob Probability prob Basic probability theory p 1

probability space

sample space

outcomes

probprob Basic probability

theory

The mathematical model for a class of

measurements is called the probability

space and is composed of a mathematical

triple of a sample space Ω σshyalgebra

F and probability measure P typically

denoted (ΩF P) each of which we will

consider in turn (Wikipedia Probability

space mdash Wikipedia The Free Encyclopedia

httpenwikipediaorgwindexphptitle=

Probability20spaceampoldid=914939789 [Online

accessed 31shyOctobershy2019] 2019)

The sample space Ω of an experiment is the

set representing all possible outcomes of the

experiment If a coin is flipped the sample

space is Ω = H T where H is heads and T

is tails If a coin is flipped twice the sample

space could be

However the same experiment can have

different sample spaces For instance for two

coin flips we could also choose

prob Probability prob Basic probability theory p 1

event

We base our choice of Ω on the problem at

hand

An event is a subset of the sample space

That is an event corresponds to a yes-or-no

question about the experiment For instance

event A (remember A sube Ω) in the coin flipping

experiment (two flips) might be A = HT TH A is

an event that corresponds to the question ldquoIs

the second flip different than the firstrdquo A is

the event for which the answer is ldquoyesrdquo

Algebra of events

Because events are sets we can perform the

usual set operations with them

Example probprobshy1 re set operations with events

Problem Statement

Consider a toss of a single die We choose

the sample space to be Ω = 1 2 3 4 5 6 Let the

following define events

A equiv the result is even = 2 4 6

B equiv the result is greater than 2 = 3 4 5 6

Find the following event combinations

A cup B A cap B A B B A A B

prob Probability condition Basic probability theory p 2

σshyalgebra

probability measure

The σshyalgebra F is the collection of events

of interest Often F is the set of all possible

events given a sample space Ω which is just

the power set of Ω (Wikipedia Probability

space mdash Wikipedia The Free Encyclopedia)

When referring to an event we often state

that it is an element of F For instance we

might say an event A isin F

Wersquore finally ready to assign probabilities to

events We define the probability measure

P F rarr [0 1] to be a function satisfying the

following conditions

1 For every event A isin F the probability

measure of A is greater than or equal to

zeromdashie P(A) gt 0

2 If an event is the entire sample space its

probability measure is unitymdashie if A = Ω

P(A) = 1

3 If events A1 A2 middot middot middot are disjoint sets (no

elements in common) then P(A1 cup A2 cup middot middot middot ) =P(A1) + P(A2) + middot middot middot

We conclude the basics by observing four facts

that can be proven from the definitions above

1

2

3

4

prob Probability condition Independence and conditional probability p 1

independent

probcondition Independence

and conditional

probability

Two events A and B are independent if and

only if

P(A cap B) = P(A)P(B)

If an experimenter must make a judgment

without data about the independence of

events they base it on their knowledge of the

events as discussed in the following example

Example probconditionshy1 re independence

Problem Statement

Answer the following questions and

imperatives

1 Consider a single fair die rolled twice

What is the probability that both rolls

are 6

2 What changes if the die is biased by a

weight such that P(6) = 17

3 What changes if the die is biased by a

magnet rolled on a magnetic dice-rolling

tray such that P(6) = 17

4 What changes if there are two dice

biased by weights such that for each

P(6) = 17 rolled once both resulting in

6

5 What changes if there are two dice

biased by magnets rolled together

prob Probability condition Independence and conditional probability p 1

dependent

conditional probability

Conditional probability

If events A and B are somehow dependent we

need a way to compute the probability of B

occurring given that A occurs This is called

the conditional probability of B given A and

is denoted P(B | A) For P(A) gt 0 it is defined as

P(B | A) =P(A cap B)P(A)

(condition1)

We can interpret this as a restriction of

the sample space Ω to A ie the new sample

space Ωprime = A sube Ω Note that if A and B are

independent we obtain the obvious result

prob Probability bay Independence and conditional probability p 2

Example probconditionshy2 re dependence

Problem Statement

Consider two unbiased dice rolled once

Let events A = sum of faces = 8 and

B = faces are equal What is the probability

the faces are equal given that their sum is 8

prob Probability bay Bayesrsquo theorem p 1

Bayesrsquo theorem

probbay Bayesrsquo theorem

Given two events A and B Bayesrsquo theorem (aka

Bayesrsquo rule) states that

P(A | B) = P(B | A)P(A)

P(B) (bay1)

Sometimes this is written

P(A | B) =P(B | A)P(A)

P(B | A)P(A) + P(B | notA)P(notA)(bay2)

=1

1 +P(B | notA)

P(B | A)middot P(notA)P(A)

(bay3)

This is a useful theorem for determining a

testrsquos effectiveness If a test is performed to

determine whether an event has occurred we

might as questions like ldquoif the test indicates

that the event has occurred what is the

probability it has actually occurredrdquo Bayesrsquo

theorem can help compute an answer

Testing outcomes

The test can be either positive or negative

meaning it can either indicate or not indicate

that A has occurred Furthermore this result

can be either true or false

A notA

positive (B) true false

negative (notB) false true

Table 031 test outcomeB for event A

There are four

options then

Consider an event

A and an event that

is a test result B

indicating that event

A has occurred

Table 031 shows

these four possible test outcomes The event

A occurring can lead to a true positive or a

prob Probability bay Bayesrsquo theorem p 1

sensitivity

detection rate

specificity

posterior probability

prior probability

false negative whereas notA can lead to a true

negative or a false positive

Terminology is important here

bull P(true positive) = P(B | A) aka sensitivity

or detection rate

bull P(true negative) = P(notB | notA) aka

specificity

bull P(false positive) = P(B | notA)

bull P(false negative) = P(notB | A)

Clearly the desirable result for any test

is that it is true However no test is true

100 percent of the time So sometimes it

is desirable to err on the side of the

false positive as in the case of a medical

diagnostic Other times it is more desirable to

err on the side of a false negative as in the

case of testing for defects in manufactured

balloons (when a false negative isnrsquot a big

deal)

Posterior probabilities

Returning to Bayesrsquo theorem we can evaluate

the posterior probability P(A | B) of the

event A having occurred given that the

test B is positive given information that

includes the prior probability P(A) of A The

form in Equation bay2 or (bay3) is typically

useful because it uses commonly known test

probabilities of the true positive P(B | A) and

of the false positive P(B | notA) We calculate

P(A | B) when we want to interpret test results

prob Probability bay Bayesrsquo theorem p 2

0 02 04 06 08 1

middot10ndash2

0

02

04

06

08

1

P(A) rarr

P(A|B)rarr

P(B|A) = 09000

P(B|A) = 09900

P(B|A) = 09990

P(B|A) = 09999

Figure 031 for different high-sensitivities theprobability that an event A occurred given thata test for it B is positive versus the probabilitythat the event A occurs under the assumption thespecificity equals the sensitivity

Some interesting results can be found from this

For instance if we let P(B | A) = P(notB | notA)

(sensitivity equal specificity) and realize that

P(B | notA) + P(notB | notA) = 1 (when notA either B or notB)

we can derive the expression

P(B | notA) = 1 ndash P(B | A) (bay4)

Using this and P(notA) = 1 ndash P(A) in Equation bay3

gives (recall wersquove assumed sensitivity equals

specificity)

P(A | B) =1

1 +1 ndash P(B | A)

P(B | A)middot 1 ndash P(A)P(A)

(bay5)

=1

1 +

(1

P(B | A)ndash 1

)(1

P(A)ndash 1

) (bay6)

This expression is plotted in Figure 031 See that

a positive result for a rare event (small P(A)) is

hard to trust unless the sensitivity P(B | A) and

specificity P(notB | notA) are very high indeed

prob Probability bay Bayesrsquo theorem p 3

Example probbayshy1 re Bayesrsquo theorem

Problem StatementSuppose 01 of springs manufactured at a

given plant are defective Suppose you need

to design a test that when it indicates

a deffective part the part is actually

defective 99 of the time What sensitivity

should your test have assuming it can be made

equal to its specificity

Problem Solution

The followingwas generated from a Jupyter

notebook with the following filename and

kernel

notebook filename bayes_theorem_example_01ipynbnotebook kernel python3

from sympy import for symbolicsimport numpy as np for numericsimport matplotlibpyplot as plt for plotsfrom IPythondisplay import display Markdown Latex

Define symbolic variables

var(p_Ap_nAp_Bp_nBp_B_Ap_B_nAp_A_Breal=True)

(p_A p_nA p_B p_nB p_B_A p_B_nA p_A_B)

prob Probability bay Bayesrsquo theorem p 4

Beginningwith Bayesrsquo theorem and assuming

the sensitivity and specificity are equal by

Equation bay4 we can derive the following

expression for the posterior probability P(A | B)

p_A_B_e1 = Eq(p_A_Bp_B_Ap_Ap_B)subs(

p_B p_B_Ap_A+p_B_nAp_nA conditional probp_B_nA 1-p_B_A Eq (35)p_nA 1-p_A

)display(p_A_B_e1)

pAB =pApBA

pApBA +(1 ndash pA

) (1 ndash pBA

)

Solve this for P(B | A) the quantity we seek

p_B_A_sol = solve(p_A_B_e1p_B_Adict=True)p_B_A_eq1 = Eq(p_B_Ap_B_A_sol[0][p_B_A])display(p_B_A_eq1)

pBA =pAB

(1 ndash pA

)ndash2pApAB + pA + pAB

Now letrsquos substitute the given probabilities

p_B_A_spec = p_B_A_eq1subs(

prob Probability rando Bayesrsquo theorem p 5

p_A 0001p_A_B 099

)display(p_B_A_spec)

pBA = 0999989888981011

Thatrsquos a tall order

prob Probability rando Random variables p 1

random variable

discrete random variable

continuous random variable

probrando Random variables

Probabilities are useful even when they do

not deal strictly with events It often occurs

that we measure something that has randomness

associated with it We use random variables to

represent these measurements

A random variable X Ω rarr R is a function

that maps an outcome ω from the sample

space Ω to a real number x isin R as shown in

Figure 032 A random variable will be denoted

with a capital letter (eg X and K) and a

specific value that it maps to (the value) will

be denoted with a lowercase letter (eg x and

k)

A discrete random variable K is one that

takes on discrete values A continuous random

variable X is one that takes on continuous

values

Example probrandoshy1 re dice again

Problem Statement

Roll two unbiased dice Let K be a random

variable representing the sum of the two Let

rvR

Ω

outcome ω

x

X

Figure 032 a random variable X maps an outcomeω isin Ω to an x isin R

prob Probability rando Random variables p 2

P(k) be the probability of the result k isin K Plotand interpret P(k)

Example probrandoshy2 re Johnson-Nyquist noise

Problem Statement

A resistor at nonzero temperature without

any applied voltage exhibits an interesting

phenomenon its voltage randomly fluctuates

This is called JohnsonshyNyquist noise and

is a result of thermal excitation of charge

carriers (electrons typically) For a given

resistor and measurement system let the

probability density function fV of the voltage

V across an unrealistically hot resistor be

fV(V) =1radicπendashV

2

Plot and interpret the meaning of this

function

prob Probability pxf Random variables p 3

prob Probability pxf Probability density and mass functions p 1

frequency distribution

probability mass function

Figure 033 plot of a probability mass function

frequency density distribution

probability density function

probpxf Probability density

and mass functions

Consider an experiment that measures a

random variable We can plot the relative

frequency of the measurand landing in

different ldquobinsrdquo (ranges of values) This is

called a frequency distribution or a

probability mass function (PMF)

Consider for instance a probability mass

function as plotted in Figure 033 where a

frequency ai can be interpreted as an estimate

of the probability of the measurand being in

the ith interval The sum of the frequencies

must be unity

with k being the number of bins

The frequency density distribution is similar

to the frequency distribution but with ai 7rarrai∆x where ∆x is the bin width

If we let the bin width approach zero we

derive the probability density function (PDF)

f(x) = limkrarrinfin∆xrarr0

ksumj=1

aj∆x (pxf1)

prob Probability pxf Probability density and mass functions p 1

Figure 034 plot of a probability density function

We typically think of a probability density

function f like the one in Figure 034 as a

function that can be integrated over to

find the probability of the random variable

(measurand) being in an interval [a b]

P(x isin [a b]) =ˆ b

af(χ)dχ (pxf2)

Of course

We now consider a common PMF and a common

PDF

Binomial PMF

Consider a random binary sequence of length

n such that each element is a random 0 or 1

generated independently like

(1 0 1 1 0 middot middot middot 1 1) (pxf3)

Let events 1 and 0 be mutually exclusive

and exhaustive and P(1) = p The probability of

the sequence above occurring is

prob Probability pxf Probability density and mass functions p 2

binomial distribution PDF

There are n choose k(n

k

)=

n

k(n ndash k) (pxf4)

possible combinations of k ones for n bits

Therefore the probability of any combination

of k ones in a series is

f(k) =

(n

k

)pk(1 ndash p)nndashk (pxf5)

We call Equation pxf5 the binomial

distribution PDF

Example probpxfshy1 re Binomial PMF

Problem Statement

Consider a field sensor that fails for a

given measurement with probability p Given

n measurements plot the binomial PMF as

a function of k failed measurements for

a few different probabilities of failure

p isin [004 025 05 075 096]

Problem SolutionFigure 036 shows Matlab code for

constructing the PDFs plotted in Figure 035

Note that the symmetry is due to the fact

that events 1 and 0 are mutually exclusive

and exhaustive

Example probpxfshy2 re hi

Problem Statement

Sed mattis erat sit amet gravida malesuada

elit augue egestas diam tempus scelerisque

nunc nisl vitae libero Sed consequat feugiat

massa Nunc porta eros in eleifend varius

erat leo rutrum dui non convallis lectus

prob Probability pxf Probability density and mass functions p 3

0 20 40 60 80 1000

005

01

015

02

number of failures in sequence

proba

bility

p = 004

p = 025

p = 05

p = 075

p = 096

Figure 035 binomial PDF for n = 100 measurementsand different values of P(1) = p the probability ofa measurement error The plot is generated by theMatlab code of Figure 036

parametersn = 100k_a = linspace(1nn)p_a = [042557596]

binomial functionf = (nkp) nchoosek(nk)p^k(1-p)^(n-k)

loop through to construct an arrayf_a = NaNones(length(k_a)length(p_a))for i = 1length(k_a)

for j = 1length(p_a)f_a(ij) = f(nk_a(i)p_a(j))

endend

plotfigurecolors = jet(length(p_a))for j = 1length(p_a)

bar(k_af_a(j)facecolorcolors(j)facealpha05displayname [$p = num2str(p_a(j))$]

) hold onendleg = legend(showlocationnorth)set(leginterpreterlatex)hold offxlabel(number of ones in sequence k)ylabel(probability)xlim([0100])

Figure 036 a Matlab script for generating binomialPMFs

prob Probability pxf Probability density and mass functions p 4

orci ut nibh Sed lorem massa nonummy quis

egestas id condimentum at nisl Maecenas at

nibh Aliquam et augue at nunc pellentesque

ullamcorper Duis nisl nibh laoreet suscipit

convallis ut rutrum id enim Phasellus odio

Nulla nulla elit molestie non scelerisque at

vestibulum eu nulla Ut odio nisl facilisis

id mollis et scelerisque nec enim Aenean sem

leo pellentesque sit amet scelerisque sit amet

vehicula pellentesque sapien

prob Probability pxf Probability density and mass functions p 5

prob Probability pxf Probability density and mass functions p 6

prob Probability E Probability density and mass functions p 1

Gaussian or normal random variable

mean

standard deviation

variance

Gaussian PDF

The Gaussian or normal random variable x has

PDF

f(x) =1

σradic2π

expndash(x ndash micro)2

2σ2 (pxf6)

Although wersquore not quite ready to understand

these quantities in detail it can be shown that

the parameters micro and σ have the following

meanings

bull micro is the mean of x

bull σ is the standard deviation of x and

bull σ2 is the variance of x

Consider the ldquobell-shapedrdquo Gaussian PDF in

Figure 037 It is always symmetric The mean micro

is its central value and the standard deviation

σ is directly related to its width We will

continue to explore the Gaussian distribution

in the following lectures especially in

Lecture statsconfidence

prob Probability E Probability density and mass functions p 2

ndash3 ndash2 ndash1 0 1 2 3

random variable x

Figure 037 PDF for Gaussian random variable x

mean micro = 0 and standard deviation σ = 1radic2

prob Probability E Expectation p 1

expected value

expectation

mean

probE Expectation

Recall that a random variable is a function

X Ω rarr R that maps from the sample

space to the reals Random variables are the

arguments of probability mass functions (PMFs)

and probability density functions (PDFs)

The expected value (or expectation) of a

random variable is akin to its ldquoaverage valuerdquo

and depends on its PMF or PDF The expected

value of a random variable X is denoted

〈X〉 or E[X] There are two definitions of

the expectation one for a discrete random

variable the other for a continuous random

variable Before we define them however it

is useful to predefine the most fundamental

property of a random variable its mean

Definition probE1 mean

The mean of a random variable X is defined

as

mX = E[X]

Letrsquos begin with a discrete random variable

Definition probE2 expectation of a

discrete random

variable

Let K be a discrete random variable and f its

PMF The expected value of K is defined as

E[K]=sumforallk

kf(k)

prob Probability E Expectation p 2

Example probEshy1 re expectation of a discrete random variable

Problem Statement

Given a discrete random variable K with PMF

shown below what is its mean mK

Let us now turn to the expectation of a

continuous random variable

Definition probE3 expectation of a

continuous random

variable

Let X be a continuous random variable and f

its PDF The expected value of X is defined as

E[X]=

ˆ infinndashinfin xf(x)dx

Example probEshy2 re expectation of a continuous random variable

Problem Statement

Given a continuous random variable X with

Gaussian PDF f what is the expected value of

X

random variable x

prob Probability moments Expectation p 3

Due to its sum or integral form the expected

value E [middot] has some familiar properties for

random variables X and Y and reals a and b

E [a] = a (E1a)

E[X + a

]= E

[X]+ a (E1b)

E[aX

]= aE

[X]

(E1c)

E[E[X]]= E

[X]

(E1d)

E[aX + bY

]= aE

[X]+ b E

[Y] (E1e)

prob Probability moments Central moments p 1

central moments

variance

probmoments Central moments

Given a probability mass function (PMF) or

probability density function (PDF) of a random

variable several useful parameters of the

random variable can be computed These are

called central moments which quantify

parameters relative to its mean

Definition probmoments1 central moments

The nth central moment of random variable X

with PDF f is defined as

E[(X ndash microX)

n] = ˆ infinndashinfin (x ndash microX)

nf(x)dx

For discrete random variable K with PMF f

E[(K ndash microK)

n] = infinsumforallk(k ndash microK)

nf(k)

Example probmomentsshy1 re first moment

Problem Statement

Prove that the first moment of continuous

random variable X is zero

The second central moment of random variable

X is called the variance and is denoted

prob Probability moments Central moments p 2

standard deviation

skewness

σ2X or Var[X]

or E[(X ndash microX)

2]

(moments1)

The variance is a measure of the width or

spread of the PMF or PDF We usually compute

the variance with the formula

Other properties of variance include for real

constant c

Var [c] = 0 (moments2)

Var[X + c

]= Var

[X]

(moments3)

Var[cX

]= c2 Var

[X] (moments4)

The standard deviation is defined as

Although the variance is mathematically more

convenient the standard deviation has the

same physical units as X so it is often the

more physically meaningful quantity Due to

its meaning as the width or spread of the

probability distribution and its sharing of

physical units it is a convenient choice for

error bars on plots of a random variable

The skewness Skew[X]is a normalized third

prob Probability ex Central moments p 3

asymmetry

kurtosis

tailedness

central moment

Skew[X]=E[(X ndash microX)

3]

σ3X

(moments5)

Skewness is a measure of asymmetry of a

random variablersquos PDF or PMF For a symmetric

PMF or PDF such as the Gaussian PDF Skew[X]=

0

The kurtosis Kurt[X]is a normalized fourth

central moment

Kurt[X]=E[(X ndash microX)

4]

σ4X

(moments6)

Kurtosis is a measure of the tailedness of a

random variablersquos PDF or PMF ldquoHeavierrdquo tails

yield higher kurtosis

A Gaussian random variable has PDF with

kurtosis 3 Given that for Gaussians both

skewness and kurtosis have nice values (0 and

3) we can think of skewness and and kurtosis

as measures of similarity to the Gaussian PDF

stats Probability Exercises for Chapter 03 p 1

probex Exercises for Chapter

03

Exercise 031 (5)

Several physical processes can be modeled

with a random walk a process of interatively

changing a quantity by some random

amount Infinitely many variations are

possible but common factors of variation

include probability distribution step

size dimensionality (eg oneshydimensional

twoshydimensional etc) and coordinate system

Graphical representations of these walks can be

beautiful Develop a computer program that

generates random walks and corresponding

graphics Do it well and call it art because it

is

estimation

stats

Statistics

Whereas probability theory is primarily

focused on the relations among mathematical

objects statistics is concerned with making

sense of the outcomes of observation (Steven

S Skiena Calculated Bets Computers Gambling

and Mathematical Modeling to Win Outlooks

Cambridge University Press 2001 DOI 10

1017CBO9780511547089 This includes a lucid

section on probability versus statistics also

available here httpswww3csstonybrook

edu~skienajaialaiexcerptsnode12html)

However we frequently use statistical methods

to estimate probabilistic models For instance

we will learn how to estimate the standard

deviation of a random process we have some

reason to expect has a Gaussian probability

distribution

Statistics has applications in nearly every

applied science and engineering discipline

Any time measurements are made statistical

analysis is how one makes sense of the results

For instance determining a reasonable level

of confidence in a measured parameter

stats Statistics terms p 3

machine learning

1 Ash Basic Probability Theory

2 Jaynes et al Probability Theory The Logic of Science

3 Erwin Kreyszig Advanced Engineering Mathematics 10th John Wiley amp

Sons Limited 2011 ISBN 9781119571094 The authoritative resource for

engineering mathematics It includes detailed accounts of probability

statistics vector calculus linear algebra fourier analysis ordinary

and partial differential equations and complex analysis It also

includes several other topics with varying degrees of depth Overall

it is the best place to start when seeking mathematical guidance

requires statistics

A particularly hot topic nowadays is machine

learning which seems to be a field with

applications that continue to expand This field

is fundamentally built on statistics

A good introduction to statistics appears at

the end of Ash1 A more involved introduction

is given by Jaynes et al2 The treatment by

Kreyszig3 is rather incomplete as will be our

own

stats Statistics terms Populations samples and machine learning p 1

population

sample

random

machine learning

training

training set

features

predictive model

statsterms Populations samples

and machine

learning

An experimentrsquos population is a complete

collection of objects that we would like to

study These objects can be people machines

processes or anything else we would like to

understand experimentally

Of course we typically canrsquot measure all of

the population Instead we take a subset of

the populationmdashcalled a samplemdashand infer the

characteristics of the entire population from

this sample

However this inference that the sample is

somehow representative of the population

assumes the sample size is sufficiently large

and that the sampling is random This means

selection of the sample should be such that no

one group within a population are systematically

over- or under-represented in the sample

Machine learning is a field that makes

extensive use of measurements and statistical

inference In it an algorithm is trained by

exposure to sample data which is called a

training set The variables measured are

called features Typically a predictive

model is developed that can be used to

extrapolate from the data to a new situation

The methods of statistical analysis we

introduce in this chapter are the foundation

of most machine learning methods

stats Statistics sample Populations samples and machine learning p 2

Example statstermsshy1 re combat boots

Problem Statement

Consider a robot Pierre with a particular

gravitas and sense of style He seeks just

the right-looking pair of combat boots

for wearing in the autumn rains Pierre

is to purchase the boots online via image

recognition and decides to gather data

by visiting a hipster hangout one evening to

train his style For contrast he also watches

footage of a White Nationalist rally focusing

special attention on the boots of wearers of

khakis and polos Comment on Pierrersquos methods

stats Statistics sample Estimation of sample mean and variance p 1

sample mean

statssample Estimation of sample

mean and variance

Estimation and sample statistics

The mean and variance definitions of

Lecture probE and Lecture probmoments apply

only to a random variable for which we have a

theoretical probability distribution Typically

it is not until after having performed many

measurements of a random variable that

we can assign a good distribution model

Until then measurements can help us estimate

aspects of the data We usually start by

estimating basic parameters such as mean

and variance before estimating a probability

distribution

There are two key aspects to randomness in the

measurement of a random variable First of

course there is the underlying randomness with

its probability distribution mean standard

deviation etc which we call the population

statistics Second there is the statistical

variability that is due to the fact that we

are estimating the random variablersquos statisticsmdash

called its sample statisticsmdashfrom some sample

Statistical variability is decreased with

greater sample size and number of samples

whereas the underlying randomness of the

random variable does not decrease Instead

our estimates of its probability distribution

and statistics improve

Sample mean variance and standard

deviation

The arithmetic mean or sample mean of a

measurand with sample size N represented by

stats Statistics sample Estimation of sample mean and variance p 2

population mean

sample variance

population variance

random variable X is defined as

x =1

N

Nsumi=1

xi (sample1)

If the sample size is large x rarr mX (the sample

mean approaches the mean) The population

mean is another name for the mean mX which

is equal to

mX = limNrarrinfin 1

N

Nsumi=1

xi (sample2)

Recall that the definition of the mean is

mX = E [x]

The sample variance of a measurand

represented by random variable X is defined

as

S2X =1

N ndash 1

Nsumi=1

(xi ndash x)2 (sample3)

If the sample size is large S2X rarr σ2X(the

sample variance approaches the variance) The

population variance is another term for the

variance σ2X and can be expressed as

σ2X = limNrarrinfin 1

N ndash 1

Nsumi=1

(xi ndash x)2 (sample4)

Recall that the definition of the variance is

σ2X= E

[(X ndash mX)

2]

The sample standard deviation of a measurand

represented by random variable X is defined

as

SX =

radicS2X (sample5)

If the sample size is large SX rarr σX (the sample

standard deviation approaches the standard

stats Statistics sample Estimation of sample mean and variance p 1

mean of means Xi

mean of standard deviations SXi

standard deviation of means SXi

standard deviation of standard deviations SSXi

deviation) The population standard deviation is

another term for the standard deviation σX

and can be expressed as

σX = limNrarrinfin

radicS2X (sample6)

Recall that the definition of the standard

deviation is σX =radicσ2X

Sample statistics as random variables

There is an ambiguity in our usage of the term

ldquosamplerdquo It can mean just one measurement

or it can mean a collection of measurements

gathered together Hopefully it is clear from

context

In the latter sense often we collect multiple

samples each of which has its own sample

mean Xi and standard deviation SXi In this

situation Xi and SXi are themselves random

variables (meta af I know) This means they

have their own sample means Xi and SXi and

standard deviations SXi

and SSXi

The mean of means Xi is equivalent to

a mean with a larger sample size and is

therefore our best estimate of the mean of

the underlying random process The mean of

standard deviations SXi is our best estimate

of the standard deviation of the underlying

random process The standard deviation

of means SXi

is a measure of the spread

in our estimates of the mean It is our best

estimate of the standard deviation of the

statistical variation and should therefore

tend to zero as sample size and number of

samples increases The standard deviation

of standard deviations SSXiis a measure of

stats Statistics sample Estimation of sample mean and variance p 1

the spread in our estimates of the standard

deviation of the underlying process It should

also tend to zero as sample size and number of

samples increases

Let N be the size of each sample It can be

shown that the standard deviation of the

means SXi

can be estimated from a single

sample standard deviation

SXiasympSXiradicN (sample7)

This shows that as the sample size N increases

the statistical variability of the mean

decreases (and in the limit approaches zero)

Nonstationary signal statistics

The sample mean variance and standard

deviation definitions above assume the random

process is stationarymdashthat is its population

mean does not vary with time However

a great many measurement signals have

populations that do vary with time ie they are

nonstationary Sometimes the nonstationarity

arises from a ldquodriftrdquo in the dc value of a

signal or some other slowly changing variable

But dynamic signals can also change in a

recognizable and predictable manner as when

say the temperature of a room changes when

a window is opened or when a water level

changes with the tide

Typically we would like to minimize the effect

of nonstationarity on the signal statistics In

certain cases such as drift the variation is a

nuissance only but other times it is the point

of the measurement

stats Statistics sample Estimation of sample mean and variance p 1

Two common techniques are used depending

on the overall type of nonstationarity If it

is periodic with some known or estimated

period the measurement data series can

be ldquofoldedrdquo or ldquoreshapedrdquo such that the ith

measurement of each period corresponds to

the ith measurement of all other periods In

this case somewhat counterintuitively we can

consider the ith measurements to correspond

to a sample of size N where N is the number of

periods over which measurements are made

When the signal is aperiodic we often simply

divide it into ldquosmallrdquo (relative to its overall

trend) intervals over which statistics are

computed separately

Note that in this discussion we have assumed

that the nonstationarity of the signal is due to

a variable that is deterministic (not random)

Example statssampleshy1 re measurement of gaussian noise on

nonstationary signal

Problem StatementConsider the measurement of the temperature

inside a desktop computer chassis via an

inexpensive thermistor a resistor that

changes resistance with temperature The

processor and power supply heat the chassis

in a manner that depends on processing

demand For the test protocol the processors

are cycled sinusoidally through processing

power levels at a frequency of 50 mHz for

nT = 12 periods and sampled at 1 Hz Assume

a temperature fluctuation between about 20

and 50 C and gaussian noise with standard

stats Statistics sample Estimation of sample mean and variance p 2

deviation 4 C Consider a sample to be the

multiple measurements of a certain instant in

the period

1 Generate and plot simulated

temperature data as a time series

and as a histogram or frequency

distribution Comment on why the

frequency distribution sucks

2 Compute the sample mean and standard

deviation for each sample in the cycle

3 Subtract the mean from each sample

in the period such that each sample

distribution is centered at zero Plot

the composite frequency distribution

of all samples together This represents

our best estimate of the frequency

distribution of the underlying process

4 Plot a comparison of the theoretical

mean which is 35 and the sample mean

of means with an error bar Vary the

number of samples nT and comment on its

effect on the estimate

5 Plot a comparison of the theoretical

standard deviation and the sample mean

of sample standard deviations with an

error bar Vary the number of samples

nT and comment on its effect on the

estimate

6 Plot the sample means over a single

period with error bars of plusmn one sample

standard deviation of the means This

represents our best estimate of the

sinusoidal heating temperature Vary the

number of samples nT and comment on

the estimate

Problem Solution

stats Statistics sample Estimation of sample mean and variance p 3

clear close all clear kernel

Generate the temperature data

The temperature data can be generated by

constructing an array that is passed to a

sinusoid then ldquorandomizedrdquo by gaussian

random numbers Note that we add 1 to npand n to avoid the sneaky fencepost error

f = 50e-3 Hz sinusoid frequencya = 15 C amplitude of oscillationdc = 35 C dc offset of oscillationfs = 1 Hz sampling frequencynT = 12 number of sinusoid periodss = 4 C standard deviationnp = fsf+1 number of samples per periodn = nTnp+1 total number of samples

t_a = linspace(0nTfn) time arraysin_a = dc + asin(2pift_a) sinusoidal arrayrng(43) seed the random number generatornoise_a = srandn(size(t_a)) gaussian noisesignal_a = sin_a + noise_a sinusoid + noise

Now that we have an array of ldquodatardquo wersquore

ready to plot

h = figurep = plot(t_asignal_ao-

Color[888]MarkerFaceColorbMarkerEdgeColornoneMarkerSize3)

xlabel(time (s))ylabel(temperature (C))hgsave(hfigurestemp)

stats Statistics sample Estimation of sample mean and variance p 4

0 50 100 150 200 250

20

40

60

time (s)

tempera

ture

(C)

Figure 041 temperature over time

This is something like what we might see for

continuous measurement data Now the

histogram

h = figurehistogram(signal_a

30 number of binsnormalizationprobability for PMF

)xlabel(temperature (C))ylabel(probability)hgsave(hfigurestemp)

10 20 30 40 50 600

002

004

006

008

temperature (C)

proba

bility

Figure 042 a poor histogram due tounstationarity of the signal

stats Statistics sample Estimation of sample mean and variance p 5

This sucks because we plot a frequency

distribution to tell us about the random

variation but this data includes the sinusoid

Sample mean variance and standard

deviation

To compute the sample mean micro and standard

deviation s for each sample in the period

we must ldquopick outrdquo the nT data points that

correspond to each other Currently theyrsquore in

one long 1 times n array signal_a It is helpful

to reshape the data so it is in an nT times nparray which each row corresponding to a

new period This leaves the correct points

aligned in columns It is important to note

that we can do this ldquofoldingrdquo operation only

when we know rather precisely the period

of the underlying sinusoid It is given in the

problem that it is a controlled experiment

variable If we did not know it we would

have to estimate it too from the data

signal_ar = reshape(signal_a(1end-1)[npnT])size(signal_ar) check sizesignal_ar(1314) print three rows four columns

ans =

12 21

ans =

302718 400946 408341 447662401836 372245 494076 461137400571 409718 461627 419145

stats Statistics sample Estimation of sample mean and variance p 6

Define the mean variance and standard

deviation functions as ldquoanonmymous

functionsrdquo We ldquoroll our ownrdquo These are not

as efficient or flexible as the built-in Matlab

functions mean var and std which should

typically be used

my_mean = (vec) sum(vec)length(vec)my_var = (vec) sum((vec-my_mean(vec))^2)(length(vec)-1)

my_std = (vec) sqrt(my_var(vec))

Now the sample mean variance and standard

deviations can be computed We proceed by

looping through each column of the reshaped

signal array

mu_a = NaNones(1np) initialize mean arrayvar_a = NaNones(1np) initialize var arrays_a = NaNones(1np) initialize std array

for i = 1np for each columnmu_a(i) = my_mean(signal_ar(i))var_a(i) = my_var(signal_ar(i))s_a(i) = sqrt(var_a(i)) touch of speed

end

Composite frequency distribution

The columns represent samples We want to

subtract the mean from each column We use

repmat to reproduce mu_a in nT rows so it can

be easily subtracted

stats Statistics sample Estimation of sample mean and variance p 7

signal_arz = signal_ar - repmat(mu_a[nT1])size(signal_arz) check sizesignal_arz(1314) print three rows four columns

ans =

12 21

ans =

-50881 09525 -02909 -1570048237 -19176 82826 -0222546972 18297 50377 -44216

Now that all samples have the same mean

we can lump them into one big bin for the

frequency distribution There are some nice

built-in functions to do a quick reshape and

fit

resizesignal_arzr = reshape(signal_arz[1nTnp])size(signal_arzr) check size fitpdfit_model = fitdist(signal_arzrnormal) fitx_a = linspace(-1515100)pdfit_a = pdf(pdfit_modelx_a)pdf_a = normpdf(x_a0s) theoretical pdf

ans =

1 252

Plot

stats Statistics sample Estimation of sample mean and variance p 8

h = figurehistogram(signal_arzr

round(ssqrt(nT)) number of binsnormalizationprobability for PMF

)hold onplot(x_apdfit_ab-linewidth2) hold onplot(x_apdf_ag--linewidth2)legend(pmfpdf estpdf)xlabel(zero-mean temperature (C))ylabel(probability massdensity)hgsave(hfigurestemp)

ndash15 ndash10 ndash5 0 5 10 150

005

01

015

zero-mean temperature (C)proba

bility

massdensity

pmf

pdf est

pdf

Figure 043 PMF and estimated and theoreticalPDFs

Means comparison

The sample mean of means is simply the

following

mu_mu = my_mean(mu_a)

mu_mu =

351175

stats Statistics sample Estimation of sample mean and variance p 9

The standard deviation that works as an

error bar which should reflect how well

we can estimate the point plotted is the

standard deviation of the means It is

difficult to compute this directly for a

nonstationary process We use the estimate

given above and improve upon it by using the

mean of standard deviations instead of a

single samplersquos standard deviation

s_mu = mean(s_a)sqrt(nT)

s_mu =

11580

Now for the simple plot

h = figurebar(mu_mu) hold on gives barerrorbar(mu_mus_murlinewidth2) error barax = gca current axisaxXTickLabels = $overlineoverlineX$axTickLabelInterpreter = latexhgsave(hfigurestemp)

Standard deviations comparison

The sample mean of standard deviations is

simply the following

mu_s = my_mean(s_a)

stats Statistics sample Estimation of sample mean and variance p 10

mu_s =

40114

The standard deviation that works as

an error bar which should reflect how

well we can estimate the point plotted is

the standard deviation of the standard

deviations We can compute this directly

s_s = my_std(s_a)

s_s =

08495

Now for the simple plot

h = figurebar(mu_s) hold on gives barerrorbar(mu_ss_srlinewidth2) error barsax = gca current axisaxXTickLabels = $overlineS_X$axTickLabelInterpreter = latexhgsave(hfigurestemp)

X0

10

20

30

40

Figure 044 sample mean of sample means

SX

0

2

4

Figure 045 sample standard deviation of sample means

Plot a period with error bars

Plotting the data with error bars is fairly

straightforward with the built-in errorbar

stats Statistics confidence Estimation of sample mean and variance p 11

function The main question is ldquowhich

standard deviationrdquo Since wersquore plotting the

means it makes sense to plot the error bars

as a single sample standard deviation of the

means

h = figuree1 = errorbar(t_a(1np)mu_as_muones(1np)b)hold ont_a2 = linspace(01f101)e2 = plot(t_a2dc + asin(2pift_a2)r-)xlim([t_a(1)t_a(np)])grid onxlabel(folded time (s))ylabel(temperature (C))legend([e1 e2]sample meanpopulation meanLocationNorthEast)hgsave(hfigurestemp)

0 5 10 15

20

30

40

50

folded time (s)

tempera

ture

(C)

sample mean

population mean

Figure 046 sample means over a period

stats Statistics confidence Confidence p 1

confidence

central limit theorem

statsconfidence Confidence

One really ought to have it to give a lecture

named it but wersquoll give it a try anyway

Confidence is used in the common sense

although we do endow it with a mathematical

definition to scare business majors who

arenrsquot actually impressed but indifferent

Approximately if under some reasonable

assumptions (probabilistic model) we estimate

the probability of some event to be P we say

we have P confidence in it I mean business

majors are all ldquoSupply and demand Letrsquos call

that a lsquolawrsquo rdquo so I think wersquore even

So wersquore back to computing probability from

distributionsmdashprobability density functions

(PDFs) and probability mass functions (PMFs)

Usually we care most about estimating the

mean of our distribution Recall from the

previous lecture that when several samples

are taken each with its own mean the mean

is itself a random variablemdashwith a mean of

course Meanception

But the mean has a probability distribution

of its own The central limit theorem has

as one of its implications that as the sample

size N gets large regardless of the sample

distributions this distribution of means

approaches the Gaussian distribution

But sometimes I always worry Irsquom being lied to

so letrsquos check

stats Statistics confidence Confidence p 1

clear close all clear kernel

Generate some data to test the central

limit theorem

Data can be generated by constructing an

array using a (seeded for consistency) random

number generator Letrsquos use a uniformly

distributed PDF between 0 and 1

N = 150 sample size (number of measurements per sample)M = 120 number of samplesn = NM total number of measurementsmu_pop = 05 because its a uniform PDF between 0 and 1

rng(11) seed the random number generatorsignal_a = rand(NM) uniform PDFsize(signal_a) check the size

ans =

150 120

Letrsquos take a look at the data by plotting

the first ten samples (columns) as shown in

Figure 047

This is something like what we might see for

continuous measurement data Now the

histogram shown in

samples_to_plot = 10h = figure

stats Statistics confidence Confidence p 1

0 20 40 60 80 100 120 1400

02

04

06

08

1

index

measurement

Figure 047 raw data with colors correspondingto samples

0 02 04 06 08 10

002

004

006

008

measurement

proba

bility

Figure 048 a histogram showing the approximatelyuniform distribution of each sample (color)

c = jet(samples_to_plot) color arrayfor j=1samples_to_plot

histogram(signal_a(j)30 number of binsfacecolorc(j)facealpha3normalizationprobability for PMF

)hold on

endhold offxlim([-05105])xlabel(measurement)ylabel(probability)hgsave(hfigurestemp)

This isnrsquot a great plot but it shows roughly

that each sample is fairly uniformly

distributed

stats Statistics confidence Confidence p 1

Sample statistics

Now letrsquos check out the sample statistics We

want the sample mean and standard deviation

of each column Letrsquos use the built-in functions

mean and std

mu_a = mean(signal_a1) mean of each columns_a = std(signal_a1) std of each column

Now we can compute the mean statistics both

the mean of the mean X and the standard

deviation of the mean sX which we donrsquot

strictly need for this part but wersquore curious

We choose to use the direct estimate instead

of the sXradicN formula but they should be close

mu_mu = mean(mu_a)s_mu = std(mu_a)

mu_mu =

04987

s_mu =

00236

The truth about sample means

Itrsquos the moment of truth Letrsquos look at the

distribution shown in Figure 049

stats Statistics confidence Confidence p 1

042 044 046 048 05 052 054 056 0580

01

02

03

measurement

proba

bility

Figure 049 a histogram showing the approximatelynormal distribution of the means

h = figurehistogram(mu_a

normalizationprobability for PMF)hold offxlabel(measurement)ylabel(probability)hgsave(hfigurestemp)

This looks like a Gaussian distribution about

the mean of means so I guess the central limit

theorem is legit

Gaussian and probability

We already know how to compute the

probability P a value of a random variable X

lies in a certain interval from a PMF or PDF

(the sum or the integral respectively) This

means that for sufficiently large sample size N

such that we can assume from the central limit

theorem that the sample means xi are normally

distributed the probability a sample mean

value xi is in a certain interval is given by

integrating the Gaussian PDF The Gaussian PDF

for random variable Y representing the sample

means is

f(y) =1

σradic2π

expndash(y ndash micro)2

2σ2 (confidence1)

stats Statistics confidence Confidence p 2

Gaussian cumulative distribution function

where micro is the population mean and σ is the

population standard deviation

The integral of f over some interval is the

probability a value will be in that interval

Unfortunately that integral is uncool It gives

rise to the definition of the error function

which for the Gaussian random variable Y is

erf(yb) =1radicπ

ˆ yb

ndashyb

endasht2dt (confidence2)

This expresses the probability a sample mean

being in the interval [ndashyb yb] if Y has mean 0

and variance 12

Matlab has this built-in as erf shown in

Figure 0410

y_a = linspace(03100)h = figurep1 = plot(y_aerf(y_a))p1LineWidth = 2grid onxlabel(interval bound $y_b$interpreterlatex)ylabel(error function $textrmerf(y_b)$

interpreterlatex)hgsave(hfigurestemp)

We could deal directly with the error

function but most people donrsquot and wersquore

weird enough as it is Instead people use the

Gaussian cumulative distribution function

stats Statistics confidence Confidence p 3

0 05 1 15 2 25 30

02

04

06

08

1

interval bound yb

errorfunctionerf(yb)

Figure 0410 the error function

(CDF) Φ Rrarr R which is defined as

Φ(z) =1

2

(1 + erf

(zradic2

))(confidence3)

and which expresses the probability of a

Gaussian random variable Z with mean 0 and

standard deviation 1 taking on a value in

the interval (ndashinfin z] The Gaussian CDF and PDF

are shown in Figure 0411 Values can be taken

directly from the graph but itrsquos more accurate

to use the table of values in Appendix A01

Thatrsquos great and all but occasionally (always)

we have Gaussian random variables with

nonzero means and nonunity standard

deviations It turns out we can shift any

Gaussian random variable by its mean and

scale it by its standard deviation to make

it have zero mean and standard deviation

We can then use Φ and interpret the results

as being relative to the mean and standard

deviation using phrases like ldquothe probability

it is within two standard deviations of its

meanrdquo The transformed random variable Z and

its values z are sometimes called the z-score

For a particular value x of a random variable

X we can compute its z-score (or value z of

stats Statistics confidence Confidence p 4

zb

ndash3 ndash2 ndash1 0 1 2 30

01

02

03

04

z

Gaussia

nPDFf(z)

Φ(zb)

f(z)

(a) the Gaussian PDF

ndash3 ndash2 ndash1 0 1 2 30

02

04

06

08

1

interval bound zb

Gaussia

nCDFΦ(zb)

(b) the Gaussian CDF

Figure 0411 the Gaussian PDF and CDF for z-scores

random variable Z) with the formula

z =x ndash microXσX

(confidence4)

and compute the probability of X taking on a

value within the interval say x isin [xbndash xb+]

from the table (Sample statistics X and SXare appropriate when population statistics are

unknown)

For instance compute the probability a

Gaussian random variable X with microX = 5 and

σX = 234 takes on a value within the interval

x isin [3 6]

1 Compute the z-score of each endpoint of

the interval

stats Statistics confidence Confidence p 5

z3 =3 ndash microXσX

asymp ndash085 (confidence5)

z6 =6 ndash microXσX

asymp 043 (confidence6)

2 Look up the CDF values for z3 and z6 which

are Φ(z3) = 01977 and Φ(z6) = 06664 3 The

CDF values correspond to the probabilities x lt 3

and x lt 6 Therefore to find the probability

x lies in that interval we subtract the lower

bound probability

P(x isin [3 6]) = P(x lt 6) ndash P(x lt 3) (confidence7)

= Φ(6) ndashΦ(3) (confidence8)

asymp 06664 ndash 01977 (confidence9)

asymp 04689 (confidence10)

So there is a 4689 probability and therefore

we have 4689 confidence that x isin [3 6]

Often we want to go the other way estimating

the symmetric interval [xbndash xb+] for which

there is a given probability In this case we

first look up the z-score corresponding to a

certain probability For concreteness given

the same population statistics above letrsquos find

the symmetric interval [xbndash xb+] over which we

have 90 confidence From the table we want

two symmetric z-scores that have CDFshyvalue

difference 09 Or in maths

Φ(zb+) ndashΦ(zbndash) = 09 and zb+ = ndashzbndash

(confidence11)

stats Statistics confidence Confidence p 6

Due to the latter relation and the additional

fact that the Gaussian CDF has antisymmetry

Φ(zb+) +Φ(zbndash) = 1 (confidence12)

Adding the two Φ equations

Φ(zb+) = 192 (confidence13)

= 095 (confidence14)

and Φ(zbndash) = 005 From the table these

correspond (with a linear interpolation) to

zb = zb+ = ndashzbndash asymp 1645 All that remains is

to solve the z-score formula for x

x = microX + zσX (confidence15)

From this

xb+ = microX + zb+σX asymp 8849 (confidence16)

xbndash = microX + zbndashσX asymp 1151 (confidence17)

and X has a 90 confidence interval

[1151 8849]

stats Statistics confidence Confidence p 7

Example statsconfidenceshy1 re gaussian confidence for a mean

Problem StatementConsider the data set generated above

What is our 95 confidence interval in our

estimate of the mean

Problem Solution

Assumingwe have a sufficiently large

data set the distribution of means is

approximately Gaussian Following the same

logic as above we need z-score that gives an

upper CDF value of From the

table we obtain the zb = zb+ = ndashzbndash below

z_b = 196

Now we can estimate the mean using our

sample and mean statistics

X = Xplusmn zbSX (confidence18)

mu_x_95 = mu_mu + [-z_bz_b]s_mu

mu_x_95 =

04526 05449

stats Statistics student Confidence p 8

This is our 95 confidence interval in our

estimate of the mean

stats Statistics student Student confidence p 1

Studentrsquos t-distribution

degrees of freedom

statsstudent Student

confidence

The central limit theorem tells us that for

large sample size N the distribution of the

means is Gaussian However for small sample

size the Gaussian isnrsquot as good of an estimate

Studentrsquos t-distribution is superior for

lower sample size and equivalent at higher

sample size Technically if the population

standard deviation σX is known even for

low sample size we should use the Gaussian

distribution However this rarely arises in

practice so we can usually get away with an

ldquoalways trdquo approach

A way that the tshydistribution accounts for

low-N is by having an entirely different

distribution for each N (seems a bit of a

cheat to me) Actually instead of N it uses

the degrees of freedom ν which is N minus the

number of parameters required to compute the

statistic Since the standard deviation requires

only the mean for most of our cases ν = N ndash 1

As with the Gaussian distribution the

tshydistributionrsquos integral is difficult to

calculate Typically we will use a t-table such

as the one given in Appendix A02 There are

three points of note

1 Since we are primarily concerned with

going from probabilityconfidence

values (eg P probabilityconfidence) to

intervals typically there is a column for

each probability

stats Statistics student Student confidence p 2

2 The extra parameter ν takes over one of

the dimensions of the table because three-

dimensional tables are illegal

3 Many of these tables are ldquotwo-sidedrdquo

meaning their t-scores and probabilities

assume you want the symmetric

probability about the mean over the

interval [ndashtb tb] where tb is your t-score

bound

Consider the following example

Example statsstudentshy1 re student confidence interval

Problem StatementWrite a Matlab script to generate a data

set with 200 samples and sample sizes

N isin 10 20 100 using any old distribution

Compare the distribution of the means for

the different N Use the sample distributions

and a t-table to compute 99 confidence

intervals

Problem Solution

Generate the data set

confidence = 099 requirement

M = 200 of samplesN_a = [1020100] sample sizes

mu = 27 population meansigma = 9 population std

rng(1) seed random number generatordata_a = mu + sigmarandn(N_a(end)M) normal

stats Statistics student Student confidence p 3

size(data_a) check sizedata_a(11015) check 10 rows and five columns

ans =

100 200

ans =

211589 302894 278705 307835 283662376305 171264 282973 240811 343486201739 443719 437059 390699 322002170135 326064 369030 379230 365747193900 329156 237230 224749 197709218460 138295 312479 169527 341876219719 346854 194480 187014 241642286054 322244 222873 269906 376746252282 187326 145011 283814 277645322780 341538 270382 188643 141752

Compute the means for different sample sizes

mu_a = NaNones(length(N_a)M)for i = 1length(N_a)

mu_a(i) = mean(data_a(1N_a(i)1M)1)end

Plotting the distribution of

the means yields Figure 0412

20 25 30 350

01

02

03

random variable X

proba

bility

(PMF)

interval for N = 10

interval for N = 20

interval for N = 100

stats Statistics student Student confidence p 4

Figure 0412 a histogram showing thedistribution of the means for each sample size

It makes sense that the larger the sample size

the smaller the spread A quantitative metric

for the spread is of course the standard

deviation of the means for each sample size

S_mu = std(mu_a02)

S_mu =

283652091810097

Look up t-table values or use Matlabrsquos tinvfor different sample sizes and 99 confidence

Use these the mean of means and the

standard deviation of means to compute the

99 confidence interval for each N

t_a = tinv(confidenceN_a-1)for i = 1length(N_a)

interval = mean(mu_a(i)) + [-11]t_a(i)S_mu(i)

disp(sprintf(interval for N = i N_a(i)))disp(interval)

end

stats Statistics multivar Student confidence p 5

t_a =

28214 25395 23646

interval for N = 10190942 351000

interval for N = 20216292 322535

interval for N = 100247036 294787

As expected the larger the sample size the

smaller the interval over which we have 99

confidence in the estimate

stats Statistics multivar Multivariate probability and correlation p 1

joint PDF

joint PMF

statsmultivar Multivariate

probability and

correlation

Thus far we have considered probability

density and mass functions (PDFs and PMFs)

of only one random variable But of course

often we measure multiple random variables

X1 X2 Xn during a single event meaning

a measurement consists of determining values

x1 x2 xn of these random variables

We can consider an n-tuple of continuous

random variables to form a sample space Ω = Rn

on which a multivariate function f Rn rarr Rcalled the joint PDF assigns a probability

density to each outcome x isin Rn The joint PDF

must be greater than or equal to zero for all

x isin Rn the multiple integral over Ω must be

unity and the multiple integral over a subset

of the sample space A sub Ω is the probability of

the event A

We can consider an n-tuple of discrete random

variables to form a sample space Nn0 on which

a multivariate function f Nn0 rarr R calledthe joint PMF assigns a probability to each

outcome x isin Nn0 The joint PMF must be greater

than or equal to zero for all x isin Nn0 the

multiple sum over Ω must be unity and the

multiple sum over a subset of the sample space

A sub Ω is the probability of the event A

stats Statistics multivar Multivariate probability and correlation p 2

Example statsmultivarshy1 re bivariate gaussian pdf

Problem StatementLetrsquos visualize multivariate PDFs by plotting

a bivariate gaussian using Matlabrsquos mvnpdffunction

Problem Solution

mu = [1020] meansSigma = [1002] cov well talk about thisx1_a = linspace(

mu(1)-5sqrt(Sigma(11))mu(1)+5sqrt(Sigma(11))30)

x2_a = linspace(mu(2)-5sqrt(Sigma(22))mu(2)+5sqrt(Sigma(22))30)

[X1X2] = meshgrid(x1_ax2_a)f = mvnpdf([X1() X2()]muSigma)f = reshape(flength(x2_a)length(x1_a))

h = figurep = surf(x1_ax2_af)xlabel($x_1$interpreterlatex)ylabel($x_2$interpreterlatex)zlabel($f(x_1x_2)$interpreterlatex)shading interphgsave(hfigurestemp)

The result is Figure 0413 Note

how the means and standard

deviations affect the distribution

510

1520

0

02

x1x2

f(x1x2)

stats Statistics multivar Multivariate probability and correlation p 1

marginal PDF

Figure 0413 twoshyvariable gaussian PDF

Marginal probability

The marginal PDF of a multivariate PDF is the

PDF of some subspace of Ω after one or more

variables have been ldquointegrated outrdquo such that

a fewer number of random variables remain

Of course these marginal PDFs must have the

same properties of any PDF such as integrating

to unity

Example statsmultivarshy2 re bivariate gaussian marginal probability

Problem StatementLetrsquos demonstrate this by numerically

integrating over x2 in the bivariate Gaussian

above

Problem Solution

Continuing from where we left off letrsquos

integrate

f1 = trapz(x2_af2) trapezoidal integration

Plotting this yields Figure 0414

h = figurep = plot(x1_af1)pLineWidth = 2xlabel($x_1$interpreterlatex)

stats Statistics multivar Multivariate probability and correlation p 1

machine learning

ylabel($g(x_1)=int_-infty^infty f(x_1x_2) d x_2$interpreterlatex)hgsave(hfigurestemp)

6 8 10 12 140

01

02

03

04

x1

g(x1)=acute infin ndashinfinf

(x1x2)dx2

Figure 0414 marginal Gaussian PDF g(x1)

We should probably verify that this

integrates to one

disp([integral over x_1 = sprintf(07ftrapz(x1_af1))]

)

integral over x_1 = 09999986

Not bad

Covariance

Very often especially in machine learning

stats Statistics multivar Multivariate probability and correlation p 1

artificial intelligence

covariance

correlation

sample covariance

covariance matrix

or artificial intelligence applications the

question about two random variables X and Y

is how do they coshyvary That is what is their

covariance defined as

Cov[XY

]equiv E

((X ndash microX)(Y ndash microY)

)= E(XY) ndash microXmicroY

Note that when X = Y the covariance is just

the variance When a covariance is large and

positive it is an indication that the random

variables are strongly correlated When

it is large and negative they are strongly

anti-correlated Zero covariance means

the variables are uncorrelated In fact

correlation is defined as

Cor[XY

]=

Cov[XY

]radicVar

[X]Var

[Y]

This is essentially the covariance ldquonormalizedrdquo

to the interval [ndash1 1]

Sample covariance

As with the other statistics wersquove considered

covariance can be estimated from

measurement The estimate called the sample

covariance qXY of random variables X and Y

with sample size N is given by

qXY =1

N ndash 1

Nsumi=1

(xi ndash X)(yi ndash Y)

Multivariate covariance

With n random variables Xi one can compute

the covariance of each pair It is common

practice to define an n times n matrix of

covariances called the covariance matrix Σ

stats Statistics multivar Multivariate probability and correlation p 2

sample covariance matrix

such that each pairrsquos covariance

Cov[Xi Xj

](multivar1)

appears in its row-column combination (making it

symmetric) as shown below

Σ =

Cov

[X1 X1

]Cov

[X1 X2

]middot middot middot Cov

[X1 Xn

]Cov

[X2 X1

]Cov

[X2 X2

]Cov

[X2 Xn

]

Cov[Xn X1

]Cov

[Xn X2

]middot middot middot Cov

[Xn Xn

]

The multivariate sample covariance matrix

Q is the same as above but with sample

covariances qXiXj

Both covariance matrices have correlation

analogs

Example statsmultivarshy3 re car data sample covariance and correlation

Problem StatementLetrsquos use a built-in multivariate data set

that describes different features of cars

listed below

d = load(carsmallmat) this is a struct

Letrsquos compute the sample covariance and

correlation matrices

Problem Solution

variables = MPGCylindersDisplacementHorsepowerWeightAcceleration

stats Statistics multivar Multivariate probability and correlation p 3

Model_Yearn = length(variables)m = length(dMPG)

data = NaNones(mn) preallocatefor i = 1n

data(i) = d(variablesi)end

cov_d = nancov(data) sample covariancecor_d = corrcov(cov_d) sample correlation

This is highly correlatedanticorrelated

data Letrsquos plot each variable versus each

other variable to see the correlations of each

We use a red-to-blue colormap to contrast

anticorrelation and correlation Purple then is

uncorrelated

The following builds the red-to-blue colormap

n_colors = 10cmap_rb = NaNones(n_colors3)for i = 1n_colors

a = in_colorscmap_rb(i) = (1-a)[100]+a[001]

end

h = figurefor i = 1n

for j = 1nsubplot(nnsub2ind([nn]ji))p = plot(d(variablesi)d(variablesj)

) hold onthis_color = cmap_rb(round((cor_d(ij)+1)(n_colors-1)2)

stats Statistics multivar Multivariate probability and correlation p 1

)pMarkerFaceColor = this_colorpMarkerEdgeColor = this_color

endendhgsave(hfigurestemp)

Figure 0415 car data correlation

Conditional probability and dependence

Independent variables are uncorrelated

However uncorrelated variables may or may

not be independent Therefore we cannot use

correlation alone as a test for independence

For instance for random variables X and Y

where X has some even distribution and Y = X2

clearly the variables are dependent However

the are also uncorrelated (due to symmetry)

stats Statistics multivar Multivariate probability and correlation p 2

Example statsmultivarshy4 re car data sample covariance and correlation

Problem StatementUsing a uniform distribution U(ndash1 1) show that

X and Y are uncorrelated (but dependent)

with Y = X2 with some sampling We compute

the correlation for different sample sizes

Problem Solution

N_a = round(linspace(10500100))qc_a = NaNones(size(N_a))rng(6)x_a = -1 + 2rand(max(N_a)1)y_a = x_a^2for i = 1length(N_a)

should write incremental algorithm but lazyq = cov(x_a(1N_a(i))y_a(1N_a(i)))qc = corrcov(q)qc_a(i) = qc(21) cross correlation

end

The absolute values of the correlations are

shown in Figure 0416 Note that we should

probably average several such curves to

estimate how the correlation would drop off

with N but the single curve describes our

understanding that the correlation in fact

approaches zero in the large-sample limit

h = figurep = plot(N_aabs(qc_a))pLineWidth = 2xlabel(sample size $N$interpreterlatex)ylabel(absolute sample correlationinterpreterlatex

stats Statistics regression Multivariate probability and correlation p 3

)hgsave(hfigurestemp)

0 100 200 300 400 5000

01

02

03

sample size Nabsolute

sampleco

rrelation

Figure 0416 absolute value of the sample

correlation between X sim U(ndash1 1) and Y = X2 fordifferent sample size N In the limit the populationcorrelation should approach zero and yet X and Yare not independent

stats Statistics regression Regression p 1

statsregression Regression

Suppose we have a sample with two

measurands (1) the force F through a spring and

(2) its displacement X (not from equilibrium)

We would like to determine an analytic

function that relates the variables perhaps

for prediction of the force given another

displacement

There is some variation in the measurement Letrsquos

say the following is the sample

X_a = 1e-3[10213041495061718092100] mF_a = [501504532559572599

610639670679703] N

Letrsquos take a look at the data The result is

Figure 0417

h = figurep = plot(X_a1e3F_abMarkerSize15)xlabel($X$ (mm)interpreterlatex)ylabel($F$ (N)interpreterlatex)xlim([0max(X_a1e3)])grid onhgsave(hfigurestemp)

How might we find an analytic function that

agrees with the data Broadly our strategy

will be to assume a general form of a function

and use the data to set the parameters in the

stats Statistics regression Regression p 2

0 20 40 60 80 10050

60

70

X (mm)

F(N)

Figure 0417 forceshydisplacement data

function such that the difference between the

data and the function is minimal

Let y be the analytic function that we would

like to fit to the data Let yi denote the

value of y(xi) where xi is the ith value of the

random variable X from the sample Then we

want to minimize the differences between the

force measurements Fi and yi

From calculus recall that we can minimize a

function by differentiating it and solving for

the zero-crossings (which correspond to local

maxima or minima)

First we need such a function to minimize

Perhaps the simplest effective function D is

constructed by squaring and summing the

differences we want to minimize for sample

size N

D(xi) =Nsumi=1

(Fi ndash yi)2 (regression1)

stats Statistics regression Regression p 3

(recall that yi = y(xi) which makes D a function

of x)

Now the form of y must be chosen We consider

only mth-order polynomial functions y but

others can be used in a similar manner

y(x) = a0 + a1x + a2x2 + middot middot middot + amxm (regression2)

If we treat D as a function of the polynomial

coefficients aj ie

D(a0 a1 middot middot middot am) (regression3)

and minimize D for each value of xi we must

take the partial derivatives of D with respect

to each aj and set each equal to zero

partD

parta0= 0

partD

parta1= 0 middot middot middot partD

partam= 0

This gives us N equations and m + 1 unknowns aj

Writing the system in matrix form

1 x1 x21middot middot middot xm1

1 x2 x22 middot middot middot xm21 x3 x23 middot middot middot xm3

1 xN x2Nmiddot middot middot xmN

︸ ︷︷ ︸

ANtimes(m+1)

a0a1a2

am

︸ ︷︷ ︸a(m+1)times1

=

F1F2F3

FN

︸ ︷︷ ︸bNtimes1

(regression4)

Typically N gt m and this is an overdetermined

system Therefore we usually canrsquot solve by

taking Andash1 because A doesnrsquot have an inverse

stats Statistics regression Regression p 4

Instead we either find the Moore-Penrose

pseudo-inverse Adagger and have a = Adaggerb as

the solution which is inefficientmdashor we can

approximate b with an algorithm such as that

used by Matlabrsquos operator In the latter case

a_a = Ab_a

Example statsregressionshy1 re regression

Problem StatementUse Matlabrsquos operator to find a good

polynomial fit for the sample Therersquos the

sometimesshydifficult question ldquowhat order

should we fitrdquo Letrsquos try out several and

see what the squaredshydifferences function D

gives

Problem Solution

N = length(X_a) sample sizem_a = 2N all the order up to N

A = NaNones(length(m_a)max(m_a)N)for k = 1length(m_a) each order

for j = 1N each measurementfor i = 1( m_a(k) + 1 ) each coef

A(kji) = X_a(j)^(i-1)end

endenddisp(squeeze(A(215)))

10000 00100 00001 00000 NaN10000 00210 00004 00000 NaN10000 00300 00009 00000 NaN10000 00410 00017 00001 NaN10000 00490 00024 00001 NaN10000 00500 00025 00001 NaN10000 00610 00037 00002 NaN10000 00710 00050 00004 NaN

stats Statistics regression Regression p 5

10000 00800 00064 00005 NaN10000 00920 00085 00008 NaN10000 01000 00100 00010 NaN

Wersquove printed the first five columns of the

third-order matrix which only has four

columns so NaNs fill in the rest

Now we can use the operator to solve for

the coefficients

a = NaNones(length(m_a)max(m_a))

warning(offall)for i = 1length(m_a)

A_now = squeeze(A(i1m_a(i)))a(i1m_a(i)) = (A_now(1m_a(i))F_a)

endwarning(onall)

n_plot = 100x_plot = linspace(min(X_a)max(X_a)n_plot)y = NaNones(n_plotlength(m_a)) preallocatefor i = 1length(m_a)

y(i) = polyval(fliplr(a(i1m_a(i)))x_plot)end

h = figurefor i = 12length(m_a)-1

p = plot(x_plot1e3y(i)linewidth15)hold onpDisplayName = sprintf(

order i (m_a(i)-1)

)

stats Statistics ex Regression p 6

endp = plot(X_a1e3F_abMarkerSize15)xlabel($X$ (mm)interpreterlatex)ylabel($F$ (N)interpreterlatex)pDisplayName = samplelegend(showlocationsoutheast)grid onhgsave(hfigurestemp)

20 40 60 80 100

50

60

70

X (mm)

F(N)

order 1

order 3

order 5

order 7

order 9

sample

Figure 0418 forceshydisplacement data with curvefits

vecs Statistics Exercises for Chapter 04 p 1

statsex Exercises for Chapter

04

calculus

limit

series

derivative

integral

vector calculus

vecs

Vector calculus

A great many physical situations of interest to

engineers can be described by calculus It can

describe how quantities continuously change

over (say) time and gives tools for computing

other quantities We assume familiarity with

the fundamentals of calculus limit series

derivative and integral From these and a

basic grasp of vectors we will outline some

of the highlights of vector calculus Vector

calculus is particularly useful for describing

the physics of for instance the following

mechanics of particles wherein is studied

the motion of particles and the forcing

causes thereof

rigid-body mechanics wherein is studied the

motion rotational and translational and

its forcing causes of bodies considered

rigid (undeformable)

solid mechanics wherein is studied the

motion and deformation and their forcing

causes of continuous solid bodies (those

that retain a specific resting shape)

vecs Vector calculus p 3

complex analysis

1 For an introduction to complex analysis see Kreyszig (Kreyszig

Advanced Engineering Mathematics Part D)

2 Ibid Chapters 9 10

3 HM Schey Div Grad Curl and All that An Informal Text on Vector

Calculus WW Norton 2005 ISBN 9780393925166

euclidean vector space

bases

components

basis vectors

invariant

coordinate system

Cartesian coordinates

fluid mechanics wherein is studied the

motion and its forcing causes of fluids

(liquids gases plasmas)

heat transfer wherein is studied the

movement of thermal energy through and

among bodies

electromagnetism wherein is studied

the motion and its forcing causes of

electrically charged particles

This last example was in fact very influential

in the original development of both vector

calculus and complex analysis1 It is not

an exaggeration to say that the topics above

comprise the majority of physical topics of

interest in engineering

A good introduction to vector calculus is

given by Kreyszig2 Perhaps the most famous

and enjoyable treatment is given by Schey3

in the adorably titled Div Grad Curl and All

that

It is important to note that in much of what

follows we will describe (typically the three-

dimensional space of our lived experience) as

a euclidean vector space an nshydimensional

vector space isomorphic to Rn As we know

from linear algebra any vector v isin Rn can be

expressed in any number of bases That is the

vector v is a basis-free object with multiple

basis representations The components and

basis vectors of a vector change with basis

changes but the vector itself is invariant A

coordinate system is in fact just a basis We

are most familiar of course with Cartesian

vecs Vector calculus div p 4

manifolds

non-euclidean geometry

parallel postulate

differential geometry

4 John M Lee Introduction to Smooth Manifolds second Vol 218

Graduate Texts in Mathematics Springer 2012

5 Francesco Bullo and Andrew D Lewis Geometric control of

mechanical systems modeling analysis and design for simple

mechanical control systems Ed by JE Marsden L Sirovich and M

Golubitsky Springer 2005

coordinates which is the specific orthonormal

basis b for Rn

b1 =

1

0

0

b2 =

0

1

0

middot middot middot bn =

0

0

1

(vecs1)

Manifolds are spaces that appear locally

as Rn but can be globally rather different

and can describe non-euclidean geometry

wherein euclidean geometryrsquos parallel

postulate is invalid Calculus on manifolds

is the focus of differential geometry a

subset of which we can consider our current

study A motivation for further study of

differential geometry is that it is very

convenient when dealingwith advanced

applications of mechanics such as rigid-body

mechanics of robots and vehicles A very nice

mathematical introduction is given by Lee4 and

Bullo and Lewis5 give a compact presentation

in the context of robotics

Vector fields have several important

properties of interest wersquoll explore in this

chapter Our goal is to gain an intuition of

these properties and be able to perform basic

calculation

vecs Vector calculus div Divergence surface integrals and flux p 1

surface integral

flux

6 Schey Div Grad Curl and All that An Informal Text on Vector

Calculus pp 21shy30

7 Kreyszig Advanced Engineering Mathematics sect 106

source

sink

incompressible

vecsdiv Divergence surface

integrals and flux

Flux and surface integrals

Consider a surface S Let

r(u v) = [x(u v) y(u v) z(u v)] be a parametric

position vector on a Euclidean vector space R3Furthermore let F R3 rarr R3 be a vectorshyvalued

function of r and let n be a unit-normal vector

on a surface S The surface integral

uml

S

F middot ndS (div1)

which integrates the normal of F over the

surface We call this quantity the flux of F

out of the surface S This terminology comes

from fluid flow for which the flux is the

mass flow rate out of S In general the flux

is a measure of a quantity (or field) passing

through a surface For more on computing

surface integrals see Schey6 and Kreyszig7

Continuity

Consider the flux out of a surface S that

encloses a volume ∆V divided by that volume

1

∆V

uml

S

F middot ndS (div2)

This gives a measure of flux per unit volume

for a volume of space Consider its physical

meaningwhen we interpret this as fluid flow

all fluid that enters the volume is negative flux

and all that leaves is positive If physical

conditions are such that we expect no fluid to

enter or exit the volume via what is called a

source or a sink and if we assume the density

of the fluid is uniform (this is called an

incompressible fluid) then all the fluid that

vecs Vector calculus div Divergence surface integrals and flux p 1

continuity equation

divergence

enters the volume must exit and we get

1

∆V

uml

S

F middot ndS = 0 (div3)

This is called a continuity equation although

typically this name is given to equations of the

form in the next section This equation is one of

the governing equations in continuum mechanics

Divergence

Letrsquos take the fluxshypershyvolume as the volume

∆Vrarr 0 we obtain the following

Equation div4 divergence integral

form

lim∆Vrarr0

1

∆V

uml

S

F middot ndS

This is called the divergence of F and is

defined at each point in R3 by taking the

volume to zero about it It is given the

shorthand div F

What interpretation can we give this quantity

It is a measure of the vector fieldrsquos flux

outward through a surface containing an

infinitesimal volume When we consider a fluid

a positive divergence is a local decrease in

density and a negative divergence is a density

increase If the fluid is incompressible and has

no sources or sinks we can write the continuity

vecs Vector calculus div Divergence surface integrals and flux p 1

equation

div F = 0 (div5)

In the Cartesian basis it can be shown that the

divergence is easily computed from the field

F = Fxi + Fyj + Fzk (div6)

as follows

Equation div7 divergence

differential form

div F = partxFx + partyFy + partzFz

Exploring divergence

Divergence is perhaps best explored by

considering it for a vector field in R2 Sucha field F = Fxi + Fyj can be represented

as a ldquoquiverrdquo plot If we overlay the quiver

plot over a ldquocolor densityrdquo plot representing

div F we can increase our intuition about the

divergence

The followingwas generated from a Jupyter

notebook with the following filename and

kernel

notebook filename div_surface_integrals_fluxipynbnotebook kernel python3

vecs Vector calculus div Divergence surface integrals and flux p 2

First load some Python packages

from sympy import import numpy as npimport matplotlibpyplot as pltfrom matplotlibticker import LogLocatorfrom matplotlibcolors import from sympyutilitieslambdify import lambdifyfrom IPythondisplay import display Markdown Latex

Now we define some symbolic variables and

functions

var(xy)F_x = Function(F_x)(xy)F_y = Function(F_y)(xy)

Rather than repeat code letrsquos write a single

function quiver_plotter_2D to make several

of these plots

def quiver_plotter_2D(field=F_xxyF_yxygrid_width=3grid_decimate_x=8grid_decimate_y=8norm=Normalize()density_operation=divprint_density=True

) define symbolicsvar(xy)F_x = Function(F_x)(xy)F_y = Function(F_y)(xy)

vecs Vector calculus div Divergence surface integrals and flux p 3

field_sub = field

compute densityif density_operation is divden = F_xdiff(x) + F_ydiff(y)

elif density_operation is curl in the k directionden = F_ydiff(x) - F_xdiff(y)

elseerror(div and curl are the only density operators)

den_simp = densubs(field_sub

)doit()simplify()if den_simpis_constant()

print(Warning density operator is constant (no density plot)

)if print_density

print(fThe density_operation is)display(den_simp)

lambdify for numericsF_x_sub = F_xsubs(field_sub)F_y_sub = F_ysubs(field_sub)F_x_fun = lambdify((xy)F_xsubs(field_sub)numpy)F_y_fun = lambdify((xy)F_ysubs(field_sub)numpy)if F_x_subis_constantF_x_fun1 = F_x_fun dummyF_x_fun = lambda xy F_x_fun1(xy)npones(xshape)

if F_y_subis_constantF_y_fun1 = F_y_fun dummyF_y_fun = lambda xy F_y_fun1(xy)npones(xshape)

if not den_simpis_constant()den_fun = lambdify((xy) den_simpnumpy)

create gridw = grid_widthY X = npmgrid[-ww100j -ww100j]

evaluate numericallyF_x_num = F_x_fun(XY)F_y_num = F_y_fun(XY)if not den_simpis_constant()den_num = den_fun(XY)

plotp = pltfigure() colormeshif not den_simpis_constant()cmap = pltget_cmap(PiYG)im = pltpcolormesh(XYden_numcmap=cmapnorm=norm)pltcolorbar()

Abs quiverdx = grid_decimate_ydy = grid_decimate_x

vecs Vector calculus div Divergence surface integrals and flux p 4

pltquiver(X[dxdy]Y[dxdy]F_x_num[dxdy]F_y_num[dxdy]units=xy scale=10

)plttitle(fF(xy) = [F_xsubs(field_sub)F_ysubs(field_sub)])return p

Note that while wersquore at it we included a

hook for density plots of the curl of F and

wersquoll return to this in a later lecture

Letrsquos inspect several cases

p = quiver_plotter_2D(field=F_xx2F_yy2

)

The div is

2x + 2y

p = quiver_plotter_2D(field=F_xxyF_yxy

)

The div is

vecs Vector calculus div Divergence surface integrals and flux p 5

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [x2y2]

10

5

0

5

10

Figure 051 png

x + y

p = quiver_plotter_2D(field=F_xx2+y2F_yx2+y2

)

The div is

2x + 2y

p = quiver_plotter_2D(field=F_xx2sqrt(x2+y2)F_yy2sqrt(x2+y2)

vecs Vector calculus div Divergence surface integrals and flux p 6

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [xyxy]

10

5

0

5

10

Figure 052 png

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [x2 + y2x2 + y2]

10

5

0

5

10

Figure 053 png

vecs Vector calculus curl Divergence surface integrals and flux p 7

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [x2sqrt(x2 + y2)y2sqrt(x2 + y2)]

100

10 1010 1

100

Figure 054 png

norm=SymLogNorm(linthresh=3 linscale=3))

The div is

ndashx3 ndash y3 + 2(x + y

) (x2 + y2

)(x2 + y2

) 32

vecs Vector calculus curl Curl line integrals and circulation p 1

line integral

8 Schey Div Grad Curl and All that An Informal Text on Vector

Calculus pp 63shy74

9 Kreyszig Advanced Engineering Mathematics sect 101 102

force

work

circulation

the law of circulation

vecscurl Curl line integrals

and circulation

Line integrals

Consider a curve C in a Euclidean vector space

R3 Let r(t) = [x(t) y(t) z(t)] be a parametric

representation of C Furthermore let F R3 rarr R3

be a vectorshyvalued function of r and let rprime(t)

be the tangent vector The line integral is

ˆ

C

F(r(t)) middot rprime(t) dt (curl1)

which integrates F along the curve For more

on computing line integrals see Schey8 and

Kreyszig9

If F is a force being applied to an object

moving along the curve C the line integral is

the work done by the force More generally the

line integral integrates F along the tangent of

C

Circulation

Consider the line integral over a closed curve

C denoted by

˛

C

F(r(t)) middot rprime(t) dt (curl2)

We call this quantity the circulation of F

around C

For certain vectorshyvalued functions F the

circulation is zero for every curve In these

cases (static electric fields for instance) this

is sometimes called the the law of circulation

vecs Vector calculus curl Curl line integrals and circulation p 2

curl

curl

Curl

Consider the division of the circulation around

a curve in R3 by the surface area it encloses

∆S

1

∆S

˛

C

F(r(t)) middot rprime(t) dt (curl3)

In a manner analogous to the operation

that gaves us the divergence letrsquos consider

shrinking this curve to a point and the surface

area to zero

lim∆Srarr0

1

∆S

˛

C

F(r(t)) middot rprime(t) dt (curl4)

We call this quantity the ldquoscalarrdquo curl of F

at each point in R3 in the direction normal

to ∆S as it shrinks to zero Taking three (or

n for Rn) ldquoscalarrdquo curls in indepedent normal

directions (enough to span the vector space)

we obtain the curl proper which is a vector-

valued function curl R3 rarr R3

The curl is coordinate-independent In

cartesian coordinates it can be shown to be

equivalent to the following

Equation curl5 curl differential

form cartesian coordinates

curl F =[partyFz ndash partzFy partzFx ndash partxFz partxFy ndash partyFx

]gt

But what does the curl of F represent It

quantifies the local rotation of F about each

point If F represents a fluidrsquos velocity curl F

vecs Vector calculus curl Curl line integrals and circulation p 1

vorticity

10 A region is simply connected if every curve in it can shrink to a

point without leaving the region An example of a region that is not

simply connected is the surface of a toroid

path independence

is the local rotation of the fluid about each

point and it is called the vorticity

Zero curl circulation and path

independence

Circulation

It can be shown that if the circulation of F on

all curves is zero then in each direction n and

at every point curl F = 0 (ie n middot curl F = 0)

Conversely for curl F = 0 in a simply connected

region10 F has zero circulation

Succinctly informally and without the

requisite qualifiers above

zero circulationrArr zero curl

(curl6)

zero curl + simply connected regionrArr zero circulation

(curl7)

Path independence

It can be shown that if the path integral of F

on all curves between any two points is path-

independent then in each direction n and

at every point curl F = 0 (ie n middot curl F = 0)

Conversely for curl F = 0 in a simply connected

region all line integrals are independent of

path

Succinctly informally and without the

requisite qualifiers above

path independencerArr zero curl

(curl8)

zero curl + simply connected regionrArr path independence

(curl9)

vecs Vector calculus curl Curl line integrals and circulation p 1

zero curl

zero circulation path independence

connectedness and

Figure 055 an implication graph relating zerocurl zero circulation path independence andconnectedness Green edges represent implication(a implies b) and black edges represent logicalconjunctions

hellip and how they relate

It is also true that

path independencehArr zero circulation

(curl10)

So putting it all together we get Figure 055

Exploring curl

Curl is perhaps best explored by considering

it for a vector field in R2 Such a field in

cartesian coordinates F = Fxi + Fyj has curl

curl F =[party0 ndash partzFy partzFx ndash partx0 partxFy ndash partyFx

]gt=[0 ndash 0 0 ndash 0 partxFy ndash partyFx

]gt=[0 0 partxFy ndash partyFx

]gt (curl11)

That is curl F = (partxFy ndash partyFx )k and the only

nonzero component is normal to the xyshyplane

If we overlay a quiver plot of F over a ldquocolor

densityrdquo plot representing the k-component of

curl F we can increase our intuition about the

curl

The followingwas generated from a Jupyter

vecs Vector calculus curl Curl line integrals and circulation p 2

notebook with the following filename and

kernel

notebook filename curl-and-line-integralsipynbnotebook kernel python3

First load some Python packages

from sympy import import numpy as npimport matplotlibpyplot as pltfrom matplotlibticker import LogLocatorfrom matplotlibcolors import from sympyutilitieslambdify import lambdifyfrom IPythondisplay import display Markdown Latex

Now we define some symbolic variables and

functions

var(xy)F_x = Function(F_x)(xy)F_y = Function(F_y)(xy)

We use the same function defined in

Lecture vecsdiv quiver_plotter_2D to make

several of these plots

Letrsquos inspect several cases

vecs Vector calculus curl Curl line integrals and circulation p 3

100 075 050 025 000 025 050 075 100100

075

050

025

000

025

050

075

100F(xy) = [0cos(2pix)]

6

4

2

0

2

4

6

Figure 056 png

p = quiver_plotter_2D(field=F_x0F_ycos(2pix)density_operation=curlgrid_decimate_x=2grid_decimate_y=10grid_width=1

)

The curl is

ndash2π sin (2πx)

p = quiver_plotter_2D(field=F_x0F_yx2density_operation=curl

vecs Vector calculus curl Curl line integrals and circulation p 4

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [0x2]

6

4

2

0

2

4

6

Figure 057 png

grid_decimate_x=2grid_decimate_y=20

)

The curl is

2x

p = quiver_plotter_2D(field=F_xy2F_yx2density_operation=curlgrid_decimate_x=2grid_decimate_y=20

)

vecs Vector calculus curl Curl line integrals and circulation p 5

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [y2x2]

6

4

2

0

2

4

6

Figure 058 png

The curl is

2x ndash 2y

p = quiver_plotter_2D(field=F_x-yF_yxdensity_operation=curlgrid_decimate_x=6grid_decimate_y=6

)

Warning density operator is constant (no densityplot)rarr

The curl is

vecs Vector calculus grad Curl line integrals and circulation p 6

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [-yx]

Figure 059 png

2

vecs Vector calculus grad Gradient p 1

gradient

direction

magnitude

principle of least action

vecsgrad Gradient

Gradient

The gradient grad of a scalarshyvalued

function f R3 rarr R is a vector field

F R3 rarr R3 that is grad f is a vectorshyvalued

function on R3 The gradientrsquos local direction

and magnitude are those of the local maximum

rate of increase of f This makes it useful in

optimization (eg in the method of gradient

descent)

In classical mechanics quantum mechanics

relativity string theory thermodynamics

and continuum mechanics (and elsewhere)

the principle of least action is taken as

fundamental (Richard P Feynman Robert B

Leighton and Matthew Sands The Feynman

Lectures on Physics New Millennium Perseus

Basic Books 2010) This principle tells us that

naturersquos laws quite frequently seem to be

derivable by assuming a certain quantitymdash

called actionmdashis minimized Considering then

that the gradient supplies us with a tool for

optimizing functions it is unsurprising that the

gradient enters into the equations of motion of

many physical quantities

The gradient is coordinate-independent but

its coordinate-free definitions donrsquot add much

to our intuition In cartesian coordinates it

can be shown to be equivalent to the following

vecs Vector calculus grad Gradient p 1

gradience

11 This is nonstandard terminology but wersquore bold

potentials

Equation grad1 gradient cartesian

coordinates

grad f =[partxf partyf partzf

]gt

Vector fields from gradients are special

Although all gradients are vector fields not

all vector fields are gradients That is given

a vector field F it may or may not be equal to

the gradient of any scalarshyvalued function f

Letrsquos say of a vector field that is a gradient

that it has gradience11 Those vector fields

that are gradients have special properties

Surprisingly those properties are connected to

path independence and curl It can be shown

that iff a field is a gradient line integrals of

the field are path independent That is for a

vector field

gradiencehArr path independence (grad2)

Consideringwhat we know from

Lecture vecscurl about path independence we

can expand Figure 055 to obtain Figure 0510

One implication is that gradients have zero

curl Many important fields that describe

physical interactions (eg static electric fields

Newtonian gravitational fields) are gradients

of scalar fields called potentials

Exploring gradient

Gradient is perhaps best explored by

considering it for a scalar field on R2 Sucha field in cartesian coordinates f(x y) has

vecs Vector calculus grad Gradient p 2

gradience

zero curl

zero circulation path independence

connectedness and

Figure 0510 an implication graph relating gradiencezero curl zero circulation path independence andconnectedness Green edges represent implication(a implies b) and black edges represent logicalconjunctions

gradient

grad f =[partxf partyf

]gt(grad3)

That is grad f = F = partxf i + partyf j If we overlay

a quiver plot of F over a ldquocolor densityrdquo

plot representing the f we can increase our

intuition about the gradient

The followingwas generated from a Jupyter

notebook with the following filename and

kernel

notebook filename gradipynbnotebook kernel python3

First load some Python packages

vecs Vector calculus grad Gradient p 3

from sympy import import numpy as npimport matplotlibpyplot as pltfrom matplotlibticker import LogLocatorfrom matplotlibcolors import from sympyutilitieslambdify import lambdifyfrom IPythondisplay import display Markdown Latex

Now we define some symbolic variables and

functions

var(xy)

(x y)

Rather than repeat code letrsquos write a single

function grad_plotter_2D to make several of

these plots

Letrsquos inspect several cases While considering

the following plots remember that they all

have zero curl

p = grad_plotter_2D(field=x

)

vecs Vector calculus grad Gradient p 4

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = x

3

2

1

0

1

2

Figure 0511 png

The gradient is

[1 0

]

p = grad_plotter_2D(field=x+y

)

The gradient is

[1 1

]

vecs Vector calculus grad Gradient p 1

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = x + y

6

4

2

0

2

4

Figure 0512 png

p = grad_plotter_2D(field=1

)

Warning field is constant (no plot)The gradient is

[0 0

]Gravitational potential

Gravitational potentials have the form of

1distance Letrsquos check out the gradient

vecs Vector calculus grad Gradient p 1

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = 1sqrt(x2 + y2)

100

101

Figure 0513 png

conic section

p = grad_plotter_2D(field=1sqrt(x2+y2)norm=SymLogNorm(linthresh=3 linscale=3)mask=True

)

The gradient is

[ndash x(x2+y2

) 32

ndashy(

x2+y2) 32

]

Conic section fields

Gradients of conic section fields can be

explored

vecs Vector calculus grad Gradient p 2

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = x2

1

2

3

4

5

6

7

8

9

Figure 0514 png

parabolic fields

eliptic fields

The following is called a parabolic field

p = grad_plotter_2D(field=x2

)

The gradient is

[2x 0

]

The following are called eliptic fields

vecs Vector calculus grad Gradient p 3

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = x2 + y2

2

4

6

8

10

12

14

16

18

Figure 0515 png

p = grad_plotter_2D(field=x2+y2

)

The gradient is

[2x 2y

]

p = grad_plotter_2D(field=-x2-y2

)

vecs Vector calculus grad Gradient p 4

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = -x2 - y2

18

16

14

12

10

8

6

4

2

Figure 0516 png

hyperbolic fields

The gradient is

[ndash2x ndash2y

]

The following is called a hyperbolic field

p = grad_plotter_2D(field=x2-y2

)

The gradient is

vecs Vector calculus stoked Gradient p 5

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = x2 - y2

8

6

4

2

0

2

4

6

8

Figure 0517 png

[2x ndash2y

]

vecs Vector calculus stoked Stokes and divergence theorems p 1

divergence theorem

triple integral

orientable

12 A surface is orientable if a consistent normal direction can

be defined Most surfaces are orientable but some notably the

Moumlbius strip cannot be See Kreyszig (Kreyszig Advanced Engineering

Mathematics sect 106) for more

13 Ibid sect 107

14 Schey Div Grad Curl and All that An Informal Text on Vector

Calculus pp 45-52

continuity equation

15 Ibid pp 49-52

vecsstoked Stokes and

divergence

theorems

Two theorems allow us to exchange certain

integrals in R3 for others that are easier to

evaluate

The divergence theorem

The divergence theorem asserts the equality

of the surface integral of a vector field F and

the triple integral of div F over the volume V

enclosed by the surface S in R3 That is

umlSF middot ndS =

˚Vdiv FdV (stoked1)

Caveats are that V is a closed region bounded

by the orientable12 surface S and that F is

continuous and continuously differentiable

over a region containing V This theorem makes

some intuitive sense we can think of the

divergence inside the volume ldquoaccumulatingrdquo

via the triple integration and equaling the

corresponding surface integral For more on the

divergence theorem see Kreyszig13 and Schey14

A lovely application of the divergence theorem

is that for any quantity of conserved stuff

(mass charge spin etc) distributed in a spatial

R3 with timeshydependent density ρ R4 rarr Rand velocity field v R4 rarr R3 the divergence

theorem can be applied to find that

parttρ = ndash div(ρv) (stoked2)

which is a more general form of a continuity

equation one of the governing equations of

many physical phenomena For a derivation of

this equation see Schey15

vecs Vector calculus ex Stokes and divergence theorems p 1

KelvinshyStokesrsquo theorem

piecewise smooth

16 A surface is smooth if its normal is continuous everywhere It

is piecewise smooth if it is composed of a finite number of smooth

surfaces

17 Kreyszig Advanced Engineering Mathematics sect 109

18 Schey Div Grad Curl and All that An Informal Text on Vector

Calculus pp 93-102

Greenersquos theorem

19 Kreyszig Advanced Engineering Mathematics sect 109

generalized Stokesrsquo theorem

20 Lee Introduction to Smooth Manifolds Ch 16

The Kelvin-Stokesrsquo theorem

The KelvinshyStokesrsquo theorem asserts the

equality of the circulation of a vector field

F over a closed curve C and the surface

integral of curl F over a surface S that has

boundary C That is for r(t) a parameterization

of C and surface normal n

˛CF(r(t)) middot rprime(t) dt =

umlSn middot curl FdS (stoked3)

Caveats are that S is piecewise smooth16 its

boundary C is a piecewise smooth simple closed

curve and F is continuous and continuously

differentiable over a region containing S

This theorem is also somewhat intuitive we

can think of the divergence over the surface

ldquoaccumulatingrdquo via the surface integration and

equaling the corresponding circulation For more

on the Kelvin-Stokesrsquo theorem see Kreyszig17

and Schey18

Related theorems

Greenersquos theorem is a twoshydimensional special

case of the Kelvin-Stokesrsquo theorem It is

described by Kreyszig19

It turns out that all of the above theorems

(and the fundamental theorem of calculus

which relates the derivative and integral)

are special cases of the generalized Stokesrsquo

theorem defined by differential geometry

We would need a deeper understanding of

differential geometry to understand this

theorem For more see Lee20

vecs Vector calculus ex Exercises for Chapter 05 p 1

vecsex Exercises for Chapter

05

Exercise 051 (6)

The PDE of Example pdeseparation-1 can be used

to describe the conduction of heat along

a long thin rod insulated along its length

where u(t x) represents temperature The initial

and dirichlet boundary conditions in that

example would be interpreted as an initial

temperature distribution along the bar and

fixed temperatures of the ends Now consider

the same PDE

parttu(t x) = kpart2xxu(t x) (ex1)

with real constant k now with neumann

boundary conditions on inverval x isin [0 L]

partxu|x=0 = 0 and partxu|x=L = 0 (ex2a)

and with initial condition

u(0 x) = f(x) (ex3)

where f is some piecewise continuous function

on [0 L] This represents the complete insulation

of the ends of the rod such that no heat flows

from the ends (or from anywhere else)

a Assume a product solution separate

variables into X(x) and T(t) and set the

separation constant to ndashλ

b Solve the boundary value problem for its

eigenfunctions Xn and eigenvalues λn

c Solve for the general solution of the

time variable ODE

d Write the product solution and apply the

initial condition f(x) by constructing it

from a generalized fourier series of the

product solution

four Vector calculus Exercises for Chapter 05 p 2

e Let L = 1 and f(x) = 100 ndash 200L |x ndash L2| as

shown in Figure 071 Compute the solution

series components Plot the sum of the

first 50 terms over x and t

0 02 04 06 08 1050100

x

f(x)

Figure 0518 initial condition for Exercise 051

four

Fourier andorthogonality

four Fourier and orthogonality series Fourier series p 1

frequency domain

Fourier analysis

1 Itrsquos important to note that the symbol ωn in this context is not the

natural frequency but a frequency indexed by integer n

frequency spectrum

fourseries Fourier series

Fourier series are mathematical series that

can represent a periodic signal as a sum

of sinusoids at different amplitudes and

frequencies They are useful for solving

for the response of a system to periodic

inputs However they are probably most

important conceptually they are our gateway

to thinking of signals in the frequency

domainmdashthat is as functions of frequency

(not time) To represent a function as a

Fourier series is to analyze it as a sum of

sinusoids at different frequencies1 ωn and

amplitudes an Its frequency spectrum is the

functional representation of amplitudes an

versus frequency ωn

Letrsquos begin with the definition

Definition fourseries1 Fourier series

trigonometric form

The Fourier analysis of a periodic function

y(t) is for n isin N0 period T and angular

frequency ωn = 2πnT

an =2

T

ˆ T2

ndashT2y(t) cos(ωnt)dt (series1)

bn =2

T

ˆ T2

ndashT2y(t) sin(ωnt)dt (series2)

The Fourier synthesis of a periodic function

y(t) with analysis components an and bn

corresponding to ωn is

y(t) =a02+

infinsumn=1

an cos(ωnt) + bn sin(ωnt)

(series3)

four Fourier and orthogonality series Fourier series p 2

harmonic

harmonic frequency

fundamental frequency

Letrsquos consider the complex form of the

Fourier series which is analogous to

Definition fourseries1 It may be helpful to

review Eulerrsquos formula(s) ndash see Appendix D01

Definition fourseries2 Fourier series

complex form

The Fourier analysis of a periodic function

y(t) is for n isin N0 period T and angular

frequency ωn = 2πnT

cplusmnn =1

T

ˆ T2

ndashT2y(t)endashjωntdt (series4)

The Fourier synthesis of a periodic function

y(t) with analysis components cn corresponding

to ωn is

y(t) =infinsumn=ndashinfin cne

jωnt (series5)

We call the integer n a harmonic and the

frequency associated with it

ωn = 2πnT (series6)

the harmonic frequency There is a special

name for the first harmonic (n = 1) the

fundamental frequency It is called this

because all other frequency components are

integer multiples of it

It is also possible to convert between the two

representations above

four Fourier and orthogonality series Fourier series p 3

harmonic amplitude

magnitude line spectrum

harmonic phase

Definition fourseries3 Fourier series

converting between

forms

The complex Fourier analysis of a periodic

function y(t) is for n isin N0 and an and bn as

defined above

cplusmnn =1

2

(a|n| ∓ jb|n|

)(series7)

The sinusoidal Fourier analysis of a periodic

function y(t) is for n isin N0 and cn as defined

above

an = cn + cndashn and (series8)

bn = j (cn ndash cndashn) (series9)

The harmonic amplitude Cn is

Cn =

radica2n + b

2n (series10)

= 2radiccncndashn (series11)

A magnitude line spectrum is a graph of the

harmonic amplitudes as a function of the

harmonic frequencies The harmonic phase is

θn = ndash arctan2(bn an) (see Appendix C02)

= arctan2(Im(cn) Re(cn))

The following illustration demonstrates how

sinusoidal components sum to represent a

square wave A line spectrum is also shown

four Fourier and orthogonality series Fourier series p 4

time

frequency

spectral amplitudeamplitude

Let us compute the associated spectral

components in the following example

Example fourseriesshy1 re Fourier series analysis line spectrum

Problem Statement

Compute the first five harmonic amplitudes

that represent the line spectrum for a square

wave in the figure above

four Fourier and orthogonality trans Fourier series p 5

four Fourier and orthogonality trans Fourier transform p 1

fourtrans Fourier transform

The followingwas generated from a Jupyter

notebook with the following filename and

kernel

notebook filename fourier_series_to_transformipynbnotebook kernel python3

We begin with the usual loading of modules

import numpy as np for numericsimport sympy as sp for symbolicsimport matplotlibpyplot as plt for plotsfrom IPythondisplay import display Markdown Latex

Letrsquos consider a periodic function f with

period T (T) Each period the function has a

triangular pulse of width δ (pulse_width) andheight δ2

period = 15 periodpulse_width = 2 pulse width

First we plot the function f in the time

domain Letrsquos begin by defining f

four Fourier and orthogonality trans Fourier transform p 2

def pulse_train(tTpulse_width)f = lambda xpulse_width2-abs(x) pulsetm = npmod(tT)if tm lt= pulse_width2

return f(tm)elif tm gt= T-pulse_width2

return f(-(tm-T))else

return 0

Now we develop a numerical array in time to

plot f

N = 201 number of points to plottpp = nplinspace(-period25period2N) time valuesfpp = nparray(npzeros(tppshape))for it_now in enumerate(tpp)

fpp[i] = pulse_train(t_nowperiodpulse_width)

axes = pltfigure(1)pltplot(tppfppb-linewidth=2) plotpltxlabel(time (s))pltxlim([-period23period2])pltxticks(

[0pulse_width2period][0$delta2$$T=+str(period)+$ s]

)pltyticks([0pulse_width2][0$delta2$])pltshow() display here

For δ = 2 and T isin [5 15 25] the leftshyhand

column of Figure 061 shows two triangle pulses

for each period T

four Fourier and orthogonality trans Fourier transform p 3

Consider the following argument Just as

a Fourier series is a frequency domain

representation of a periodic signal a Fourier

transform is a frequency domain representation

of an aperiodic signal (we will rigorously

define it in a moment) The Fourier series

components will have an analog then in the

Fourier transform Recall that they can be

computed by integrating over a period of the

signal If we increase that period infinitely

the function is effectively aperiodic The

result (within a scaling factor) will be the

Fourier transform analog of the Fourier series

components

Let us approach this understanding by actually

computing the Fourier series components for

increasing period T using Wersquoll use sympyto compute the Fourier series cosine and sine

components an and bn for component n (n) andperiod T (T)

spvar(xa_0a_nb_nreal=True)spvar(deltaTpositive=True)spvar(nnonnegative=True) a0 = 2Tspintegrate( (delta2-spAbs(x)) (x-delta2delta2) otherwise zero )simplify()an = spintegrate(

2T(delta2-spAbs(x))spcos(2sppinTx)(x-delta2delta2) otherwise zero

)simplify()bn = 2Tspintegrate(

(delta2-spAbs(x))spsin(2sppinTx)(x-delta2delta2) otherwise zero

)simplify()display(spEq(a_nan)spEq(b_nbn))

four Fourier and orthogonality trans Fourier transform p 4

an =

T(1ndashcos

(πδnT

))π2n2

for n 6= 0δ2

2Totherwise

bn = 0

Furthermore let us compute the harmonic

amplitude

(f_harmonic_amplitude)

Cn =

radica2n + b

2n (trans1)

which we have also scaled by a factor Tδ in

order to plot it with a convenient scale

spvar(C_npositive=True)cn = spsqrt(an2+bn2)display(spEq(C_ncn))

Cn =

T∣∣∣cos(πδn

T

)ndash1∣∣∣

π2n2for n 6= 0

δ2

2Totherwise

Now we lambdify the symbolic expression for a

numpy function

cn_f = splambdify((nTdelta)cn)

Now we can plot

four Fourier and orthogonality trans Fourier transform p 5

amplitude CnTδ

δ2T = 5 s

δ2

5 10

δ2

δ2

T = 15 s

δ2

5 10

δ2

δ2

T = 25 s

δ2

time (s)

5 10

δ2

frequency ω (rads)

Figure 061 triangle pulse trains (left column) withlonger periods descending and their correspondingline spectra (right column) scaled for convenientcomparison

omega_max = 12 rads max frequency in line spectrumn_max = round(omega_maxperiod(2nppi)) max harmonicn_a = nplinspace(0n_maxn_max+1)omega = 2nppin_aperiodaxes = pltfigure(2)markerline stemlines baseline = pltstem(

omega periodpulse_widthcn_f(n_aperiodpulse_width)linefmt=b- markerfmt=bo basefmt=r-use_line_collection=True

)pltxlabel(frequency $omega$ (rads))pltxlim([0omega_max])pltylim([0pulse_width2])pltyticks([0pulse_width2][0$delta2$])pltshow() show here

The line spectra are shown in the rightshyhand

column of Figure 061 Note that with our chosen

scaling as T increases the line spectra reveal

a distinct waveform

four Fourier and orthogonality trans Fourier transform p 6

2 4 6 8 10 12

δ2

frequency ω (rads)

F(ω)

Figure 062 F(ω) our mysterious Fourier seriesamplitude analog

Let F be the continuous function of angular

frequency ω

F(ω) =δ

2middot sin

2(ωδ4)

(ωδ4)2 (trans2)

First we plot it

F = lambda w pulse_width2 npsin(wpulse_width(22))2 (wpulse_width(22))2

N = 201 number of points to plotwpp = nplinspace(00001omega_maxN)Fpp = []for i in range(0N)

Fppappend(F(wpp[i])) build array of function valuesaxes = pltfigure(3)pltplot(wppFppb-linewidth=2) plotpltxlim([0omega_max])pltyticks([0pulse_width2][0$delta2$])pltxlabel(frequency $omega$ (rads))pltylabel($F(omega)$)pltshow()

Letrsquos consider the plot in Figure 062 of F Itrsquos

obviously the function emerging in Figure 061

from increasing the period of our pulse train

four Fourier and orthogonality trans Fourier transform p 7

Now we are ready to define the Fourier

transform and its inverse

Definition fourtrans1 Fourier transforms

trigonometric form

Fourier transform (analysis)

A(ω) =

ˆ infinndashinfin y(t) cos(ωt)dt (trans3)

B(ω) =

ˆ infinndashinfin y(t) sin(ωt)dt (trans4)

Inverse Fourier transform (synthesis)

y(t) =1

ˆ infinndashinfin A(ω) cos(ωt)dω +

1

ˆ infinndashinfin B(ω) sin(ωt)dω

(trans5)

Definition fourtrans2 Fourier transforms

complex form

Fourier transform F (analysis)

F(y(t)) = Y(ω) =

ˆ infinndashinfin y(t)endashjωtdt (trans6)

Inverse Fourier transform Fndash1 (synthesis)

Fndash1(Y(ω)) = y(t) =1

ˆ infinndashinfin Y(ω)ejωtdω (trans7)

So now we have defined the Fourier transform

There are many applications including solving

differential equations and frequency domain

representationsmdashcalled spectramdashof time domain

functions

There is a striking similarity between the

four Fourier and orthogonality trans Fourier transform p 8

Fourier transform and the Laplace transform

with which you are already acquainted

In fact the Fourier transform is a special

case of a Laplace transform with Laplace

transform variable s = jω instead of having

some real component Both transforms convert

differential equations to algebraic equations

which can be solved and inversely transformed

to find timeshydomain solutions The Laplace

transform is especially important to use

when an input function to a differential

equation is not absolutely integrable and

the Fourier transform is undefined (for

example our definition will yield a transform

for neither the unit step nor the unit ramp

functions) However the Laplace transform is

also preferred for initial value problems due

to its convenient way of handling them The

two transforms are equally useful for solving

steady state problems Although the Laplace

transform has many advantages for spectral

considerations the Fourier transform is the

only game in town

A table of Fourier transforms and their

properties can be found in Appendix B02

Example fourtransshy1 re a Fourier transform

Problem Statement

Consider the aperiodic signal y(t) = us(t)endashat

with us the unit step function and a gt 0 The

signal is plotted below Derive the complex

frequency spectrum and plot its magnitude

and phase

ndash2 0 2 40

051

t

y(t)

four Fourier and orthogonality trans Fourier transform p 9

Problem SolutionThe signal is aperiodic so the Fourier

transform can be computed from

Equation trans6

Y(ω) =

ˆ infinndashinfin y(t)endashjωtdt

=

ˆ infinndashinfin us(t)e

ndashatendashjωtdt (def of y)

=

ˆ infin0

endashatendashjωtdt (us effect)

=

ˆ infin0

e(ndashandashjω)tdt (multiply)

=1

ndasha ndash jωe(ndashandashjω)t

∣∣∣∣infin0dt (antiderivative)

=1

ndasha ndash jω

(limtrarrinfin e(ndashandashjω)t ndash e0

)(evaluate)

=1

ndasha ndash jω

(limtrarrinfin endashatendashjωt ndash 1

)(arrange)

=1

ndasha ndash jω

((0)(complex with mag 6 1) ndash 1

)(limit)

=ndash1

ndasha ndash jω(consequence)

=1

a + jω

=a ndash jω

a ndash jωmiddot 1

a + jω(rationalize)

=a ndash jω

a2 +ω2

The magnitude and phase of this complex

function are straightforward to compute

|Y(ω)| =radicRe(Y(ω))2 + Im(Y(ω))2

=1

a2 +ω2

radica2 +ω2

=1radic

a2 +ω2

angY(ω) = ndash arctan(ωa)

four Fourier and orthogonality general Fourier transform p 10

Now we can plot these functions of ω Setting

a = 1 (arbitrarily) we obtain the following

plots

0

05

1

|Y(ω)|

ndash10 ndash5 0 5 10

ndash1

0

1

ω

angY(ω)

four Fourier and orthogonality general Generalized fourier series and orthogonality p 1

2 A function f is square-integrable ifacuteinfinndashinfin |f(x)|2 dx ltinfin

inner product

weight function

3 This definition of the inner product can be extended to functions

on R2 and R3 domains using double- and triple-integration See

( Schey Div Grad Curl and All that An Informal Text on Vector

Calculus p 261)

orthogonality

basis

function space

legendre polynomials

fourgeneral Generalized

fourier series and

orthogonality

Let f R rarr C g R rarr C and w R rarr Cbe complex functions For square-integrable2

f g and w the inner product of f and g with

weight function w over the interval [a b] sube Ris3

〈f g〉w =ˆ b

af(x)g(x)w(x) dx (general1)

where g denotes the complex conjugate of

g The inner product of functions can be

considered analogous to the inner (or dot)

product of vectors

The fourier series components can be found by a

special property of the sin and cos functions

called orthogonality In general functions

f and g from above are orthogonal over the

interval [a b] iff

〈f g〉w = 0 (general2)

for weight function w Similar to how a set of

orthogonal vectors can be a basis for a vector

space a set of orthogonal functions can be

a basis for a function space a vector space

of functions from one set to another (with

certain caveats)

In addition to some sets of sinusoids there

are several other important sets of functions

that are orthogonal For instance sets of

legendre polynomials (Erwin Kreyszig

Advanced Engineering Mathematics 10th John

Wiley amp Sons Limited 2011 ISBN 9781119571094

The authoritative resource for engineering

four Fourier and orthogonality ex Generalized fourier series and orthogonality p 2

bessel functions

generalized fourier series

Fourier components

synthesis

analysis

fourier-legendre series

fourier-bessel series

mathematics It includes detailed accounts

of probability statistics vector calculus

linear algebra fourier analysis ordinary

and partial differential equations and

complex analysis It also includes several

other topics with varying degrees of depth

Overall it is the best place to start when

seeking mathematical guidance sect 52) and bessel

functions (Kreyszig Advanced Engineering

Mathematics sect 54) are orthogonal

As with sinusoids the orthogonality of some

sets of functions allows us to compute their

series components Let functions f0 f1 middot middot middot beorthogonal with respect to weight function

w on interval [a b] and let α0 α1 middot middot middot be realconstants A generalized fourier series is

(ibid sect 116)

f(x) =infinsumm=0

αmfm(x) (general3)

and represents a function f as a convergent

series It can be shown that the Fourier

components αm can be computed from

αm =〈f fm〉w〈fm fm〉w

(general4)

In keepingwith our previous terminology

for fourier series Equation general3 and

Equation general4 are called general fourier

synthesis and analysis respectively

For the aforementioned legendre and bessel

functions the generalized fourier series are

called fourier-legendre and fourier-bessel

series (ibid sect 116) These and the standard

fourier series (Lecture fourseries) are of

particular interest for the solution of partial

differential equations (Chapter 07)

four Fourier and orthogonality ex Exercises for Chapter 06 p 1

3 Python code in this section was generated from a Jupyter notebook

named random_signal_fftipynb with a python3 kernel

fourex Exercises for Chapter

06

Exercise 061 (secrets)

This exercise encodes a ldquosecret wordrdquo into a

sampled waveform for decoding via a discrete

fourier transform (DFT) The nominal goal of

the exercise is to decode the secret word

Along the way plotting and interpreting the DFT

will be important

First load relevant packages

import numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

We define two functions letter_to_number to

convert a letter into an integer index of the

alphabet (a becomes 1 b becomes 2 etc) and

string_to_number_list to convert a string to

a list of ints as shown in the example at the

end

def letter_to_number(letter)return ord(letter) - 96

def string_to_number_list(string)out = [] listfor i in range(0len(string))

outappend(letter_to_number(string[i]))return out list

print(faces = string_to_number_list(aces) )

four Fourier and orthogonality ex Exercises for Chapter 06 p 2

aces = [1 3 5 19]

Now we encode a code string code into

a signal by beginningwith ldquowhite noiserdquo

which is broadband (appears throughout the

spectrum) and adding to it sin functions

with amplitudes corresponding to the letter

assignments of the code and harmonic

corresponding to the position of the letter in

the string For instance the string bad would

be represented by noise plus the signal

2 sin2πt + 1 sin 4πt + 4 sin 6πt (ex1)

Letrsquos set this up for secret word chupcabra

N = 2000Tm = 30T = float(Tm)float(N)fs = 1Tx = nplinspace(0 Tm N)noise = 4nprandomnormal(0 1 N)code = chupcabra the secret wordcode_number_array = nparray(string_to_number_list(code))y = nparray(noise)for i in range(0len(code))

y = y + code_number_array[i]npsin(2nppi(i+1)x)

For proper decoding later it is important

to know the fundamental frequency of the

generated data

print(ffundamental frequency = fs Hz)

four Fourier and orthogonality ex Exercises for Chapter 06 p 3

0 1 2 3 4 5 6 7

time (s)

minus60

minus40

minus20

0

20

40

60

y n

Figure 063 the chupacabra signal

fundamental frequency = 6666666666666667 Hz

Now we plot

fig ax = pltsubplots()pltplot(xy)pltxlim([0Tm4])pltxlabel(time (s))pltylabel($y_n$)pltshow()

Finally we can save our data to a numpy file

secretsnpy to distribute our message

npsave(secretsy)

four Fourier and orthogonality ex Exercises for Chapter 06 p 1

Now I have done this (for a different secret

word) and saved the data download it here

ricopiconemathematical_foundationssourcesecretsnpy

In order to load the npy file into Python we

can use the following command

secret_array = npload(secretsnpy)

Your job is to (a) perform a DFT (b) plot the

spectrum and (c) decode the message Here are

a few hints

1 Use from scipy import fft to do the

DFT

2 Use a hanning window to minimize the

end-effects See numpyhanning for

instance The fft call might then look

like

2fft(nphanning(N)secret_array)N

where N = len(secret_array)3 Use only the positive spectrum you can

lop off the negative side and double the

positive side

Exercise 062 (seesaw)

Consider the periodic function f R rarr R with

period T defined for one period as

f(t) = at for t isin (ndashT2 T2] (ex2)

where a T isin R Perform a fourier analysis on

f Letting a = 5 and T = 1 plot f along with the

partial sum of the fourier synthesis the first

50 nonzero components over t isin [ndashT T]

four Fourier and orthogonality ex Exercises for Chapter 06 p 1

Exercise 063 (9)

Consider the function f Rrarr R defined as

f(t) =

a ndash a|t|T for t isin [ndashT T]

0 otherwise(ex3)

where a T isin R Perform a fourier analysis on f

resulting in F(ω) Plot F for various a and T

Exercise 064 (10)

Consider the function f Rrarr R defined as

f(t) = aendashb|tndashT| (ex4)

where a b T isin R Perform a fourier analysis on

f resulting in F(ω) Plot F for various a b and

T

Exercise 065 (11)

Consider the function f Rrarr R defined as

f(t) = a cosω0t + b sinω0t (ex5)

where a bω0 isin R constants Perform a fourier

analysis on f resulting in F(ω)

Exercise 066 (12)

Derive a fourier transform property for

expressions including function f Rrarr R for

f(t) cos(ω0t +ψ)

where ω0 ψ isin R

Exercise 067 (13)

Consider the function f Rrarr R defined as

f(t) = aus(t)endashbt cos(ω0t +ψ) (ex6)

where a bω0 ψ isin R and us(t) is the unit

step function Perform a fourier analysis on

f resulting in F(ω) Plot F for various a b ω0 ψ

and T

four Fourier and orthogonality ex Exercises for Chapter 06 p 1

Exercise 068 (14)

Consider the function f Rrarr R defined as

f(t) = g(t) cos(ω0t) (ex7)

where ω0 isin R and g R rarr R will be defined in

each part below Perform a fourier analysis on

f for each g below for ω1 isin R a constant and

consider how things change if ω1 rarr ω0

a g(t) = cos(ω1t)

b g(t) = sin(ω1t)

Exercise 069 (savage)

An instrument called a ldquolock-in amplifierrdquo can

measure a sinusoidal signal A cos(ω0t + ψ) =

a cos(ω0t) + b sin(ω0t) at a known frequency ω0with exceptional accuracy even in the presence

of significant noise N(t) The workings of these

devices can be described in two operations

first the following operations on the signal

with its noise f1(t) = a cos(ω0t) + b sin(ω0t) + N(t)

f2(t) = f1(t) cos(ω1t) and f3(t) = f1(t) sin(ω1t)

(ex8)

where ω0 ω1 a b isin R Note the relation of

this operation to the Fourier analysis of

Exercise 068 The key is to know with some

accuracty ω0 such that the instrument can set

ω1 asymp ω0 The second operation on the signal is

an aggressive lowshypass filter The filtered f2and f3 are called the inshyphase and quadrature

components of the signal and are typically

given a complex representation

(inshyphase) + j (quadrature)

Explain with fourier analyses on f2 and f3

pde Fourier and orthogonality Exercises for Chapter 06 p 1

a what F2 = F(f2) looks like

b what F3 = F(f3) looks like

c why we want ω1 asymp ω0 d why a lowshypass filter is desirable and

e what the timeshydomain signal will look

like

Exercise 0610 (strawman)

Consider again the lock-in amplifier explored

in Exercise 069 Investigate the lock-in

amplifier numerically with the following steps

a Generate a noisy sinusoidal signal

at some frequency ω0 Include enough

broadband white noise that the signal

is invisible in a timeshydomain plot

b Generate f2 and f3 as described in

Exercise 069

c Apply a timeshydomain discrete lowshypass

filter to each f2 7rarr φ2 and f3 7rarr φ3 such

as scipyrsquos scipysignalsosfiltfilt tocomplete the lock-in amplifier operation

Plot the results in time and as a complex

(polar) plot

d Perform a discrete fourier transform

on each f2 7rarr F2 and f3 7rarr F3 Plot the

spectra

e Construct a frequency domain lowshypass

filter F and apply it (multiply) to each

F2 7rarr Fprime2 and F3 7rarr Fprime3 Plot the filtered

spectra

f Perform an inverse discrete fourier

transform to each Fprime2 7rarr fprime2 and Fprime3 7rarr fprime3

Plot the results in time and as a complex

(polar) plot

g Compare the two methods used ie time-

domain filtering versus frequencyshydomain

filtering

ordinary differential equations

lumped-parameter modeling

pde

Partialdifferential

equations

An ordinary differential equation is one

with (ordinary) derivatives of functions a

single variable eachmdashtime in many applications

These typically describe quantities in some sort

of lumped-parameter way mass as a ldquopoint

particlerdquo a springrsquos force as a function of

timeshyvarying displacement across it a resistorrsquos

current as a function of timeshyvarying voltage

across it Given the simplicity of such models

in comparison to the wildness of nature

it is quite surprising how well they work

for a great many phenomena For instance

electronics rigid body mechanics population

dynamics bulk fluid mechanics and bulk heat

transfer can be lumpedshyparameter modeled

However as we saw in there are many

phenomena of which we require more detailed

models These include

pde Partial differential equations p 3

timeshyvarying spatial distribution

partial differential equations

dependent variables

independent variables

analytic solution

numeric solution

1 There are some analytic techniques for gaining insight into PDEs

for which there are no known solutions such as considering the

phase space This is an active area of research for more see Bove

Colombini and Santo (Antonio Bove F (Ferruccio) Colombini and

Daniele Del Santo Phase space analysis of partial differential

equations eng Progress in nonlinear differential equations and

their applications v 69 Boston Berlin Birkhaumluser 2006 ISBN

9780817645212)

bull detailed fluid mechanics

bull detailed heat transfer

bull solid mechanics

bull electromagnetism and

bull quantum mechanics

In many cases what is required to account

for is the timeshyvarying spatial distribution

of a quantity In fluid mechanics we treat a

fluid as having quantities such as density and

velocity that vary continuous over space and

time Deriving the governing equations for such

phenomena typically involves vector calculus

we observed in that statements about

quantities like the divergence (eg continuity)

can be made about certain scalar and vector

fields Such statements are governing equations

(eg the continuity equation) and they are

partial differential equations (PDEs) because

the quantities of interest called dependent

variables (eg density and velocity) are both

temporally and spatially varying (temporal

and spatial variables are therefore called

independent variables)

In this chapter we explore the analytic

solution of PDEs This is related to but

distinct from the numeric solution (ie

simulation) of PDEs which is another important

topic Many PDEs have no known analytic

solution so numeric solution is the best

available option1 However it is important

to note that the insight one can gain from an

analytic solution is often much greater than

that from a numeric solution This is easily

understood when one considers that a numeric

solution is an approximation for a specific set

pde Partial differential equations class p 4

2 Kreyszig Advanced Engineering Mathematics Ch 12

3 WA Strauss Partial Differential Equations An Introduction Wiley

2007 ISBN 9780470054567 A thorough and yet relatively compact

introduction

4 R Haberman Applied Partial Differential Equations with Fourier

Series and Boundary Value Problems (Classic Version) Pearson Modern

Classics for Advanced Mathematics Pearson Education Canada 2018

ISBN 9780134995434

of initial and boundary conditions Typically

very little can be said of what would happen

in general although this is often what we seek

to know So despite the importance of numeric

solution one should always prefer an analytic

solution

Three good texts on PDEs for further study are

Kreyszig2 Strauss3 and Haberman4

pde Partial differential equations class Classifying PDEs p 1

initial conditions

boundary conditions

well-posed problem

linear

nonlinear

pdeclass Classifying PDEs

PDEs often have an infinite number of

solutions however when applying them to

physical systems we usually assume there is

a deterministic or at least a probabilistic

sequence of events will occur Therefore we

impose additonal constraints on a PDE usually

in the form of

1 initial conditions values of

independent variables over all space at

an initial time and

2 boundary conditions values of

independent variables (or their

derivatives) over all time

Ideally imposing such conditions leaves us

with a well-posed problem which has three

aspects (Antonio Bove F (Ferruccio) Colombini

and Daniele Del Santo Phase space analysis of

partial differential equations eng Progress

in nonlinear differential equations and their

applications v 69 Boston Berlin Birkhaumluser

2006 ISBN 9780817645212 sect 15)

existence There exists at least one solution

uniqueness There exists at most one solution

stability If the PDE boundary conditons or

initial conditions are changed slightly

the solution changes only slightly

As with ODEs PDEs can be linear or nonlinear

pde Partial differential equations class Classifying PDEs p 2

order

second-order PDEs

forcing function

homogeneous

that is the independent variables and

their derivatives can appear in only linear

combinations (linear PDE) or in one or more

nonlinear combination (nonlinear PDE) As with

ODEs there are more known analytic solutions

to linear PDEs than nonlinear PDEs

The order of a PDE is the order of its

highest partial derivative A great many

physical models can be described by second-

order PDEs or systems thereof Let u be an

independent scalar variable a function of

m temporal and spatial variables xi isin RnA second-order linear PDE has the form

for coefficients αβγ δ real functions

of xi (WA Strauss Partial Differential

Equations An Introduction Wiley 2007 ISBN

9780470054567 A thorough and yet relatively

compact introduction sect 16)

nsumi=1

nsumj=1

αijpart2xixju︸ ︷︷ ︸

second-order terms

+msumk=1

(γkpartxku + δku

)︸ ︷︷ ︸

first- and zeroth-order terms

= f(x1 middot middot middot xn)︸ ︷︷ ︸forcing

(class1)

where f is called a forcing function When f

is zero Equation class1 is called homogeneous

We can consider the coefficients αij to be

components of a matrix A with rows indexed

by i and columns indexed by j There are four

prominent classes defined by the eigenvalues of

A

elliptic the eigenvalues all have the same sign

parabolic the eigenvalues have the same sign

except one that is zero

hyperbolic exactly one eigenvalue has the

opposite sign of the others and

pde Partial differential equations sturm Classifying PDEs p 3

cauchyshykowalevski theorem

cauchy problems

ultrahyperbolic at least two eigenvalues of

each signs

The first three of these have received extensive

treatment They are named after conic

sections due to the similarity the equations

have with polynomials when derivatives

are considered analogous to powers of

polynomial variables For instance here is a

case of each of the first three classes

part2xxu + part2yyu = 0 (elliptic)

part2xxu ndash part2yyu = 0 (hyperbolic)

part2xxu ndash parttu = 0 (parabolic)

When A depends on xi it may have multiple

classes across its domain In general this

equation and its associated initial and

boundary conditions do not comprise a well-

posed problem however several special cases

have been shown to be wellshyposed Thus far

the most general statement of existence and

uniqueness is the cauchyshykowalevski theorem

for cauchy problems

pde Partial differential equations sturm Sturm-liouville problems p 1

sturm-liouville (SshyL) differential equation

regular SshyL problem

5 For the SshyL problem to be regular it has the additional constraints

that p qσ are continuous and pσ gt 0 on [a b] This is also sometimes

called the sturm-liouville eigenvalue problem See Haberman (Haberman

Applied Partial Differential Equations with Fourier Series and

Boundary Value Problems (Classic Version) sect 53) for the more general

(non-regular) SshyL problem and Haberman (ibid sect 74) for the multi-

dimensional analog

boundary value problems

eigenfunctions

eigenvalues

6 These eigenvalues are closely related to but distinct from the

ldquoeigenvaluesrdquo that arise in systems of linear ODEs

7 Ibid sect 53

pdesturm Sturm-liouville

problems

Before we introduce an important solution

method for PDEs in Lecture pdeseparation we

consider an ordinary differential equation

that will arise in that method when dealing

with a single spatial dimension x the sturm-

liouville (SshyL) differential equation Let

p qσ be functions of x on open interval

(a b) Let X be the dependent variable and λ

constant The regular SshyL problem is the SshyL

ODE5

d

dx

(pXprime

)+ qX + λσX = 0 (sturm1a)

with boundary conditions

β1X(a) + β2Xprime(a) = 0 (sturm2a)

β3X(b) + β4Xprime(b) = 0 (sturm2b)

with coefficients βi isin R This is a type of

boundary value problem

This problem has nontrivial solutions

called eigenfunctions Xn(x) with n isin Z+ corresponding to specific values of λ = λn

called eigenvalues6 There are several

important theorems proven about this (see

Haberman7) Of greatest interest to us are that

1 there exist an infinite number of

eigenfunctions Xn (unique within a

multiplicative constant)

2 there exists a unique corresponding real

eigenvalue λn for each eigenfunction Xn

3 the eigenvalues can be ordered as λ1 lt

λ2 lt middot middot middot

pde Partial differential equations sturm Sturm-liouville problems p 1

homogeneous boundary conditions

periodic boundary conditions

4 eigenfunction Xn has n ndash 1 zeros on open

interval (a b)

5 the eigenfunctions Xn form an

orthogonal basis with respect to

weighting function σ such that

any piecewise continuous function

f [a b] rarr R can be represented by a

generalized fourier series on [a b]

This last theorem will be of particular interest

in Lecture pdeseparation

Types of boundary conditions

Boundary conditions of the sturm-liouville

kind (sturm2) have four sub-types

dirichlet for just β2 β4 = 0

neumann for just β1 β3 = 0

robin for all βi 6= 0 andmixed if β1 = 0 β3 6= 0 if β2 = 0 β4 6= 0

There are many problems that are not regular

sturm-liouville problems For instance the

rightshyhand sides of Equation sturm2 are

zero making them homogeneous boundary

conditions however these can also be

nonzero Another case is periodic boundary

conditions

X(a) = X(b) (sturm3a)

Xprime(a) = Xprime(b) (sturm3b)

pde Partial differential equations sturm Sturm-liouville problems p 2

Example pdesturmshy1 re a sturm-liouville problem with dirichlet

boundary conditions

Problem Statement

Consider the differential equation

Xprimeprime + λX = 0 (sturm4)

with dirichlet boundary conditions on the

boundary of the interval [0 L]

X(0) = 0 and X(L) = 0 (sturm5)

Solve for the eigenvalues and eigenfunctions

Problem SolutionThis is a sturm-liouville problem so we know

the eigenvalues are real The well-known

general solutions to the ODE is

X(x) =

k1 + k2x λ = 0

k1ejradicλx + k2e

ndashjradicλx otherwise

(sturm6)

with real constants k1 k2 The solution must

also satisfy the boundary conditions Letrsquos

apply them to the case of λ = 0 first

X(0) = 0rArr k1 + k2(0) = 0rArr k1 = 0 (sturm7)

X(L) = 0rArr k1 + k2(L) = 0rArr k2 = ndashk1L (sturm8)

Together these imply k1 = k2 = 0 which gives

the trivial solution X(x) = 0 in which we

arenrsquot interested We say then for nontrivial

solutions λ 6= 0 Now letrsquos check λ lt 0 The

solution becomes

X(x) = k1endashradic|λ|x + k2e

radic|λ|x (sturm9)

= k3 cosh(radic|λ|x) + k4 sinh(

radic|λ|x) (sturm10)

where k3 and k4 are real constants Again

applying the boundary conditions

X(0) = 0rArr k3 cosh(0) + k4 sinh(0) = 0rArr k3 + 0 = 0rArr k3 = 0

X(L) = 0rArr 0cosh(radic|λ|L) + k4 sinh(

radic|λ|L) = 0rArr k4 sinh(

radic|λ|L) = 0

However sinh(radic|λ|L) 6= 0 for L gt 0 so

k4 = k3 = 0mdashagain the trivial solution Now

pde Partial differential equations sturm Sturm-liouville problems p 3

letrsquos try λ gt 0 The solution can be written

X(x) = k5 cos(radicλx) + k6 sin(

radicλx) (sturm11)

Applying the boundary conditions for this

case

X(0) = 0rArr k5 cos(0) + k6 sin(0) = 0rArr k5 + 0 = 0rArr k5 = 0

X(L) = 0rArr 0cos(radicλL) + k6 sin(

radicλL) = 0rArr k6 sin(

radicλL) = 0

Now sin(radicλL) = 0 for

radicλL = nπrArr

λ =(nπL

)2 (n isin Z+)

Therefore the only nontrivial solutions

that satisfy both the ODE and the boundary

conditions are the eigenfunctions

Xn(x) = sin(radic

λnx)

(sturm12a)

= sin(nπLx)

(sturm12b)

with corresponding eigenvalues

λn =(nπL

)2 (sturm13)

Note that because λ gt 0 λ1 is the lowest

eigenvalue

Plotting the eigenfunctions

The followingwas generated from a Jupyter

notebook with the following filename and

kernel

notebook filename eigenfunctions_example_plotipynbnotebook kernel python3

First load some Python packages

pde Partial differential equations sturm Sturm-liouville problems p 4

import numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

Set L = 1 and compute values for the first

four eigenvalues lambda_n and eigenfunctions

X_n

L = 1x = nplinspace(0L100)n = nplinspace(144dtype=int)lambda_n = (nnppiL)2X_n = npzeros([len(n)len(x)])for in_i in enumerate(n)X_n[i] = npsin(npsqrt(lambda_n[i])x)

Plot the eigenfunctions

for in_i in enumerate(n)pltplot(

xX_n[i]linewidth=2label=n = +str(n_i)

)pltlegend()pltshow() display the plot

00 02 04 06 08 10

ndash1

ndash05

0

05

1

n = 1

n = 2

n = 3

n = 4

pde Partial differential equations separation Sturm-liouville problems p 5

We see that the fourth of the SshyL theorems

appears true n ndash 1 zeros of Xn exist on the

open interval (0 1)

pde Partial differential equations separation PDE solution by separation of variables p 1

separation of variables

linear

product solution

separable PDEs

pdeseparation PDE solution by

separation of

variables

We are now ready to learn one of the

most important techniques for solving PDEs

separation of variables It applies only to

linear PDEs since it will require the principle

of superposition Not all linear PDEs yield to

this solution technique but several that are

important do

The technique includes the following steps

assume a product solution Assume the

solution can be written as a product

solution up the product of functions

of each independent variable

separate PDE Substitute up into the PDE and

rearrange such that at least one side

of the equation has functions of a single

independent variabe If this is possible

the PDE is called separable

set equal to a constant Each side of

the equation depends on different

independent variables therefore they

must each equal the same constant often

called ndashλ

repeat separation as needed If there are more

than two independent variables there

will be an ODE in the separated variable

and a PDE (with one fewer variables) in

the other independent variables Attempt

to separate the PDE until only ODEs

remain

solve each boundary value problem Solve

each boundary value problem ODE

pde Partial differential equations separation PDE solution by separation of variables p 1

eigenfunctions

superposition

ignoring the initial conditions for now

solve the time variable ODE Solve for the

general solution of the time variable ODE

sans initial conditions

construct the product solution Multiply the

solution in each variable to construct

the product solution up If the boundary

value problems were sturm-liouville

the product solution is a family of

eigenfunctions from which any function

can be constructed via a generalized

fourier series

apply the initial condition The product

solutions individually usually do not

meet the initial condition However a

generalized fourier series of them nearly

always does Superposition tells us a

linear combination of solutions to the

PDE and boundary conditions is also

a solution the unique series that also

satisfies the initial condition is the

unique solution to the entire problem

Example pdeseparationshy1 re 1D diffusion equation

Problem Statement

Consider the oneshydimensional diffusion

equation PDEa

parttu(t x) = kpart2xxu(t x) (separation1)

with real constant k with dirichlet boundary

conditions on inverval x isin [0 L]

u(t 0) = 0 (separation2a)

u(t L) = 0 (separation2b)

and with initial condition

u(0 x) = f(x) (separation3)

where f is some piecewise continuous function

on [0 L]

pde Partial differential equations separation PDE solution by separation of variables p 2

a For more on the diffusion or heat equation see Haberman

(Haberman Applied Partial Differential Equations with Fourier

Series and Boundary Value Problems (Classic Version) sect 23)

Kreyszig (Kreyszig Advanced Engineering Mathematics sect 125)

and Strauss (Strauss Partial Differential Equations An

Introduction sect 23)

Problem Solution

Assume a product solution

First we assume a product solution of the

form up(t x) = T(t)X(x) where T and X are

unknown functions on t gt 0 and x isin [0 L]

Separate PDE

Second we substitute the product solution

into Equation separation1 and separate

variables

TprimeX = kTXprimeprime rArr (separation4)

Tprime

kT=Xprimeprime

X (separation5)

So it is separable Note that we chose to

group k with T which was arbitrary but

conventional

Set equal to a constant

Since these two sides depend on different

independent variables (t and x) they must

equal the same constant we call ndashλ so we

have two ODEs

Tprime

kT= ndashλ rArr Tprime + λkT = 0 (separation6)

Xprimeprime

X= ndashλ rArr Xprimeprime + λX = 0 (separation7)

Solve the boundary value problem

The latter of these equations with the

boundary conditions (separation2) is precisely

the same sturm-liouville boundary value

problem from Example pdesturm-1 which had

pde Partial differential equations separation PDE solution by separation of variables p 3

eigenfunctions

Xn(x) = sin(radic

λnx)

(separation8a)

= sin(nπLx)

(separation8b)

with corresponding (positive) eigenvalues

λn =(nπL

)2 (separation9)

Solve the time variable ODE

The time variable ODE is homogeneous and has

the familiar general solution

T(t) = cendashkλt (separation10)

with real constant c However the boundary

value problem restricted values of λ to λn so

Tn(t) = cendashk(nπL)

2t (separation11)

Construct the product solution

The product solution is

up(t x) = Tn(t)Xn(x)

= cendashk(nπL)2t sin

(nπLx) (separation12)

This is a family of solutions that each satisfy

only exotically specific initial conditions

Apply the initial condition

The initial condition is u(0 x) = f(x) The

eigenfunctions of the boundary value problem

form a fourier series that satisfies the initial

condition on the interval [0 L] if we extend

f to be periodic and odd over x (Kreyszig

Advanced Engineering Mathematics p 550) we

call the extension flowast The odd series synthesis

can be written

flowast(x) =infinsumn=1

bn sin(nπLx)

(separation13)

pde Partial differential equations separation PDE solution by separation of variables p 4

where the fourier analysis gives

bn =2

L

ˆ L

0flowast(χ) sin

(nπLχ) (separation14)

So the complete solution is

u(t x) =infinsumn=1

bnendashk(nπL)

2t sin(nπLx)

(separation15)

Notice this satisfies the PDE the boundary

conditions and the initial condition

Plotting solutions

If we want to plot solutions we need to

specify an initial condition u(0 x) = flowast(x)

over [0 L] We can choose anything piecewise

continuous but for simplicity letrsquos let

f(x) = 1 (x isin [0 L])

The odd periodic extension is an odd square

wave The integral (separation14) gives

bn =4

nπ(1 ndash cos(nπ))

=

0 n even

4nπ n odd

(separation16)

Now we can write the solution as

u(t x) =infinsum

n=1 n odd

4

nπendashk(nπL)

2t sin(nπLx)

(separation17)

Plotting in Python

First load some Python packages

import numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

pde Partial differential equations separation PDE solution by separation of variables p 5

Set k = L = 1 and sum values for the first Nterms of the solution

L = 1k = 1N = 100x = nplinspace(0L300)t = nplinspace(02(Lnppi)2100)u_n = npzeros([len(t)len(x)])for n in range(N)n = n+1 because index starts at 0if n 2 == 0 even

pass already initialized to zeroselse odd

u_n += 4(nnppi)npouter(npexp(-k(nnppiL)2t)npsin(nnppiLx)

)

Letrsquos first plot the initial condition

p = pltfigure()pltplot(xu_n[0])pltxlabel(space $x$)pltylabel($u(0x)$)pltshow()

00 02 04 06 08 10space x

00

02

04

06

08

10

12

u(0x)

pde Partial differential equations wave PDE solution by separation of variables p 6

Now we plot the entire response

p = pltfigure()pltcontourf(txu_nT)c = pltcolorbar()cset_label($u(tx))pltxlabel(time $t$)pltylabel(space $x$)pltshow()

000 005 010 015 020

time t

00

02

04

06

08

10

spacex

000

015

030

045

060

075

090

105

120

u(tx)

We see the diffusive action proceeds as we

expected

Python code in this section was generated from a Jupyter

notebook named pde_separation_example_01ipynb with a

python3 kernel

pde Partial differential equations wave The 1D wave equation p 1

wave equation

8 Kreyszig Advanced Engineering Mathematics sect 129 1210

9 Haberman Applied Partial Differential Equations with Fourier Series

and Boundary Value Problems (Classic Version) sect 45 73

pdewave The 1D wave equation

The oneshydimensional wave equation is the

linear PDE

part2ttu(t x) = c2part2xxu(t x) (wave1)

with real constant c This equation models

such phenomena as strings fluids sound and

light It is subject to initial and boundary

conditions and can be extended to multiple

spatial dimensions For 2D and 3D examples

in rectangular and polar coordinates see

Kreyszig8 and Haberman9

Example pdewaveshy1 re vibrating string PDE solution by separation of

variables

Problem Statement

Consider the oneshydimensional wave equation

PDE

part2ttu(t x) = c2part2xxu(t x) (wave2)

with real constant c and with dirichlet

boundary conditions on inverval x isin [0 L]

u(t 0) = 0 and u(t L) = 0 (wave3a)

and with initial conditions (we need two

because of the second timeshyderivative)

u(0 x) = f(x) and parttu(0 x) = g(x) (wave4)

where f and g are some piecewise continuous

functions on [0 L]

Problem Solution

Assume a product solution

First we assume a product solution of the

form up(t x) = T(t)X(x) where T and X are

unknown functions on t gt 0 and x isin [0 L]

pde Partial differential equations wave The 1D wave equation p 2

Separate PDE

Second we substitute the product solution

into Equation wave2 and separate variables

TprimeprimeX = c2TXprimeprime rArr (wave5)

Tprimeprime

c2T=Xprimeprime

X (wave6)

So it is separable Note that we chose to

group c with T which was arbitrary but

conventional

Set equal to a constant

Since these two sides depend on different

independent variables (t and x) they must

equal the same constant we call ndashλ so we

have two ODEs

Tprimeprime

c2T= ndashλ rArr Tprimeprime + λc2T = 0 (wave7)

Xprimeprime

X= ndashλ rArr Xprimeprime + λX = 0 (wave8)

Solve the boundary value problem

The latter of these equations with the

boundary conditions (wave3) is precisely

the same sturm-liouville boundary value

problem from Example pdesturm-1 which had

eigenfunctions

Xn(x) = sin(radic

λnx)

(wave9a)

= sin(nπLx)

(wave9b)

with corresponding (positive) eigenvalues

λn =(nπL

)2 (wave10)

Solve the time variable ODE

The time variable ODE is homogeneous and

with λ restricted by the reals by the

boundary value problem has the familiar

general solution

T(t) = k1 cos(cradicλt) + k2 sin(c

radicλt) (wave11)

pde Partial differential equations wave The 1D wave equation p 3

with real constants k1 and k2 However the

boundary value problem restricted values of λ

to λn so

Tn(t) = k1 cos(cnπLt)+ k2 sin

(cnπLt)

(wave12)

Construct the product solution

The product solution is

up(t x) = Tn(t)Xn(x)

= k1 sin(nπLx)cos

(cnπLt)+ k2 sin

(nπLx)sin

(cnπLt)

This is a family of solutions that each satisfy

only exotically specific initial conditions

Apply the initial conditions

Recall that superposition tells us that any

linear combination of the product solution is

also a solution Therefore

u(t x) =infinsumn=1

an sin(nπLx)cos

(cnπLt)+ bn sin

(nπLx)sin

(cnπLt)

(wave13)

is a solution If an and bn are properly

selected to satisfy the initial conditions

Equation wave13 will be the solution to the

entire problem Substituting t = 0 into our

potential solution gives

u(0 x) =infinsumn=1

an sin(nπLx)

(wave14a)

parttu(t x)|t=0 =infinsumn=1

bncnπ

Lsin

(nπLx) (wave14b)

Let us extend f and g to be periodic and odd

over x we call the extensions flowast and glowast From

Equation wave14 the intial conditions are

satsified if

flowast(x) =infinsumn=1

an sin(nπLx)

(wave15a)

glowast(x) =infinsumn=1

bncnπ

Lsin

(nπLx) (wave15b)

pde Partial differential equations wave The 1D wave equation p 4

We identify these as two odd fourier

syntheses The corresponding fourier analyses

are

an =2

L

ˆ L

0flowast(χ) sin

(nπLχ)

(wave16a)

bncnπ

L=2

L

ˆ L

0glowast(χ) sin

(nπLχ)

(wave16b)

So the complete solution is Equation wave13

with components given by Equation wave16

Notice this satisfies the PDE the boundary

conditions and the initial condition

Discussion

It can be shown that this series solution is

equivalent to two travelingwaves that are

interfering (see Habermana and Kreyszigb)

This is convenient because computing the

series solution exactly requires an infinite

summation We show in the following section

that the approximation by partial summation is

still quite good

Choosing specific initial conditions

If we want to plot solutions we need to

specify initial conditions over [0 L] Letrsquos

model a string being suddenly struck from

rest as

f(x) = 0 (wave17)

g(x) = δ(x ndash ∆L) (wave18)

where δ is the dirac delta distribution and

∆ isin [0 L] is a fraction of L representing the

location of the string being struck The odd

periodic extension is an odd pulse train The

integrals of (wave16) give

an = 0 (wave19a)

bn =2

cnπ

ˆ L

0δ(χ ndash ∆L) sin

(nπLχ)dx

=2

cnπsin(nπ∆) (sifting property)

pde Partial differential equations wave The 1D wave equation p 5

Now we can write the solution as

u(t x) =infinsumn=1

2

cnπsin(nπ∆) sin

(nπLx)sin

(cnπLt)

(wave20)

Plotting in Python

First load some Python packages

import numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

Set c = L = 1 and sum values for the first

N terms of the solution for some striking

location ∆

Delta = 01 0 lt= Delta lt= LL = 1c = 1N = 150t = nplinspace(030(Lnppi)2100)x = nplinspace(0L150)t_b x_b = npmeshgrid(tx)u_n = npzeros([len(x)len(t)])for n in range(N)n = n+1 because index starts at 0u_n += 4(cnnppi)

npsin(nnppiDelta) npsin(cnnppiLt_b) npsin(nnppiLx_b)

Letrsquos first plot some early snapshots of the

response

pde Partial differential equations wave The 1D wave equation p 6

import seaborn as snsn_snaps = 7snsset_palette(snsdiverging_palette(

240 10 n=n_snaps center=dark)

)

fig ax = pltsubplots()it = nplinspace(277n_snapsdtype=int)for i in range(len(it))axplot(xu_n[it[i]]label=ft = t[i]3f)

lgd = axlegend(bbox_to_anchor=(105 1)loc=upper left

)pltxlabel(space $x$)pltylabel($u(tx)$)pltshow()

00 02 04 06 08 10

space x

minus10

minus05

00

05

10

u(tx

)

t = 0000

t = 0031

t = 0061

t = 0092

t = 0123

t = 0154

t = 0184

Now we plot the entire response

fig ax = pltsubplots()p = axcontourf(txu_n)c = figcolorbar(pax=ax)cset_label($u(tx)$)pltxlabel(time $t$)pltylabel(space $x$)pltshow()

pde Partial differential equations ex The 1D wave equation p 7

00 05 10 15 20 25 30

time t

00

02

04

06

08

10

spac

ex

minus12

minus09

minus06

minus03

00

03

06

09

12

u(tx

)

We see a wave develop and travel reflecting

and inverting off each boundary

a Haberman Applied Partial Differential Equations with Fourier

Series and Boundary Value Problems (Classic Version) sect 44

b Kreyszig Advanced Engineering Mathematics sect 122

b Python code in this section was generated from a Jupyter

notebook named pde_separation_example_02ipynb with a

python3 kernel

pde Partial differential equations ex Exercises for Chapter 07 p 1

pdeex Exercises for Chapter 07

Exercise 071 (poltergeist)

The PDE of Example pdeseparation-1 can be used

to describe the conduction of heat along

a long thin rod insulated along its length

where u(t x) represents temperature The initial

and dirichlet boundary conditions in that

example would be interpreted as an initial

temperature distribution along the bar and

fixed temperatures of the ends Now consider

the same PDE

parttu(t x) = kpart2xxu(t x) (ex1)

with real constant k now with neumann

boundary conditions on inverval x isin [0 L]

partxu|x=0 = 0 and partxu|x=L = 0 (ex2a)

and with initial condition

u(0 x) = f(x) (ex3)

where f is some piecewise continuous function

on [0 L] This represents the complete insulation

of the ends of the rod such that no heat flows

from the ends (or from anywhere else)

a Assume a product solution separate

variables into X(x) and T(t) and set the

separation constant to ndashλ

b Solve the boundary value problem for its

eigenfunctions Xn and eigenvalues λn

c Solve for the general solution of the

time variable ODE

d Write the product solution and apply the

initial condition f(x) by constructing it

from a generalized fourier series of the

product solution

pde Partial differential equations ex Exercises for Chapter 07 p 1

e Let L = k = 1 and f(x) = 100ndash200L |xndashL2| as

shown in Figure 071 Compute the solution

series components Plot the sum of the

first 50 terms over x and t

0 02 04 06 08 1050100

x

f(x)

Figure 071 initial condition for Exercise 051Exercise 072 (kathmandu)

Consider the free vibration of a uniform

and relatively thin beammdashwith modulus of

elasticity E second moment of cross-sectional

area I and massshyper-length micromdashpinned at each

end The PDE describing this is a version of

the euler-bernoulli beam equation for vertical

motion u

part2ttu(t x) = ndashα2part4xxxxu(t x) (ex4)

with real constant α defined as

α2 =EI

micro (ex5)

Pinned supports fix vertical motion such that

we have boundary conditions on interval

x isin [0 L]

u(t 0) = 0 and u(t L) = 0 (ex6a)

Additionally pinned supports cannot provide

a moment so

part2xxu|x=0 = 0 and part2xxu|x=L = 0 (ex6b)

Furthermore consider the initial conditions

u(0 x) = f(x) and parttu|t=0 = 0 (ex7a)

where f is some piecewise continuous function

on [0 L]

opt Partial differential equations Exercises for Chapter 07 p 2

a Assume a product solution separate

variables into X(x) and T(t) and set the

separation constant to ndashλ

b Solve the boundary value problem for

its eigenfunctions Xn and eigenvalues λn

Assume real λ gt 0 (itrsquos true but tedious to

show)

c Solve for the general solution of the

time variable ODE

d Write the product solution and apply

the initial conditions by constructing

it from a generalized fourier series of

the product solution

e Let L = α = 1 and f(x) = sin(10πxL)

as shown in Figure 072 Compute the

solution series components Plot the sum

of the first 50 terms over x and t

0 02 04 06 08 1ndash101

x

f(x)

Figure 072 initial condition for Exercise 072

opt

Optimization

opt Optimization grad Gradient descent p 1

objective function

extremum

global extremum

local extremum

gradient descent

stationary point

saddle point

hessian matrix

optgrad Gradient descent

Consider a multivariate function f Rn rarr Rthat represents some cost or value This is

called an objective function and we often

want to find an X isin Rn that yields frsquos

extremum minimum or maximum depending on

whichever is desirable

It is important to note however that some

functions have no finite extremum Other

functions have multiple Finding a global

extremum is generally difficult however

many good methods exist for finding a local

extremum an extremum for some region R sub Rn

The method explored here is called gradient

descent It will soon become apparent why it

has this name

Stationary points

Recall from basic calculus that a function f of

a single variable had potential local extrema

where df(x) dx = 0 The multivariate version of

this for multivariate function f is

grad f = 0 (grad1)

A value X for which Equation grad1 holds is

called a stationary point However as in the

univariate case a stationary point may not be

a local extremum in these cases it called a

saddle point

Consider the hessian matrix H with values for

independent variables xi

Hij = part2xixjf (grad2)

opt Optimization grad Gradient descent p 1

second partial derivative test

positive definite

negative definite

indefinite

convex

concave

the gradient points toward stationary points

For a stationary point X the second partial

derivative test tells us if it is a local

maximum local minimum or saddle point

minimum If H(X) is positive definite (all its

eigenvalues are positive)

X is a local minimum

maximum If H(X) is negative definite (all its

eigenvalues are negative)

X is a local maximum

saddle If H(X) is indefinite (it has both

positive and negative eigenvalues)

X is a saddle point

These are sometimes called tests for concavity

minima occur where f is convex and maxima

where f is concave (ie where ndashf is convex)

It turns out however that solving

Equation grad1 directly for stationary points is

generally hard Therefore we will typically use

an iterative technique for estimating them

The gradient points the way

Although Equation grad1 isnrsquot usually directly

useful for computing stationary points it

suggests iterative techniques that are Several

techniques rely on the insight that the

gradient points toward stationary points

Recall from Lecture vecsgrad that grad f is a

vector field that points in the direction of

greatest increase in f

opt Optimization grad Gradient descent p 1

line search

step size α

Consider starting at some point x0 and

wanting to move iteratively closer to a

stationary point So if one is seeking a

maximum of f then choose x1 to be in the

direction of grad f If one is seeking a

minimum of f then choose x1 to be opposite the

direction of grad f

The question becomes how far α should we go

in (or opposite) the direction of the gradient

Surely too-small α will require more iteration

and too-large α will lead to poor convergence

or missing minima altogether This framing of

the problem is called line search There are a

few common methods for choosing α called the

step size some more computationally efficient

than others

Two methods for choosing the step size

are described below Both are framed as

minimization methods but changing the sign of

the step turns them into maximization methods

The classical method

Let

gk = grad f(xk) (grad3)

the gradient at the algorithmrsquos current

estimate xk of the minimum The classical

method of choosing α is to attempt to solve

analytically for

αk = argminα

f(xk ndash αgk) (grad4)

This solution approximates the function f

as one varies α It is approximate because

as α varies so should x But even with α as

the only variable Equation grad4 may be

opt Optimization grad Gradient descent p 1

1 Kreyszig Advanced Engineering Mathematics sect 221

2 Jonathan Barzilai and Jonathan M Borwein Two-Point Step Size

Gradient Methods In IMA Journal of Numerical Analysis 81 (Jan 1988)

pp 141ndash148 issn 0272-4979 doi 101093imanum81141 This

includes an innovative line search method

difficult or impossible to solve However this

is sometimes called the ldquooptimalrdquo choice for

α Here ldquooptimalityrdquo refers not to practicality

but to ideality This method is rarely used to

solve practical problems

The algorithm of the classical gradient descent

method can be summarized in the pseudocode

of Algorithm 1 It is described further in

Kreyszig1

Algorithm 1 Classical gradient descent

1 procedure classical_minimizer(fx0 T)

2 while δx gt T do until threshold T is

met

3 gk larr grad f(xk)

4 αk larr argminα f(xk ndash αgk)

5 xk+1 larr xk ndash αkgk6 δxlarr xk+1 ndash xk7 klarr k + 1

8 end while

9 return xk the threshold was reached

10 end procedureThe Barzilai and Borwein method

In practice several non-classical methods are

used for choosing step size α Most of these

construct criteria for step sizes that are too

small and too large and prescribe choosing

some α that (at least in certain cases) must

be in the sweet-spot in between Barzilai and

Borwein2 developed such a prescription which

we now present

Let ∆xk = xk ndash xkndash1 and ∆gk = gk ndash gkndash1 This

method minimizes ∆x ndash α∆g2 by choosing

αk =∆xk middot ∆gk∆gk middot ∆gk

(grad5)

opt Optimization grad Gradient descent p 2

3 Barzilai and Borwein Two-Point Step Size Gradient Methods

The algorithm of this gradient descent

method can be summarized in the pseudocode

of Algorithm 2 It is described further in

Barzilai and Borwein3

Algorithm 2 Barzilai and Borwein gradient

descent

1 procedure barzilai_minimizer(fx0 T)

2 while δx gt T do until threshold T is

met

3 gk larr grad f(xk)

4 ∆gk larr gk ndash gkndash15 ∆xk larr xk ndash xkndash16 αk larr

∆xkmiddot∆gk∆gkmiddot∆gk

7 xk+1 larr xk ndash αkgk8 δxlarr xk+1 ndash xk9 klarr k + 1

10 end while

11 return xk the threshold was reached

12 end procedure

Example optgradshy1 re Barzilai and Borwein gradient descent

Problem Statement

Consider the functions (a) f1 R2 rarr R and (b)

f2 R2 rarr R defined as

f1(x) = (x1 ndash 25)2 + 13(x2 + 10)

2 (grad6)

f2(x) =1

2x middot Ax ndash b middot x (grad7)

where

A =

[20 0

0 10

]and (grad8a)

b =[1 1

]gt (grad8b)

Use the method of Barzilai and Borweina

starting at some x0 to find a minimum of each

function

a Ibid

Problem Solution

First load some Python packages

opt Optimization grad Gradient descent p 3

from sympy import import numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latexfrom tabulate import tabulate

We begin by writing a class gradient_descent_minto perform the gradient descent This is not

optimized for speed

class gradient_descent_min() A Barzilai and Borwein gradient descent class

Inputs f Python function of x variables x list of symbolic variables (eg [x1 x2]) x0 list of numeric initial guess of a min of f T step size threshold for stopping the descent

To execute the gradient descent call descend method

nb This is only for gradients in cartesiancoordinates Further work would be to implementthis in multiple or generalized coordinatesSee the grad method below for implementation

def __init__(selffxx0T)selff = fselfx = Array(x)selfx0 = nparray(x0)selfT = Tselfn = len(x0) size of xselfg = lambdify(xselfgrad(fx)numpy)selfxk = nparray(x0)selftable =

def descend(self) unpack variablesf = selffx = selfxx0 = selfx0T = selfTg = selfg initialize variablesN = 0

opt Optimization grad Gradient descent p 4

x_k = x0dx = 2T cant be zerox_km1 = 9x0-1 cant equal x0g_km1 = nparray(g(x_km1))N_max = 100 max iterationstable_data = [[Nx0nparray(g(x0))0]]while (dx gt T and N lt N_max) or N lt 1N += 1 increment indexg_k = nparray(g(x_k))dg_k = g_k - g_km1dx_k = x_k - x_km1alpha_k = dx_kdot(dg_k)dg_kdot(dg_k)x_km1 = x_k storex_k = x_k - alpha_kg_k savetable_dataappend([Nx_kg_kalpha_k])selfxk = npvstack((selfxkx_k)) store other variablesg_km1 = g_kdx = nplinalgnorm(x_k - x_km1) check

selftabulater(table_data)

def tabulater(selftable_data)npset_printoptions(precision=2)tabulateLATEX_ESCAPE_RULES=selftable[python] = tabulate(table_dataheaders=[Nx_kg_kalpha]

)selftable[latex] = tabulate(table_dataheaders=[

$N$$bmx_k$$bmg_k$$alpha$]tablefmt=latex_raw

)

def grad(selffx) cartesian coords gradientreturn derive_by_array(f(x)x)

First consider f1

var(x1 x2)x = Array([x1x2])f1 = lambda x (x[0]-25)2 + 13(x[1]+10)2gd = gradient_descent_min(f=f1x=xx0=[-5040]T=1e-8)

opt Optimization grad Gradient descent p 5

Perform the gradient descent

gddescend()

Print the interesting variables

print(gdtable[python])

N xk gk α

0 [-50 40] [-150 1300] 0

1 [-4365 -1503] [-150 1300] 00423296

2 [shy3836 -10 ] [-1373 -13074] 00384979

3 [shy3311 -10 ] [-127e+02 124e-01] 0041454

4 [ 2499 -10 ] [-116e+02 -962e-03] 0499926

5 [ 25 -1005] [-002 012] 0499999

6 [ 25 -10] [-184e-08 -138e+00] 00385225

7 [ 25 -10] [-170e-08 219e-03] 00384615

8 [ 25 -10] [-157e-08 000e+00] 00384615

Now letrsquos lambdify the function f1 so we can

plot

f1_lambda = lambdify((x1x2)f1(x)numpy)

Now letrsquos plot a contour plot with the

gradient descent overlaid

opt Optimization grad Gradient descent p 6

fig ax = pltsubplots() contour plotX1 = nplinspace(-100100100)X2 = nplinspace(-5050100)X1 X2 = npmeshgrid(X1X2)F1 = f1_lambda(X1X2)pltcontourf(X1X2F1) gradient descent plotfrom mpl_toolkitsmplot3d import Axes3Dfrom matplotlibcollections import LineCollectionxX1 = gdxk[0]xX2 = gdxk[1]points = nparray([xX1 xX2])Treshape(-1 1 2)segments = npconcatenate([points[-1] points[1]] axis=1

)lc = LineCollection(segmentscmap=pltget_cmap(Reds)

)lcset_array(nplinspace(01len(xX1))) color segslcset_linewidth(3)axautoscale(False) avoid the scatter changing limsaxadd_collection(lc)axscatter(xX1xX2zorder=1marker=ocolor=pltcmReds(nplinspace(01len(xX1)))edgecolor=none

)pltshow()

minus100 minus75 minus50 minus25 0 25 50 75 100

minus40

minus20

0

20

40

opt Optimization grad Gradient descent p 7

Now consider f2

A = Matrix([[100][020]])b = Matrix([[11]])def f2(x)X = Array([x])tomatrix()Treturn 12Xdot(AX) - bdot(X)

gd = gradient_descent_min(f=f2x=xx0=[50-40]T=1e-8)

Perform the gradient descent

gddescend()

Print the interesting variables

print(gdtable[python])

opt Optimization grad Gradient descent p 8

N xk gk α

0 [ 50 -40] [ 499 -801] 0

1 [1758 1204] [ 499 -801] 00649741

2 [ 807 -101] [17478 23988] 00544221

3 [362 017] [ 7966 shy2122] 00558582

4 [ 049 -005] [3516 249] 00889491

5 [01 014] [ 389 -194] 00990201

6 [01 0 ] [004 19 ] 00750849

7 [01 005] [ 001 -095] 0050005

8 [01 005] [474e-03 958e-05] 00500012

9 [01 005] [ 237e-03 shy238e-09] 00999186

10 [01 005] [193e-06 237e-09] 01

11 [01 005] [ 000e+00 shy237e-09] 00999997

Now letrsquos lambdify the function f2 so we can

plot

f2_lambda = lambdify((x1x2)f2(x)numpy)

Now letrsquos plot a contour plot with the

gradient descent overlaid

fig ax = pltsubplots() contour plotX1 = nplinspace(-100100100)X2 = nplinspace(-5050100)X1 X2 = npmeshgrid(X1X2)F2 = f2_lambda(X1X2)pltcontourf(X1X2F2) gradient descent plotfrom mpl_toolkitsmplot3d import Axes3Dfrom matplotlibcollections import LineCollectionxX1 = gdxk[0]xX2 = gdxk[1]points = nparray([xX1 xX2])Treshape(-1 1 2)

opt Optimization lin Gradient descent p 9

segments = npconcatenate([points[-1] points[1]] axis=1

)lc = LineCollection(segmentscmap=pltget_cmap(Reds)

)lcset_array(nplinspace(01len(xX1))) color segslcset_linewidth(3)axautoscale(False) avoid the scatter changing limsaxadd_collection(lc)axscatter(xX1xX2zorder=1marker=ocolor=pltcmReds(nplinspace(01len(xX1)))edgecolor=none

)pltshow()

minus100 minus75 minus50 minus25 0 25 50 75 100

minus40

minus20

0

20

40

Python code in this section was generated from a Jupyter

notebook named gradient_descentipynb with a python3kernel

opt Optimization lin Constrained linear optimization p 1

constraints

linear programming problem

maximize

constrained

linear

feasible solution

optimal

hyperplane

optlin Constrained linear

optimization

Consider a linear objective function f

Rn rarr R with variables xi in vector x and

coefficients ci in vector c

f(x) = c middot x (lin1)

subject to the linear constraintsmdashrestrictions

on ximdash

Ax 6 a (lin2a)

Bx = b and (lin2b)

l 6 x 6 u (lin2c)

where A and B are constant matrices and

ab lu are nshyvectors This is one formulation

of what is called a linear programming

problem Usually we want to maximize f over

the constraints Such problems frequently

arise throughout engineering for instance

in manufacturing transportation operations

etc They are called constrained because

there are constraints on x they are called

linear because the objective function and the

constraints are linear

We call a pair (x f(x)) for which x satisfies

Equation lin2 a feasible solution Of course

not every feasible solution is optimal a

feasible solution is optimal iff there exists no

other feasible solution for which f is greater

(assumingwersquore maximizing) We call the vector

subspace of feasible solutions S sub Rn

Feasible solutions form a polytope

Consider the effect of the constraints Each

of the equalities and inequalities defines a

linear hyperplane in Rn (ie a linear subspace

opt Optimization lin Constrained linear optimization p 1

flat

polytope

vertices

basic feasible solutions

level set

an optimal solution occurs at a vertex of S

of dimension n ndash 1) either as a boundary of

S (inequality) or as a restriction of S to the

hyperplane When joined these hyperplanes

are the boundary of S (equalities restrict S

to lower dimension) So we see that each of

the boundaries of S is flat which makes S a

polytope (in R2 a polygon) What makes this

especially interesting is that polytopes have

vertices where the hyperplanes intersect

Solutions at the vertices are called basic

feasible solutions

Only the vertices matter

Our objective function f is linear so for

some constant h f(x) = h defines a level set

that is itself a hyperplane H in Rn If thishyperplane intersects S at a point x (x f(x) = h)

is the corresponding solution There are three

possibilities when H intersects S

1 H cap S is a vertex of S

2 H cap S is a boundary hyperplane of S or

3 H cap S slices through the interior of S

However this third option implies that there

exists a level set G corresponding to f(x) =

g such that G intersects S and g gt h so

solutions on H cap S are not optimal (We have

not proven this but it may be clear from

our progression) We conclude that either the

first or second case must be true for optimal

solutions And notice that in both cases a

(potentially optimal) solution occurs at at

least one vertex The key insight then is that

an optimal solution occurs at a vertex of

S

opt Optimization simplex Constrained linear optimization p 2

This means we donrsquot need to search all of S

or even its boundary we need only search the

vertices Helpful as this is it restricts us down

to( n constraints

)potentially optimal solutionsmdash

usually still too many to search in a naiumlve

way In Lecture optsimplex this is mitigated

by introducing a powerful searching method

opt Optimization simplex The simplex algorithm p 1

simplex algorithm

4 Another Python package pulp (PuLP) is probably more popular for

linear programming however we choose scipyoptimize because it

has applications beyond linear programming

optsimplex The simplex

algorithm

The simplex algorithm (or ldquomethodrdquo) is an

iterative technique for finding an optimal

solution of the linear programming problem

of Equations lin1 and lin2 The details of

the algorithm are somewhat involved but

the basic idea is to start at a vertex of the

feasible solution space S and traverse an edge

of the polytope that leads to another vertex

with a greater value of f Then repeat this

process until there is no neighboring vertex

with a greater value of f at which point the

solution is guaranteed to be optimal

Rather than present the details of the

algorithm we choose to show an example using

Python There have been some improvements

on the original algorithm that have been

implemented into many standard software

packages including Pythonrsquos scipy package

(Pauli Virtanen et al SciPy 10ndashFundamental

Algorithms for Scientific Computing in Python

In arXiv eshyprints arXiv190710121 [July 2019]

arXiv190710121) module scipyoptimize4

Example optsimplexshy1 re simplex method using scipyoptimize

Problem Statement

Maximize the objective function

f(x) = c middot x (simplex1a)

for x isin R2 and

c =[5 2

]gt(simplex1b)

opt Optimization simplex The simplex algorithm p 2

subject to constraints

0 6 x1 6 10 (simplex2a)

ndash5 6 x2 6 15 (simplex2b)

4x1 + x2 6 40 (simplex2c)

x1 + 3x2 6 35 (simplex2d)

ndash8x1 ndash x2 gt ndash75 (simplex2e)

Problem Solution

First load some Python packages

from scipyoptimize import linprogimport numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

Encoding the problem

Before we can use linprog we must first

encode Equations simplex2 and ex4 into

a form linprog will recognize We begin

with f which we can write as c middot x with the

coefficients of c as follows

c = [-5 -2] negative to find max

Wersquove negated each constant because

linprog minimizes f and we want to

maximize f Now letrsquos encode the inequality

constraints We will write the leftshyhand

side coefficients in the matrix A and the

opt Optimization simplex The simplex algorithm p 3

rightshyhand-side values in vector a such that

Ax 6 a (simplex3)

Notice that one of our constraint

inequalities is gt instead of 6 We can

flip this by multiplying the inequality by ndash1

We use simple lists to encode A and a

A = [[4 1][1 3][8 1]

]a = [40 35 75]

Now we need to define the lower l and upper

u bounds of x The function linprog expects

these to be in a single list of lower- and

upper-bounds of each xi

lu = [(0 10)(-515)

]

We want to keep track of each step linprogtakes We can access these by defining a

function callback to be passed to linprog

x = [] for storing the stepsdef callback(res) called at each step

opt Optimization simplex The simplex algorithm p 4

global xprint(fnit = resnit x_k = resx)xappend(resxcopy()) store

Now we need to call linprog We donrsquot have

any equality constraints so we need only use

the keyword arguments A_ub=A and b_ub=aFor demonstration purposes we tell it to

use the simplex method which is not as

good as its other methods which use better

algorithms based on the simplex

res = linprog(cA_ub=Ab_ub=abounds=lumethod=simplexcallback=callback

)

x = nparray(x)

nit = 0 x_k = [ 0 -5]nit = 1 x_k = [10 -5]nit = 2 x_k = [875 5 ]nit = 3 x_k = [772727273 909090909]nit = 4 x_k = [772727273 909090909]nit = 5 x_k = [772727273 909090909]nit = 5 x_k = [772727273 909090909]

So the optimal solution (x f(x)) is as follows

opt Optimization simplex The simplex algorithm p 5

print(foptimum x resx)print(foptimum f(x) -resfun)

optimum x [772727273 909090909]optimum f(x) 5681818181818182

The last point was repeated

1 once because there was no adjacent

vertex with greater f(x) and

2 twice because the algorithm calls

lsquocallbacklsquo twice on the last step

Plotting

When the solution space is in R2 it is

helpful to graphically represent the solution

space constraints and the progression of

the algorithm We begin by defining the

inequality lines from A and a over the bounds

of x1

n = len(c) number of variables xm = npshape(A)[0] number of inequality constraintsx2 = npempty([m2])for i in range(0m)x2[i] = -A[i][0]A[i][1]nparray(lu[0]) + a[i]A[i][1]

Now we plot a contour plot of f over

the bounds of x1 and x2 and overlay the

opt Optimization simplex The simplex algorithm p 6

inequality constraints and the steps of the

algorithm stored in x

lu_array = nparray(lu)fig ax = pltsubplots()mplrcParams[lineslinewidth] = 3 contour plotX1 = nplinspace(lu_array[0]100)X2 = nplinspace(lu_array[1]100)X1 X2 = npmeshgrid(X1X2)F2 = -c[0]X1 + -c[1]X2 negative because max hackcon = axcontourf(X1X2F2)cbar = figcolorbar(conax=ax)cbaraxset_ylabel(objective function) bounds on xun = nparray([11])opts = cwlabelNonelinewidth6pltplot(lu_array[0]lu_array[10]unopts)pltplot(lu_array[0]lu_array[11]unopts)pltplot(lu_array[00]unlu_array[1]opts)pltplot(lu_array[01]unlu_array[1]opts) inequality constraintsfor i in range(0m)p = pltplot(lu[0]x2[i]c=w)

pset_label(constraint) stepspltplot(x[0]x[1]-oc=rclip_on=Falsezorder=20label=simplex

)pltylim(lu_array[1])pltxlabel($x_1$)pltylabel($x_2$)pltlegend()pltshow()

opt Optimization ex The simplex algorithm p 7

0 2 4 6 8 10x1

minus50

minus25

00

25

50

75

100

125

150

x2

constraint

simplex

minus15

0

15

30

45

60

75

90

obje

ctiv

efu

nct

ion

Python code in this section was generated from a Jupyter

notebook named simplex_linear_programmingipynb with a

python3 kernel

nlin Optimization Exercises for Chapter 08 p 1

5 Barzilai and Borwein Two-Point Step Size Gradient Methods

optex Exercises for Chapter 08

Exercise 081 (cummerbund)

Consider the functions (a) f1 R2 rarr R and (b)

f2 R2 rarr R defined as

f1(x) = 4(x1 ndash 16)2 + (x2 + 64)

2 + x1 sin2 x1 (ex1)

f2(x) =1

2x middot Ax ndash b middot x (ex2)

where

A =

[5 0

0 15

]and (ex3a)

b =[ndash2 1

]gt (ex3b)

Use the method of Barzilai and Borwein5

starting at some x0 to find a minimum of each

function

Exercise 082 (melty)

Maximize the objective function

f(x) = c middot x (ex4a)

for x isin R3 and

c =[3 ndash8 1

]gt(ex4b)

subject to constraints

0 6 x1 6 20 (ex5a)

ndash5 6 x2 6 0 (ex5b)

5 6 x3 6 17 (ex5c)

x1 + 4x2 6 50 (ex5d)

2x1 + x3 6 43 (ex5e)

ndash4x1 + x2 ndash 5x3 gt ndash99 (ex5f)

1 As is customary we frequently say ldquosystemrdquo when we mean

ldquomathematical system modelrdquo Recall that multiple models may be

used for any given physical system depending on what one wants to

know

nlin

Nonlinear analysis

1 The ubiquity of near-linear systems and

the tools we have for analyses thereof can

sometimes give the impression that nonlinear

systems are exotic or even downright

flamboyant However a great many systems1

important for a mechanical engineer are

frequently hopelessly nonlinear Here are a some

examples of such systems

bull A robot arm

bull Viscous fluid flow (usually modelled by

the navier-stokes equations)

bull

bull Anything that ldquofills uprdquo or ldquosaturatesrdquo

bull Nonlinear optics

bull Einsteinrsquos field equations (gravitation in

general relativity)

bull Heat radiation and nonlinear heat

conduction

bull Fracture mechanics

bull

nlin Nonlinear analysis ss p 3

2 SH Strogatz and M Dichter Nonlinear Dynamics and Chaos

Second Studies in Nonlinearity Avalon Publishing 2016 isbn

9780813350844

3 W Richard Kolk and Robert A Lerman Nonlinear System Dynamics

1st ed Springer US 1993 isbn 978-1-4684shy6496shy2

2 Lest we think this is merely an

inconvenience we should keep in mind that

it is actually the nonlinearity that makes

many phenomena useful For instance the

depends on the nonlinearity

of its optics Similarly transistors and the

digital circuits made thereby (including the

microprocessor) wouldnrsquot function if their

physics were linear

3 In this chapter we will see some ways

to formulate characterize and simulate

nonlinear systems Purely

are few for nonlinear systems

Most are beyond the scope of this text but we

describe a few mostly in Lecture nlinchar

Simulation via numerical integration of

nonlinear dynamical equations is the most

accessible technique so it is introduced

4 We skip a discussion of linearization

of course if this is an option it is

preferable Instead we focus on the

5 For a good introduction to nonlinear

dynamics see Strogatz and Dichter2 A more

engineer-oriented introduction is Kolk and

Lerman3

nlin Nonlinear analysis ss Nonlinear state-space models p 1

nonlinear stateshyspace models

autonomous system

nonautonomous system

4 Strogatz and Dichter Nonlinear Dynamics and Chaos

equilibrium state

stationary point

nlinss Nonlinear stateshyspace

models

1 A state-space model has the general form

dx

dt= f(xu t) (ss1a)

y = (ss1b)

where f and g are vectorshyvalued functions

that depend on the system Nonlinear state-

space models are those for which f is a

functional of either x or u

For instance a state variable x1 might appear

as x21or two state variables might combine as

x1x2 or an input u1 might enter the equations as

log u1

Autonomous and nonautonomous systems

2 An autonomous system is one for which

f(x) with neither time nor input appearing

explicitly A nonautonomous system is one

for which either t or u do appear explicitly

in f It turns out that we can always write

nonautonomous systems as autonomous by

substituting in u(t) and introducing an extra

for t4

3 Therefore without loss of generality we

will focus on ways of analyzing autonomous

systems

Equilibrium

4 An equilibrium state (also called a

) x is one for which

dx dt = 0 In most cases this occurs only

when the input u is a constant u and for

timeshyvarying systems at a given time t For

autonomous systems equilibrium occurs when

nlin Nonlinear analysis char Nonlinear state-space models p 2

the following holds

(ss2)

This is a system of nonlinear algebraic

equations which can be challenging to solve for

x However frequently several solutionsmdashthat

is equilibrium statesmdashdo exist

nlin Nonlinear analysis char Nonlinear system characteristics p 1

system order

nlinchar Nonlinear system

characteristics

1 Characterizing nonlinear systems can be

challengingwithout the tools developed

for system characterization

However there are ways of characterizing

nonlinear systems and wersquoll here explore a few

DUM See Equation ss1a

Those in-common with linear systems

2 As with linear systems the system order is

either the number of stateshyvariables required

to describe the system or equivalently the

highest-order in a single

scalar differential equation describing the

system

3 Similarly nonlinear systems can have state

variables that depend on alone

or those that also depend on

(or some other independent variable) The

former lead to ordinary differential equations

(ODEs) and the latter to partial differential

equations (PDEs)

4 Equilibrium was already considered in

Lecture nlinss

Stability

5 In terms of system performance perhaps

no other criterion is as important as

nlin Nonlinear analysis char Nonlinear system characteristics p 1

Lyapunov stability theory

5 William L Brogan Modern Control Theory Third Prentice Hall 1991

Ch 10

6 A Choukchou-Braham et al Analysis and Control of

Underactuated Mechanical Systems SpringerLink Buumlcher Springer

International Publishing 2013 isbn 9783319026367 App A

Definition nlinchar1 Stability

If x is perturbed from an equilibrium state x

the response x(t) can

1 asymptotically return to x

(asymptotically )

2 diverge from x ( ) or

3 remain perturned or oscillate

about x with a constant amplitude

( stable)

Notice that this definition is actually

local stability in the neighborhood of one

equilibrium may not be the same as in the

neighborhood of another

6 Other than nonlinear systemsrsquo lack of

linear systemsrsquo eigenvalues poles and roots

of the characteristic equation from which to

compute it the primary difference between

the stability of linear and nonlinear systems

is that nonlinear system stability is often

difficult to establish

Using a linear systemrsquos eigenvalues it is

straightforward to establish stable unstable

and marginally stable subspaces of state-space

(via transforming to an eigenvector basis)

For nonlinear systems no such method exists

However we are not without tools to explore

nonlinear system stability One mathematical

tool to consider is

which is beyond the scope of this

course but has good treatments in5 and6

Qualities of equilibria

7 Equilibria (ie stationary points) come in

a variety of qualities It is instructive to

nlin Nonlinear analysis char Nonlinear system characteristics p 2

x

xprime

(a) r lt 0

x

xprime

(b) r = 0

x

xprime

(c) r gt 0

Figure 091 plots of xprime versus x for Equation char1

attractor

sink

stable

consider the first-order differential equation

in state variable with real constant

xprime = rx ndash x3 (char1)

If we plot xprime versus x for different values of r

we obtain the plots of Figure 091

8 By definition equilibria occur when

xprime = 0 so the xshyaxis crossings of Figure 091

are equilibria The blue arrows on the xshyaxis

show the of state

change xprime quantified by the plots For both

(a) and (b) only one equilibrium exists x = 0

Note that the blue arrows in both plots point

toward the equilibrium In such casesmdashthat is

when a exists around

an equilibrium for which state changes point

toward the equilibriummdashthe equilibrium is

called an or

Note that attractors are

9 Now consider (c) of Figure 091 When r gt 0

three equilibria emerge This change of the

number of equilibria with the changing of a

parameter is called a

nlin Nonlinear analysis sim Nonlinear system characteristics p 3

bifurcation diagram

repeller

source

unstable

saddle

A plot of bifurcations versus the parameter

is called a bifurcation diagram The x =

0 equilibrium now has arrows that point

from it Such an equilibrium is

called a or

and is The other two

equilibria here are (stable) attractors

Consider a very small initial condition x(0) = ε

If ε gt 0 the repeller pushes away x and the

positive attractor pulls x to itself Conversely

if ε lt 0 the repeller again pushes away x and

the negative attractor pulls x to itself

10 Another type of equilibrium is called

the one which acts as an

attractor along some lines and as a repeller

along others We will see this type in the

following example

Example nlincharshy1 re Saddle bifurcation

Problem Statement

Consider the dynamical equation

xprime = x2 + r (char2)

with r a real constant Sketch xprime vs x for

negative zero and positive r Identify and

classify each of the equilibria

nlin Nonlinear analysis pysim Nonlinear system simulation p 1

nlinsim Nonlinear system

simulation

nlin Nonlinear analysis pysim Simulating nonlinear systems in Python p 1

nlinpysim Simulating nonlinear

systems in Python

Example nlinpysimshy1 re a nonlinear unicycle

Problem Statementasdf

Problem Solution

First load some Python packages

import numpy as npimport sympy as spfrom scipyintegrate import solve_ivpimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

The state equation can be encoded via the

following function f

def f(t x u c)dxdt = [

x[3]npcos(x[2])x[3]npsin(x[2])x[4]1c[0] u(t)[0]1c[1] u(t)[1]

]return dxdt

The input function u must also be defined

nlin Nonlinear analysis ex Simulating nonlinear systems in Python p 2

def u(t)return [

15(1+npcos(t))25npsin(3t)

]

Define time spans initial values and constantstspan = nplinspace(0 50 300)xinit = [00000]mass = 10inertia = 10c = [massinertia]

Solve differential equationsol = solve_ivp(

lambda t x f(t x u c)[tspan[0] tspan[-1]]xinitt_eval=tspan

)

Letrsquos first plot the trajectory and

instantaneous velocity

ndash75 ndash50 ndash25 0 25 50 75 100x

ndash75

ndash50

ndash25

0

25

50

75

100

y

Figure 092

xp = soly[3]npcos(soly[2])yp = soly[3]npsin(soly[2])p = pltfigure()pltplot(soly[0]soly[1])pltquiver(soly[0]soly[1]xpyp)pltxlabel($x$)pltylabel($y$)pltshow()

Python code in this section was generated from a Jupyter

notebook named horcruxipynb with a python3 kernel

Nonlinear analysis Exercises for Chapter 09 p 1

nlinex Exercises for Chapter 09

A

Distribution tables

A Distribution tables A01 Gaussian distribution table p 1

A01 Gaussian distribution table

Below are plots of the Gaussian probability

density function f and cumulative distribution

function Φ Below them is Table A1 of CDF

values

zb

ndash3 ndash2 ndash1 0 1 2 30

01

02

03

04

z

Gaussia

nPDFf(z)

Φ(zb)

f(z)

(a) the Gaussian PDF

ndash3 ndash2 ndash1 0 1 2 30

02

04

06

08

1

interval bound zb

Gaussia

nCDFΦ(zb)

(b) the Gaussian CDF

Figure A1 the Gaussian PDF and CDF for z-scoresTable A1 z-score table Φ(zb) = P(z isin (ndashinfin zb])

zb 0 1 2 3 4 5 6 7 8 9

shy34 00003 00003 00003 00003 00003 00003 00003 00003 00003 00002

shy33 00005 00005 00005 00004 00004 00004 00004 00004 00004 00003

shy32 00007 00007 00006 00006 00006 00006 00006 00005 00005 00005

shy31 00010 00009 00009 00009 00008 00008 00008 00008 00007 00007

shy30 00013 00013 00013 00012 00012 00011 00011 00011 00010 00010

shy29 00019 00018 00018 00017 00016 00016 00015 00015 00014 00014

shy28 00026 00025 00024 00023 00023 00022 00021 00021 00020 00019

shy27 00035 00034 00033 00032 00031 00030 00029 00028 00027 00026

shy26 00047 00045 00044 00043 00041 00040 00039 00038 00037 00036

shy25 00062 00060 00059 00057 00055 00054 00052 00051 00049 00048

shy24 00082 00080 00078 00075 00073 00071 00069 00068 00066 00064

shy23 00107 00104 00102 00099 00096 00094 00091 00089 00087 00084

shy22 00139 00136 00132 00129 00125 00122 00119 00116 00113 00110

shy21 00179 00174 00170 00166 00162 00158 00154 00150 00146 00143

shy20 00228 00222 00217 00212 00207 00202 00197 00192 00188 00183

shy19 00287 00281 00274 00268 00262 00256 00250 00244 00239 00233

shy18 00359 00351 00344 00336 00329 00322 00314 00307 00301 00294

shy17 00446 00436 00427 00418 00409 00401 00392 00384 00375 00367

shy16 00548 00537 00526 00516 00505 00495 00485 00475 00465 00455

A Distribution tables A01 Gaussian distribution table p 2

Table A1 z-score table Φ(zb) = P(z isin (ndashinfin zb])

zb 0 1 2 3 4 5 6 7 8 9

shy15 00668 00655 00643 00630 00618 00606 00594 00582 00571 00559

shy14 00808 00793 00778 00764 00749 00735 00721 00708 00694 00681

shy13 00968 00951 00934 00918 00901 00885 00869 00853 00838 00823

shy12 01151 01131 01112 01093 01075 01056 01038 01020 01003 00985

shy11 01357 01335 01314 01292 01271 01251 01230 01210 01190 01170

shy10 01587 01562 01539 01515 01492 01469 01446 01423 01401 01379

shy09 01841 01814 01788 01762 01736 01711 01685 01660 01635 01611

shy08 02119 02090 02061 02033 02005 01977 01949 01922 01894 01867

shy07 02420 02389 02358 02327 02296 02266 02236 02206 02177 02148

shy06 02743 02709 02676 02643 02611 02578 02546 02514 02483 02451

shy05 03085 03050 03015 02981 02946 02912 02877 02843 02810 02776

shy04 03446 03409 03372 03336 03300 03264 03228 03192 03156 03121

shy03 03821 03783 03745 03707 03669 03632 03594 03557 03520 03483

shy02 04207 04168 04129 04090 04052 04013 03974 03936 03897 03859

shy01 04602 04562 04522 04483 04443 04404 04364 04325 04286 04247

shy00 05000 04960 04920 04880 04840 04801 04761 04721 04681 04641

00 05000 05040 05080 05120 05160 05199 05239 05279 05319 05359

01 05398 05438 05478 05517 05557 05596 05636 05675 05714 05753

02 05793 05832 05871 05910 05948 05987 06026 06064 06103 06141

03 06179 06217 06255 06293 06331 06368 06406 06443 06480 06517

04 06554 06591 06628 06664 06700 06736 06772 06808 06844 06879

05 06915 06950 06985 07019 07054 07088 07123 07157 07190 07224

06 07257 07291 07324 07357 07389 07422 07454 07486 07517 07549

07 07580 07611 07642 07673 07704 07734 07764 07794 07823 07852

08 07881 07910 07939 07967 07995 08023 08051 08078 08106 08133

09 08159 08186 08212 08238 08264 08289 08315 08340 08365 08389

10 08413 08438 08461 08485 08508 08531 08554 08577 08599 08621

11 08643 08665 08686 08708 08729 08749 08770 08790 08810 08830

12 08849 08869 08888 08907 08925 08944 08962 08980 08997 09015

13 09032 09049 09066 09082 09099 09115 09131 09147 09162 09177

14 09192 09207 09222 09236 09251 09265 09279 09292 09306 09319

15 09332 09345 09357 09370 09382 09394 09406 09418 09429 09441

16 09452 09463 09474 09484 09495 09505 09515 09525 09535 09545

17 09554 09564 09573 09582 09591 09599 09608 09616 09625 09633

18 09641 09649 09656 09664 09671 09678 09686 09693 09699 09706

19 09713 09719 09726 09732 09738 09744 09750 09756 09761 09767

20 09772 09778 09783 09788 09793 09798 09803 09808 09812 09817

21 09821 09826 09830 09834 09838 09842 09846 09850 09854 09857

22 09861 09864 09868 09871 09875 09878 09881 09884 09887 09890

23 09893 09896 09898 09901 09904 09906 09909 09911 09913 09916

24 09918 09920 09922 09925 09927 09929 09931 09932 09934 09936

25 09938 09940 09941 09943 09945 09946 09948 09949 09951 09952

26 09953 09955 09956 09957 09959 09960 09961 09962 09963 09964

27 09965 09966 09967 09968 09969 09970 09971 09972 09973 09974

28 09974 09975 09976 09977 09977 09978 09979 09979 09980 09981

29 09981 09982 09982 09983 09984 09984 09985 09985 09986 09986

30 09987 09987 09987 09988 09988 09989 09989 09989 09990 09990

31 09990 09991 09991 09991 09992 09992 09992 09992 09993 09993

32 09993 09993 09994 09994 09994 09994 09994 09995 09995 09995

33 09995 09995 09995 09996 09996 09996 09996 09996 09996 09997

34 09997 09997 09997 09997 09997 09997 09997 09997 09997 09998

A Distribution tables Studentrsquos tshydistribution table p 1

A02 Studentrsquos t-distribution

tableTable A2 two-tail inverse studentrsquos t-distribution table

percent probability

ν 600 667 750 800 875 900 950 975 990 995 999

1 0325 0577 1000 1376 2414 3078 6314 12706 31821 63657 31831

2 0289 0500 0816 1061 1604 1886 2920 4303 6965 9925 22327

3 0277 0476 0765 0978 1423 1638 2353 3182 4541 5841 10215

4 0271 0464 0741 0941 1344 1533 2132 2776 3747 4604 7173

5 0267 0457 0727 0920 1301 1476 2015 2571 3365 4032 5893

6 0265 0453 0718 0906 1273 1440 1943 2447 3143 3707 5208

7 0263 0449 0711 0896 1254 1415 1895 2365 2998 3499 4785

8 0262 0447 0706 0889 1240 1397 1860 2306 2896 3355 4501

9 0261 0445 0703 0883 1230 1383 1833 2262 2821 3250 4297

10 0260 0444 0700 0879 1221 1372 1812 2228 2764 3169 4144

11 0260 0443 0697 0876 1214 1363 1796 2201 2718 3106 4025

12 0259 0442 0695 0873 1209 1356 1782 2179 2681 3055 3930

13 0259 0441 0694 0870 1204 1350 1771 2160 2650 3012 3852

14 0258 0440 0692 0868 1200 1345 1761 2145 2624 2977 3787

15 0258 0439 0691 0866 1197 1341 1753 2131 2602 2947 3733

16 0258 0439 0690 0865 1194 1337 1746 2120 2583 2921 3686

17 0257 0438 0689 0863 1191 1333 1740 2110 2567 2898 3646

18 0257 0438 0688 0862 1189 1330 1734 2101 2552 2878 3610

19 0257 0438 0688 0861 1187 1328 1729 2093 2539 2861 3579

20 0257 0437 0687 0860 1185 1325 1725 2086 2528 2845 3552

21 0257 0437 0686 0859 1183 1323 1721 2080 2518 2831 3527

22 0256 0437 0686 0858 1182 1321 1717 2074 2508 2819 3505

23 0256 0436 0685 0858 1180 1319 1714 2069 2500 2807 3485

24 0256 0436 0685 0857 1179 1318 1711 2064 2492 2797 3467

25 0256 0436 0684 0856 1178 1316 1708 2060 2485 2787 3450

26 0256 0436 0684 0856 1177 1315 1706 2056 2479 2779 3435

27 0256 0435 0684 0855 1176 1314 1703 2052 2473 2771 3421

28 0256 0435 0683 0855 1175 1313 1701 2048 2467 2763 3408

29 0256 0435 0683 0854 1174 1311 1699 2045 2462 2756 3396

30 0256 0435 0683 0854 1173 1310 1697 2042 2457 2750 3385

35 0255 0434 0682 0852 1170 1306 1690 2030 2438 2724 3340

40 0255 0434 0681 0851 1167 1303 1684 2021 2423 2704 3307

45 0255 0434 0680 0850 1165 1301 1679 2014 2412 2690 3281

50 0255 0433 0679 0849 1164 1299 1676 2009 2403 2678 3261

55 0255 0433 0679 0848 1163 1297 1673 2004 2396 2668 3245

60 0254 0433 0679 0848 1162 1296 1671 2000 2390 2660 3232infin 0253 0431 0674 0842 1150 1282 1645 1960 2326 2576 3090

B

Fourier andLaplace tables

B Fourier and Laplace tables B01 Laplace transforms p 1

B01 Laplace transforms

Table B1 is a table with functions of time

f(t) on the left and corresponding Laplace

transforms L(s) on the right Where applicable

s = σ + jω is the Laplace transform variable

T is the timeshydomain period ω02πT is the

corresponding angular frequency j =radicndash1 a isin R+

and b t0 isin R are constants

Table B1 Laplace transform identities

function of time t function of Laplace s

a1f1(t) + a2f2(t) a1F1(s) + a2F2(s)

f(t ndash t0) F(s)endasht0s

fprime(t) sF(s) ndash f(0)

dnf(t)

dtnsnF(s) + s(nndash1)f(0) + s(nndash2)fprime(0) + middot middot middot + f(nndash1)(0)

ˆ t

0f(τ)dτ

1

sF(s)

tf(t) ndashFprime(s)

f1(t) lowast f2(t) =ˆ infinndashinfin f1(τ)f2(t ndash τ)dτ F1(s)F2(s)

δ(t) 1

us(t) 1s

ur(t) 1s2

tnndash1(n ndash 1) 1sn

endashat1

s + a

tendashat1

(s + a)2

B Fourier and Laplace tables B01 Laplace transforms p 2

1

(n ndash 1)tnndash1endashat

1

(s + a)n

1

a ndash b(eat ndash ebt)

1

(s ndash a)(s ndash b)(a 6= b)

1

a ndash b(aeat ndash bebt)

s

(s ndash a)(s ndash b)(a 6= b)

sinωtω

s2 +ω2

cosωts

s2 +ω2

eat sinωtω

(s ndash a)2 +ω2

eat cosωts ndash a

(s ndash a)2 +ω2

B Fourier and Laplace tables B02 Fourier transforms p 1

B02 Fourier transforms

Table B2 is a table with functions of time

f(t) on the left and corresponding Fourier

transforms F(ω) on the right Where applicable

T is the timeshydomain period ω02πT is the

corresponding angular frequency j =radicndash1 a isin R+

and b t0 isin R are constants Furthermore feand f0 are even and odd functions of time

respectively and it can be shown that any

function f can be written as the sum f(t) =

fe(t) + f0(t) (Hsu1967)

Table B2 Fourier transform identities

function of time t function of frequency ω

a1f1(t) + a2f2(t) a1F1(ω) + a2F2(ω)

f(at)1

|a|F(ωa)

f(ndasht) F(ndashω)

f(t ndash t0) F(ω)endashjωt0

f(t) cosω0t1

2F(ω ndashω0) +

1

2F(ω +ω0)

f(t) sinω0t1

j2F(ω ndashω0) ndash

1

j2F(ω +ω0)

fe(t) Re F(ω)

f0(t) j Im F(ω)

F(t) 2πf(ndashω)

fprime(t) jωF(ω)

dnf(t)

dtn(jω)nF(ω)

ˆ t

ndashinfin f(τ)dτ1

jωF(ω) + πF(0)δ(ω)

B Fourier and Laplace tables B02 Fourier transforms p 2

ndashjtf(t) Fprime(ω)

(ndashjt)nf(t)dnF(ω)

dωn

f1(t) lowast f2(t) =ˆ infinndashinfin f1(τ)f2(t ndash τ)dτ F1(ω)F2(ω)

f1(t)f2(t)1

2πF1(ω) lowast F2(ω) =

1

ˆ infinndashinfin F1(α)F2(ω ndash α)dα

endashatus(t)1

jω + a

endasha|t|2a

a2 +ω2

endashat2 radic

πa endashω2(4a)

1 for |t| lt a2 else 0a sin(aω2)

aω2

tendashatus(t)1

(a + jω)2

tnndash1

(n ndash 1)endashat)nus(t)

1

(a + jω)n

1

a2 + t2π

aendasha|ω|

δ(t) 1

δ(t ndash t0) endashjωt0

us(t) πδ(ω) +1

us(t ndash t0) πδ(ω) +1

jωendashjωt0

1 2πδ(ω)

t 2πjδprime(ω)

tn 2πjndnδ(ω)

dωn

ejω0t 2πδ(ω ndashω0)

cosω0t πδ(ω ndashω0) + πδ(ω +ω0)

B Fourier and Laplace tables Fourier transforms p 3

sinω0t ndashjπδ(ω ndashω0) + jπδ(ω +ω0)

us(t) cosω0tjω

ω20 ndashω2+π

2δ(ω ndashω0) +

π

2δ(ω +ω0)

us(t) sinω0tω0

ω20 ndashω2+π

2jδ(ω ndashω0) ndash

π

2jδ(ω +ω0)

tus(t) jπδprime(ω) ndash 1ω2

1t πj ndash 2πjus(ω)

1tn(ndashjω)nndash1

(n ndash 1)

(πj ndash 2πjus(ω)

)sgn t

2

jωinfinsumn=ndashinfin δ(t ndash nT) ω0

infinsumn=ndashinfin δ(ω ndash nω0)

C

Mathematicsreference

C Mathematics reference C01 Quadratic forms p 1

C01 Quadratic forms

The solution to equations of the form ax2 + bx +

c = 0 is

x =ndashbplusmn

radicb2 ndash 4ac

2a (C011)

Completing the square

This is accomplished by reshywriting the quadratic

formula in the form of the leftshyhand-side (LHS)

of this equality which describes factorization

x2 + 2xh + h2 = (x + h)2 (C012)

C Mathematics reference C02 Trigonometry p 1

C02 Trigonometry

Triangle identities

With reference to the below figure the law of

sines is

sin α

a=sin β

b=sin γ

c(C021)

and the law of cosines is

c2 = a2 + b2 ndash 2ab cos γ (C022a)

b2 = a2 + c2 ndash 2ac cos β (C022b)

a2 = c2 + b2 ndash 2cb cos α (C022c)

b

ca

α γ

β

Reciprocal identities

csc u =1

sin u(C023a)

sec u =1

cos u(C023b)

cot u =1

tan u(C023c)

Pythagorean identities

1 = sin2 u + cos2 u (C024a)

sec2 u = 1 + tan2 u (C024b)

csc2 u = 1 + cot2 u (C024c)

C Mathematics reference C02 Trigonometry p 1

Co-function identities

sin(π2ndash u

)= cos u (C025a)

cos(π2ndash u

)= sin u (C025b)

tan(π2ndash u

)= cot u (C025c)

csc(π2ndash u

)= sec u (C025d)

sec(π2ndash u

)= csc u (C025e)

cot(π2ndash u

)= tan u (C025f)

Even-odd identities

sin(ndashu) = ndash sin u (C026a)

cos(ndashu) = cos u (C026b)

tan(ndashu) = ndash tan u (C026c)

Sumshydifference formulas (AM or lock-in)

sin(uplusmn v) = sin u cos vplusmn cos u sin v (C027a)

cos(uplusmn v) = cos u cos v∓ sin u sin v (C027b)

tan(uplusmn v) = tan uplusmn tan v1∓ tanu tan v

(C027c)

Double angle formulas

sin(2u) = 2 sin u cos u (C028a)

cos(2u) = cos2 u ndash sin2 u (C028b)

= 2 cos2 u ndash 1 (C028c)

= 1 ndash 2 sin2 u (C028d)

tan(2u) =2 tanu

1 ndash tan2 u(C028e)

Power-reducing or halfshyangle formulas

sin2 u =1 ndash cos(2u)

2(C029a)

cos2 u =1 + cos(2u)

2(C029b)

tan2 u =1 ndash cos(2u)

1 + cos(2u)(C029c)

C Mathematics reference C02 Trigonometry p 1

Sum-toshyproduct formulas

sin u + sin v = 2 sinu + v

2cos

u ndash v

2(C0210a)

sin u ndash sin v = 2 cosu + v

2sin

u ndash v

2(C0210b)

cos u + cos v = 2 cosu + v

2cos

u ndash v

2(C0210c)

cos u ndash cos v = ndash2 sinu + v

2sin

u ndash v

2(C0210d)

Product-to-sum formulas

sin u sin v =1

2

[cos(u ndash v) ndash cos(u + v)

](C0211a)

cos u cos v =1

2

[cos(u ndash v) + cos(u + v)

](C0211b)

sin u cos v =1

2

[sin(u + v) + sin(u ndash v)

](C0211c)

cos u sin v =1

2

[sin(u + v) ndash sin(u ndash v)

](C0211d)

Two-to-one formulas

A sin u + B cos u = C sin(u + φ) (C0212a)

= C cos(u +ψ) where (C0212b)

C =

radicA2 + B2 (C0212c)

φ = arctanB

A(C0212d)

ψ = ndash arctanA

B(C0212e)

C Mathematics reference C03 Matrix inverses p 1

adjoint

C03 Matrix inverses

This is a guide to inverting 1times1 2times2 and ntimesnmatrices

Let A be the 1times 1 matrix

A =[a]

The inverse is simply the reciprocal

Andash1 =[1a

]

Let B be the 2times 2 matrix

B =

[b11 b12b21 b22

]

It can be shown that the inverse follows a

simple pattern

Bndash1 =1

det B

[b22 ndashb12ndashb21 b11

]

=1

b11b22 ndash b12b21

[b22 ndashb12ndashb21 b11

]

Let C be an n times n matrix It can be shown that

its inverse is

Cndash1 =1

det CadjC

where adj is the adjoint of C

C Mathematics reference Laplace transforms p 1

C04 Laplace transforms

The definition of the one-side Laplace and

inverse Laplace transforms follow

Definition C041 Laplace transforms (one-

sided)

Laplace transform L

L(y(t)) = Y(s) =

ˆ infin0

y(t)endashstdt (C041)

Inverse Laplace transform Lndash1

Lndash1(Y(s)) = y(t) =1

2πj

ˆ σ+jinfinσndashjinfin Y(s)estds (C042)

See Table B1 for a list of properties and

common transforms

D

Complex analysis

D Complex analysis Eulerrsquos formulas p 1

Eulerrsquos formulaD01 Eulerrsquos formulas

Eulerrsquos formula is our bridge backshyand forth

between trigonomentric forms (cos θ and sin θ)

and complex exponential form (ejθ)

ejθ = cos θ + j sin θ (The Formulas of Euler1)

Here are a few useful identities implied by

Eulerrsquos formula

endashjθ = cos θ ndash j sin θ (The Formulas of Euler2a)

cos θ = Re (ejθ) (The Formulas of Euler2b)

=1

2

(ejθ + endashjθ

)(The Formulas of Euler2c)

sin θ = Im (ejθ) (The Formulas of Euler2d)

=1

j2

(ejθ ndash endashjθ

)

(The Formulas of Euler2e)

Bibliography

[1] Robert B Ash Basic Probability Theory

Dover Publications Inc 2008

[2] Joan Bagaria Set Theory In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Fall 2019

Metaphysics Research Lab Stanford

University 2019

[3] Maria Baghramian and J Adam

Carter Relativism In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Winter 2019

Metaphysics Research Lab Stanford

University 2019

[4] Stephen Barker and Mark Jago Being

Positive About Negative Facts In

Philosophy and Phenomenological

Research 851 (2012) pp 117ndash138 doi 10

1111j1933-1592201000479x

[5] Jonathan Barzilai and Jonathan M

Borwein Two-Point Step Size Gradient

Methods In IMA Journal of Numerical

Analysis 81 (Jan 1988) pp 141ndash148 issn

0272-4979 doi 101093imanum81141

This includes an innovative line search

method

D Complex analysis Bibliography p 3

1 FBADL04

[6] Anat Biletzki and Anat Matar Ludwig

Wittgenstein In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Summer 2018

Metaphysics Research Lab Stanford

University 2018 An introduction to

Wittgenstein and his thought

[7] Antonio Bove F (Ferruccio) Colombini

and Daniele Del Santo Phase space

analysis of partial differential

equations eng Progress in nonlinear

differential equations and their

applications v 69 Boston Berlin

Birkhaumluser 2006 isbn 9780817645212

[8] William L Brogan Modern Control Theory

Third Prentice Hall 1991

[9] Francesco Bullo and Andrew D Lewis

Geometric control of mechanical systems

modeling analysis and design for simple

mechanical control systems Ed by JE

Marsden L Sirovich and M Golubitsky

Springer 2005

[10] Francesco Bullo and Andrew D Lewis

Supplementary Chapters for Geometric

Control of Mechanical Systems1 Jan

2005

[11] A Choukchou-Braham et al Analysis and

Control of Underactuated Mechanical

Systems SpringerLink Buumlcher Springer

International Publishing 2013 isbn

9783319026367

[12] K Ciesielski Set Theory for the

Working Mathematician London

Mathematical Society Student Texts

Cambridge University Press 1997 isbn

9780521594653 A readable introduction

to set theory

D Complex analysis Bibliography p 4

[13] Marian David The Correspondence

Theory of Truth In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Fall 2016 Metaphysics

Research Lab Stanford University

2016 A detailed overview of the

correspondence theory of truth

[14] David Dolby Wittgenstein on Truth

In A Companion to Wittgenstein John

Wiley amp Sons Ltd 2016 Chap 27 pp 433ndash

442 isbn 9781118884607 doi 101002

9781118884607ch27

[15] Peter J Eccles An Introduction to

Mathematical Reasoning Numbers

Sets and Functions Cambridge

University Press 1997 doi

10 1017 CBO9780511801136 A gentle

introduction to mathematical reasoning

It includes introductory treatments of

set theory and number systems

[16] HB Enderton Elements of Set

Theory Elsevier Science 1977 isbn

9780080570426 A gentle introduction to

set theory and mathematical reasoningmdasha

great place to start

[17] Richard P Feynman Robert B Leighton

and Matthew Sands The Feynman

Lectures on Physics New Millennium

Perseus Basic Books 2010

[18] Michael Glanzberg Truth In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Fall 2018

Metaphysics Research Lab Stanford

University 2018

[19] Hans Johann Glock Truth in the

Tractatus In Synthese 1482 (Jan

2006) pp 345ndash368 issn 1573-0964 doi

101007s11229-004-6226-2

D Complex analysis Bibliography p 5

[20] Mario GoacutemezshyTorrente Alfred Tarski

In The Stanford Encyclopedia of

Philosophy Ed by Edward N Zalta

Spring 2019 Metaphysics Research Lab

Stanford University 2019

[21] Paul Guyer and Rolf-Peter Horstmann

Idealism In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Winter 2018

Metaphysics Research Lab Stanford

University 2018

[22] R Haberman Applied Partial Differential

Equations with Fourier Series and

Boundary Value Problems (Classic

Version) Pearson Modern Classics for

Advanced Mathematics Pearson Education

Canada 2018 isbn 9780134995434

[23] GWF Hegel and AV Miller

Phenomenology of Spirit Motilal

Banarsidass 1998 isbn 9788120814738

[24] Wilfrid Hodges Model Theory In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Fall 2018

Metaphysics Research Lab Stanford

University 2018

[25] Wilfrid Hodges Tarskirsquos Truth

Definitions In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Fall 2018 Metaphysics

Research Lab Stanford University 2018

[26] Peter Hylton and Gary Kemp Willard

Van Orman Quine In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Spring 2019 Metaphysics

Research Lab Stanford University 2019

[27] ET Jaynes et al Probability Theory The

Logic of Science Cambridge University

Press 2003 isbn 9780521592710

D Complex analysis Bibliography p 6

An excellent and comprehensive

introduction to probability theory

[28] I Kant P Guyer and AW Wood Critique

of Pure Reason The Cambridge Edition

of the Works of Immanuel Kant

Cambridge University Press 1999 isbn

9781107268333

[29] Juliette Kennedy Kurt Goumldel In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Winter 2018

Metaphysics Research Lab Stanford

University 2018

[30] Drew Khlentzos Challenges to

Metaphysical Realism In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Winter 2016

Metaphysics Research Lab Stanford

University 2016

[31] Peter Klein Skepticism In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Summer 2015

Metaphysics Research Lab Stanford

University 2015

[32] M Kline Mathematics The Loss

of Certainty A Galaxy book

Oxford University Press 1982 isbn

9780195030853 A detailed account

of the ldquoillogicalrdquo development of

mathematics and an exposition of

its therefore remarkable utility in

describing the world

[33] W Richard Kolk and Robert A Lerman

Nonlinear System Dynamics 1st ed

Springer US 1993 isbn 978-1-4684shy6496shy2

[34] Erwin Kreyszig Advanced Engineering

Mathematics 10th John Wiley amp Sons

Limited 2011 isbn 9781119571094 The

authoritative resource for engineering

mathematics It includes detailed

D Complex analysis Bibliography p 7

accounts of probability statistics

vector calculus linear algebra

fourier analysis ordinary and partial

differential equations and complex

analysis It also includes several other

topics with varying degrees of depth

Overall it is the best place to start

when seeking mathematical guidance

[35] John M Lee Introduction to Smooth

Manifolds second Vol 218 Graduate

Texts in Mathematics Springer 2012

[36] Catherine Legg and Christopher

Hookway Pragmatism In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Spring 2019 Metaphysics

Research Lab Stanford University

2019 An introductory article on the

philsophical movement ldquopragmatismrdquo It

includes an important clarification of

the pragmatic slogan ldquotruth is the end of

inquiryrdquo

[37] Daniel Liberzon Calculus of Variations

and Optimal Control Theory A Concise

Introduction Princeton University Press

2012 isbn 9780691151878

[38] Panu Raatikainen Goumldelrsquos Incompleteness

Theorems In The Stanford Encyclopedia

of Philosophy Ed by Edward N Zalta

Fall 2018 Metaphysics Research Lab

Stanford University 2018 A through

and contemporary description of

Goumldelrsquos incompleteness theorems which

have significant implications for the

foundations and function of mathematics

and mathematical truth

[39] Paul Redding Georg Wilhelm Friedrich

Hegel In The Stanford Encyclopedia

of Philosophy Ed by Edward N Zalta

Summer 2018 Metaphysics Research Lab

Stanford University 2018

D Complex analysis Bibliography p 8

[40] HM Schey Div Grad Curl and All that An

Informal Text on Vector Calculus WW

Norton 2005 isbn 9780393925166

[41] Christopher Shields Aristotle In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Winter 2016

Metaphysics Research Lab Stanford

University 2016

[42] Steven S Skiena Calculated Bets

Computers Gambling and Mathematical

Modeling to Win Outlooks Cambridge

University Press 2001 doi 10 1017

CBO9780511547089 This includes a lucid

section on probability versus statistics

also available here https www3

csstonybrookedu~skienajaialai

excerptsnode12html

[43] George Smith Isaac Newton In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Fall 2008

Metaphysics Research Lab Stanford

University 2008

[44] Daniel Stoljar and Nic Damnjanovic

The Deflationary Theory of Truth

In The Stanford Encyclopedia of

Philosophy Ed by Edward N Zalta Fall

2014 Metaphysics Research Lab Stanford

University 2014

[45] WA Strauss Partial Differential

Equations An Introduction Wiley 2007

isbn 9780470054567 A thorough and yet

relatively compact introduction

[46] SH Strogatz and M Dichter Nonlinear

Dynamics and Chaos Second Studies

in Nonlinearity Avalon Publishing 2016

isbn 9780813350844

[47] Mark Textor States of Affairs In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Winter 2016

D Complex analysis Bibliography p 9

Metaphysics Research Lab Stanford

University 2016

[48] Pauli Virtanen et al SciPy

10ndashFundamental Algorithms for

Scientific Computing in Python In arXiv

eshyprints arXiv190710121 (July 2019)

arXiv190710121

[49] Wikipedia Algebra mdash Wikipedia The Free

Encyclopedia httpenwikipedia

orgwindexphptitle=Algebraampoldid=

920573802 [Online accessed 26shyOctober-

2019] 2019

[50] Wikipedia Carl Friedrich Gauss mdash

Wikipedia The Free Encyclopedia http

en wikipedia org w index php

title=Carl20Friedrich20Gaussampoldid=

922692291 [Online accessed 26shyOctober-

2019] 2019

[51] Wikipedia Euclid mdash Wikipedia The Free

Encyclopedia httpenwikipedia

orgwindexphptitle=Euclidampoldid=

923031048 [Online accessed 26shyOctober-

2019] 2019

[52] Wikipedia First-order logic mdash

Wikipedia The Free Encyclopedia http

enwikipediaorgwindexphptitle=

First-order20logicampoldid=921437906

[Online accessed 29shyOctobershy2019] 2019

[53] Wikipedia Fundamental interaction mdash

Wikipedia The Free Encyclopedia http

en wikipedia org w index php

title=Fundamental20interactionampoldid=

925884124 [Online accessed 16shyNovember-

2019] 2019

[54] Wikipedia Leonhard Euler mdash Wikipedia

The Free Encyclopedia http en

wikipediaorgwindexphptitle=

Leonhard 20Euler amp oldid = 921824700

[Online accessed 26shyOctobershy2019] 2019

D Complex analysis Bibliography p 10

[55] Wikipedia Linguistic turnmdashWikipedia

The Free Encyclopedia http en

wikipediaorgwindexphptitle=

Linguistic20turnampoldid=922305269

[Online accessed 23shyOctobershy2019] 2019

Hey we all do it

[56] Wikipedia Probability space mdash

Wikipedia The Free Encyclopedia http

enwikipediaorgwindexphptitle=

Probability20spaceampoldid=914939789

[Online accessed 31shyOctobershy2019] 2019

[57] Wikipedia Propositional calculus mdash

Wikipedia The Free Encyclopedia http

en wikipedia org w index php

title=Propositional20calculusampoldid=

914757384 [Online accessed 29shyOctober-

2019] 2019

[58] Wikipedia Quaternion mdash Wikipedia The

Free Encyclopedia httpenwikipedia

orgwindexphptitle=Quaternionamp

oldid=920710557 [Online accessed 26-

Octobershy2019] 2019

[59] Wikipedia Set-builder notation mdash

Wikipedia The Free Encyclopedia http

enwikipediaorgwindexphptitle=

Set-builder20notationampoldid=917328223

[Online accessed 29shyOctobershy2019] 2019

[60] Wikipedia William Rowan Hamilton mdash

Wikipedia The Free Encyclopedia http

enwikipediaorgwindexphptitle=

William 20Rowan 20Hamilton amp oldid =

923163451 [Online accessed 26shyOctober-

2019] 2019

[61] L Wittgenstein PMS Hacker and J

Schulte Philosophical Investigations

Wiley 2010 isbn 9781444307979

[62] Ludwig Wittgenstein Tractatus Logico-

Philosophicus Ed by C familyi=C

given=K Ogden giveni= O Project

Complex analysis Bibliography p 11

Gutenberg International Library of

Psychology Philosophy and Scientific

Method Kegan Paul Trench Trubner amp Co

Ltd 1922 A brilliant work on what is

possible to express in languagemdashand what

is not As Wittgenstein puts it ldquoWhat can

be said at all can be said clearly and

whereof one cannot speak thereof one

must be silentrdquo

[63] Slavoj Žižek Less Than Nothing

Hegel and the Shadow of Dialectical

Materialism Verso 2012 isbn

9781844678976 This is one of the most

interesting presentations of Hegel and

Lacan by one of the most exciting

contemporary philosophers

  • Mathematics itself
    • Truth
      • Neo-classical theories of truth
      • The picture theory
      • The relativity of truth
      • Other ideas about truth
      • Where this leaves us
        • The foundations of mathematics
          • Algebra ex nihilo
          • The application of mathematics to science
          • The rigorization of mathematics
          • The foundations of mathematics are built
          • The foundations have cracks
          • Mathematics is considered empirical
            • Mathematical reasoning
            • Mathematical topics in overview
            • What is mathematics for engineering
            • Exercises for Chapter 01
              • Mathematical reasoning logic and set theory
                • Introduction to set theory
                • Logical connectives and quantifiers
                  • Logical connectives
                  • Quantifiers
                    • Exercises for Chapter 02
                      • Exe 021 hardhat
                      • Exe 022 2
                      • Exe 023 3
                      • Exe 024 4
                          • Probability
                            • Probability and measurement
                            • Basic probability theory
                              • Algebra of events
                                • Independence and conditional probability
                                  • Conditional probability
                                    • Bayes theorem
                                      • Testing outcomes
                                      • Posterior probabilities
                                        • Random variables
                                        • Probability density and mass functions
                                          • Binomial PMF
                                          • Gaussian PDF
                                            • Expectation
                                            • Central moments
                                            • Exercises for Chapter 03
                                              • Exe 031 5
                                                  • Statistics
                                                    • Populations samples and machine learning
                                                    • Estimation of sample mean and variance
                                                      • Estimation and sample statistics
                                                      • Sample mean variance and standard deviation
                                                      • Sample statistics as random variables
                                                      • Nonstationary signal statistics
                                                        • Confidence
                                                          • Generate some data to test the central limit theorem
                                                          • Sample statistics
                                                          • The truth about sample means
                                                          • Gaussian and probability
                                                            • Student confidence
                                                            • Multivariate probability and correlation
                                                              • Marginal probability
                                                              • Covariance
                                                              • Conditional probability and dependence
                                                                • Regression
                                                                • Exercises for Chapter 04
                                                                  • Vector calculus
                                                                    • Divergence surface integrals and flux
                                                                      • Flux and surface integrals
                                                                      • Continuity
                                                                      • Divergence
                                                                      • Exploring divergence
                                                                        • Curl line integrals and circulation
                                                                          • Line integrals
                                                                          • Circulation
                                                                          • Curl
                                                                          • Zero curl circulation and path independence
                                                                          • Exploring curl
                                                                            • Gradient
                                                                              • Gradient
                                                                              • Vector fields from gradients are special
                                                                              • Exploring gradient
                                                                                • Stokes and divergence theorems
                                                                                  • The divergence theorem
                                                                                  • The Kelvin-Stokes theorem
                                                                                  • Related theorems
                                                                                    • Exercises for Chapter 05
                                                                                      • Exe 051 6
                                                                                          • Fourier and orthogonality
                                                                                            • Fourier series
                                                                                            • Fourier transform
                                                                                            • Generalized fourier series and orthogonality
                                                                                            • Exercises for Chapter 06
                                                                                              • Exe 061 secrets
                                                                                              • Exe 062 seesaw
                                                                                              • Exe 063 9
                                                                                              • Exe 064 10
                                                                                              • Exe 065 11
                                                                                              • Exe 066 12
                                                                                              • Exe 067 13
                                                                                              • Exe 068 14
                                                                                              • Exe 069 savage
                                                                                              • Exe 0610 strawman
                                                                                                  • Partial differential equations
                                                                                                    • Classifying PDEs
                                                                                                    • Sturm-liouville problems
                                                                                                      • Types of boundary conditions
                                                                                                        • PDE solution by separation of variables
                                                                                                        • The 1D wave equation
                                                                                                        • Exercises for Chapter 07
                                                                                                          • Exe 071 poltergeist
                                                                                                          • Exe 072 kathmandu
                                                                                                              • Optimization
                                                                                                                • Gradient descent
                                                                                                                  • Stationary points
                                                                                                                  • The gradient points the way
                                                                                                                  • The classical method
                                                                                                                  • The Barzilai and Borwein method
                                                                                                                    • Constrained linear optimization
                                                                                                                      • Feasible solutions form a polytope
                                                                                                                      • Only the vertices matter
                                                                                                                        • The simplex algorithm
                                                                                                                        • Exercises for Chapter 08
                                                                                                                          • Exe 081 cummerbund
                                                                                                                          • Exe 082 melty
                                                                                                                              • Nonlinear analysis
                                                                                                                                • Nonlinear state-space models
                                                                                                                                  • Autonomous and nonautonomous systems
                                                                                                                                  • Equilibrium
                                                                                                                                    • Nonlinear system characteristics
                                                                                                                                      • Those in-common with linear systems
                                                                                                                                      • Stability
                                                                                                                                      • Qualities of equilibria
                                                                                                                                        • Nonlinear system simulation
                                                                                                                                        • Simulating nonlinear systems in Python
                                                                                                                                        • Exercises for Chapter 09
                                                                                                                                          • Distribution tables
                                                                                                                                            • Gaussian distribution table
                                                                                                                                            • Students t-distribution table
                                                                                                                                              • Fourier and Laplace tables
                                                                                                                                                • Laplace transforms
                                                                                                                                                • Fourier transforms
                                                                                                                                                  • Mathematics reference
                                                                                                                                                    • Quadratic forms
                                                                                                                                                      • Completing the square
                                                                                                                                                        • Trigonometry
                                                                                                                                                          • Triangle identities
                                                                                                                                                          • Reciprocal identities
                                                                                                                                                          • Pythagorean identities
                                                                                                                                                          • Co-function identities
                                                                                                                                                          • Even-odd identities
                                                                                                                                                          • Sum-difference formulas (AM or lock-in)
                                                                                                                                                          • Double angle formulas
                                                                                                                                                          • Power-reducing or half-angle formulas
                                                                                                                                                          • Sum-to-product formulas
                                                                                                                                                          • Product-to-sum formulas
                                                                                                                                                          • Two-to-one formulas
                                                                                                                                                            • Matrix inverses
                                                                                                                                                            • Laplace transforms
                                                                                                                                                              • Complex analysis
                                                                                                                                                                • Eulers formulas
                                                                                                                                                                  • Bibliography
Page 6: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview

Contents Contents p 6

Exe 081 cummerbund 245

Exe 082 melty 245

09 Nonlinear analysis 246

nlinss Nonlinear state-space models 248

Autonomous and nonautonomous

systems 248

Equilibrium 248

nlinchar Nonlinear system

characteristics 250

Those in-common with linear

systems 250

Stability 250

Qualities of equilibria 251

nlinsim Nonlinear system simulation 254

nlinpysimSimulating nonlinear systems

in Python 255

nlinex Exercises for Chapter 09 257

A Distribution tables 258

A01 Gaussian distribution table 259

A02 Studentrsquos tshydistribution table 261

B Fourier and Laplace tables 262

B01 Laplace transforms 263

B02 Fourier transforms 265

C Mathematics reference 268

C01 Quadratic forms 269

Completing the square 269

C02 Trigonometry 270

Triangle identities 270

Reciprocal identities 270

Pythagorean identities 270

Co-function identities 271

Even-odd identities 271

Sumshydifference formulas (AM or

lock-in) 271

Double angle formulas 271

Power-reducing or halfshyangle

formulas 271

Sum-toshyproduct formulas 272

Contents Contents p 7

Product-to-sum formulas 272

Two-to-one formulas 272

C03 Matrix inverses 273

C04 Laplace transforms 274

D Complex analysis 275

D01 Eulerrsquos formulas 276

Bibliography 277

itself

Mathematics itself

itself Mathematics itself tru Truth p 1

1 For much of this lecture I rely on the thorough overview

of Glanzberg (Michael Glanzberg Truth In The Stanford

Encyclopedia of Philosophy Ed by Edward N Zalta Fall 2018

Metaphysics Research Lab Stanford University 2018)

correspondence theory

itselftru Truth

1 Before we can discuss mathematical truth

we should begin with a discussion of truth

itself1 It is important to note that this is

obviously extremely incomplete My aim is

to give a sense of the subject via brutal

(mis)abbreviation

2 Of course the study of truth cannot but

be entangled with the study of the world

as such (metaphysics) and of knowledge

(epistemology) Some of the following theories

presuppose or imply a certain metaphysical

andor epistemological theory but which these

are is controversial

Neo-classical theories of truth

3 The neo-classical theories of truth

take for granted that there is truth and

attempt to explain what its precise nature is

(Michael Glanzberg Truth In The Stanford

Encyclopedia of Philosophy Ed by Edward N

Zalta Fall 2018 Metaphysics Research Lab

Stanford University 2018) What are provided

here are modern understandings of theories

developed primarily in the early 20th century

The correspondence theory

4 A version of what is called the

correspondence theory of truth is the

following

A proposition is true iff there is an

existing entity in the world that

corresponds with it

itself Mathematics itself tru Truth p 1

facts

false

coherence theory

2 This is typically put in terms of ldquobeliefsrdquo or ldquojudgmentsrdquo but for

brevity and parallelism I have cast it in terms of propositions It is

to this theory I have probably committed the most violence

coherence

Such existing entities are called facts Facts

are relational in that their parts (eg subject

predicate etc) are related in a certain way

5 Under this theory then if a proposition

does not correspond to a fact it is false

6 This theory of truth is rather intuitive

and consistently popular (Marian David The

Correspondence Theory of Truth In The

Stanford Encyclopedia of Philosophy Ed

by Edward N Zalta Fall 2016 Metaphysics

Research Lab Stanford University 2016 A

detailed overview of the correspondence

theory of truth)

The coherence theory

7 The coherence theory of truth is adamant

that the truth of any given proposition

is only as good as its holistic system of

propositions2 This includes (but typically

goes beyond) a requirement for consistency of

a given proposition with the whole and the

self-consistency of the whole itselfmdashsometimes

called coherence

8 For parallelism letrsquos attempt a succinct

formulation of this theory cast in terms of

propositions

A proposition is true iff it is

has coherence with a system of

propositions

itself Mathematics itself tru Truth p 1

pragmatism

3 Pragmatism was an American philosophical movement of the early

20th century that valued the success of ldquopracticalrdquo application of

theories For an introduction see Legg and Hookway (Catherine

Legg and Christopher Hookway Pragmatism In The Stanford

Encyclopedia of Philosophy Ed by Edward N Zalta Spring 2019

Metaphysics Research Lab Stanford University 2019 An introductory

article on the philsophical movement ldquopragmatismrdquo It includes an

important clarification of the pragmatic slogan ldquotruth is the end of

inquiryrdquo)

4 This is especially congruent with the work of William James (ibid)

9 Note that this has no reference to facts

whatsoever However it need not necessarily

preclude them

The pragmatic theory

10 Of the neo-classical theories of truth this

is probably the least agreed upon as having a

single clear statement (Glanzberg Truth)

However as with pragmatism in general3 the

pragmatic truth is oriented practically

11 Perhaps the most important aspect

of this theory is that it is thoroughly a

correspondence theory agreeing that true

propositions are those that correspond to

the world However there is a different

focus here that differentiates it from

correspondence theory proper it values as

more true that which has some sort of practical

use in human life

12 Wersquoll try to summarize pragmatism in two

slogans with slightly different emphases herersquos

the first again cast in propositional parallel

A proposition is true iff it works4

Now there are two ways this can be

understood (a) the proposition ldquoworksrdquo in

that it empirically corresponds to the world

or (b) the proposition ldquoworksrdquo in that it has

an effect that some agent intends The former

is pretty standard correspondence theory The

itself Mathematics itself tru Truth p 1

5 This is especially congruent with the work of Charles Sanders

Peirce (Legg and Hookway Pragmatism)

inquiry

facts

latter is new and fairly obviously has ethical

implications especially today

13 Let us turn to a second formulation

A proposition is true if it

corresponds with a process of

inquiry5

This has two interesting facets (a) an agentrsquos

active inquiry creates truth and (b) it is a

sort of correspondence theory that requires a

correspondence of a proposition with a process

of inquiry not as in the correspondence theory

with a fact about the world The latter has

shades of both correspondence theory and

coherence theory

The picture theory

Before we delve into this theory we must take

a moment to clarify some terminology

States of affairs and facts

14 When discussing the correspondence

theory we have used the term fact to mean

an actual state of things in the world A

problem arises in the correspondence theory

here It says that a proposition is true iff

there is a fact that corresponds with it What

of a negative proposition like ldquothere are no

cows in Antarcticardquo We would seem to need

a corresponding ldquonegative factrdquo in the world

to make this true If a fact is taken to be

itself Mathematics itself tru Truth p 1

6 But Barker and Jago (Stephen Barker and Mark Jago Being

Positive About Negative Facts In Philosophy and Phenomenological

Research 851 [2012] pp 117ndash138 DOI 10 1111 j 1933 -1592201000479x) have attempted just that

state of affairs

obtain

model

picture

7 See Wittgenstein (Ludwig Wittgenstein Tractatus Logico-

Philosophicus Ed by C familyi=C given=K Ogden giveni= O

Project Gutenberg International Library of Psychology Philosophy

and Scientific Method Kegan Paul Trench Trubner amp Co Ltd 1922

A brilliant work on what is possible to express in languagemdashand

what is not As Wittgenstein puts it ldquoWhat can be said at all can

be said clearly and whereof one cannot speak thereof one must be

silentrdquo) Biletzki and Matar (Anat Biletzki and Anat Matar Ludwig

Wittgenstein In The Stanford Encyclopedia of Philosophy Ed by

Edward N Zalta Summer 2018 Metaphysics Research Lab Stanford

University 2018 An introduction to Wittgenstein and his thought)

glock2016 () and Dolby (David Dolby Wittgenstein on Truth In

A Companion to Wittgenstein John Wiley amp Sons Ltd 2016 Chap 27

pp 433ndash442 ISBN 9781118884607 DOI 1010029781118884607ch27)

composed of a complex of actual objects and

relations it is hard to imagine such facts6

15 Furthermore if a proposition is true it

seems that it is the corresponding fact that

makes it so what then makes a proposition

false since there is no fact to support the

falsity (Mark Textor States of Affairs In

The Stanford Encyclopedia of Philosophy Ed

by Edward N Zalta Winter 2016 Metaphysics

Research Lab Stanford University 2016)

16 And what of nonsense There are some

propositions like ldquothere is a round cuberdquo

that are neither true nor false However

the preceding correspondence theory cannot

differentiate between false and nonsensical

propositions

17 A state of affairs is something possible

that may or may not be actual (ibid) If

a state of affairs is actual it is said to

obtain The picture theory will make central

this concept instead of that of the fact

The picture theory of meaning (and truth)

18 The picture theory of meaning uses the

analogy of the model or picture to explain

the meaningfulness of propositions7

A proposition names possible objects

and arranges these names to

correspond to a state of affairs

itself Mathematics itself tru Truth p 1

Figure 011 a representation of the picture theory

nonsense

See Figure 011 This also allows for an easy

account of truth falsity and nonsense

nonsense A sentence that appears to be

a proposition is actually not if the

arrangement of named objects is

impossible Such a sentence is simply

nonsense

truth A proposition is true if the state of

affairs it depicts obtains

falsity A proposition is false if the state of

affairs it depicts does not obtain

19 Now some (Hans Johann Glock Truth

in the Tractatus In Synthese 1482 [Jan

2006] pp 345ndash368 ISSN 1573-0964 DOI 10

1007 s11229 - 004 - 6226 - 2) argue this is

a correspondence theory and others (David

Dolby Wittgenstein on Truth In A Companion

to Wittgenstein John Wiley amp Sons Ltd 2016

Chap 27 pp 433ndash442 ISBN 9781118884607 DOI

1010029781118884607ch27) that it is not In

any case it certainly solves some issues that

have plagued the correspondence theory

ldquoWhat cannot be said must be shownrdquo

20 Something the picture theory does is

declare a limit on what can meaningfully be

said A proposition (as defined above) must be

potentially true or false Therefore something

that cannot be false (something necessarily

true) cannot be a proposition (ibid) And

there are certain things that are necessarily

true for language itself to be meaningfulmdash

paradigmatically the logical structure of the

world What a proposition does then is show

itself Mathematics itself tru Truth p 1

8 See also ( Slavoj Žižek Less Than Nothing Hegel and the Shadow

of Dialectical Materialism Verso 2012 ISBN 9781844678976 This is

one of the most interesting presentations of Hegel and Lacan by one

of the most exciting contemporary philosophers pp 25shy6) from whom I

stole the section title

language itself

9 This was one of the contributions to the ldquolinguistic turnrdquo

(Wikipedia Linguistic turnmdashWikipedia The Free Encyclopedia

http en wikipedia org w index php title =Linguistic 20turn amp oldid = 922305269 [Online accessed23shyOctobershy2019] 2019 Hey we all do it) of philosophy in the early

20th century

perspective

skepticism

via its own logical structure the necessary

(for there to be meaningful propositions at all)

logical structure of the world8

21 An interesting feature of this perspective

is that it opens up language itself to analysis

and limitation9 And furthermore it suggests

that the set of what is is smaller than the set

of what can be meaningfully spoken about

The relativity of truth

22 Each subject (ie agent) in the world with

their propositions has a perspective a given

moment a given place an historical-cultural-

linguistic situation At the very least the truth

of propositions must account for this Just

how a theory of truth should do so is a matter

of significant debate (Maria Baghramian and

J Adam Carter Relativism In The Stanford

Encyclopedia of Philosophy Ed by Edward N

Zalta Winter 2019 Metaphysics Research Lab

Stanford University 2019)

23 Some go so far as to be skeptical

about truth (Peter Klein Skepticism In The

Stanford Encyclopedia of Philosophy Ed by

Edward N Zalta Summer 2015 Metaphysics

Research Lab Stanford University 2015)

regarding it to be entirely impossible Others

say that while a proposition may or may not be

true we could never come to know this

24 Often underlying this conversation is the

question of there being a common world in

which we all participate and if so whether

or not we can properly represent this world

itself Mathematics itself tru Truth p 1

10 Especially notable here is the work of Alfred Tarski in the

midshy20th century

beliefs

sentences

interpreted sentences

quasi-quotation

in language such that multiple subjects could

come to justifiably agree or disagree on the

truth of a proposition If every proposition

is so relative that it is relevant to only the

proposer truth would seem of little value On

the other hand if truth is understood to

be ldquoobjectiverdquomdashindependent of subjective

perspectivemdasha number of objections can be

made (Baghramian and Carter Relativism) such

as that there is no non-subjective perspective

from which to judge truth

Other ideas about truth

25 There are too many credible ideas

about truth to attempt a reasonable summary

however I will attempt to highlight a few

important ones

Formal methods

26 A set of tools was developed for

exploring theories of truth especially

correspondence theories10 Focus turned

from beliefs to sentences which are akin to

propositions (Recall that the above theories

have already been recast in the more modern

language of propositions) Another aspect

of these sentences under consideration is

that they begin to be taken as interpreted

sentences they are already have meaning

27 Beyond this several technical apparatus

are introduced that formalize criteria for

truth For instance a sentence is given a

sign φ A need arises to distinguish between

the quotation of sentence φ and the unqoted

sentence φ which is then given the quasi-

quotation notation pφq For instance let φ

stand for snow is white then φ rarr snow is

itself Mathematics itself tru Truth p 1

Convention T

11 Wilfrid Hodges Tarskirsquos Truth Definitions In The Stanford

Encyclopedia of Philosophy Ed by Edward N Zalta Fall 2018

Metaphysics Research Lab Stanford University 2018 Mario Goacutemez-

Torrente Alfred Tarski In The Stanford Encyclopedia of

Philosophy Ed by Edward N Zalta Spring 2019 Metaphysics Research

Lab Stanford University 2019 Peter Hylton and Gary Kemp Willard

Van Orman Quine In The Stanford Encyclopedia of Philosophy Ed

by Edward N Zalta Spring 2019 Metaphysics Research Lab Stanford

University 2019

redundancy theory

white and pφq rarr lsquosnow is whitersquo Tarski

introduces Convention T which states that

for a fixed language L with fully interpreted

sentences (Glanzberg Truth)

An adequate theory of truth for L

must imply for each sentence φ of L

pφq is true if and only if φ

Using the same example then

lsquosnow is whitersquo if and only if snow

is white

Convention T states a general rule for the

adequacy of a theory of truth and is used in

several contemporary theories

28 We can see that these formal methods get

quite technical and fun For more see Hodges

GoacutemezshyTorrente and Hylton and Kemp11

Deflationary theories

29 Deflationary theories of truth try to

minimize or eliminate altogether the concept

of or use of the term lsquotruthrsquo For instance the

redundancy theory claim that (Glanzberg

Truth)

itself Mathematics itself tru Truth p 1

12 Daniel Stoljar and Nic Damnjanovic The Deflationary Theory of

Truth In The Stanford Encyclopedia of Philosophy Ed by Edward N

Zalta Fall 2014 Metaphysics Research Lab Stanford University 2014

truth-bearer

meaning

theory of use

To assert that pφq is true is just to

assert that φ

Therefore we can eliminate the use of lsquois truersquo

30 For more of less see Stoljar and

Damnjanovic12

Language

31 It is important to recognize that language

mediates truth that is truth is embedded in

language The way language in general affects

theories of truth has been studied extensively

For instance whether the truth-bearer is

a belief or a proposition or a sentencemdashor

something elsemdashhas been much discussed The

importance of the meaning of truth-bearers

like sentences has played another large role

Theories of meaning like the picture theory

presented above are often closely intertwined

with theories of truth

32 One of the most popular theories of

meaning is called the theory of use

For a large class of cases of the

employment of the word ldquomeaningrdquo

ndash though not for all ndash this word

can be explained in this way the

meaning of a word is its use in

the language (L Wittgenstein PMS

Hacker and J Schulte Philosophical

itself Mathematics itself tru Truth p 1

languageshygames

metaphysical realism

scientific realism

13 Drew Khlentzos Challenges to Metaphysical Realism In The

Stanford Encyclopedia of Philosophy Ed by Edward N Zalta Winter

2016 Metaphysics Research Lab Stanford University 2016

antishyrealism

Investigations Wiley 2010 ISBN

9781444307979)

This theory is accompanied by the concept of

languageshygames which are loosely defined as

rule-based contexts within which sentences

have uses The idea is that the meaning of

a given sentence is its use in a network of

meaning that is constantly evolving This view

tends to be understood as deflationary or

relativistic about truth

Metaphysical and epistemological

considerations

33 We began with the recognition that

truth is intertwined with metaphysics and

epistemology Letrsquos consider a few such topics

34 The first is metaphysical realism

which claims that there is a world existing

objectively independently of how we think

about or describe it This ldquorealismrdquo tends to be

closely tied to yet distinct from scientific

realism which goes further claiming the

world is ldquoactuallyrdquo as science describes

independently of the scientific descriptions

(eg there are actual objects corresponding to

the phenomena we call atoms molecules light

particles etc)

35 There have been many challenges to the

realist claim (for some recent versions see

Khlentzos13) put forth by what is broadly

called antishyrealism These vary but often

challenge the ability of realists to properly

itself Mathematics itself tru Truth p 2

Metaphysical idealism

Epistemological idealism

14 These definitions are explicated by Guyer and Horstmann (Paul

Guyer and Rolf-Peter Horstmann Idealism In The Stanford

Encyclopedia of Philosophy Ed by Edward N Zalta Winter 2018

Metaphysics Research Lab Stanford University 2018)

noumenal world

phenomenal world

link language to supposed objects in the

world

36 Metaphysical idealism has been

characterized as claiming that ldquomindrdquo or

ldquosubjectivityrdquo generate or completely compose

the world which has no being outside mind

Epistemological idealism on the other hand

while perhaps conceding that there is a world

independent of mind claims all knowledge

of the world is created through mind and

for mind and therefore can never escape a

sort of mindshyworld gap14 This epistemological

idealism has been highly influential since

the work of Immanuel Kant (I Kant P Guyer

and AW Wood Critique of Pure Reason The

Cambridge Edition of the Works of Immanuel

Kant Cambridge University Press 1999 ISBN

9781107268333) in the late 18th century

which ushered in the idea of the noumenal

world in-itself and the phenomenal world

which is how the noumenal world presents to

us Many have held that phenomena can be

known through inquiry whereas noumena are

inaccessible Furthermore what can be known is

restricted by the categories pre-existent in

the knower

37 Another approach taken by Georg Wilhelm

Friedrich Hegel (Paul Redding Georg

Wilhelm Friedrich Hegel In The Stanford

Encyclopedia of Philosophy Ed by Edward N

Zalta Summer 2018 Metaphysics Research

Lab Stanford University 2018) and other

German idealists following Kant is to reframe

reality as thoroughly integrating subjectivity

(GWF Hegel and AV Miller Phenomenology

of Spirit Motilal Banarsidass 1998 ISBN

itself Mathematics itself tru Truth p 1

Notion

9788120814738 Slavoj Žižek Less Than

Nothing Hegel and the Shadow of Dialectical

Materialism Verso 2012 ISBN 9781844678976

This is one of the most interesting presentations

of Hegel and Lacan by one of the most exciting

contemporary philosophers) that is ldquoeverything

turns on grasping and expressing the True not

only as Substance but equally as Subjectrdquo

A subjectrsquos proposition is true inasmuch as it

corresponds with its Notion (approximately

the idea or meaning for the subject) Some

hold that this idealism is compatible with a

sort of metaphysical realism at least as far

as understanding is not independent of but

rather beholden to reality (Žižek Less Than

Nothing Hegel and the Shadow of Dialectical

Materialism p 906 ff)

38 Clearly all these ideas have many

implications for theories of truth and vice

versa

Where this leaves us

39 The truth is hard What may at first

appear to be a simple concept becomes complex

upon analysis It is important to recognize

that we have only sampled some highlights of

the theories of truth I recommend further

study of this fascinating topic

40 Despite the difficulties of finding

definitive grounds for understanding truth

we are faced with the task of provisionally

forging ahead Much of what follows in

the study of mathematics makes certain

implicit and explicit assumptions about truth

However we have found that the foundations

itself Mathematics itself found Truth p 2

of these assumptions may themselves be

problematic It is my contention that despite

the lack of clear foundations it is still worth

studying engineering analysis its mathematical

foundations and the foundations of truth

itself My justification for this claim is that

I find the utility and the beauty of this study

highly rewarding

itself Mathematics itself found The foundations of mathematics p 1

axioms

deduction

proof

theorems

15 Throughout this section for the history of mathematics I rely

heavily on Kline (M Kline Mathematics The Loss of Certainty A

Galaxy book Oxford University Press 1982 ISBN 9780195030853 A

detailed account of the ldquoillogicalrdquo development of mathematics and

an exposition of its therefore remarkable utility in describing the

world)

Euclid

geometry

Gauss

itselffound The foundations of

mathematics

1 Mathematics has long been considered

exemplary for establishing truth Primarily

it uses a method that begins with

axiomsmdashunproven propositions that include

undefined termsmdashand uses logical deduction

to prove other propositions (theorems) to

show that they are necessarily true if the

axioms are

2 It may seem obvious that truth established

in this way would always be relative to the

truth of the axioms but throughout history

this footnote was often obscured by the

ldquoobviousrdquo or ldquointuitiverdquo universal truth of the

axioms15 For instance Euclid (Wikipedia

Euclid mdash Wikipedia The Free Encyclopedia

http en wikipedia org w index php

title = Euclid amp oldid = 923031048 [Online

accessed 26shyOctobershy2019] 2019) founded

geometrymdashthe study of mathematical objects

traditionally considered to represent physical

space like points lines etcmdashon axioms thought

so solid that it was not until the early 19th

century that Carl Friedrich Gauss (Wikipedia

Carl Friedrich Gauss mdash Wikipedia The Free

Encyclopedia httpenwikipediaorgw

indexphptitle=Carl20Friedrich20Gaussamp

oldid=922692291 [Online accessed 26shyOctober-

2019] 2019) and others recognized this was

only one among many possible geometries (M

Kline Mathematics The Loss of Certainty A

Galaxy book Oxford University Press 1982

ISBN 9780195030853 A detailed account of

the ldquoillogicalrdquo development of mathematics

and an exposition of its therefore remarkable

utility in describing the world) resting on

itself Mathematics itself found The foundations of mathematics p 1

Aristotle

algebra

rational numbers

natural numbers

integers

irrational numbers

different axioms Furthermore Aristotle

(Christopher Shields Aristotle In The

Stanford Encyclopedia of Philosophy Ed by

Edward N Zalta Winter 2016 Metaphysics

Research Lab Stanford University 2016) had

acknowledged that reasoning must begin

with undefined terms however even Euclid

(presumably aware of Aristotlersquos work) seemed

to forget this and provided definitions

obscuring the foundations of his work and

starting mathematics on a path that for over

2000 years would forget its own relativity

(Kline Mathematics The Loss of Certainty p 101-

2)

3 The foundations of Euclid were even

shakier than its murky starting point several

unstated axioms were used in proofs and some

proofs were otherwise erroneous However for

two millennia mathematics was seen as the field

wherein truth could be established beyond

doubt

Algebra ex nihilo

4 Although not much work new geometry

appeared during this period the field of

algebra (Wikipedia Algebra mdash Wikipedia The

Free Encyclopedia httpenwikipediaorg

windexphptitle=Algebraampoldid=920573802

[Online accessed 26shyOctobershy2019] 2019)mdashthe

study of manipulations of symbols standing for

numbers in generalmdashbegan with no axiomatic

foundation whatsoever The Greeks had a notion

of rational numbers ratios of natural numbers

(positive integers) and it was known that

many solutions to algebraic equations were

irrational (could not be expressed as a ratio

of integers) But these irrational numbers like

virtually everything else in algebra were

itself Mathematics itself found The foundations of mathematics p 1

negative numbers

imaginary numbers

complex numbers

optics

astronomy

calculus

Newtonian mechanics

non-Euclidean geometry

gradually accepted because they were so

useful in solving practical problems (they

could be approximated by rational numbers and

this seemed good enough) The rules of basic

arithmetic were accepted as applying to these

and other forms of new numbers that arose in

algebraic solutions negative imaginary and

complex numbers

The application of mathematics to science

5 During this time mathematics was being

applied to optics and astronomy Sir

Isaac Newton then built calculus upon

algebra applying it to what is now known as

Newtonian mechanics which was really more

the product of Leonhard Euler (George Smith

Isaac Newton In The Stanford Encyclopedia

of Philosophy Ed by Edward N Zalta Fall

2008 Metaphysics Research Lab Stanford

University 2008 Wikipedia Leonhard

Euler mdash Wikipedia The Free Encyclopedia

httpenwikipediaorgwindexphptitle=

Leonhard20Eulerampoldid=921824700 [Online

accessed 26shyOctobershy2019] 2019) Calculus

introduced its own dubious operations but

the success of mechanics in describing and

predicting physical phenomena was astounding

Mathematics was hailed as the language of

God (later Nature)

The rigorization of mathematics

6 It was not until Gauss created non-

Euclidean geometry in which Euclidrsquos were

shown to be one of many possible axioms

compatible with the world and William Rowan

Hamilton (Wikipedia William Rowan Hamilton

mdash Wikipedia The Free Encyclopedia http

en wikipedia org w index php

title=William20Rowan20Hamiltonampoldid=

923163451 [Online accessed 26shyOctober-

itself Mathematics itself found The foundations of mathematics p 1

quaternions

rigorization

symbolic logic

mathematical model

16 The branch of mathematics called model theory concerns itself

with general types of models that can be made from a given formal

system like an axiomatic mathematical system For more on model

theory see Hodges (Wilfrid Hodges Model Theory In The Stanford

Encyclopedia of Philosophy Ed by Edward N Zalta Fall 2018

Metaphysics Research Lab Stanford University 2018) It is noteworthy

that the engineeringscience use of the term ldquomathematical modelrdquo is

only loosely a ldquomodelrdquo in the sense of model theory

implicit definition

2019] 2019) created quaternions (Wikipedia

Quaternion mdash Wikipedia The Free Encyclopedia

http en wikipedia org w index

php title = Quaternion amp oldid = 920710557

[Online accessed 26shyOctobershy2019] 2019)

a number system in which multiplication is

noncommunicative that it became apparent

somethingwas fundamentally wrong with the

way truth in mathematics had been understood

This started a period of rigorization in

mathematics that set about axiomatizing and

proving 19th century mathematics This included

the development of symbolic logic which

aided in the process of deductive reasoning

7 An aspect of this rigorization is that

mathematicians came to terms with the axioms

that include undefined terms For instance

a ldquopointrdquo might be such an undefined term in

an axiom A mathematical model is what we

create when we attach these undefined terms

to objects which can be anything consistent

with the axioms16 The system that results

from proving theorems would then apply to

anything ldquoproperlyrdquo described by the axioms

So two masses might be assigned ldquopointsrdquo in

a Euclidean geometric space from which

we could be confident that for instance

the ldquodistancerdquo between these masses is the

Euclidean norm of the line drawn between the

points It could be said then that a ldquopointrdquo in

Euclidean geometry is implicitly defined by

its axioms and theorems and nothing else That

is mathematical objects are not inherently

tied to the physical objects to which we tend

to apply them Euclidean geometry is not

the study of physical space as it was long

consideredmdashit is the study of the objects

implicitly defined by its axioms and theorems

The foundations of mathematics are built

itself Mathematics itself found The foundations of mathematics p 2

consistent

theorem

Set theory

foundation

sets

ZFC set theory

8 The building of the modern foundations

mathematics began with clear axioms

solid reasoning (with symbolic logic) and

lofty yet seemingly attainable goals prove

theorems to support the already ubiquitous

mathematical techniques in geometry algebra

and calculus from axioms furthermore prove

that these axioms (and things they imply) do

not contradict each other ie are consistent

and that the axioms are not results of each

other (one that can be derived from others is

a theorem not an axiom)

9 Set theory is a type of formal axiomatic

system that all modern mathematics is

expressed with so set theory is often called

the foundation of mathematics (Joan Bagaria

Set Theory In The Stanford Encyclopedia

of Philosophy Ed by Edward N Zalta Fall

2019 Metaphysics Research Lab Stanford

University 2019) We will study the basics

in The primary objects in set theory are

sets informally collections of mathematical

objects There is not just one a single set

of axioms that is used as the foundation of

all mathematics for reasons will review in a

moment However the most popular set theory

is ZermeloshyFraenkel set theory with the

axiom of choice (ZFC) The axioms of ZF sans C

are as follows (ibid)

extensionality If two sets A and B have the

same elements then they are equal

empty set There exists a set denoted by emptyand called the empty set which has no

elements

pair Given any sets A and B there exists a set

itself Mathematics itself found The foundations of mathematics p 1

denoted by A B which contains A and B

as its only elements In particular there

exists the set A which has A as its only

element

power set For every set A there exists a set

denoted by P(A) and called the power set

of A whose elements are all the subsets of

A

union For every set A there exists a set

denoted by⋃A and called the union of

A whose elements are all the elements of

the elements of A

infinity There exists an infinite set In

particular there exists a set Z that

contains empty and such that if A isin Z then⋃A A isin Z

separation For every set A and every given

property there is a set containing

exactly the elements of A that have

that property A property is given by a

formula ϕ of the first-order language

of set theory Thus separation is not a

single axiom but an axiom schema that is

an infinite list of axioms one for each

formula ϕ

replacement For every given definable

function with domain a set A there is a

set whose elements are all the values of

the function

10 ZFC also has the axiom of choice (Bagaria

Set Theory)

choice For every set A of pairwiseshydisjoint

non-empty sets there exists a set that

contains exactly one element from each

set in A

itself Mathematics itself found The foundations of mathematics p 2

first incompleteness theorem

incomplete

17 Panu Raatikainen Goumldelrsquos Incompleteness Theorems In The

Stanford Encyclopedia of Philosophy Ed by Edward N Zalta Fall

2018 Metaphysics Research Lab Stanford University 2018 A through

and contemporary description of Goumldelrsquos incompleteness theorems

which have significant implications for the foundations and function

of mathematics and mathematical truth

undecidable

second incompleteness theorem

The foundations have cracks

11 The foundationalistsrsquo goal was to prove

that some set of axioms from which all of

mathematics can be derived is both consistent

(contains no contradictions) and complete

(every true statement is provable) The work of

Kurt Goumldel (Juliette Kennedy Kurt Goumldel In

The Stanford Encyclopedia of Philosophy Ed

by Edward N Zalta Winter 2018 Metaphysics

Research Lab Stanford University 2018) in

the mid 20th century shattered this dream by

proving in his first incompleteness theorem

that any consistent formal system within which

one can do some amount of basic arithmetic is

incomplete His argument is worth reviewing

(see Raatikainen17) but at its heart is an

undecidable statement like ldquoThis sentence is

unprovablerdquo Let U stand for this statement If

it is true it is unprovable If it is provable it is

false Therefore it is true iff it is provable

Then he shows that if a statement A that

essentially says ldquoarithmetic is consistentrdquo is

provable then so is the undecidable statement

U But if U is to be consistent it cannot be

provable and therefore neither can A be

provable

12 This is problematic It tells us virtually

any conceivable axiomatic foundation of

mathematics is incomplete If one is complete

it is inconsistent (and therefore worthless) One

problem this introduces is that a true theorem

may be impossible to prove but it turns out we

can never know that in advance of its proof if

it is provable

13 But it gets worse Goumldelrsquos second

itself Mathematics itself found The foundations of mathematics p 1

material implication

Banach-Tarski paradox

empirical

18 Kline Mathematics The Loss of Certainty

incompleteness theorem shows that such

systems cannot even be shown to be consistent

This means at any moment someone could find

an inconsistency in mathematics and not only

would we lose some of the theorems we would

lose them all This is because by what is called

the material implication (Kline Mathematics

The Loss of Certainty pp 187-8 264) if one

contradiction can be found every proposition

can be proven from it And if this is the case

all (even proven) theorems in the system would

be suspect

14 Even though no contradiction has yet

appeared in ZFC its axiom of choice which

is required for the proof of most of what

has thus far been proven generates the

Banach-Tarski paradox that says a sphere

of diameter x can be partitioned into a finite

number of pieces and recombined to form

two spheres of diameter x Troubling to say

the least Attempts were made for a while to

eliminate the use of the axiom of choice but

our buddy Goumldel later proved that if ZF is

consistent so is ZFC (ibid p 267)

Mathematics is considered empirical

15 Since its inception mathematics has been

applied extensively to the modeling of the

world Despite its cracked foundations it has

striking utility Many recent leading minds of

mathematics philosophy and science suggest

we treat mathematics as empirical like any

science subject to its success in describing

and predicting events in the world As Kline18

summarizes

itself Mathematics itself reason The foundations of mathematics p 2

The upshot [hellip] is that sound

mathematics must be determined

not by any one foundation which

may some day prove to be right The

ldquocorrectnessrdquo of mathematics must

be judged by its application to the

physical world Mathematics is an

empirical science much as Newtonian

mechanics It is correct only to the

extent that it works and when it

does not it must be modified It

is not a priori knowledge even

though it was so regarded for two

thousand years It is not absolute

or unchangeable

itself Mathematics itself overview Mathematical reasoning p 1

itselfreason Mathematical

reasoning

itself Mathematics itself engmath Mathematical topics in overview p 1

itselfoverview Mathematical

topics in

overview

itself Mathematics itself ex What is mathematics for engineering p 1

itselfengmath What is

mathematics for

engineering

language

sets Mathematics itself Exercises for Chapter 01 p 1

itselfex Exercises for Chapter

01

formal languages

logic

reasoning

propositions

theorems

proof

propositional calculus

logical connectives

first-order logic

sets

Mathematicalreasoning logicand set theory

In order to communicate mathematical

ideas effectively formal languages have

been developed within which logic ie

deductive (mathematical) reasoning can

proceed Propositions are statements that

can be either true gt or false perp Axiomatic

systems begin with statements (axioms) assumed

true Theorems are proven by deduction

In many forms of logic like propositional

calculus (Wikipedia Propositional

calculus mdash Wikipedia The Free Encyclopedia

httpenwikipediaorgwindexphptitle=

Propositional20calculusampoldid=914757384

[Online accessed 29shyOctobershy2019] 2019)

compound propositions are constructed

via logical connectives like ldquoandrdquo

and ldquoorrdquo of atomic propositions (see

Lecture setslogic) In others like first-order

logic (Wikipedia First-order logic mdash

Wikipedia The Free Encyclopedia http

enwikipediaorgwindexphptitle=First-

order 20logic amp oldid = 921437906 [Online

accessed 29shyOctobershy2019] 2019) there are also

sets Mathematical reasoning logic and set theory setintro p 3

quantifiers

set theory

logical quantifiers like ldquofor everyrdquo and ldquothere

existsrdquo

The mathematical objects and operations about

which most propositions are made are expressed

in terms of set theory which was introduced

in Lecture itselffound and will be expanded

upon in Lecture setssetintro We can say

that mathematical reasoning is comprised of

mathematical objects and operations expressed

in set theory and logic allows us to reason

therewith

sets Mathematical reasoning logic and set theory setintro Introduction to set theory p 1

set theory

1 K Ciesielski Set Theory for the Working Mathematician London

Mathematical Society Student Texts Cambridge University Press 1997

ISBN 9780521594653 A readable introduction to set theory

2 HB Enderton Elements of Set Theory Elsevier Science 1977 ISBN

9780080570426 A gentle introduction to set theory and mathematical

reasoningmdasha great place to start

set

field

addition

multiplication

subtraction

division

setssetintro Introduction to

set theory

Set theory is the language of the modern

foundation of mathematics as discussed

in Lecture itselffound It is unsurprising

then that it arises throughout the study of

mathematics We will use set theory extensively

in on probability theory

The axioms of ZFC set theory were introduced

in Lecture itselffound Instead of proceeding

in the pure mathematics way of introducing

and proving theorems we will opt for a more

applied approach in which we begin with some

simple definitions and include basic operations

A more thorough and still readable treatment

is given by Ciesielski1 and a very gentle

version by Enderton2

A set is a collection of objects Set theory

gives us a way to describe these collections

Often the objects in a set are numbers or sets

of numbers However a set could represent

collections of zebras and trees and hairballs

For instance here are some sets

A field is a set with special structure This

structure is provided by the addition (+)

and multiplication (times) operators and their

inverses subtraction (ndash) and division (divide)

sets Mathematical reasoning logic and set theory setintro Introduction to set theory p 2

real numbers

3 When the natural numbers include zero we write N0

set membership

set operations

union

The quintessential example of a field is the

set of real numbers R which admits these

operators making it a field The reals R thecomplex numbers C the integers Z and the

natural numbers3 N are the fields we typically

consider

Set membership is the belonging of an object

to a set It is denoted with the symbol isinwhich can be read ldquois an element ofrdquo for

element x and set X

For instance we might say 7 isin 1 7 2 or

4 isin 1 7 2 Or we might declare that a is a

real number by stating x isin R

Set operations can be used to construct new

sets from established sets We consider a few

common set operations now

The union cup of sets is the set containing all

the elements of the original sets (no repetition

allowed) The union of sets A and B is denoted

A cup B For instance let A = 1 2 3 and B = ndash1 3

then

sets Mathematical reasoning logic and set theory setintro Introduction to set theory p 3

intersection

empty set

set difference

subset

proper subset

The intersection cap of sets is a set containing

the elements common to all the original sets

The intersection of sets A and B is denoted

A cap B For instance let A = 1 2 3 and B = 2 3 4

then

If two sets have no elements in common the

intersection is the empty set empty = the unique

set with no elements

The set difference of two sets A and B is the

set of elements in A that arenrsquot also in B It is

denoted A B For instance let A = 1 2 3 and

B = 2 3 4 Then

A subset sube of a set is a set the elements of

which are contained in the original set If the

two sets are equal one is still considered a

subset of the other We call a subset that is

not equal to the other set a proper subset sub

sets Mathematical reasoning logic and set theory setintro Introduction to set theory p 4

complement

cartesian product

set-builder notation

For instance let A = 1 2 3 and B = 1 2 Then

The complement of a subset is a set of elements

of the original set that arenrsquot in the subset

For instance if B sube A then the complement of B

denoted B is

The cartesian product of two sets A and B is

denoted A times B and is the set of all ordered

pairs (a b) where a isin A and b isin B Itrsquos

worthwhile considering the following notation

for this definition

which means ldquothe cartesian product of A and B

is the ordered pair (a b) such that a isin A and

b isin Brdquo in set-builder notation (Wikipedia

Set-builder notation mdash Wikipedia The Free

Encyclopedia httpenwikipediaorgw

indexphptitle=Set-builder20notationamp

oldid=917328223 [Online accessed 29shyOctober-

sets Mathematical reasoning logic and set theory logic Introduction to set theory p 5

map

function

value

domain

codomain

image

2019] 2019)

Let A and B be sets A map or function f from

A to B is an assignment of some element a isin A

to each element b isin B The function is denoted

f A rarr B and we say that f maps each element

a isin A to an element f(a) isin B called the value

of a under f or a 7rarr f(a) We say that f has

domain A and codomain B The image of f is

the subset of its codomain B that contains

the values of all elements mapped by f from its

domain A

sets Mathematical reasoning logic and set theory ex Logical connectives and quantifiers p 1

first-order logic

atomistic proposition

compound proposition

logical connectives

not

andor

truth table

universal quantifier symbol

existential quantifier

setslogic Logical connectives

and quantifiers

In order to make compound propositions

we need to define logical connectives In

order to specify quantities of variables

we need to define logical quantifiers The

following is a form of first-order logic

(Wikipedia First-order logic mdash Wikipedia The

Free Encyclopedia)

Logical connectives

A proposition can be either true gt and false perpWhen it does not contain a logical connective

it is called an atomistic proposition

To combine propositions into a compound

proposition we require logical connectives

They are not (not) and (and) and or (or) Table 021

is a truth table for a number of connectives

Quantifiers

Logical quantifiers allow us to indicate

the quantity of a variable The universal

quantifier symbol forall means ldquofor allrdquo For

instance let A be a set then foralla isin A means

ldquofor all elements in Ardquo and gives this quantity

variable a The existential quantifier existmeans ldquothere exists at least onerdquo or ldquofor somerdquo

For instance let A be a set then exista isin A

Table 021 a truth table for logical connectivesThe first two columns are the truth values ofpropositions p and q the rest are outputs

not and or nand nor xor xnor

p q notp pand q por q p uarr q p darr q p Y q phArr q

perp perp gt perp perp gt gt perp gtperp gt gt perp gt gt perp gt perpgt perp perp perp gt gt perp gt perpgt gt perp gt gt perp perp perp gt

sets Mathematical reasoning logic and set theory ex Logical connectives and quantifiers p 2

means ldquothere exists at least one element a in

A helliprdquo

prob Mathematical reasoning logic and set theory Exercises for Chapter 02 p 1

setsex Exercises for Chapter

02

Exercise 021 (hardhat)

For the following write the set described in

set-builder notation

a A = 2 3 5 9 17 33 middot middot middot b B is the set of integers divisible by 11

c C = 13 14 15 middot middot middot d D is the set of reals between ndash3 and 42

Exercise 022 (2)

Let xy isin Rn Prove the Cauchy-Schwarz

Inequality

|x middot y| 6 xy (ex1)

Hint you may find the geometric definition of

the dot product helpful

Exercise 023 (3)

Let x isin Rn Prove that

x middot x = x2 (ex2)

Hint you may find the geometric definition of

the dot product helpful

Exercise 024 (4)

Let xy isin Rn Prove the Triangle Inequality

x + y 6 x + y (ex3)

Hint you may find the Cauchy-Schwarz

Inequality helpful

prob

Probability

prob Probability prob Probability and measurement p 1

Probability theory

1 For a good introduction to probability theory see Ash (Robert B

Ash Basic Probability Theory Dover Publications Inc 2008) or Jaynes

et al (ET Jaynes et al Probability Theory The Logic of Science

Cambridge University Press 2003 ISBN 9780521592710 An excellent

and comprehensive introduction to probability theory)

interpretation of probability

event

probmeas Probability and

measurement

Probability theory is a wellshydefined branch

of mathematics Andrey Kolmogorov described

a set of axioms in 1933 that are still in use

today as the foundation of probability

theory1

We will implicitly use these axioms in our

analysis The interpretation of probability is

a contentious matter Some believe probability

quantifies the frequency of the occurrence of

some event that is repeated in a large number

of trials Others believe it quantifies the

state of our knowledge or belief that some

event will occur

In experiments our measurements are tightly

coupled to probability This is apparent in the

questions we ask Here are some examples

1 How common is a given event

2 What is the probability we will reject

a good theory based on experimental

results

3 How repeatable are the results

4 How confident are we in the results

5 What is the character of the fluctuations

and drift in the data

6 How much data do we need

prob Probability prob Basic probability theory p 1

probability space

sample space

outcomes

probprob Basic probability

theory

The mathematical model for a class of

measurements is called the probability

space and is composed of a mathematical

triple of a sample space Ω σshyalgebra

F and probability measure P typically

denoted (ΩF P) each of which we will

consider in turn (Wikipedia Probability

space mdash Wikipedia The Free Encyclopedia

httpenwikipediaorgwindexphptitle=

Probability20spaceampoldid=914939789 [Online

accessed 31shyOctobershy2019] 2019)

The sample space Ω of an experiment is the

set representing all possible outcomes of the

experiment If a coin is flipped the sample

space is Ω = H T where H is heads and T

is tails If a coin is flipped twice the sample

space could be

However the same experiment can have

different sample spaces For instance for two

coin flips we could also choose

prob Probability prob Basic probability theory p 1

event

We base our choice of Ω on the problem at

hand

An event is a subset of the sample space

That is an event corresponds to a yes-or-no

question about the experiment For instance

event A (remember A sube Ω) in the coin flipping

experiment (two flips) might be A = HT TH A is

an event that corresponds to the question ldquoIs

the second flip different than the firstrdquo A is

the event for which the answer is ldquoyesrdquo

Algebra of events

Because events are sets we can perform the

usual set operations with them

Example probprobshy1 re set operations with events

Problem Statement

Consider a toss of a single die We choose

the sample space to be Ω = 1 2 3 4 5 6 Let the

following define events

A equiv the result is even = 2 4 6

B equiv the result is greater than 2 = 3 4 5 6

Find the following event combinations

A cup B A cap B A B B A A B

prob Probability condition Basic probability theory p 2

σshyalgebra

probability measure

The σshyalgebra F is the collection of events

of interest Often F is the set of all possible

events given a sample space Ω which is just

the power set of Ω (Wikipedia Probability

space mdash Wikipedia The Free Encyclopedia)

When referring to an event we often state

that it is an element of F For instance we

might say an event A isin F

Wersquore finally ready to assign probabilities to

events We define the probability measure

P F rarr [0 1] to be a function satisfying the

following conditions

1 For every event A isin F the probability

measure of A is greater than or equal to

zeromdashie P(A) gt 0

2 If an event is the entire sample space its

probability measure is unitymdashie if A = Ω

P(A) = 1

3 If events A1 A2 middot middot middot are disjoint sets (no

elements in common) then P(A1 cup A2 cup middot middot middot ) =P(A1) + P(A2) + middot middot middot

We conclude the basics by observing four facts

that can be proven from the definitions above

1

2

3

4

prob Probability condition Independence and conditional probability p 1

independent

probcondition Independence

and conditional

probability

Two events A and B are independent if and

only if

P(A cap B) = P(A)P(B)

If an experimenter must make a judgment

without data about the independence of

events they base it on their knowledge of the

events as discussed in the following example

Example probconditionshy1 re independence

Problem Statement

Answer the following questions and

imperatives

1 Consider a single fair die rolled twice

What is the probability that both rolls

are 6

2 What changes if the die is biased by a

weight such that P(6) = 17

3 What changes if the die is biased by a

magnet rolled on a magnetic dice-rolling

tray such that P(6) = 17

4 What changes if there are two dice

biased by weights such that for each

P(6) = 17 rolled once both resulting in

6

5 What changes if there are two dice

biased by magnets rolled together

prob Probability condition Independence and conditional probability p 1

dependent

conditional probability

Conditional probability

If events A and B are somehow dependent we

need a way to compute the probability of B

occurring given that A occurs This is called

the conditional probability of B given A and

is denoted P(B | A) For P(A) gt 0 it is defined as

P(B | A) =P(A cap B)P(A)

(condition1)

We can interpret this as a restriction of

the sample space Ω to A ie the new sample

space Ωprime = A sube Ω Note that if A and B are

independent we obtain the obvious result

prob Probability bay Independence and conditional probability p 2

Example probconditionshy2 re dependence

Problem Statement

Consider two unbiased dice rolled once

Let events A = sum of faces = 8 and

B = faces are equal What is the probability

the faces are equal given that their sum is 8

prob Probability bay Bayesrsquo theorem p 1

Bayesrsquo theorem

probbay Bayesrsquo theorem

Given two events A and B Bayesrsquo theorem (aka

Bayesrsquo rule) states that

P(A | B) = P(B | A)P(A)

P(B) (bay1)

Sometimes this is written

P(A | B) =P(B | A)P(A)

P(B | A)P(A) + P(B | notA)P(notA)(bay2)

=1

1 +P(B | notA)

P(B | A)middot P(notA)P(A)

(bay3)

This is a useful theorem for determining a

testrsquos effectiveness If a test is performed to

determine whether an event has occurred we

might as questions like ldquoif the test indicates

that the event has occurred what is the

probability it has actually occurredrdquo Bayesrsquo

theorem can help compute an answer

Testing outcomes

The test can be either positive or negative

meaning it can either indicate or not indicate

that A has occurred Furthermore this result

can be either true or false

A notA

positive (B) true false

negative (notB) false true

Table 031 test outcomeB for event A

There are four

options then

Consider an event

A and an event that

is a test result B

indicating that event

A has occurred

Table 031 shows

these four possible test outcomes The event

A occurring can lead to a true positive or a

prob Probability bay Bayesrsquo theorem p 1

sensitivity

detection rate

specificity

posterior probability

prior probability

false negative whereas notA can lead to a true

negative or a false positive

Terminology is important here

bull P(true positive) = P(B | A) aka sensitivity

or detection rate

bull P(true negative) = P(notB | notA) aka

specificity

bull P(false positive) = P(B | notA)

bull P(false negative) = P(notB | A)

Clearly the desirable result for any test

is that it is true However no test is true

100 percent of the time So sometimes it

is desirable to err on the side of the

false positive as in the case of a medical

diagnostic Other times it is more desirable to

err on the side of a false negative as in the

case of testing for defects in manufactured

balloons (when a false negative isnrsquot a big

deal)

Posterior probabilities

Returning to Bayesrsquo theorem we can evaluate

the posterior probability P(A | B) of the

event A having occurred given that the

test B is positive given information that

includes the prior probability P(A) of A The

form in Equation bay2 or (bay3) is typically

useful because it uses commonly known test

probabilities of the true positive P(B | A) and

of the false positive P(B | notA) We calculate

P(A | B) when we want to interpret test results

prob Probability bay Bayesrsquo theorem p 2

0 02 04 06 08 1

middot10ndash2

0

02

04

06

08

1

P(A) rarr

P(A|B)rarr

P(B|A) = 09000

P(B|A) = 09900

P(B|A) = 09990

P(B|A) = 09999

Figure 031 for different high-sensitivities theprobability that an event A occurred given thata test for it B is positive versus the probabilitythat the event A occurs under the assumption thespecificity equals the sensitivity

Some interesting results can be found from this

For instance if we let P(B | A) = P(notB | notA)

(sensitivity equal specificity) and realize that

P(B | notA) + P(notB | notA) = 1 (when notA either B or notB)

we can derive the expression

P(B | notA) = 1 ndash P(B | A) (bay4)

Using this and P(notA) = 1 ndash P(A) in Equation bay3

gives (recall wersquove assumed sensitivity equals

specificity)

P(A | B) =1

1 +1 ndash P(B | A)

P(B | A)middot 1 ndash P(A)P(A)

(bay5)

=1

1 +

(1

P(B | A)ndash 1

)(1

P(A)ndash 1

) (bay6)

This expression is plotted in Figure 031 See that

a positive result for a rare event (small P(A)) is

hard to trust unless the sensitivity P(B | A) and

specificity P(notB | notA) are very high indeed

prob Probability bay Bayesrsquo theorem p 3

Example probbayshy1 re Bayesrsquo theorem

Problem StatementSuppose 01 of springs manufactured at a

given plant are defective Suppose you need

to design a test that when it indicates

a deffective part the part is actually

defective 99 of the time What sensitivity

should your test have assuming it can be made

equal to its specificity

Problem Solution

The followingwas generated from a Jupyter

notebook with the following filename and

kernel

notebook filename bayes_theorem_example_01ipynbnotebook kernel python3

from sympy import for symbolicsimport numpy as np for numericsimport matplotlibpyplot as plt for plotsfrom IPythondisplay import display Markdown Latex

Define symbolic variables

var(p_Ap_nAp_Bp_nBp_B_Ap_B_nAp_A_Breal=True)

(p_A p_nA p_B p_nB p_B_A p_B_nA p_A_B)

prob Probability bay Bayesrsquo theorem p 4

Beginningwith Bayesrsquo theorem and assuming

the sensitivity and specificity are equal by

Equation bay4 we can derive the following

expression for the posterior probability P(A | B)

p_A_B_e1 = Eq(p_A_Bp_B_Ap_Ap_B)subs(

p_B p_B_Ap_A+p_B_nAp_nA conditional probp_B_nA 1-p_B_A Eq (35)p_nA 1-p_A

)display(p_A_B_e1)

pAB =pApBA

pApBA +(1 ndash pA

) (1 ndash pBA

)

Solve this for P(B | A) the quantity we seek

p_B_A_sol = solve(p_A_B_e1p_B_Adict=True)p_B_A_eq1 = Eq(p_B_Ap_B_A_sol[0][p_B_A])display(p_B_A_eq1)

pBA =pAB

(1 ndash pA

)ndash2pApAB + pA + pAB

Now letrsquos substitute the given probabilities

p_B_A_spec = p_B_A_eq1subs(

prob Probability rando Bayesrsquo theorem p 5

p_A 0001p_A_B 099

)display(p_B_A_spec)

pBA = 0999989888981011

Thatrsquos a tall order

prob Probability rando Random variables p 1

random variable

discrete random variable

continuous random variable

probrando Random variables

Probabilities are useful even when they do

not deal strictly with events It often occurs

that we measure something that has randomness

associated with it We use random variables to

represent these measurements

A random variable X Ω rarr R is a function

that maps an outcome ω from the sample

space Ω to a real number x isin R as shown in

Figure 032 A random variable will be denoted

with a capital letter (eg X and K) and a

specific value that it maps to (the value) will

be denoted with a lowercase letter (eg x and

k)

A discrete random variable K is one that

takes on discrete values A continuous random

variable X is one that takes on continuous

values

Example probrandoshy1 re dice again

Problem Statement

Roll two unbiased dice Let K be a random

variable representing the sum of the two Let

rvR

Ω

outcome ω

x

X

Figure 032 a random variable X maps an outcomeω isin Ω to an x isin R

prob Probability rando Random variables p 2

P(k) be the probability of the result k isin K Plotand interpret P(k)

Example probrandoshy2 re Johnson-Nyquist noise

Problem Statement

A resistor at nonzero temperature without

any applied voltage exhibits an interesting

phenomenon its voltage randomly fluctuates

This is called JohnsonshyNyquist noise and

is a result of thermal excitation of charge

carriers (electrons typically) For a given

resistor and measurement system let the

probability density function fV of the voltage

V across an unrealistically hot resistor be

fV(V) =1radicπendashV

2

Plot and interpret the meaning of this

function

prob Probability pxf Random variables p 3

prob Probability pxf Probability density and mass functions p 1

frequency distribution

probability mass function

Figure 033 plot of a probability mass function

frequency density distribution

probability density function

probpxf Probability density

and mass functions

Consider an experiment that measures a

random variable We can plot the relative

frequency of the measurand landing in

different ldquobinsrdquo (ranges of values) This is

called a frequency distribution or a

probability mass function (PMF)

Consider for instance a probability mass

function as plotted in Figure 033 where a

frequency ai can be interpreted as an estimate

of the probability of the measurand being in

the ith interval The sum of the frequencies

must be unity

with k being the number of bins

The frequency density distribution is similar

to the frequency distribution but with ai 7rarrai∆x where ∆x is the bin width

If we let the bin width approach zero we

derive the probability density function (PDF)

f(x) = limkrarrinfin∆xrarr0

ksumj=1

aj∆x (pxf1)

prob Probability pxf Probability density and mass functions p 1

Figure 034 plot of a probability density function

We typically think of a probability density

function f like the one in Figure 034 as a

function that can be integrated over to

find the probability of the random variable

(measurand) being in an interval [a b]

P(x isin [a b]) =ˆ b

af(χ)dχ (pxf2)

Of course

We now consider a common PMF and a common

PDF

Binomial PMF

Consider a random binary sequence of length

n such that each element is a random 0 or 1

generated independently like

(1 0 1 1 0 middot middot middot 1 1) (pxf3)

Let events 1 and 0 be mutually exclusive

and exhaustive and P(1) = p The probability of

the sequence above occurring is

prob Probability pxf Probability density and mass functions p 2

binomial distribution PDF

There are n choose k(n

k

)=

n

k(n ndash k) (pxf4)

possible combinations of k ones for n bits

Therefore the probability of any combination

of k ones in a series is

f(k) =

(n

k

)pk(1 ndash p)nndashk (pxf5)

We call Equation pxf5 the binomial

distribution PDF

Example probpxfshy1 re Binomial PMF

Problem Statement

Consider a field sensor that fails for a

given measurement with probability p Given

n measurements plot the binomial PMF as

a function of k failed measurements for

a few different probabilities of failure

p isin [004 025 05 075 096]

Problem SolutionFigure 036 shows Matlab code for

constructing the PDFs plotted in Figure 035

Note that the symmetry is due to the fact

that events 1 and 0 are mutually exclusive

and exhaustive

Example probpxfshy2 re hi

Problem Statement

Sed mattis erat sit amet gravida malesuada

elit augue egestas diam tempus scelerisque

nunc nisl vitae libero Sed consequat feugiat

massa Nunc porta eros in eleifend varius

erat leo rutrum dui non convallis lectus

prob Probability pxf Probability density and mass functions p 3

0 20 40 60 80 1000

005

01

015

02

number of failures in sequence

proba

bility

p = 004

p = 025

p = 05

p = 075

p = 096

Figure 035 binomial PDF for n = 100 measurementsand different values of P(1) = p the probability ofa measurement error The plot is generated by theMatlab code of Figure 036

parametersn = 100k_a = linspace(1nn)p_a = [042557596]

binomial functionf = (nkp) nchoosek(nk)p^k(1-p)^(n-k)

loop through to construct an arrayf_a = NaNones(length(k_a)length(p_a))for i = 1length(k_a)

for j = 1length(p_a)f_a(ij) = f(nk_a(i)p_a(j))

endend

plotfigurecolors = jet(length(p_a))for j = 1length(p_a)

bar(k_af_a(j)facecolorcolors(j)facealpha05displayname [$p = num2str(p_a(j))$]

) hold onendleg = legend(showlocationnorth)set(leginterpreterlatex)hold offxlabel(number of ones in sequence k)ylabel(probability)xlim([0100])

Figure 036 a Matlab script for generating binomialPMFs

prob Probability pxf Probability density and mass functions p 4

orci ut nibh Sed lorem massa nonummy quis

egestas id condimentum at nisl Maecenas at

nibh Aliquam et augue at nunc pellentesque

ullamcorper Duis nisl nibh laoreet suscipit

convallis ut rutrum id enim Phasellus odio

Nulla nulla elit molestie non scelerisque at

vestibulum eu nulla Ut odio nisl facilisis

id mollis et scelerisque nec enim Aenean sem

leo pellentesque sit amet scelerisque sit amet

vehicula pellentesque sapien

prob Probability pxf Probability density and mass functions p 5

prob Probability pxf Probability density and mass functions p 6

prob Probability E Probability density and mass functions p 1

Gaussian or normal random variable

mean

standard deviation

variance

Gaussian PDF

The Gaussian or normal random variable x has

PDF

f(x) =1

σradic2π

expndash(x ndash micro)2

2σ2 (pxf6)

Although wersquore not quite ready to understand

these quantities in detail it can be shown that

the parameters micro and σ have the following

meanings

bull micro is the mean of x

bull σ is the standard deviation of x and

bull σ2 is the variance of x

Consider the ldquobell-shapedrdquo Gaussian PDF in

Figure 037 It is always symmetric The mean micro

is its central value and the standard deviation

σ is directly related to its width We will

continue to explore the Gaussian distribution

in the following lectures especially in

Lecture statsconfidence

prob Probability E Probability density and mass functions p 2

ndash3 ndash2 ndash1 0 1 2 3

random variable x

Figure 037 PDF for Gaussian random variable x

mean micro = 0 and standard deviation σ = 1radic2

prob Probability E Expectation p 1

expected value

expectation

mean

probE Expectation

Recall that a random variable is a function

X Ω rarr R that maps from the sample

space to the reals Random variables are the

arguments of probability mass functions (PMFs)

and probability density functions (PDFs)

The expected value (or expectation) of a

random variable is akin to its ldquoaverage valuerdquo

and depends on its PMF or PDF The expected

value of a random variable X is denoted

〈X〉 or E[X] There are two definitions of

the expectation one for a discrete random

variable the other for a continuous random

variable Before we define them however it

is useful to predefine the most fundamental

property of a random variable its mean

Definition probE1 mean

The mean of a random variable X is defined

as

mX = E[X]

Letrsquos begin with a discrete random variable

Definition probE2 expectation of a

discrete random

variable

Let K be a discrete random variable and f its

PMF The expected value of K is defined as

E[K]=sumforallk

kf(k)

prob Probability E Expectation p 2

Example probEshy1 re expectation of a discrete random variable

Problem Statement

Given a discrete random variable K with PMF

shown below what is its mean mK

Let us now turn to the expectation of a

continuous random variable

Definition probE3 expectation of a

continuous random

variable

Let X be a continuous random variable and f

its PDF The expected value of X is defined as

E[X]=

ˆ infinndashinfin xf(x)dx

Example probEshy2 re expectation of a continuous random variable

Problem Statement

Given a continuous random variable X with

Gaussian PDF f what is the expected value of

X

random variable x

prob Probability moments Expectation p 3

Due to its sum or integral form the expected

value E [middot] has some familiar properties for

random variables X and Y and reals a and b

E [a] = a (E1a)

E[X + a

]= E

[X]+ a (E1b)

E[aX

]= aE

[X]

(E1c)

E[E[X]]= E

[X]

(E1d)

E[aX + bY

]= aE

[X]+ b E

[Y] (E1e)

prob Probability moments Central moments p 1

central moments

variance

probmoments Central moments

Given a probability mass function (PMF) or

probability density function (PDF) of a random

variable several useful parameters of the

random variable can be computed These are

called central moments which quantify

parameters relative to its mean

Definition probmoments1 central moments

The nth central moment of random variable X

with PDF f is defined as

E[(X ndash microX)

n] = ˆ infinndashinfin (x ndash microX)

nf(x)dx

For discrete random variable K with PMF f

E[(K ndash microK)

n] = infinsumforallk(k ndash microK)

nf(k)

Example probmomentsshy1 re first moment

Problem Statement

Prove that the first moment of continuous

random variable X is zero

The second central moment of random variable

X is called the variance and is denoted

prob Probability moments Central moments p 2

standard deviation

skewness

σ2X or Var[X]

or E[(X ndash microX)

2]

(moments1)

The variance is a measure of the width or

spread of the PMF or PDF We usually compute

the variance with the formula

Other properties of variance include for real

constant c

Var [c] = 0 (moments2)

Var[X + c

]= Var

[X]

(moments3)

Var[cX

]= c2 Var

[X] (moments4)

The standard deviation is defined as

Although the variance is mathematically more

convenient the standard deviation has the

same physical units as X so it is often the

more physically meaningful quantity Due to

its meaning as the width or spread of the

probability distribution and its sharing of

physical units it is a convenient choice for

error bars on plots of a random variable

The skewness Skew[X]is a normalized third

prob Probability ex Central moments p 3

asymmetry

kurtosis

tailedness

central moment

Skew[X]=E[(X ndash microX)

3]

σ3X

(moments5)

Skewness is a measure of asymmetry of a

random variablersquos PDF or PMF For a symmetric

PMF or PDF such as the Gaussian PDF Skew[X]=

0

The kurtosis Kurt[X]is a normalized fourth

central moment

Kurt[X]=E[(X ndash microX)

4]

σ4X

(moments6)

Kurtosis is a measure of the tailedness of a

random variablersquos PDF or PMF ldquoHeavierrdquo tails

yield higher kurtosis

A Gaussian random variable has PDF with

kurtosis 3 Given that for Gaussians both

skewness and kurtosis have nice values (0 and

3) we can think of skewness and and kurtosis

as measures of similarity to the Gaussian PDF

stats Probability Exercises for Chapter 03 p 1

probex Exercises for Chapter

03

Exercise 031 (5)

Several physical processes can be modeled

with a random walk a process of interatively

changing a quantity by some random

amount Infinitely many variations are

possible but common factors of variation

include probability distribution step

size dimensionality (eg oneshydimensional

twoshydimensional etc) and coordinate system

Graphical representations of these walks can be

beautiful Develop a computer program that

generates random walks and corresponding

graphics Do it well and call it art because it

is

estimation

stats

Statistics

Whereas probability theory is primarily

focused on the relations among mathematical

objects statistics is concerned with making

sense of the outcomes of observation (Steven

S Skiena Calculated Bets Computers Gambling

and Mathematical Modeling to Win Outlooks

Cambridge University Press 2001 DOI 10

1017CBO9780511547089 This includes a lucid

section on probability versus statistics also

available here httpswww3csstonybrook

edu~skienajaialaiexcerptsnode12html)

However we frequently use statistical methods

to estimate probabilistic models For instance

we will learn how to estimate the standard

deviation of a random process we have some

reason to expect has a Gaussian probability

distribution

Statistics has applications in nearly every

applied science and engineering discipline

Any time measurements are made statistical

analysis is how one makes sense of the results

For instance determining a reasonable level

of confidence in a measured parameter

stats Statistics terms p 3

machine learning

1 Ash Basic Probability Theory

2 Jaynes et al Probability Theory The Logic of Science

3 Erwin Kreyszig Advanced Engineering Mathematics 10th John Wiley amp

Sons Limited 2011 ISBN 9781119571094 The authoritative resource for

engineering mathematics It includes detailed accounts of probability

statistics vector calculus linear algebra fourier analysis ordinary

and partial differential equations and complex analysis It also

includes several other topics with varying degrees of depth Overall

it is the best place to start when seeking mathematical guidance

requires statistics

A particularly hot topic nowadays is machine

learning which seems to be a field with

applications that continue to expand This field

is fundamentally built on statistics

A good introduction to statistics appears at

the end of Ash1 A more involved introduction

is given by Jaynes et al2 The treatment by

Kreyszig3 is rather incomplete as will be our

own

stats Statistics terms Populations samples and machine learning p 1

population

sample

random

machine learning

training

training set

features

predictive model

statsterms Populations samples

and machine

learning

An experimentrsquos population is a complete

collection of objects that we would like to

study These objects can be people machines

processes or anything else we would like to

understand experimentally

Of course we typically canrsquot measure all of

the population Instead we take a subset of

the populationmdashcalled a samplemdashand infer the

characteristics of the entire population from

this sample

However this inference that the sample is

somehow representative of the population

assumes the sample size is sufficiently large

and that the sampling is random This means

selection of the sample should be such that no

one group within a population are systematically

over- or under-represented in the sample

Machine learning is a field that makes

extensive use of measurements and statistical

inference In it an algorithm is trained by

exposure to sample data which is called a

training set The variables measured are

called features Typically a predictive

model is developed that can be used to

extrapolate from the data to a new situation

The methods of statistical analysis we

introduce in this chapter are the foundation

of most machine learning methods

stats Statistics sample Populations samples and machine learning p 2

Example statstermsshy1 re combat boots

Problem Statement

Consider a robot Pierre with a particular

gravitas and sense of style He seeks just

the right-looking pair of combat boots

for wearing in the autumn rains Pierre

is to purchase the boots online via image

recognition and decides to gather data

by visiting a hipster hangout one evening to

train his style For contrast he also watches

footage of a White Nationalist rally focusing

special attention on the boots of wearers of

khakis and polos Comment on Pierrersquos methods

stats Statistics sample Estimation of sample mean and variance p 1

sample mean

statssample Estimation of sample

mean and variance

Estimation and sample statistics

The mean and variance definitions of

Lecture probE and Lecture probmoments apply

only to a random variable for which we have a

theoretical probability distribution Typically

it is not until after having performed many

measurements of a random variable that

we can assign a good distribution model

Until then measurements can help us estimate

aspects of the data We usually start by

estimating basic parameters such as mean

and variance before estimating a probability

distribution

There are two key aspects to randomness in the

measurement of a random variable First of

course there is the underlying randomness with

its probability distribution mean standard

deviation etc which we call the population

statistics Second there is the statistical

variability that is due to the fact that we

are estimating the random variablersquos statisticsmdash

called its sample statisticsmdashfrom some sample

Statistical variability is decreased with

greater sample size and number of samples

whereas the underlying randomness of the

random variable does not decrease Instead

our estimates of its probability distribution

and statistics improve

Sample mean variance and standard

deviation

The arithmetic mean or sample mean of a

measurand with sample size N represented by

stats Statistics sample Estimation of sample mean and variance p 2

population mean

sample variance

population variance

random variable X is defined as

x =1

N

Nsumi=1

xi (sample1)

If the sample size is large x rarr mX (the sample

mean approaches the mean) The population

mean is another name for the mean mX which

is equal to

mX = limNrarrinfin 1

N

Nsumi=1

xi (sample2)

Recall that the definition of the mean is

mX = E [x]

The sample variance of a measurand

represented by random variable X is defined

as

S2X =1

N ndash 1

Nsumi=1

(xi ndash x)2 (sample3)

If the sample size is large S2X rarr σ2X(the

sample variance approaches the variance) The

population variance is another term for the

variance σ2X and can be expressed as

σ2X = limNrarrinfin 1

N ndash 1

Nsumi=1

(xi ndash x)2 (sample4)

Recall that the definition of the variance is

σ2X= E

[(X ndash mX)

2]

The sample standard deviation of a measurand

represented by random variable X is defined

as

SX =

radicS2X (sample5)

If the sample size is large SX rarr σX (the sample

standard deviation approaches the standard

stats Statistics sample Estimation of sample mean and variance p 1

mean of means Xi

mean of standard deviations SXi

standard deviation of means SXi

standard deviation of standard deviations SSXi

deviation) The population standard deviation is

another term for the standard deviation σX

and can be expressed as

σX = limNrarrinfin

radicS2X (sample6)

Recall that the definition of the standard

deviation is σX =radicσ2X

Sample statistics as random variables

There is an ambiguity in our usage of the term

ldquosamplerdquo It can mean just one measurement

or it can mean a collection of measurements

gathered together Hopefully it is clear from

context

In the latter sense often we collect multiple

samples each of which has its own sample

mean Xi and standard deviation SXi In this

situation Xi and SXi are themselves random

variables (meta af I know) This means they

have their own sample means Xi and SXi and

standard deviations SXi

and SSXi

The mean of means Xi is equivalent to

a mean with a larger sample size and is

therefore our best estimate of the mean of

the underlying random process The mean of

standard deviations SXi is our best estimate

of the standard deviation of the underlying

random process The standard deviation

of means SXi

is a measure of the spread

in our estimates of the mean It is our best

estimate of the standard deviation of the

statistical variation and should therefore

tend to zero as sample size and number of

samples increases The standard deviation

of standard deviations SSXiis a measure of

stats Statistics sample Estimation of sample mean and variance p 1

the spread in our estimates of the standard

deviation of the underlying process It should

also tend to zero as sample size and number of

samples increases

Let N be the size of each sample It can be

shown that the standard deviation of the

means SXi

can be estimated from a single

sample standard deviation

SXiasympSXiradicN (sample7)

This shows that as the sample size N increases

the statistical variability of the mean

decreases (and in the limit approaches zero)

Nonstationary signal statistics

The sample mean variance and standard

deviation definitions above assume the random

process is stationarymdashthat is its population

mean does not vary with time However

a great many measurement signals have

populations that do vary with time ie they are

nonstationary Sometimes the nonstationarity

arises from a ldquodriftrdquo in the dc value of a

signal or some other slowly changing variable

But dynamic signals can also change in a

recognizable and predictable manner as when

say the temperature of a room changes when

a window is opened or when a water level

changes with the tide

Typically we would like to minimize the effect

of nonstationarity on the signal statistics In

certain cases such as drift the variation is a

nuissance only but other times it is the point

of the measurement

stats Statistics sample Estimation of sample mean and variance p 1

Two common techniques are used depending

on the overall type of nonstationarity If it

is periodic with some known or estimated

period the measurement data series can

be ldquofoldedrdquo or ldquoreshapedrdquo such that the ith

measurement of each period corresponds to

the ith measurement of all other periods In

this case somewhat counterintuitively we can

consider the ith measurements to correspond

to a sample of size N where N is the number of

periods over which measurements are made

When the signal is aperiodic we often simply

divide it into ldquosmallrdquo (relative to its overall

trend) intervals over which statistics are

computed separately

Note that in this discussion we have assumed

that the nonstationarity of the signal is due to

a variable that is deterministic (not random)

Example statssampleshy1 re measurement of gaussian noise on

nonstationary signal

Problem StatementConsider the measurement of the temperature

inside a desktop computer chassis via an

inexpensive thermistor a resistor that

changes resistance with temperature The

processor and power supply heat the chassis

in a manner that depends on processing

demand For the test protocol the processors

are cycled sinusoidally through processing

power levels at a frequency of 50 mHz for

nT = 12 periods and sampled at 1 Hz Assume

a temperature fluctuation between about 20

and 50 C and gaussian noise with standard

stats Statistics sample Estimation of sample mean and variance p 2

deviation 4 C Consider a sample to be the

multiple measurements of a certain instant in

the period

1 Generate and plot simulated

temperature data as a time series

and as a histogram or frequency

distribution Comment on why the

frequency distribution sucks

2 Compute the sample mean and standard

deviation for each sample in the cycle

3 Subtract the mean from each sample

in the period such that each sample

distribution is centered at zero Plot

the composite frequency distribution

of all samples together This represents

our best estimate of the frequency

distribution of the underlying process

4 Plot a comparison of the theoretical

mean which is 35 and the sample mean

of means with an error bar Vary the

number of samples nT and comment on its

effect on the estimate

5 Plot a comparison of the theoretical

standard deviation and the sample mean

of sample standard deviations with an

error bar Vary the number of samples

nT and comment on its effect on the

estimate

6 Plot the sample means over a single

period with error bars of plusmn one sample

standard deviation of the means This

represents our best estimate of the

sinusoidal heating temperature Vary the

number of samples nT and comment on

the estimate

Problem Solution

stats Statistics sample Estimation of sample mean and variance p 3

clear close all clear kernel

Generate the temperature data

The temperature data can be generated by

constructing an array that is passed to a

sinusoid then ldquorandomizedrdquo by gaussian

random numbers Note that we add 1 to npand n to avoid the sneaky fencepost error

f = 50e-3 Hz sinusoid frequencya = 15 C amplitude of oscillationdc = 35 C dc offset of oscillationfs = 1 Hz sampling frequencynT = 12 number of sinusoid periodss = 4 C standard deviationnp = fsf+1 number of samples per periodn = nTnp+1 total number of samples

t_a = linspace(0nTfn) time arraysin_a = dc + asin(2pift_a) sinusoidal arrayrng(43) seed the random number generatornoise_a = srandn(size(t_a)) gaussian noisesignal_a = sin_a + noise_a sinusoid + noise

Now that we have an array of ldquodatardquo wersquore

ready to plot

h = figurep = plot(t_asignal_ao-

Color[888]MarkerFaceColorbMarkerEdgeColornoneMarkerSize3)

xlabel(time (s))ylabel(temperature (C))hgsave(hfigurestemp)

stats Statistics sample Estimation of sample mean and variance p 4

0 50 100 150 200 250

20

40

60

time (s)

tempera

ture

(C)

Figure 041 temperature over time

This is something like what we might see for

continuous measurement data Now the

histogram

h = figurehistogram(signal_a

30 number of binsnormalizationprobability for PMF

)xlabel(temperature (C))ylabel(probability)hgsave(hfigurestemp)

10 20 30 40 50 600

002

004

006

008

temperature (C)

proba

bility

Figure 042 a poor histogram due tounstationarity of the signal

stats Statistics sample Estimation of sample mean and variance p 5

This sucks because we plot a frequency

distribution to tell us about the random

variation but this data includes the sinusoid

Sample mean variance and standard

deviation

To compute the sample mean micro and standard

deviation s for each sample in the period

we must ldquopick outrdquo the nT data points that

correspond to each other Currently theyrsquore in

one long 1 times n array signal_a It is helpful

to reshape the data so it is in an nT times nparray which each row corresponding to a

new period This leaves the correct points

aligned in columns It is important to note

that we can do this ldquofoldingrdquo operation only

when we know rather precisely the period

of the underlying sinusoid It is given in the

problem that it is a controlled experiment

variable If we did not know it we would

have to estimate it too from the data

signal_ar = reshape(signal_a(1end-1)[npnT])size(signal_ar) check sizesignal_ar(1314) print three rows four columns

ans =

12 21

ans =

302718 400946 408341 447662401836 372245 494076 461137400571 409718 461627 419145

stats Statistics sample Estimation of sample mean and variance p 6

Define the mean variance and standard

deviation functions as ldquoanonmymous

functionsrdquo We ldquoroll our ownrdquo These are not

as efficient or flexible as the built-in Matlab

functions mean var and std which should

typically be used

my_mean = (vec) sum(vec)length(vec)my_var = (vec) sum((vec-my_mean(vec))^2)(length(vec)-1)

my_std = (vec) sqrt(my_var(vec))

Now the sample mean variance and standard

deviations can be computed We proceed by

looping through each column of the reshaped

signal array

mu_a = NaNones(1np) initialize mean arrayvar_a = NaNones(1np) initialize var arrays_a = NaNones(1np) initialize std array

for i = 1np for each columnmu_a(i) = my_mean(signal_ar(i))var_a(i) = my_var(signal_ar(i))s_a(i) = sqrt(var_a(i)) touch of speed

end

Composite frequency distribution

The columns represent samples We want to

subtract the mean from each column We use

repmat to reproduce mu_a in nT rows so it can

be easily subtracted

stats Statistics sample Estimation of sample mean and variance p 7

signal_arz = signal_ar - repmat(mu_a[nT1])size(signal_arz) check sizesignal_arz(1314) print three rows four columns

ans =

12 21

ans =

-50881 09525 -02909 -1570048237 -19176 82826 -0222546972 18297 50377 -44216

Now that all samples have the same mean

we can lump them into one big bin for the

frequency distribution There are some nice

built-in functions to do a quick reshape and

fit

resizesignal_arzr = reshape(signal_arz[1nTnp])size(signal_arzr) check size fitpdfit_model = fitdist(signal_arzrnormal) fitx_a = linspace(-1515100)pdfit_a = pdf(pdfit_modelx_a)pdf_a = normpdf(x_a0s) theoretical pdf

ans =

1 252

Plot

stats Statistics sample Estimation of sample mean and variance p 8

h = figurehistogram(signal_arzr

round(ssqrt(nT)) number of binsnormalizationprobability for PMF

)hold onplot(x_apdfit_ab-linewidth2) hold onplot(x_apdf_ag--linewidth2)legend(pmfpdf estpdf)xlabel(zero-mean temperature (C))ylabel(probability massdensity)hgsave(hfigurestemp)

ndash15 ndash10 ndash5 0 5 10 150

005

01

015

zero-mean temperature (C)proba

bility

massdensity

pmf

pdf est

pdf

Figure 043 PMF and estimated and theoreticalPDFs

Means comparison

The sample mean of means is simply the

following

mu_mu = my_mean(mu_a)

mu_mu =

351175

stats Statistics sample Estimation of sample mean and variance p 9

The standard deviation that works as an

error bar which should reflect how well

we can estimate the point plotted is the

standard deviation of the means It is

difficult to compute this directly for a

nonstationary process We use the estimate

given above and improve upon it by using the

mean of standard deviations instead of a

single samplersquos standard deviation

s_mu = mean(s_a)sqrt(nT)

s_mu =

11580

Now for the simple plot

h = figurebar(mu_mu) hold on gives barerrorbar(mu_mus_murlinewidth2) error barax = gca current axisaxXTickLabels = $overlineoverlineX$axTickLabelInterpreter = latexhgsave(hfigurestemp)

Standard deviations comparison

The sample mean of standard deviations is

simply the following

mu_s = my_mean(s_a)

stats Statistics sample Estimation of sample mean and variance p 10

mu_s =

40114

The standard deviation that works as

an error bar which should reflect how

well we can estimate the point plotted is

the standard deviation of the standard

deviations We can compute this directly

s_s = my_std(s_a)

s_s =

08495

Now for the simple plot

h = figurebar(mu_s) hold on gives barerrorbar(mu_ss_srlinewidth2) error barsax = gca current axisaxXTickLabels = $overlineS_X$axTickLabelInterpreter = latexhgsave(hfigurestemp)

X0

10

20

30

40

Figure 044 sample mean of sample means

SX

0

2

4

Figure 045 sample standard deviation of sample means

Plot a period with error bars

Plotting the data with error bars is fairly

straightforward with the built-in errorbar

stats Statistics confidence Estimation of sample mean and variance p 11

function The main question is ldquowhich

standard deviationrdquo Since wersquore plotting the

means it makes sense to plot the error bars

as a single sample standard deviation of the

means

h = figuree1 = errorbar(t_a(1np)mu_as_muones(1np)b)hold ont_a2 = linspace(01f101)e2 = plot(t_a2dc + asin(2pift_a2)r-)xlim([t_a(1)t_a(np)])grid onxlabel(folded time (s))ylabel(temperature (C))legend([e1 e2]sample meanpopulation meanLocationNorthEast)hgsave(hfigurestemp)

0 5 10 15

20

30

40

50

folded time (s)

tempera

ture

(C)

sample mean

population mean

Figure 046 sample means over a period

stats Statistics confidence Confidence p 1

confidence

central limit theorem

statsconfidence Confidence

One really ought to have it to give a lecture

named it but wersquoll give it a try anyway

Confidence is used in the common sense

although we do endow it with a mathematical

definition to scare business majors who

arenrsquot actually impressed but indifferent

Approximately if under some reasonable

assumptions (probabilistic model) we estimate

the probability of some event to be P we say

we have P confidence in it I mean business

majors are all ldquoSupply and demand Letrsquos call

that a lsquolawrsquo rdquo so I think wersquore even

So wersquore back to computing probability from

distributionsmdashprobability density functions

(PDFs) and probability mass functions (PMFs)

Usually we care most about estimating the

mean of our distribution Recall from the

previous lecture that when several samples

are taken each with its own mean the mean

is itself a random variablemdashwith a mean of

course Meanception

But the mean has a probability distribution

of its own The central limit theorem has

as one of its implications that as the sample

size N gets large regardless of the sample

distributions this distribution of means

approaches the Gaussian distribution

But sometimes I always worry Irsquom being lied to

so letrsquos check

stats Statistics confidence Confidence p 1

clear close all clear kernel

Generate some data to test the central

limit theorem

Data can be generated by constructing an

array using a (seeded for consistency) random

number generator Letrsquos use a uniformly

distributed PDF between 0 and 1

N = 150 sample size (number of measurements per sample)M = 120 number of samplesn = NM total number of measurementsmu_pop = 05 because its a uniform PDF between 0 and 1

rng(11) seed the random number generatorsignal_a = rand(NM) uniform PDFsize(signal_a) check the size

ans =

150 120

Letrsquos take a look at the data by plotting

the first ten samples (columns) as shown in

Figure 047

This is something like what we might see for

continuous measurement data Now the

histogram shown in

samples_to_plot = 10h = figure

stats Statistics confidence Confidence p 1

0 20 40 60 80 100 120 1400

02

04

06

08

1

index

measurement

Figure 047 raw data with colors correspondingto samples

0 02 04 06 08 10

002

004

006

008

measurement

proba

bility

Figure 048 a histogram showing the approximatelyuniform distribution of each sample (color)

c = jet(samples_to_plot) color arrayfor j=1samples_to_plot

histogram(signal_a(j)30 number of binsfacecolorc(j)facealpha3normalizationprobability for PMF

)hold on

endhold offxlim([-05105])xlabel(measurement)ylabel(probability)hgsave(hfigurestemp)

This isnrsquot a great plot but it shows roughly

that each sample is fairly uniformly

distributed

stats Statistics confidence Confidence p 1

Sample statistics

Now letrsquos check out the sample statistics We

want the sample mean and standard deviation

of each column Letrsquos use the built-in functions

mean and std

mu_a = mean(signal_a1) mean of each columns_a = std(signal_a1) std of each column

Now we can compute the mean statistics both

the mean of the mean X and the standard

deviation of the mean sX which we donrsquot

strictly need for this part but wersquore curious

We choose to use the direct estimate instead

of the sXradicN formula but they should be close

mu_mu = mean(mu_a)s_mu = std(mu_a)

mu_mu =

04987

s_mu =

00236

The truth about sample means

Itrsquos the moment of truth Letrsquos look at the

distribution shown in Figure 049

stats Statistics confidence Confidence p 1

042 044 046 048 05 052 054 056 0580

01

02

03

measurement

proba

bility

Figure 049 a histogram showing the approximatelynormal distribution of the means

h = figurehistogram(mu_a

normalizationprobability for PMF)hold offxlabel(measurement)ylabel(probability)hgsave(hfigurestemp)

This looks like a Gaussian distribution about

the mean of means so I guess the central limit

theorem is legit

Gaussian and probability

We already know how to compute the

probability P a value of a random variable X

lies in a certain interval from a PMF or PDF

(the sum or the integral respectively) This

means that for sufficiently large sample size N

such that we can assume from the central limit

theorem that the sample means xi are normally

distributed the probability a sample mean

value xi is in a certain interval is given by

integrating the Gaussian PDF The Gaussian PDF

for random variable Y representing the sample

means is

f(y) =1

σradic2π

expndash(y ndash micro)2

2σ2 (confidence1)

stats Statistics confidence Confidence p 2

Gaussian cumulative distribution function

where micro is the population mean and σ is the

population standard deviation

The integral of f over some interval is the

probability a value will be in that interval

Unfortunately that integral is uncool It gives

rise to the definition of the error function

which for the Gaussian random variable Y is

erf(yb) =1radicπ

ˆ yb

ndashyb

endasht2dt (confidence2)

This expresses the probability a sample mean

being in the interval [ndashyb yb] if Y has mean 0

and variance 12

Matlab has this built-in as erf shown in

Figure 0410

y_a = linspace(03100)h = figurep1 = plot(y_aerf(y_a))p1LineWidth = 2grid onxlabel(interval bound $y_b$interpreterlatex)ylabel(error function $textrmerf(y_b)$

interpreterlatex)hgsave(hfigurestemp)

We could deal directly with the error

function but most people donrsquot and wersquore

weird enough as it is Instead people use the

Gaussian cumulative distribution function

stats Statistics confidence Confidence p 3

0 05 1 15 2 25 30

02

04

06

08

1

interval bound yb

errorfunctionerf(yb)

Figure 0410 the error function

(CDF) Φ Rrarr R which is defined as

Φ(z) =1

2

(1 + erf

(zradic2

))(confidence3)

and which expresses the probability of a

Gaussian random variable Z with mean 0 and

standard deviation 1 taking on a value in

the interval (ndashinfin z] The Gaussian CDF and PDF

are shown in Figure 0411 Values can be taken

directly from the graph but itrsquos more accurate

to use the table of values in Appendix A01

Thatrsquos great and all but occasionally (always)

we have Gaussian random variables with

nonzero means and nonunity standard

deviations It turns out we can shift any

Gaussian random variable by its mean and

scale it by its standard deviation to make

it have zero mean and standard deviation

We can then use Φ and interpret the results

as being relative to the mean and standard

deviation using phrases like ldquothe probability

it is within two standard deviations of its

meanrdquo The transformed random variable Z and

its values z are sometimes called the z-score

For a particular value x of a random variable

X we can compute its z-score (or value z of

stats Statistics confidence Confidence p 4

zb

ndash3 ndash2 ndash1 0 1 2 30

01

02

03

04

z

Gaussia

nPDFf(z)

Φ(zb)

f(z)

(a) the Gaussian PDF

ndash3 ndash2 ndash1 0 1 2 30

02

04

06

08

1

interval bound zb

Gaussia

nCDFΦ(zb)

(b) the Gaussian CDF

Figure 0411 the Gaussian PDF and CDF for z-scores

random variable Z) with the formula

z =x ndash microXσX

(confidence4)

and compute the probability of X taking on a

value within the interval say x isin [xbndash xb+]

from the table (Sample statistics X and SXare appropriate when population statistics are

unknown)

For instance compute the probability a

Gaussian random variable X with microX = 5 and

σX = 234 takes on a value within the interval

x isin [3 6]

1 Compute the z-score of each endpoint of

the interval

stats Statistics confidence Confidence p 5

z3 =3 ndash microXσX

asymp ndash085 (confidence5)

z6 =6 ndash microXσX

asymp 043 (confidence6)

2 Look up the CDF values for z3 and z6 which

are Φ(z3) = 01977 and Φ(z6) = 06664 3 The

CDF values correspond to the probabilities x lt 3

and x lt 6 Therefore to find the probability

x lies in that interval we subtract the lower

bound probability

P(x isin [3 6]) = P(x lt 6) ndash P(x lt 3) (confidence7)

= Φ(6) ndashΦ(3) (confidence8)

asymp 06664 ndash 01977 (confidence9)

asymp 04689 (confidence10)

So there is a 4689 probability and therefore

we have 4689 confidence that x isin [3 6]

Often we want to go the other way estimating

the symmetric interval [xbndash xb+] for which

there is a given probability In this case we

first look up the z-score corresponding to a

certain probability For concreteness given

the same population statistics above letrsquos find

the symmetric interval [xbndash xb+] over which we

have 90 confidence From the table we want

two symmetric z-scores that have CDFshyvalue

difference 09 Or in maths

Φ(zb+) ndashΦ(zbndash) = 09 and zb+ = ndashzbndash

(confidence11)

stats Statistics confidence Confidence p 6

Due to the latter relation and the additional

fact that the Gaussian CDF has antisymmetry

Φ(zb+) +Φ(zbndash) = 1 (confidence12)

Adding the two Φ equations

Φ(zb+) = 192 (confidence13)

= 095 (confidence14)

and Φ(zbndash) = 005 From the table these

correspond (with a linear interpolation) to

zb = zb+ = ndashzbndash asymp 1645 All that remains is

to solve the z-score formula for x

x = microX + zσX (confidence15)

From this

xb+ = microX + zb+σX asymp 8849 (confidence16)

xbndash = microX + zbndashσX asymp 1151 (confidence17)

and X has a 90 confidence interval

[1151 8849]

stats Statistics confidence Confidence p 7

Example statsconfidenceshy1 re gaussian confidence for a mean

Problem StatementConsider the data set generated above

What is our 95 confidence interval in our

estimate of the mean

Problem Solution

Assumingwe have a sufficiently large

data set the distribution of means is

approximately Gaussian Following the same

logic as above we need z-score that gives an

upper CDF value of From the

table we obtain the zb = zb+ = ndashzbndash below

z_b = 196

Now we can estimate the mean using our

sample and mean statistics

X = Xplusmn zbSX (confidence18)

mu_x_95 = mu_mu + [-z_bz_b]s_mu

mu_x_95 =

04526 05449

stats Statistics student Confidence p 8

This is our 95 confidence interval in our

estimate of the mean

stats Statistics student Student confidence p 1

Studentrsquos t-distribution

degrees of freedom

statsstudent Student

confidence

The central limit theorem tells us that for

large sample size N the distribution of the

means is Gaussian However for small sample

size the Gaussian isnrsquot as good of an estimate

Studentrsquos t-distribution is superior for

lower sample size and equivalent at higher

sample size Technically if the population

standard deviation σX is known even for

low sample size we should use the Gaussian

distribution However this rarely arises in

practice so we can usually get away with an

ldquoalways trdquo approach

A way that the tshydistribution accounts for

low-N is by having an entirely different

distribution for each N (seems a bit of a

cheat to me) Actually instead of N it uses

the degrees of freedom ν which is N minus the

number of parameters required to compute the

statistic Since the standard deviation requires

only the mean for most of our cases ν = N ndash 1

As with the Gaussian distribution the

tshydistributionrsquos integral is difficult to

calculate Typically we will use a t-table such

as the one given in Appendix A02 There are

three points of note

1 Since we are primarily concerned with

going from probabilityconfidence

values (eg P probabilityconfidence) to

intervals typically there is a column for

each probability

stats Statistics student Student confidence p 2

2 The extra parameter ν takes over one of

the dimensions of the table because three-

dimensional tables are illegal

3 Many of these tables are ldquotwo-sidedrdquo

meaning their t-scores and probabilities

assume you want the symmetric

probability about the mean over the

interval [ndashtb tb] where tb is your t-score

bound

Consider the following example

Example statsstudentshy1 re student confidence interval

Problem StatementWrite a Matlab script to generate a data

set with 200 samples and sample sizes

N isin 10 20 100 using any old distribution

Compare the distribution of the means for

the different N Use the sample distributions

and a t-table to compute 99 confidence

intervals

Problem Solution

Generate the data set

confidence = 099 requirement

M = 200 of samplesN_a = [1020100] sample sizes

mu = 27 population meansigma = 9 population std

rng(1) seed random number generatordata_a = mu + sigmarandn(N_a(end)M) normal

stats Statistics student Student confidence p 3

size(data_a) check sizedata_a(11015) check 10 rows and five columns

ans =

100 200

ans =

211589 302894 278705 307835 283662376305 171264 282973 240811 343486201739 443719 437059 390699 322002170135 326064 369030 379230 365747193900 329156 237230 224749 197709218460 138295 312479 169527 341876219719 346854 194480 187014 241642286054 322244 222873 269906 376746252282 187326 145011 283814 277645322780 341538 270382 188643 141752

Compute the means for different sample sizes

mu_a = NaNones(length(N_a)M)for i = 1length(N_a)

mu_a(i) = mean(data_a(1N_a(i)1M)1)end

Plotting the distribution of

the means yields Figure 0412

20 25 30 350

01

02

03

random variable X

proba

bility

(PMF)

interval for N = 10

interval for N = 20

interval for N = 100

stats Statistics student Student confidence p 4

Figure 0412 a histogram showing thedistribution of the means for each sample size

It makes sense that the larger the sample size

the smaller the spread A quantitative metric

for the spread is of course the standard

deviation of the means for each sample size

S_mu = std(mu_a02)

S_mu =

283652091810097

Look up t-table values or use Matlabrsquos tinvfor different sample sizes and 99 confidence

Use these the mean of means and the

standard deviation of means to compute the

99 confidence interval for each N

t_a = tinv(confidenceN_a-1)for i = 1length(N_a)

interval = mean(mu_a(i)) + [-11]t_a(i)S_mu(i)

disp(sprintf(interval for N = i N_a(i)))disp(interval)

end

stats Statistics multivar Student confidence p 5

t_a =

28214 25395 23646

interval for N = 10190942 351000

interval for N = 20216292 322535

interval for N = 100247036 294787

As expected the larger the sample size the

smaller the interval over which we have 99

confidence in the estimate

stats Statistics multivar Multivariate probability and correlation p 1

joint PDF

joint PMF

statsmultivar Multivariate

probability and

correlation

Thus far we have considered probability

density and mass functions (PDFs and PMFs)

of only one random variable But of course

often we measure multiple random variables

X1 X2 Xn during a single event meaning

a measurement consists of determining values

x1 x2 xn of these random variables

We can consider an n-tuple of continuous

random variables to form a sample space Ω = Rn

on which a multivariate function f Rn rarr Rcalled the joint PDF assigns a probability

density to each outcome x isin Rn The joint PDF

must be greater than or equal to zero for all

x isin Rn the multiple integral over Ω must be

unity and the multiple integral over a subset

of the sample space A sub Ω is the probability of

the event A

We can consider an n-tuple of discrete random

variables to form a sample space Nn0 on which

a multivariate function f Nn0 rarr R calledthe joint PMF assigns a probability to each

outcome x isin Nn0 The joint PMF must be greater

than or equal to zero for all x isin Nn0 the

multiple sum over Ω must be unity and the

multiple sum over a subset of the sample space

A sub Ω is the probability of the event A

stats Statistics multivar Multivariate probability and correlation p 2

Example statsmultivarshy1 re bivariate gaussian pdf

Problem StatementLetrsquos visualize multivariate PDFs by plotting

a bivariate gaussian using Matlabrsquos mvnpdffunction

Problem Solution

mu = [1020] meansSigma = [1002] cov well talk about thisx1_a = linspace(

mu(1)-5sqrt(Sigma(11))mu(1)+5sqrt(Sigma(11))30)

x2_a = linspace(mu(2)-5sqrt(Sigma(22))mu(2)+5sqrt(Sigma(22))30)

[X1X2] = meshgrid(x1_ax2_a)f = mvnpdf([X1() X2()]muSigma)f = reshape(flength(x2_a)length(x1_a))

h = figurep = surf(x1_ax2_af)xlabel($x_1$interpreterlatex)ylabel($x_2$interpreterlatex)zlabel($f(x_1x_2)$interpreterlatex)shading interphgsave(hfigurestemp)

The result is Figure 0413 Note

how the means and standard

deviations affect the distribution

510

1520

0

02

x1x2

f(x1x2)

stats Statistics multivar Multivariate probability and correlation p 1

marginal PDF

Figure 0413 twoshyvariable gaussian PDF

Marginal probability

The marginal PDF of a multivariate PDF is the

PDF of some subspace of Ω after one or more

variables have been ldquointegrated outrdquo such that

a fewer number of random variables remain

Of course these marginal PDFs must have the

same properties of any PDF such as integrating

to unity

Example statsmultivarshy2 re bivariate gaussian marginal probability

Problem StatementLetrsquos demonstrate this by numerically

integrating over x2 in the bivariate Gaussian

above

Problem Solution

Continuing from where we left off letrsquos

integrate

f1 = trapz(x2_af2) trapezoidal integration

Plotting this yields Figure 0414

h = figurep = plot(x1_af1)pLineWidth = 2xlabel($x_1$interpreterlatex)

stats Statistics multivar Multivariate probability and correlation p 1

machine learning

ylabel($g(x_1)=int_-infty^infty f(x_1x_2) d x_2$interpreterlatex)hgsave(hfigurestemp)

6 8 10 12 140

01

02

03

04

x1

g(x1)=acute infin ndashinfinf

(x1x2)dx2

Figure 0414 marginal Gaussian PDF g(x1)

We should probably verify that this

integrates to one

disp([integral over x_1 = sprintf(07ftrapz(x1_af1))]

)

integral over x_1 = 09999986

Not bad

Covariance

Very often especially in machine learning

stats Statistics multivar Multivariate probability and correlation p 1

artificial intelligence

covariance

correlation

sample covariance

covariance matrix

or artificial intelligence applications the

question about two random variables X and Y

is how do they coshyvary That is what is their

covariance defined as

Cov[XY

]equiv E

((X ndash microX)(Y ndash microY)

)= E(XY) ndash microXmicroY

Note that when X = Y the covariance is just

the variance When a covariance is large and

positive it is an indication that the random

variables are strongly correlated When

it is large and negative they are strongly

anti-correlated Zero covariance means

the variables are uncorrelated In fact

correlation is defined as

Cor[XY

]=

Cov[XY

]radicVar

[X]Var

[Y]

This is essentially the covariance ldquonormalizedrdquo

to the interval [ndash1 1]

Sample covariance

As with the other statistics wersquove considered

covariance can be estimated from

measurement The estimate called the sample

covariance qXY of random variables X and Y

with sample size N is given by

qXY =1

N ndash 1

Nsumi=1

(xi ndash X)(yi ndash Y)

Multivariate covariance

With n random variables Xi one can compute

the covariance of each pair It is common

practice to define an n times n matrix of

covariances called the covariance matrix Σ

stats Statistics multivar Multivariate probability and correlation p 2

sample covariance matrix

such that each pairrsquos covariance

Cov[Xi Xj

](multivar1)

appears in its row-column combination (making it

symmetric) as shown below

Σ =

Cov

[X1 X1

]Cov

[X1 X2

]middot middot middot Cov

[X1 Xn

]Cov

[X2 X1

]Cov

[X2 X2

]Cov

[X2 Xn

]

Cov[Xn X1

]Cov

[Xn X2

]middot middot middot Cov

[Xn Xn

]

The multivariate sample covariance matrix

Q is the same as above but with sample

covariances qXiXj

Both covariance matrices have correlation

analogs

Example statsmultivarshy3 re car data sample covariance and correlation

Problem StatementLetrsquos use a built-in multivariate data set

that describes different features of cars

listed below

d = load(carsmallmat) this is a struct

Letrsquos compute the sample covariance and

correlation matrices

Problem Solution

variables = MPGCylindersDisplacementHorsepowerWeightAcceleration

stats Statistics multivar Multivariate probability and correlation p 3

Model_Yearn = length(variables)m = length(dMPG)

data = NaNones(mn) preallocatefor i = 1n

data(i) = d(variablesi)end

cov_d = nancov(data) sample covariancecor_d = corrcov(cov_d) sample correlation

This is highly correlatedanticorrelated

data Letrsquos plot each variable versus each

other variable to see the correlations of each

We use a red-to-blue colormap to contrast

anticorrelation and correlation Purple then is

uncorrelated

The following builds the red-to-blue colormap

n_colors = 10cmap_rb = NaNones(n_colors3)for i = 1n_colors

a = in_colorscmap_rb(i) = (1-a)[100]+a[001]

end

h = figurefor i = 1n

for j = 1nsubplot(nnsub2ind([nn]ji))p = plot(d(variablesi)d(variablesj)

) hold onthis_color = cmap_rb(round((cor_d(ij)+1)(n_colors-1)2)

stats Statistics multivar Multivariate probability and correlation p 1

)pMarkerFaceColor = this_colorpMarkerEdgeColor = this_color

endendhgsave(hfigurestemp)

Figure 0415 car data correlation

Conditional probability and dependence

Independent variables are uncorrelated

However uncorrelated variables may or may

not be independent Therefore we cannot use

correlation alone as a test for independence

For instance for random variables X and Y

where X has some even distribution and Y = X2

clearly the variables are dependent However

the are also uncorrelated (due to symmetry)

stats Statistics multivar Multivariate probability and correlation p 2

Example statsmultivarshy4 re car data sample covariance and correlation

Problem StatementUsing a uniform distribution U(ndash1 1) show that

X and Y are uncorrelated (but dependent)

with Y = X2 with some sampling We compute

the correlation for different sample sizes

Problem Solution

N_a = round(linspace(10500100))qc_a = NaNones(size(N_a))rng(6)x_a = -1 + 2rand(max(N_a)1)y_a = x_a^2for i = 1length(N_a)

should write incremental algorithm but lazyq = cov(x_a(1N_a(i))y_a(1N_a(i)))qc = corrcov(q)qc_a(i) = qc(21) cross correlation

end

The absolute values of the correlations are

shown in Figure 0416 Note that we should

probably average several such curves to

estimate how the correlation would drop off

with N but the single curve describes our

understanding that the correlation in fact

approaches zero in the large-sample limit

h = figurep = plot(N_aabs(qc_a))pLineWidth = 2xlabel(sample size $N$interpreterlatex)ylabel(absolute sample correlationinterpreterlatex

stats Statistics regression Multivariate probability and correlation p 3

)hgsave(hfigurestemp)

0 100 200 300 400 5000

01

02

03

sample size Nabsolute

sampleco

rrelation

Figure 0416 absolute value of the sample

correlation between X sim U(ndash1 1) and Y = X2 fordifferent sample size N In the limit the populationcorrelation should approach zero and yet X and Yare not independent

stats Statistics regression Regression p 1

statsregression Regression

Suppose we have a sample with two

measurands (1) the force F through a spring and

(2) its displacement X (not from equilibrium)

We would like to determine an analytic

function that relates the variables perhaps

for prediction of the force given another

displacement

There is some variation in the measurement Letrsquos

say the following is the sample

X_a = 1e-3[10213041495061718092100] mF_a = [501504532559572599

610639670679703] N

Letrsquos take a look at the data The result is

Figure 0417

h = figurep = plot(X_a1e3F_abMarkerSize15)xlabel($X$ (mm)interpreterlatex)ylabel($F$ (N)interpreterlatex)xlim([0max(X_a1e3)])grid onhgsave(hfigurestemp)

How might we find an analytic function that

agrees with the data Broadly our strategy

will be to assume a general form of a function

and use the data to set the parameters in the

stats Statistics regression Regression p 2

0 20 40 60 80 10050

60

70

X (mm)

F(N)

Figure 0417 forceshydisplacement data

function such that the difference between the

data and the function is minimal

Let y be the analytic function that we would

like to fit to the data Let yi denote the

value of y(xi) where xi is the ith value of the

random variable X from the sample Then we

want to minimize the differences between the

force measurements Fi and yi

From calculus recall that we can minimize a

function by differentiating it and solving for

the zero-crossings (which correspond to local

maxima or minima)

First we need such a function to minimize

Perhaps the simplest effective function D is

constructed by squaring and summing the

differences we want to minimize for sample

size N

D(xi) =Nsumi=1

(Fi ndash yi)2 (regression1)

stats Statistics regression Regression p 3

(recall that yi = y(xi) which makes D a function

of x)

Now the form of y must be chosen We consider

only mth-order polynomial functions y but

others can be used in a similar manner

y(x) = a0 + a1x + a2x2 + middot middot middot + amxm (regression2)

If we treat D as a function of the polynomial

coefficients aj ie

D(a0 a1 middot middot middot am) (regression3)

and minimize D for each value of xi we must

take the partial derivatives of D with respect

to each aj and set each equal to zero

partD

parta0= 0

partD

parta1= 0 middot middot middot partD

partam= 0

This gives us N equations and m + 1 unknowns aj

Writing the system in matrix form

1 x1 x21middot middot middot xm1

1 x2 x22 middot middot middot xm21 x3 x23 middot middot middot xm3

1 xN x2Nmiddot middot middot xmN

︸ ︷︷ ︸

ANtimes(m+1)

a0a1a2

am

︸ ︷︷ ︸a(m+1)times1

=

F1F2F3

FN

︸ ︷︷ ︸bNtimes1

(regression4)

Typically N gt m and this is an overdetermined

system Therefore we usually canrsquot solve by

taking Andash1 because A doesnrsquot have an inverse

stats Statistics regression Regression p 4

Instead we either find the Moore-Penrose

pseudo-inverse Adagger and have a = Adaggerb as

the solution which is inefficientmdashor we can

approximate b with an algorithm such as that

used by Matlabrsquos operator In the latter case

a_a = Ab_a

Example statsregressionshy1 re regression

Problem StatementUse Matlabrsquos operator to find a good

polynomial fit for the sample Therersquos the

sometimesshydifficult question ldquowhat order

should we fitrdquo Letrsquos try out several and

see what the squaredshydifferences function D

gives

Problem Solution

N = length(X_a) sample sizem_a = 2N all the order up to N

A = NaNones(length(m_a)max(m_a)N)for k = 1length(m_a) each order

for j = 1N each measurementfor i = 1( m_a(k) + 1 ) each coef

A(kji) = X_a(j)^(i-1)end

endenddisp(squeeze(A(215)))

10000 00100 00001 00000 NaN10000 00210 00004 00000 NaN10000 00300 00009 00000 NaN10000 00410 00017 00001 NaN10000 00490 00024 00001 NaN10000 00500 00025 00001 NaN10000 00610 00037 00002 NaN10000 00710 00050 00004 NaN

stats Statistics regression Regression p 5

10000 00800 00064 00005 NaN10000 00920 00085 00008 NaN10000 01000 00100 00010 NaN

Wersquove printed the first five columns of the

third-order matrix which only has four

columns so NaNs fill in the rest

Now we can use the operator to solve for

the coefficients

a = NaNones(length(m_a)max(m_a))

warning(offall)for i = 1length(m_a)

A_now = squeeze(A(i1m_a(i)))a(i1m_a(i)) = (A_now(1m_a(i))F_a)

endwarning(onall)

n_plot = 100x_plot = linspace(min(X_a)max(X_a)n_plot)y = NaNones(n_plotlength(m_a)) preallocatefor i = 1length(m_a)

y(i) = polyval(fliplr(a(i1m_a(i)))x_plot)end

h = figurefor i = 12length(m_a)-1

p = plot(x_plot1e3y(i)linewidth15)hold onpDisplayName = sprintf(

order i (m_a(i)-1)

)

stats Statistics ex Regression p 6

endp = plot(X_a1e3F_abMarkerSize15)xlabel($X$ (mm)interpreterlatex)ylabel($F$ (N)interpreterlatex)pDisplayName = samplelegend(showlocationsoutheast)grid onhgsave(hfigurestemp)

20 40 60 80 100

50

60

70

X (mm)

F(N)

order 1

order 3

order 5

order 7

order 9

sample

Figure 0418 forceshydisplacement data with curvefits

vecs Statistics Exercises for Chapter 04 p 1

statsex Exercises for Chapter

04

calculus

limit

series

derivative

integral

vector calculus

vecs

Vector calculus

A great many physical situations of interest to

engineers can be described by calculus It can

describe how quantities continuously change

over (say) time and gives tools for computing

other quantities We assume familiarity with

the fundamentals of calculus limit series

derivative and integral From these and a

basic grasp of vectors we will outline some

of the highlights of vector calculus Vector

calculus is particularly useful for describing

the physics of for instance the following

mechanics of particles wherein is studied

the motion of particles and the forcing

causes thereof

rigid-body mechanics wherein is studied the

motion rotational and translational and

its forcing causes of bodies considered

rigid (undeformable)

solid mechanics wherein is studied the

motion and deformation and their forcing

causes of continuous solid bodies (those

that retain a specific resting shape)

vecs Vector calculus p 3

complex analysis

1 For an introduction to complex analysis see Kreyszig (Kreyszig

Advanced Engineering Mathematics Part D)

2 Ibid Chapters 9 10

3 HM Schey Div Grad Curl and All that An Informal Text on Vector

Calculus WW Norton 2005 ISBN 9780393925166

euclidean vector space

bases

components

basis vectors

invariant

coordinate system

Cartesian coordinates

fluid mechanics wherein is studied the

motion and its forcing causes of fluids

(liquids gases plasmas)

heat transfer wherein is studied the

movement of thermal energy through and

among bodies

electromagnetism wherein is studied

the motion and its forcing causes of

electrically charged particles

This last example was in fact very influential

in the original development of both vector

calculus and complex analysis1 It is not

an exaggeration to say that the topics above

comprise the majority of physical topics of

interest in engineering

A good introduction to vector calculus is

given by Kreyszig2 Perhaps the most famous

and enjoyable treatment is given by Schey3

in the adorably titled Div Grad Curl and All

that

It is important to note that in much of what

follows we will describe (typically the three-

dimensional space of our lived experience) as

a euclidean vector space an nshydimensional

vector space isomorphic to Rn As we know

from linear algebra any vector v isin Rn can be

expressed in any number of bases That is the

vector v is a basis-free object with multiple

basis representations The components and

basis vectors of a vector change with basis

changes but the vector itself is invariant A

coordinate system is in fact just a basis We

are most familiar of course with Cartesian

vecs Vector calculus div p 4

manifolds

non-euclidean geometry

parallel postulate

differential geometry

4 John M Lee Introduction to Smooth Manifolds second Vol 218

Graduate Texts in Mathematics Springer 2012

5 Francesco Bullo and Andrew D Lewis Geometric control of

mechanical systems modeling analysis and design for simple

mechanical control systems Ed by JE Marsden L Sirovich and M

Golubitsky Springer 2005

coordinates which is the specific orthonormal

basis b for Rn

b1 =

1

0

0

b2 =

0

1

0

middot middot middot bn =

0

0

1

(vecs1)

Manifolds are spaces that appear locally

as Rn but can be globally rather different

and can describe non-euclidean geometry

wherein euclidean geometryrsquos parallel

postulate is invalid Calculus on manifolds

is the focus of differential geometry a

subset of which we can consider our current

study A motivation for further study of

differential geometry is that it is very

convenient when dealingwith advanced

applications of mechanics such as rigid-body

mechanics of robots and vehicles A very nice

mathematical introduction is given by Lee4 and

Bullo and Lewis5 give a compact presentation

in the context of robotics

Vector fields have several important

properties of interest wersquoll explore in this

chapter Our goal is to gain an intuition of

these properties and be able to perform basic

calculation

vecs Vector calculus div Divergence surface integrals and flux p 1

surface integral

flux

6 Schey Div Grad Curl and All that An Informal Text on Vector

Calculus pp 21shy30

7 Kreyszig Advanced Engineering Mathematics sect 106

source

sink

incompressible

vecsdiv Divergence surface

integrals and flux

Flux and surface integrals

Consider a surface S Let

r(u v) = [x(u v) y(u v) z(u v)] be a parametric

position vector on a Euclidean vector space R3Furthermore let F R3 rarr R3 be a vectorshyvalued

function of r and let n be a unit-normal vector

on a surface S The surface integral

uml

S

F middot ndS (div1)

which integrates the normal of F over the

surface We call this quantity the flux of F

out of the surface S This terminology comes

from fluid flow for which the flux is the

mass flow rate out of S In general the flux

is a measure of a quantity (or field) passing

through a surface For more on computing

surface integrals see Schey6 and Kreyszig7

Continuity

Consider the flux out of a surface S that

encloses a volume ∆V divided by that volume

1

∆V

uml

S

F middot ndS (div2)

This gives a measure of flux per unit volume

for a volume of space Consider its physical

meaningwhen we interpret this as fluid flow

all fluid that enters the volume is negative flux

and all that leaves is positive If physical

conditions are such that we expect no fluid to

enter or exit the volume via what is called a

source or a sink and if we assume the density

of the fluid is uniform (this is called an

incompressible fluid) then all the fluid that

vecs Vector calculus div Divergence surface integrals and flux p 1

continuity equation

divergence

enters the volume must exit and we get

1

∆V

uml

S

F middot ndS = 0 (div3)

This is called a continuity equation although

typically this name is given to equations of the

form in the next section This equation is one of

the governing equations in continuum mechanics

Divergence

Letrsquos take the fluxshypershyvolume as the volume

∆Vrarr 0 we obtain the following

Equation div4 divergence integral

form

lim∆Vrarr0

1

∆V

uml

S

F middot ndS

This is called the divergence of F and is

defined at each point in R3 by taking the

volume to zero about it It is given the

shorthand div F

What interpretation can we give this quantity

It is a measure of the vector fieldrsquos flux

outward through a surface containing an

infinitesimal volume When we consider a fluid

a positive divergence is a local decrease in

density and a negative divergence is a density

increase If the fluid is incompressible and has

no sources or sinks we can write the continuity

vecs Vector calculus div Divergence surface integrals and flux p 1

equation

div F = 0 (div5)

In the Cartesian basis it can be shown that the

divergence is easily computed from the field

F = Fxi + Fyj + Fzk (div6)

as follows

Equation div7 divergence

differential form

div F = partxFx + partyFy + partzFz

Exploring divergence

Divergence is perhaps best explored by

considering it for a vector field in R2 Sucha field F = Fxi + Fyj can be represented

as a ldquoquiverrdquo plot If we overlay the quiver

plot over a ldquocolor densityrdquo plot representing

div F we can increase our intuition about the

divergence

The followingwas generated from a Jupyter

notebook with the following filename and

kernel

notebook filename div_surface_integrals_fluxipynbnotebook kernel python3

vecs Vector calculus div Divergence surface integrals and flux p 2

First load some Python packages

from sympy import import numpy as npimport matplotlibpyplot as pltfrom matplotlibticker import LogLocatorfrom matplotlibcolors import from sympyutilitieslambdify import lambdifyfrom IPythondisplay import display Markdown Latex

Now we define some symbolic variables and

functions

var(xy)F_x = Function(F_x)(xy)F_y = Function(F_y)(xy)

Rather than repeat code letrsquos write a single

function quiver_plotter_2D to make several

of these plots

def quiver_plotter_2D(field=F_xxyF_yxygrid_width=3grid_decimate_x=8grid_decimate_y=8norm=Normalize()density_operation=divprint_density=True

) define symbolicsvar(xy)F_x = Function(F_x)(xy)F_y = Function(F_y)(xy)

vecs Vector calculus div Divergence surface integrals and flux p 3

field_sub = field

compute densityif density_operation is divden = F_xdiff(x) + F_ydiff(y)

elif density_operation is curl in the k directionden = F_ydiff(x) - F_xdiff(y)

elseerror(div and curl are the only density operators)

den_simp = densubs(field_sub

)doit()simplify()if den_simpis_constant()

print(Warning density operator is constant (no density plot)

)if print_density

print(fThe density_operation is)display(den_simp)

lambdify for numericsF_x_sub = F_xsubs(field_sub)F_y_sub = F_ysubs(field_sub)F_x_fun = lambdify((xy)F_xsubs(field_sub)numpy)F_y_fun = lambdify((xy)F_ysubs(field_sub)numpy)if F_x_subis_constantF_x_fun1 = F_x_fun dummyF_x_fun = lambda xy F_x_fun1(xy)npones(xshape)

if F_y_subis_constantF_y_fun1 = F_y_fun dummyF_y_fun = lambda xy F_y_fun1(xy)npones(xshape)

if not den_simpis_constant()den_fun = lambdify((xy) den_simpnumpy)

create gridw = grid_widthY X = npmgrid[-ww100j -ww100j]

evaluate numericallyF_x_num = F_x_fun(XY)F_y_num = F_y_fun(XY)if not den_simpis_constant()den_num = den_fun(XY)

plotp = pltfigure() colormeshif not den_simpis_constant()cmap = pltget_cmap(PiYG)im = pltpcolormesh(XYden_numcmap=cmapnorm=norm)pltcolorbar()

Abs quiverdx = grid_decimate_ydy = grid_decimate_x

vecs Vector calculus div Divergence surface integrals and flux p 4

pltquiver(X[dxdy]Y[dxdy]F_x_num[dxdy]F_y_num[dxdy]units=xy scale=10

)plttitle(fF(xy) = [F_xsubs(field_sub)F_ysubs(field_sub)])return p

Note that while wersquore at it we included a

hook for density plots of the curl of F and

wersquoll return to this in a later lecture

Letrsquos inspect several cases

p = quiver_plotter_2D(field=F_xx2F_yy2

)

The div is

2x + 2y

p = quiver_plotter_2D(field=F_xxyF_yxy

)

The div is

vecs Vector calculus div Divergence surface integrals and flux p 5

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [x2y2]

10

5

0

5

10

Figure 051 png

x + y

p = quiver_plotter_2D(field=F_xx2+y2F_yx2+y2

)

The div is

2x + 2y

p = quiver_plotter_2D(field=F_xx2sqrt(x2+y2)F_yy2sqrt(x2+y2)

vecs Vector calculus div Divergence surface integrals and flux p 6

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [xyxy]

10

5

0

5

10

Figure 052 png

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [x2 + y2x2 + y2]

10

5

0

5

10

Figure 053 png

vecs Vector calculus curl Divergence surface integrals and flux p 7

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [x2sqrt(x2 + y2)y2sqrt(x2 + y2)]

100

10 1010 1

100

Figure 054 png

norm=SymLogNorm(linthresh=3 linscale=3))

The div is

ndashx3 ndash y3 + 2(x + y

) (x2 + y2

)(x2 + y2

) 32

vecs Vector calculus curl Curl line integrals and circulation p 1

line integral

8 Schey Div Grad Curl and All that An Informal Text on Vector

Calculus pp 63shy74

9 Kreyszig Advanced Engineering Mathematics sect 101 102

force

work

circulation

the law of circulation

vecscurl Curl line integrals

and circulation

Line integrals

Consider a curve C in a Euclidean vector space

R3 Let r(t) = [x(t) y(t) z(t)] be a parametric

representation of C Furthermore let F R3 rarr R3

be a vectorshyvalued function of r and let rprime(t)

be the tangent vector The line integral is

ˆ

C

F(r(t)) middot rprime(t) dt (curl1)

which integrates F along the curve For more

on computing line integrals see Schey8 and

Kreyszig9

If F is a force being applied to an object

moving along the curve C the line integral is

the work done by the force More generally the

line integral integrates F along the tangent of

C

Circulation

Consider the line integral over a closed curve

C denoted by

˛

C

F(r(t)) middot rprime(t) dt (curl2)

We call this quantity the circulation of F

around C

For certain vectorshyvalued functions F the

circulation is zero for every curve In these

cases (static electric fields for instance) this

is sometimes called the the law of circulation

vecs Vector calculus curl Curl line integrals and circulation p 2

curl

curl

Curl

Consider the division of the circulation around

a curve in R3 by the surface area it encloses

∆S

1

∆S

˛

C

F(r(t)) middot rprime(t) dt (curl3)

In a manner analogous to the operation

that gaves us the divergence letrsquos consider

shrinking this curve to a point and the surface

area to zero

lim∆Srarr0

1

∆S

˛

C

F(r(t)) middot rprime(t) dt (curl4)

We call this quantity the ldquoscalarrdquo curl of F

at each point in R3 in the direction normal

to ∆S as it shrinks to zero Taking three (or

n for Rn) ldquoscalarrdquo curls in indepedent normal

directions (enough to span the vector space)

we obtain the curl proper which is a vector-

valued function curl R3 rarr R3

The curl is coordinate-independent In

cartesian coordinates it can be shown to be

equivalent to the following

Equation curl5 curl differential

form cartesian coordinates

curl F =[partyFz ndash partzFy partzFx ndash partxFz partxFy ndash partyFx

]gt

But what does the curl of F represent It

quantifies the local rotation of F about each

point If F represents a fluidrsquos velocity curl F

vecs Vector calculus curl Curl line integrals and circulation p 1

vorticity

10 A region is simply connected if every curve in it can shrink to a

point without leaving the region An example of a region that is not

simply connected is the surface of a toroid

path independence

is the local rotation of the fluid about each

point and it is called the vorticity

Zero curl circulation and path

independence

Circulation

It can be shown that if the circulation of F on

all curves is zero then in each direction n and

at every point curl F = 0 (ie n middot curl F = 0)

Conversely for curl F = 0 in a simply connected

region10 F has zero circulation

Succinctly informally and without the

requisite qualifiers above

zero circulationrArr zero curl

(curl6)

zero curl + simply connected regionrArr zero circulation

(curl7)

Path independence

It can be shown that if the path integral of F

on all curves between any two points is path-

independent then in each direction n and

at every point curl F = 0 (ie n middot curl F = 0)

Conversely for curl F = 0 in a simply connected

region all line integrals are independent of

path

Succinctly informally and without the

requisite qualifiers above

path independencerArr zero curl

(curl8)

zero curl + simply connected regionrArr path independence

(curl9)

vecs Vector calculus curl Curl line integrals and circulation p 1

zero curl

zero circulation path independence

connectedness and

Figure 055 an implication graph relating zerocurl zero circulation path independence andconnectedness Green edges represent implication(a implies b) and black edges represent logicalconjunctions

hellip and how they relate

It is also true that

path independencehArr zero circulation

(curl10)

So putting it all together we get Figure 055

Exploring curl

Curl is perhaps best explored by considering

it for a vector field in R2 Such a field in

cartesian coordinates F = Fxi + Fyj has curl

curl F =[party0 ndash partzFy partzFx ndash partx0 partxFy ndash partyFx

]gt=[0 ndash 0 0 ndash 0 partxFy ndash partyFx

]gt=[0 0 partxFy ndash partyFx

]gt (curl11)

That is curl F = (partxFy ndash partyFx )k and the only

nonzero component is normal to the xyshyplane

If we overlay a quiver plot of F over a ldquocolor

densityrdquo plot representing the k-component of

curl F we can increase our intuition about the

curl

The followingwas generated from a Jupyter

vecs Vector calculus curl Curl line integrals and circulation p 2

notebook with the following filename and

kernel

notebook filename curl-and-line-integralsipynbnotebook kernel python3

First load some Python packages

from sympy import import numpy as npimport matplotlibpyplot as pltfrom matplotlibticker import LogLocatorfrom matplotlibcolors import from sympyutilitieslambdify import lambdifyfrom IPythondisplay import display Markdown Latex

Now we define some symbolic variables and

functions

var(xy)F_x = Function(F_x)(xy)F_y = Function(F_y)(xy)

We use the same function defined in

Lecture vecsdiv quiver_plotter_2D to make

several of these plots

Letrsquos inspect several cases

vecs Vector calculus curl Curl line integrals and circulation p 3

100 075 050 025 000 025 050 075 100100

075

050

025

000

025

050

075

100F(xy) = [0cos(2pix)]

6

4

2

0

2

4

6

Figure 056 png

p = quiver_plotter_2D(field=F_x0F_ycos(2pix)density_operation=curlgrid_decimate_x=2grid_decimate_y=10grid_width=1

)

The curl is

ndash2π sin (2πx)

p = quiver_plotter_2D(field=F_x0F_yx2density_operation=curl

vecs Vector calculus curl Curl line integrals and circulation p 4

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [0x2]

6

4

2

0

2

4

6

Figure 057 png

grid_decimate_x=2grid_decimate_y=20

)

The curl is

2x

p = quiver_plotter_2D(field=F_xy2F_yx2density_operation=curlgrid_decimate_x=2grid_decimate_y=20

)

vecs Vector calculus curl Curl line integrals and circulation p 5

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [y2x2]

6

4

2

0

2

4

6

Figure 058 png

The curl is

2x ndash 2y

p = quiver_plotter_2D(field=F_x-yF_yxdensity_operation=curlgrid_decimate_x=6grid_decimate_y=6

)

Warning density operator is constant (no densityplot)rarr

The curl is

vecs Vector calculus grad Curl line integrals and circulation p 6

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [-yx]

Figure 059 png

2

vecs Vector calculus grad Gradient p 1

gradient

direction

magnitude

principle of least action

vecsgrad Gradient

Gradient

The gradient grad of a scalarshyvalued

function f R3 rarr R is a vector field

F R3 rarr R3 that is grad f is a vectorshyvalued

function on R3 The gradientrsquos local direction

and magnitude are those of the local maximum

rate of increase of f This makes it useful in

optimization (eg in the method of gradient

descent)

In classical mechanics quantum mechanics

relativity string theory thermodynamics

and continuum mechanics (and elsewhere)

the principle of least action is taken as

fundamental (Richard P Feynman Robert B

Leighton and Matthew Sands The Feynman

Lectures on Physics New Millennium Perseus

Basic Books 2010) This principle tells us that

naturersquos laws quite frequently seem to be

derivable by assuming a certain quantitymdash

called actionmdashis minimized Considering then

that the gradient supplies us with a tool for

optimizing functions it is unsurprising that the

gradient enters into the equations of motion of

many physical quantities

The gradient is coordinate-independent but

its coordinate-free definitions donrsquot add much

to our intuition In cartesian coordinates it

can be shown to be equivalent to the following

vecs Vector calculus grad Gradient p 1

gradience

11 This is nonstandard terminology but wersquore bold

potentials

Equation grad1 gradient cartesian

coordinates

grad f =[partxf partyf partzf

]gt

Vector fields from gradients are special

Although all gradients are vector fields not

all vector fields are gradients That is given

a vector field F it may or may not be equal to

the gradient of any scalarshyvalued function f

Letrsquos say of a vector field that is a gradient

that it has gradience11 Those vector fields

that are gradients have special properties

Surprisingly those properties are connected to

path independence and curl It can be shown

that iff a field is a gradient line integrals of

the field are path independent That is for a

vector field

gradiencehArr path independence (grad2)

Consideringwhat we know from

Lecture vecscurl about path independence we

can expand Figure 055 to obtain Figure 0510

One implication is that gradients have zero

curl Many important fields that describe

physical interactions (eg static electric fields

Newtonian gravitational fields) are gradients

of scalar fields called potentials

Exploring gradient

Gradient is perhaps best explored by

considering it for a scalar field on R2 Sucha field in cartesian coordinates f(x y) has

vecs Vector calculus grad Gradient p 2

gradience

zero curl

zero circulation path independence

connectedness and

Figure 0510 an implication graph relating gradiencezero curl zero circulation path independence andconnectedness Green edges represent implication(a implies b) and black edges represent logicalconjunctions

gradient

grad f =[partxf partyf

]gt(grad3)

That is grad f = F = partxf i + partyf j If we overlay

a quiver plot of F over a ldquocolor densityrdquo

plot representing the f we can increase our

intuition about the gradient

The followingwas generated from a Jupyter

notebook with the following filename and

kernel

notebook filename gradipynbnotebook kernel python3

First load some Python packages

vecs Vector calculus grad Gradient p 3

from sympy import import numpy as npimport matplotlibpyplot as pltfrom matplotlibticker import LogLocatorfrom matplotlibcolors import from sympyutilitieslambdify import lambdifyfrom IPythondisplay import display Markdown Latex

Now we define some symbolic variables and

functions

var(xy)

(x y)

Rather than repeat code letrsquos write a single

function grad_plotter_2D to make several of

these plots

Letrsquos inspect several cases While considering

the following plots remember that they all

have zero curl

p = grad_plotter_2D(field=x

)

vecs Vector calculus grad Gradient p 4

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = x

3

2

1

0

1

2

Figure 0511 png

The gradient is

[1 0

]

p = grad_plotter_2D(field=x+y

)

The gradient is

[1 1

]

vecs Vector calculus grad Gradient p 1

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = x + y

6

4

2

0

2

4

Figure 0512 png

p = grad_plotter_2D(field=1

)

Warning field is constant (no plot)The gradient is

[0 0

]Gravitational potential

Gravitational potentials have the form of

1distance Letrsquos check out the gradient

vecs Vector calculus grad Gradient p 1

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = 1sqrt(x2 + y2)

100

101

Figure 0513 png

conic section

p = grad_plotter_2D(field=1sqrt(x2+y2)norm=SymLogNorm(linthresh=3 linscale=3)mask=True

)

The gradient is

[ndash x(x2+y2

) 32

ndashy(

x2+y2) 32

]

Conic section fields

Gradients of conic section fields can be

explored

vecs Vector calculus grad Gradient p 2

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = x2

1

2

3

4

5

6

7

8

9

Figure 0514 png

parabolic fields

eliptic fields

The following is called a parabolic field

p = grad_plotter_2D(field=x2

)

The gradient is

[2x 0

]

The following are called eliptic fields

vecs Vector calculus grad Gradient p 3

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = x2 + y2

2

4

6

8

10

12

14

16

18

Figure 0515 png

p = grad_plotter_2D(field=x2+y2

)

The gradient is

[2x 2y

]

p = grad_plotter_2D(field=-x2-y2

)

vecs Vector calculus grad Gradient p 4

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = -x2 - y2

18

16

14

12

10

8

6

4

2

Figure 0516 png

hyperbolic fields

The gradient is

[ndash2x ndash2y

]

The following is called a hyperbolic field

p = grad_plotter_2D(field=x2-y2

)

The gradient is

vecs Vector calculus stoked Gradient p 5

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = x2 - y2

8

6

4

2

0

2

4

6

8

Figure 0517 png

[2x ndash2y

]

vecs Vector calculus stoked Stokes and divergence theorems p 1

divergence theorem

triple integral

orientable

12 A surface is orientable if a consistent normal direction can

be defined Most surfaces are orientable but some notably the

Moumlbius strip cannot be See Kreyszig (Kreyszig Advanced Engineering

Mathematics sect 106) for more

13 Ibid sect 107

14 Schey Div Grad Curl and All that An Informal Text on Vector

Calculus pp 45-52

continuity equation

15 Ibid pp 49-52

vecsstoked Stokes and

divergence

theorems

Two theorems allow us to exchange certain

integrals in R3 for others that are easier to

evaluate

The divergence theorem

The divergence theorem asserts the equality

of the surface integral of a vector field F and

the triple integral of div F over the volume V

enclosed by the surface S in R3 That is

umlSF middot ndS =

˚Vdiv FdV (stoked1)

Caveats are that V is a closed region bounded

by the orientable12 surface S and that F is

continuous and continuously differentiable

over a region containing V This theorem makes

some intuitive sense we can think of the

divergence inside the volume ldquoaccumulatingrdquo

via the triple integration and equaling the

corresponding surface integral For more on the

divergence theorem see Kreyszig13 and Schey14

A lovely application of the divergence theorem

is that for any quantity of conserved stuff

(mass charge spin etc) distributed in a spatial

R3 with timeshydependent density ρ R4 rarr Rand velocity field v R4 rarr R3 the divergence

theorem can be applied to find that

parttρ = ndash div(ρv) (stoked2)

which is a more general form of a continuity

equation one of the governing equations of

many physical phenomena For a derivation of

this equation see Schey15

vecs Vector calculus ex Stokes and divergence theorems p 1

KelvinshyStokesrsquo theorem

piecewise smooth

16 A surface is smooth if its normal is continuous everywhere It

is piecewise smooth if it is composed of a finite number of smooth

surfaces

17 Kreyszig Advanced Engineering Mathematics sect 109

18 Schey Div Grad Curl and All that An Informal Text on Vector

Calculus pp 93-102

Greenersquos theorem

19 Kreyszig Advanced Engineering Mathematics sect 109

generalized Stokesrsquo theorem

20 Lee Introduction to Smooth Manifolds Ch 16

The Kelvin-Stokesrsquo theorem

The KelvinshyStokesrsquo theorem asserts the

equality of the circulation of a vector field

F over a closed curve C and the surface

integral of curl F over a surface S that has

boundary C That is for r(t) a parameterization

of C and surface normal n

˛CF(r(t)) middot rprime(t) dt =

umlSn middot curl FdS (stoked3)

Caveats are that S is piecewise smooth16 its

boundary C is a piecewise smooth simple closed

curve and F is continuous and continuously

differentiable over a region containing S

This theorem is also somewhat intuitive we

can think of the divergence over the surface

ldquoaccumulatingrdquo via the surface integration and

equaling the corresponding circulation For more

on the Kelvin-Stokesrsquo theorem see Kreyszig17

and Schey18

Related theorems

Greenersquos theorem is a twoshydimensional special

case of the Kelvin-Stokesrsquo theorem It is

described by Kreyszig19

It turns out that all of the above theorems

(and the fundamental theorem of calculus

which relates the derivative and integral)

are special cases of the generalized Stokesrsquo

theorem defined by differential geometry

We would need a deeper understanding of

differential geometry to understand this

theorem For more see Lee20

vecs Vector calculus ex Exercises for Chapter 05 p 1

vecsex Exercises for Chapter

05

Exercise 051 (6)

The PDE of Example pdeseparation-1 can be used

to describe the conduction of heat along

a long thin rod insulated along its length

where u(t x) represents temperature The initial

and dirichlet boundary conditions in that

example would be interpreted as an initial

temperature distribution along the bar and

fixed temperatures of the ends Now consider

the same PDE

parttu(t x) = kpart2xxu(t x) (ex1)

with real constant k now with neumann

boundary conditions on inverval x isin [0 L]

partxu|x=0 = 0 and partxu|x=L = 0 (ex2a)

and with initial condition

u(0 x) = f(x) (ex3)

where f is some piecewise continuous function

on [0 L] This represents the complete insulation

of the ends of the rod such that no heat flows

from the ends (or from anywhere else)

a Assume a product solution separate

variables into X(x) and T(t) and set the

separation constant to ndashλ

b Solve the boundary value problem for its

eigenfunctions Xn and eigenvalues λn

c Solve for the general solution of the

time variable ODE

d Write the product solution and apply the

initial condition f(x) by constructing it

from a generalized fourier series of the

product solution

four Vector calculus Exercises for Chapter 05 p 2

e Let L = 1 and f(x) = 100 ndash 200L |x ndash L2| as

shown in Figure 071 Compute the solution

series components Plot the sum of the

first 50 terms over x and t

0 02 04 06 08 1050100

x

f(x)

Figure 0518 initial condition for Exercise 051

four

Fourier andorthogonality

four Fourier and orthogonality series Fourier series p 1

frequency domain

Fourier analysis

1 Itrsquos important to note that the symbol ωn in this context is not the

natural frequency but a frequency indexed by integer n

frequency spectrum

fourseries Fourier series

Fourier series are mathematical series that

can represent a periodic signal as a sum

of sinusoids at different amplitudes and

frequencies They are useful for solving

for the response of a system to periodic

inputs However they are probably most

important conceptually they are our gateway

to thinking of signals in the frequency

domainmdashthat is as functions of frequency

(not time) To represent a function as a

Fourier series is to analyze it as a sum of

sinusoids at different frequencies1 ωn and

amplitudes an Its frequency spectrum is the

functional representation of amplitudes an

versus frequency ωn

Letrsquos begin with the definition

Definition fourseries1 Fourier series

trigonometric form

The Fourier analysis of a periodic function

y(t) is for n isin N0 period T and angular

frequency ωn = 2πnT

an =2

T

ˆ T2

ndashT2y(t) cos(ωnt)dt (series1)

bn =2

T

ˆ T2

ndashT2y(t) sin(ωnt)dt (series2)

The Fourier synthesis of a periodic function

y(t) with analysis components an and bn

corresponding to ωn is

y(t) =a02+

infinsumn=1

an cos(ωnt) + bn sin(ωnt)

(series3)

four Fourier and orthogonality series Fourier series p 2

harmonic

harmonic frequency

fundamental frequency

Letrsquos consider the complex form of the

Fourier series which is analogous to

Definition fourseries1 It may be helpful to

review Eulerrsquos formula(s) ndash see Appendix D01

Definition fourseries2 Fourier series

complex form

The Fourier analysis of a periodic function

y(t) is for n isin N0 period T and angular

frequency ωn = 2πnT

cplusmnn =1

T

ˆ T2

ndashT2y(t)endashjωntdt (series4)

The Fourier synthesis of a periodic function

y(t) with analysis components cn corresponding

to ωn is

y(t) =infinsumn=ndashinfin cne

jωnt (series5)

We call the integer n a harmonic and the

frequency associated with it

ωn = 2πnT (series6)

the harmonic frequency There is a special

name for the first harmonic (n = 1) the

fundamental frequency It is called this

because all other frequency components are

integer multiples of it

It is also possible to convert between the two

representations above

four Fourier and orthogonality series Fourier series p 3

harmonic amplitude

magnitude line spectrum

harmonic phase

Definition fourseries3 Fourier series

converting between

forms

The complex Fourier analysis of a periodic

function y(t) is for n isin N0 and an and bn as

defined above

cplusmnn =1

2

(a|n| ∓ jb|n|

)(series7)

The sinusoidal Fourier analysis of a periodic

function y(t) is for n isin N0 and cn as defined

above

an = cn + cndashn and (series8)

bn = j (cn ndash cndashn) (series9)

The harmonic amplitude Cn is

Cn =

radica2n + b

2n (series10)

= 2radiccncndashn (series11)

A magnitude line spectrum is a graph of the

harmonic amplitudes as a function of the

harmonic frequencies The harmonic phase is

θn = ndash arctan2(bn an) (see Appendix C02)

= arctan2(Im(cn) Re(cn))

The following illustration demonstrates how

sinusoidal components sum to represent a

square wave A line spectrum is also shown

four Fourier and orthogonality series Fourier series p 4

time

frequency

spectral amplitudeamplitude

Let us compute the associated spectral

components in the following example

Example fourseriesshy1 re Fourier series analysis line spectrum

Problem Statement

Compute the first five harmonic amplitudes

that represent the line spectrum for a square

wave in the figure above

four Fourier and orthogonality trans Fourier series p 5

four Fourier and orthogonality trans Fourier transform p 1

fourtrans Fourier transform

The followingwas generated from a Jupyter

notebook with the following filename and

kernel

notebook filename fourier_series_to_transformipynbnotebook kernel python3

We begin with the usual loading of modules

import numpy as np for numericsimport sympy as sp for symbolicsimport matplotlibpyplot as plt for plotsfrom IPythondisplay import display Markdown Latex

Letrsquos consider a periodic function f with

period T (T) Each period the function has a

triangular pulse of width δ (pulse_width) andheight δ2

period = 15 periodpulse_width = 2 pulse width

First we plot the function f in the time

domain Letrsquos begin by defining f

four Fourier and orthogonality trans Fourier transform p 2

def pulse_train(tTpulse_width)f = lambda xpulse_width2-abs(x) pulsetm = npmod(tT)if tm lt= pulse_width2

return f(tm)elif tm gt= T-pulse_width2

return f(-(tm-T))else

return 0

Now we develop a numerical array in time to

plot f

N = 201 number of points to plottpp = nplinspace(-period25period2N) time valuesfpp = nparray(npzeros(tppshape))for it_now in enumerate(tpp)

fpp[i] = pulse_train(t_nowperiodpulse_width)

axes = pltfigure(1)pltplot(tppfppb-linewidth=2) plotpltxlabel(time (s))pltxlim([-period23period2])pltxticks(

[0pulse_width2period][0$delta2$$T=+str(period)+$ s]

)pltyticks([0pulse_width2][0$delta2$])pltshow() display here

For δ = 2 and T isin [5 15 25] the leftshyhand

column of Figure 061 shows two triangle pulses

for each period T

four Fourier and orthogonality trans Fourier transform p 3

Consider the following argument Just as

a Fourier series is a frequency domain

representation of a periodic signal a Fourier

transform is a frequency domain representation

of an aperiodic signal (we will rigorously

define it in a moment) The Fourier series

components will have an analog then in the

Fourier transform Recall that they can be

computed by integrating over a period of the

signal If we increase that period infinitely

the function is effectively aperiodic The

result (within a scaling factor) will be the

Fourier transform analog of the Fourier series

components

Let us approach this understanding by actually

computing the Fourier series components for

increasing period T using Wersquoll use sympyto compute the Fourier series cosine and sine

components an and bn for component n (n) andperiod T (T)

spvar(xa_0a_nb_nreal=True)spvar(deltaTpositive=True)spvar(nnonnegative=True) a0 = 2Tspintegrate( (delta2-spAbs(x)) (x-delta2delta2) otherwise zero )simplify()an = spintegrate(

2T(delta2-spAbs(x))spcos(2sppinTx)(x-delta2delta2) otherwise zero

)simplify()bn = 2Tspintegrate(

(delta2-spAbs(x))spsin(2sppinTx)(x-delta2delta2) otherwise zero

)simplify()display(spEq(a_nan)spEq(b_nbn))

four Fourier and orthogonality trans Fourier transform p 4

an =

T(1ndashcos

(πδnT

))π2n2

for n 6= 0δ2

2Totherwise

bn = 0

Furthermore let us compute the harmonic

amplitude

(f_harmonic_amplitude)

Cn =

radica2n + b

2n (trans1)

which we have also scaled by a factor Tδ in

order to plot it with a convenient scale

spvar(C_npositive=True)cn = spsqrt(an2+bn2)display(spEq(C_ncn))

Cn =

T∣∣∣cos(πδn

T

)ndash1∣∣∣

π2n2for n 6= 0

δ2

2Totherwise

Now we lambdify the symbolic expression for a

numpy function

cn_f = splambdify((nTdelta)cn)

Now we can plot

four Fourier and orthogonality trans Fourier transform p 5

amplitude CnTδ

δ2T = 5 s

δ2

5 10

δ2

δ2

T = 15 s

δ2

5 10

δ2

δ2

T = 25 s

δ2

time (s)

5 10

δ2

frequency ω (rads)

Figure 061 triangle pulse trains (left column) withlonger periods descending and their correspondingline spectra (right column) scaled for convenientcomparison

omega_max = 12 rads max frequency in line spectrumn_max = round(omega_maxperiod(2nppi)) max harmonicn_a = nplinspace(0n_maxn_max+1)omega = 2nppin_aperiodaxes = pltfigure(2)markerline stemlines baseline = pltstem(

omega periodpulse_widthcn_f(n_aperiodpulse_width)linefmt=b- markerfmt=bo basefmt=r-use_line_collection=True

)pltxlabel(frequency $omega$ (rads))pltxlim([0omega_max])pltylim([0pulse_width2])pltyticks([0pulse_width2][0$delta2$])pltshow() show here

The line spectra are shown in the rightshyhand

column of Figure 061 Note that with our chosen

scaling as T increases the line spectra reveal

a distinct waveform

four Fourier and orthogonality trans Fourier transform p 6

2 4 6 8 10 12

δ2

frequency ω (rads)

F(ω)

Figure 062 F(ω) our mysterious Fourier seriesamplitude analog

Let F be the continuous function of angular

frequency ω

F(ω) =δ

2middot sin

2(ωδ4)

(ωδ4)2 (trans2)

First we plot it

F = lambda w pulse_width2 npsin(wpulse_width(22))2 (wpulse_width(22))2

N = 201 number of points to plotwpp = nplinspace(00001omega_maxN)Fpp = []for i in range(0N)

Fppappend(F(wpp[i])) build array of function valuesaxes = pltfigure(3)pltplot(wppFppb-linewidth=2) plotpltxlim([0omega_max])pltyticks([0pulse_width2][0$delta2$])pltxlabel(frequency $omega$ (rads))pltylabel($F(omega)$)pltshow()

Letrsquos consider the plot in Figure 062 of F Itrsquos

obviously the function emerging in Figure 061

from increasing the period of our pulse train

four Fourier and orthogonality trans Fourier transform p 7

Now we are ready to define the Fourier

transform and its inverse

Definition fourtrans1 Fourier transforms

trigonometric form

Fourier transform (analysis)

A(ω) =

ˆ infinndashinfin y(t) cos(ωt)dt (trans3)

B(ω) =

ˆ infinndashinfin y(t) sin(ωt)dt (trans4)

Inverse Fourier transform (synthesis)

y(t) =1

ˆ infinndashinfin A(ω) cos(ωt)dω +

1

ˆ infinndashinfin B(ω) sin(ωt)dω

(trans5)

Definition fourtrans2 Fourier transforms

complex form

Fourier transform F (analysis)

F(y(t)) = Y(ω) =

ˆ infinndashinfin y(t)endashjωtdt (trans6)

Inverse Fourier transform Fndash1 (synthesis)

Fndash1(Y(ω)) = y(t) =1

ˆ infinndashinfin Y(ω)ejωtdω (trans7)

So now we have defined the Fourier transform

There are many applications including solving

differential equations and frequency domain

representationsmdashcalled spectramdashof time domain

functions

There is a striking similarity between the

four Fourier and orthogonality trans Fourier transform p 8

Fourier transform and the Laplace transform

with which you are already acquainted

In fact the Fourier transform is a special

case of a Laplace transform with Laplace

transform variable s = jω instead of having

some real component Both transforms convert

differential equations to algebraic equations

which can be solved and inversely transformed

to find timeshydomain solutions The Laplace

transform is especially important to use

when an input function to a differential

equation is not absolutely integrable and

the Fourier transform is undefined (for

example our definition will yield a transform

for neither the unit step nor the unit ramp

functions) However the Laplace transform is

also preferred for initial value problems due

to its convenient way of handling them The

two transforms are equally useful for solving

steady state problems Although the Laplace

transform has many advantages for spectral

considerations the Fourier transform is the

only game in town

A table of Fourier transforms and their

properties can be found in Appendix B02

Example fourtransshy1 re a Fourier transform

Problem Statement

Consider the aperiodic signal y(t) = us(t)endashat

with us the unit step function and a gt 0 The

signal is plotted below Derive the complex

frequency spectrum and plot its magnitude

and phase

ndash2 0 2 40

051

t

y(t)

four Fourier and orthogonality trans Fourier transform p 9

Problem SolutionThe signal is aperiodic so the Fourier

transform can be computed from

Equation trans6

Y(ω) =

ˆ infinndashinfin y(t)endashjωtdt

=

ˆ infinndashinfin us(t)e

ndashatendashjωtdt (def of y)

=

ˆ infin0

endashatendashjωtdt (us effect)

=

ˆ infin0

e(ndashandashjω)tdt (multiply)

=1

ndasha ndash jωe(ndashandashjω)t

∣∣∣∣infin0dt (antiderivative)

=1

ndasha ndash jω

(limtrarrinfin e(ndashandashjω)t ndash e0

)(evaluate)

=1

ndasha ndash jω

(limtrarrinfin endashatendashjωt ndash 1

)(arrange)

=1

ndasha ndash jω

((0)(complex with mag 6 1) ndash 1

)(limit)

=ndash1

ndasha ndash jω(consequence)

=1

a + jω

=a ndash jω

a ndash jωmiddot 1

a + jω(rationalize)

=a ndash jω

a2 +ω2

The magnitude and phase of this complex

function are straightforward to compute

|Y(ω)| =radicRe(Y(ω))2 + Im(Y(ω))2

=1

a2 +ω2

radica2 +ω2

=1radic

a2 +ω2

angY(ω) = ndash arctan(ωa)

four Fourier and orthogonality general Fourier transform p 10

Now we can plot these functions of ω Setting

a = 1 (arbitrarily) we obtain the following

plots

0

05

1

|Y(ω)|

ndash10 ndash5 0 5 10

ndash1

0

1

ω

angY(ω)

four Fourier and orthogonality general Generalized fourier series and orthogonality p 1

2 A function f is square-integrable ifacuteinfinndashinfin |f(x)|2 dx ltinfin

inner product

weight function

3 This definition of the inner product can be extended to functions

on R2 and R3 domains using double- and triple-integration See

( Schey Div Grad Curl and All that An Informal Text on Vector

Calculus p 261)

orthogonality

basis

function space

legendre polynomials

fourgeneral Generalized

fourier series and

orthogonality

Let f R rarr C g R rarr C and w R rarr Cbe complex functions For square-integrable2

f g and w the inner product of f and g with

weight function w over the interval [a b] sube Ris3

〈f g〉w =ˆ b

af(x)g(x)w(x) dx (general1)

where g denotes the complex conjugate of

g The inner product of functions can be

considered analogous to the inner (or dot)

product of vectors

The fourier series components can be found by a

special property of the sin and cos functions

called orthogonality In general functions

f and g from above are orthogonal over the

interval [a b] iff

〈f g〉w = 0 (general2)

for weight function w Similar to how a set of

orthogonal vectors can be a basis for a vector

space a set of orthogonal functions can be

a basis for a function space a vector space

of functions from one set to another (with

certain caveats)

In addition to some sets of sinusoids there

are several other important sets of functions

that are orthogonal For instance sets of

legendre polynomials (Erwin Kreyszig

Advanced Engineering Mathematics 10th John

Wiley amp Sons Limited 2011 ISBN 9781119571094

The authoritative resource for engineering

four Fourier and orthogonality ex Generalized fourier series and orthogonality p 2

bessel functions

generalized fourier series

Fourier components

synthesis

analysis

fourier-legendre series

fourier-bessel series

mathematics It includes detailed accounts

of probability statistics vector calculus

linear algebra fourier analysis ordinary

and partial differential equations and

complex analysis It also includes several

other topics with varying degrees of depth

Overall it is the best place to start when

seeking mathematical guidance sect 52) and bessel

functions (Kreyszig Advanced Engineering

Mathematics sect 54) are orthogonal

As with sinusoids the orthogonality of some

sets of functions allows us to compute their

series components Let functions f0 f1 middot middot middot beorthogonal with respect to weight function

w on interval [a b] and let α0 α1 middot middot middot be realconstants A generalized fourier series is

(ibid sect 116)

f(x) =infinsumm=0

αmfm(x) (general3)

and represents a function f as a convergent

series It can be shown that the Fourier

components αm can be computed from

αm =〈f fm〉w〈fm fm〉w

(general4)

In keepingwith our previous terminology

for fourier series Equation general3 and

Equation general4 are called general fourier

synthesis and analysis respectively

For the aforementioned legendre and bessel

functions the generalized fourier series are

called fourier-legendre and fourier-bessel

series (ibid sect 116) These and the standard

fourier series (Lecture fourseries) are of

particular interest for the solution of partial

differential equations (Chapter 07)

four Fourier and orthogonality ex Exercises for Chapter 06 p 1

3 Python code in this section was generated from a Jupyter notebook

named random_signal_fftipynb with a python3 kernel

fourex Exercises for Chapter

06

Exercise 061 (secrets)

This exercise encodes a ldquosecret wordrdquo into a

sampled waveform for decoding via a discrete

fourier transform (DFT) The nominal goal of

the exercise is to decode the secret word

Along the way plotting and interpreting the DFT

will be important

First load relevant packages

import numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

We define two functions letter_to_number to

convert a letter into an integer index of the

alphabet (a becomes 1 b becomes 2 etc) and

string_to_number_list to convert a string to

a list of ints as shown in the example at the

end

def letter_to_number(letter)return ord(letter) - 96

def string_to_number_list(string)out = [] listfor i in range(0len(string))

outappend(letter_to_number(string[i]))return out list

print(faces = string_to_number_list(aces) )

four Fourier and orthogonality ex Exercises for Chapter 06 p 2

aces = [1 3 5 19]

Now we encode a code string code into

a signal by beginningwith ldquowhite noiserdquo

which is broadband (appears throughout the

spectrum) and adding to it sin functions

with amplitudes corresponding to the letter

assignments of the code and harmonic

corresponding to the position of the letter in

the string For instance the string bad would

be represented by noise plus the signal

2 sin2πt + 1 sin 4πt + 4 sin 6πt (ex1)

Letrsquos set this up for secret word chupcabra

N = 2000Tm = 30T = float(Tm)float(N)fs = 1Tx = nplinspace(0 Tm N)noise = 4nprandomnormal(0 1 N)code = chupcabra the secret wordcode_number_array = nparray(string_to_number_list(code))y = nparray(noise)for i in range(0len(code))

y = y + code_number_array[i]npsin(2nppi(i+1)x)

For proper decoding later it is important

to know the fundamental frequency of the

generated data

print(ffundamental frequency = fs Hz)

four Fourier and orthogonality ex Exercises for Chapter 06 p 3

0 1 2 3 4 5 6 7

time (s)

minus60

minus40

minus20

0

20

40

60

y n

Figure 063 the chupacabra signal

fundamental frequency = 6666666666666667 Hz

Now we plot

fig ax = pltsubplots()pltplot(xy)pltxlim([0Tm4])pltxlabel(time (s))pltylabel($y_n$)pltshow()

Finally we can save our data to a numpy file

secretsnpy to distribute our message

npsave(secretsy)

four Fourier and orthogonality ex Exercises for Chapter 06 p 1

Now I have done this (for a different secret

word) and saved the data download it here

ricopiconemathematical_foundationssourcesecretsnpy

In order to load the npy file into Python we

can use the following command

secret_array = npload(secretsnpy)

Your job is to (a) perform a DFT (b) plot the

spectrum and (c) decode the message Here are

a few hints

1 Use from scipy import fft to do the

DFT

2 Use a hanning window to minimize the

end-effects See numpyhanning for

instance The fft call might then look

like

2fft(nphanning(N)secret_array)N

where N = len(secret_array)3 Use only the positive spectrum you can

lop off the negative side and double the

positive side

Exercise 062 (seesaw)

Consider the periodic function f R rarr R with

period T defined for one period as

f(t) = at for t isin (ndashT2 T2] (ex2)

where a T isin R Perform a fourier analysis on

f Letting a = 5 and T = 1 plot f along with the

partial sum of the fourier synthesis the first

50 nonzero components over t isin [ndashT T]

four Fourier and orthogonality ex Exercises for Chapter 06 p 1

Exercise 063 (9)

Consider the function f Rrarr R defined as

f(t) =

a ndash a|t|T for t isin [ndashT T]

0 otherwise(ex3)

where a T isin R Perform a fourier analysis on f

resulting in F(ω) Plot F for various a and T

Exercise 064 (10)

Consider the function f Rrarr R defined as

f(t) = aendashb|tndashT| (ex4)

where a b T isin R Perform a fourier analysis on

f resulting in F(ω) Plot F for various a b and

T

Exercise 065 (11)

Consider the function f Rrarr R defined as

f(t) = a cosω0t + b sinω0t (ex5)

where a bω0 isin R constants Perform a fourier

analysis on f resulting in F(ω)

Exercise 066 (12)

Derive a fourier transform property for

expressions including function f Rrarr R for

f(t) cos(ω0t +ψ)

where ω0 ψ isin R

Exercise 067 (13)

Consider the function f Rrarr R defined as

f(t) = aus(t)endashbt cos(ω0t +ψ) (ex6)

where a bω0 ψ isin R and us(t) is the unit

step function Perform a fourier analysis on

f resulting in F(ω) Plot F for various a b ω0 ψ

and T

four Fourier and orthogonality ex Exercises for Chapter 06 p 1

Exercise 068 (14)

Consider the function f Rrarr R defined as

f(t) = g(t) cos(ω0t) (ex7)

where ω0 isin R and g R rarr R will be defined in

each part below Perform a fourier analysis on

f for each g below for ω1 isin R a constant and

consider how things change if ω1 rarr ω0

a g(t) = cos(ω1t)

b g(t) = sin(ω1t)

Exercise 069 (savage)

An instrument called a ldquolock-in amplifierrdquo can

measure a sinusoidal signal A cos(ω0t + ψ) =

a cos(ω0t) + b sin(ω0t) at a known frequency ω0with exceptional accuracy even in the presence

of significant noise N(t) The workings of these

devices can be described in two operations

first the following operations on the signal

with its noise f1(t) = a cos(ω0t) + b sin(ω0t) + N(t)

f2(t) = f1(t) cos(ω1t) and f3(t) = f1(t) sin(ω1t)

(ex8)

where ω0 ω1 a b isin R Note the relation of

this operation to the Fourier analysis of

Exercise 068 The key is to know with some

accuracty ω0 such that the instrument can set

ω1 asymp ω0 The second operation on the signal is

an aggressive lowshypass filter The filtered f2and f3 are called the inshyphase and quadrature

components of the signal and are typically

given a complex representation

(inshyphase) + j (quadrature)

Explain with fourier analyses on f2 and f3

pde Fourier and orthogonality Exercises for Chapter 06 p 1

a what F2 = F(f2) looks like

b what F3 = F(f3) looks like

c why we want ω1 asymp ω0 d why a lowshypass filter is desirable and

e what the timeshydomain signal will look

like

Exercise 0610 (strawman)

Consider again the lock-in amplifier explored

in Exercise 069 Investigate the lock-in

amplifier numerically with the following steps

a Generate a noisy sinusoidal signal

at some frequency ω0 Include enough

broadband white noise that the signal

is invisible in a timeshydomain plot

b Generate f2 and f3 as described in

Exercise 069

c Apply a timeshydomain discrete lowshypass

filter to each f2 7rarr φ2 and f3 7rarr φ3 such

as scipyrsquos scipysignalsosfiltfilt tocomplete the lock-in amplifier operation

Plot the results in time and as a complex

(polar) plot

d Perform a discrete fourier transform

on each f2 7rarr F2 and f3 7rarr F3 Plot the

spectra

e Construct a frequency domain lowshypass

filter F and apply it (multiply) to each

F2 7rarr Fprime2 and F3 7rarr Fprime3 Plot the filtered

spectra

f Perform an inverse discrete fourier

transform to each Fprime2 7rarr fprime2 and Fprime3 7rarr fprime3

Plot the results in time and as a complex

(polar) plot

g Compare the two methods used ie time-

domain filtering versus frequencyshydomain

filtering

ordinary differential equations

lumped-parameter modeling

pde

Partialdifferential

equations

An ordinary differential equation is one

with (ordinary) derivatives of functions a

single variable eachmdashtime in many applications

These typically describe quantities in some sort

of lumped-parameter way mass as a ldquopoint

particlerdquo a springrsquos force as a function of

timeshyvarying displacement across it a resistorrsquos

current as a function of timeshyvarying voltage

across it Given the simplicity of such models

in comparison to the wildness of nature

it is quite surprising how well they work

for a great many phenomena For instance

electronics rigid body mechanics population

dynamics bulk fluid mechanics and bulk heat

transfer can be lumpedshyparameter modeled

However as we saw in there are many

phenomena of which we require more detailed

models These include

pde Partial differential equations p 3

timeshyvarying spatial distribution

partial differential equations

dependent variables

independent variables

analytic solution

numeric solution

1 There are some analytic techniques for gaining insight into PDEs

for which there are no known solutions such as considering the

phase space This is an active area of research for more see Bove

Colombini and Santo (Antonio Bove F (Ferruccio) Colombini and

Daniele Del Santo Phase space analysis of partial differential

equations eng Progress in nonlinear differential equations and

their applications v 69 Boston Berlin Birkhaumluser 2006 ISBN

9780817645212)

bull detailed fluid mechanics

bull detailed heat transfer

bull solid mechanics

bull electromagnetism and

bull quantum mechanics

In many cases what is required to account

for is the timeshyvarying spatial distribution

of a quantity In fluid mechanics we treat a

fluid as having quantities such as density and

velocity that vary continuous over space and

time Deriving the governing equations for such

phenomena typically involves vector calculus

we observed in that statements about

quantities like the divergence (eg continuity)

can be made about certain scalar and vector

fields Such statements are governing equations

(eg the continuity equation) and they are

partial differential equations (PDEs) because

the quantities of interest called dependent

variables (eg density and velocity) are both

temporally and spatially varying (temporal

and spatial variables are therefore called

independent variables)

In this chapter we explore the analytic

solution of PDEs This is related to but

distinct from the numeric solution (ie

simulation) of PDEs which is another important

topic Many PDEs have no known analytic

solution so numeric solution is the best

available option1 However it is important

to note that the insight one can gain from an

analytic solution is often much greater than

that from a numeric solution This is easily

understood when one considers that a numeric

solution is an approximation for a specific set

pde Partial differential equations class p 4

2 Kreyszig Advanced Engineering Mathematics Ch 12

3 WA Strauss Partial Differential Equations An Introduction Wiley

2007 ISBN 9780470054567 A thorough and yet relatively compact

introduction

4 R Haberman Applied Partial Differential Equations with Fourier

Series and Boundary Value Problems (Classic Version) Pearson Modern

Classics for Advanced Mathematics Pearson Education Canada 2018

ISBN 9780134995434

of initial and boundary conditions Typically

very little can be said of what would happen

in general although this is often what we seek

to know So despite the importance of numeric

solution one should always prefer an analytic

solution

Three good texts on PDEs for further study are

Kreyszig2 Strauss3 and Haberman4

pde Partial differential equations class Classifying PDEs p 1

initial conditions

boundary conditions

well-posed problem

linear

nonlinear

pdeclass Classifying PDEs

PDEs often have an infinite number of

solutions however when applying them to

physical systems we usually assume there is

a deterministic or at least a probabilistic

sequence of events will occur Therefore we

impose additonal constraints on a PDE usually

in the form of

1 initial conditions values of

independent variables over all space at

an initial time and

2 boundary conditions values of

independent variables (or their

derivatives) over all time

Ideally imposing such conditions leaves us

with a well-posed problem which has three

aspects (Antonio Bove F (Ferruccio) Colombini

and Daniele Del Santo Phase space analysis of

partial differential equations eng Progress

in nonlinear differential equations and their

applications v 69 Boston Berlin Birkhaumluser

2006 ISBN 9780817645212 sect 15)

existence There exists at least one solution

uniqueness There exists at most one solution

stability If the PDE boundary conditons or

initial conditions are changed slightly

the solution changes only slightly

As with ODEs PDEs can be linear or nonlinear

pde Partial differential equations class Classifying PDEs p 2

order

second-order PDEs

forcing function

homogeneous

that is the independent variables and

their derivatives can appear in only linear

combinations (linear PDE) or in one or more

nonlinear combination (nonlinear PDE) As with

ODEs there are more known analytic solutions

to linear PDEs than nonlinear PDEs

The order of a PDE is the order of its

highest partial derivative A great many

physical models can be described by second-

order PDEs or systems thereof Let u be an

independent scalar variable a function of

m temporal and spatial variables xi isin RnA second-order linear PDE has the form

for coefficients αβγ δ real functions

of xi (WA Strauss Partial Differential

Equations An Introduction Wiley 2007 ISBN

9780470054567 A thorough and yet relatively

compact introduction sect 16)

nsumi=1

nsumj=1

αijpart2xixju︸ ︷︷ ︸

second-order terms

+msumk=1

(γkpartxku + δku

)︸ ︷︷ ︸

first- and zeroth-order terms

= f(x1 middot middot middot xn)︸ ︷︷ ︸forcing

(class1)

where f is called a forcing function When f

is zero Equation class1 is called homogeneous

We can consider the coefficients αij to be

components of a matrix A with rows indexed

by i and columns indexed by j There are four

prominent classes defined by the eigenvalues of

A

elliptic the eigenvalues all have the same sign

parabolic the eigenvalues have the same sign

except one that is zero

hyperbolic exactly one eigenvalue has the

opposite sign of the others and

pde Partial differential equations sturm Classifying PDEs p 3

cauchyshykowalevski theorem

cauchy problems

ultrahyperbolic at least two eigenvalues of

each signs

The first three of these have received extensive

treatment They are named after conic

sections due to the similarity the equations

have with polynomials when derivatives

are considered analogous to powers of

polynomial variables For instance here is a

case of each of the first three classes

part2xxu + part2yyu = 0 (elliptic)

part2xxu ndash part2yyu = 0 (hyperbolic)

part2xxu ndash parttu = 0 (parabolic)

When A depends on xi it may have multiple

classes across its domain In general this

equation and its associated initial and

boundary conditions do not comprise a well-

posed problem however several special cases

have been shown to be wellshyposed Thus far

the most general statement of existence and

uniqueness is the cauchyshykowalevski theorem

for cauchy problems

pde Partial differential equations sturm Sturm-liouville problems p 1

sturm-liouville (SshyL) differential equation

regular SshyL problem

5 For the SshyL problem to be regular it has the additional constraints

that p qσ are continuous and pσ gt 0 on [a b] This is also sometimes

called the sturm-liouville eigenvalue problem See Haberman (Haberman

Applied Partial Differential Equations with Fourier Series and

Boundary Value Problems (Classic Version) sect 53) for the more general

(non-regular) SshyL problem and Haberman (ibid sect 74) for the multi-

dimensional analog

boundary value problems

eigenfunctions

eigenvalues

6 These eigenvalues are closely related to but distinct from the

ldquoeigenvaluesrdquo that arise in systems of linear ODEs

7 Ibid sect 53

pdesturm Sturm-liouville

problems

Before we introduce an important solution

method for PDEs in Lecture pdeseparation we

consider an ordinary differential equation

that will arise in that method when dealing

with a single spatial dimension x the sturm-

liouville (SshyL) differential equation Let

p qσ be functions of x on open interval

(a b) Let X be the dependent variable and λ

constant The regular SshyL problem is the SshyL

ODE5

d

dx

(pXprime

)+ qX + λσX = 0 (sturm1a)

with boundary conditions

β1X(a) + β2Xprime(a) = 0 (sturm2a)

β3X(b) + β4Xprime(b) = 0 (sturm2b)

with coefficients βi isin R This is a type of

boundary value problem

This problem has nontrivial solutions

called eigenfunctions Xn(x) with n isin Z+ corresponding to specific values of λ = λn

called eigenvalues6 There are several

important theorems proven about this (see

Haberman7) Of greatest interest to us are that

1 there exist an infinite number of

eigenfunctions Xn (unique within a

multiplicative constant)

2 there exists a unique corresponding real

eigenvalue λn for each eigenfunction Xn

3 the eigenvalues can be ordered as λ1 lt

λ2 lt middot middot middot

pde Partial differential equations sturm Sturm-liouville problems p 1

homogeneous boundary conditions

periodic boundary conditions

4 eigenfunction Xn has n ndash 1 zeros on open

interval (a b)

5 the eigenfunctions Xn form an

orthogonal basis with respect to

weighting function σ such that

any piecewise continuous function

f [a b] rarr R can be represented by a

generalized fourier series on [a b]

This last theorem will be of particular interest

in Lecture pdeseparation

Types of boundary conditions

Boundary conditions of the sturm-liouville

kind (sturm2) have four sub-types

dirichlet for just β2 β4 = 0

neumann for just β1 β3 = 0

robin for all βi 6= 0 andmixed if β1 = 0 β3 6= 0 if β2 = 0 β4 6= 0

There are many problems that are not regular

sturm-liouville problems For instance the

rightshyhand sides of Equation sturm2 are

zero making them homogeneous boundary

conditions however these can also be

nonzero Another case is periodic boundary

conditions

X(a) = X(b) (sturm3a)

Xprime(a) = Xprime(b) (sturm3b)

pde Partial differential equations sturm Sturm-liouville problems p 2

Example pdesturmshy1 re a sturm-liouville problem with dirichlet

boundary conditions

Problem Statement

Consider the differential equation

Xprimeprime + λX = 0 (sturm4)

with dirichlet boundary conditions on the

boundary of the interval [0 L]

X(0) = 0 and X(L) = 0 (sturm5)

Solve for the eigenvalues and eigenfunctions

Problem SolutionThis is a sturm-liouville problem so we know

the eigenvalues are real The well-known

general solutions to the ODE is

X(x) =

k1 + k2x λ = 0

k1ejradicλx + k2e

ndashjradicλx otherwise

(sturm6)

with real constants k1 k2 The solution must

also satisfy the boundary conditions Letrsquos

apply them to the case of λ = 0 first

X(0) = 0rArr k1 + k2(0) = 0rArr k1 = 0 (sturm7)

X(L) = 0rArr k1 + k2(L) = 0rArr k2 = ndashk1L (sturm8)

Together these imply k1 = k2 = 0 which gives

the trivial solution X(x) = 0 in which we

arenrsquot interested We say then for nontrivial

solutions λ 6= 0 Now letrsquos check λ lt 0 The

solution becomes

X(x) = k1endashradic|λ|x + k2e

radic|λ|x (sturm9)

= k3 cosh(radic|λ|x) + k4 sinh(

radic|λ|x) (sturm10)

where k3 and k4 are real constants Again

applying the boundary conditions

X(0) = 0rArr k3 cosh(0) + k4 sinh(0) = 0rArr k3 + 0 = 0rArr k3 = 0

X(L) = 0rArr 0cosh(radic|λ|L) + k4 sinh(

radic|λ|L) = 0rArr k4 sinh(

radic|λ|L) = 0

However sinh(radic|λ|L) 6= 0 for L gt 0 so

k4 = k3 = 0mdashagain the trivial solution Now

pde Partial differential equations sturm Sturm-liouville problems p 3

letrsquos try λ gt 0 The solution can be written

X(x) = k5 cos(radicλx) + k6 sin(

radicλx) (sturm11)

Applying the boundary conditions for this

case

X(0) = 0rArr k5 cos(0) + k6 sin(0) = 0rArr k5 + 0 = 0rArr k5 = 0

X(L) = 0rArr 0cos(radicλL) + k6 sin(

radicλL) = 0rArr k6 sin(

radicλL) = 0

Now sin(radicλL) = 0 for

radicλL = nπrArr

λ =(nπL

)2 (n isin Z+)

Therefore the only nontrivial solutions

that satisfy both the ODE and the boundary

conditions are the eigenfunctions

Xn(x) = sin(radic

λnx)

(sturm12a)

= sin(nπLx)

(sturm12b)

with corresponding eigenvalues

λn =(nπL

)2 (sturm13)

Note that because λ gt 0 λ1 is the lowest

eigenvalue

Plotting the eigenfunctions

The followingwas generated from a Jupyter

notebook with the following filename and

kernel

notebook filename eigenfunctions_example_plotipynbnotebook kernel python3

First load some Python packages

pde Partial differential equations sturm Sturm-liouville problems p 4

import numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

Set L = 1 and compute values for the first

four eigenvalues lambda_n and eigenfunctions

X_n

L = 1x = nplinspace(0L100)n = nplinspace(144dtype=int)lambda_n = (nnppiL)2X_n = npzeros([len(n)len(x)])for in_i in enumerate(n)X_n[i] = npsin(npsqrt(lambda_n[i])x)

Plot the eigenfunctions

for in_i in enumerate(n)pltplot(

xX_n[i]linewidth=2label=n = +str(n_i)

)pltlegend()pltshow() display the plot

00 02 04 06 08 10

ndash1

ndash05

0

05

1

n = 1

n = 2

n = 3

n = 4

pde Partial differential equations separation Sturm-liouville problems p 5

We see that the fourth of the SshyL theorems

appears true n ndash 1 zeros of Xn exist on the

open interval (0 1)

pde Partial differential equations separation PDE solution by separation of variables p 1

separation of variables

linear

product solution

separable PDEs

pdeseparation PDE solution by

separation of

variables

We are now ready to learn one of the

most important techniques for solving PDEs

separation of variables It applies only to

linear PDEs since it will require the principle

of superposition Not all linear PDEs yield to

this solution technique but several that are

important do

The technique includes the following steps

assume a product solution Assume the

solution can be written as a product

solution up the product of functions

of each independent variable

separate PDE Substitute up into the PDE and

rearrange such that at least one side

of the equation has functions of a single

independent variabe If this is possible

the PDE is called separable

set equal to a constant Each side of

the equation depends on different

independent variables therefore they

must each equal the same constant often

called ndashλ

repeat separation as needed If there are more

than two independent variables there

will be an ODE in the separated variable

and a PDE (with one fewer variables) in

the other independent variables Attempt

to separate the PDE until only ODEs

remain

solve each boundary value problem Solve

each boundary value problem ODE

pde Partial differential equations separation PDE solution by separation of variables p 1

eigenfunctions

superposition

ignoring the initial conditions for now

solve the time variable ODE Solve for the

general solution of the time variable ODE

sans initial conditions

construct the product solution Multiply the

solution in each variable to construct

the product solution up If the boundary

value problems were sturm-liouville

the product solution is a family of

eigenfunctions from which any function

can be constructed via a generalized

fourier series

apply the initial condition The product

solutions individually usually do not

meet the initial condition However a

generalized fourier series of them nearly

always does Superposition tells us a

linear combination of solutions to the

PDE and boundary conditions is also

a solution the unique series that also

satisfies the initial condition is the

unique solution to the entire problem

Example pdeseparationshy1 re 1D diffusion equation

Problem Statement

Consider the oneshydimensional diffusion

equation PDEa

parttu(t x) = kpart2xxu(t x) (separation1)

with real constant k with dirichlet boundary

conditions on inverval x isin [0 L]

u(t 0) = 0 (separation2a)

u(t L) = 0 (separation2b)

and with initial condition

u(0 x) = f(x) (separation3)

where f is some piecewise continuous function

on [0 L]

pde Partial differential equations separation PDE solution by separation of variables p 2

a For more on the diffusion or heat equation see Haberman

(Haberman Applied Partial Differential Equations with Fourier

Series and Boundary Value Problems (Classic Version) sect 23)

Kreyszig (Kreyszig Advanced Engineering Mathematics sect 125)

and Strauss (Strauss Partial Differential Equations An

Introduction sect 23)

Problem Solution

Assume a product solution

First we assume a product solution of the

form up(t x) = T(t)X(x) where T and X are

unknown functions on t gt 0 and x isin [0 L]

Separate PDE

Second we substitute the product solution

into Equation separation1 and separate

variables

TprimeX = kTXprimeprime rArr (separation4)

Tprime

kT=Xprimeprime

X (separation5)

So it is separable Note that we chose to

group k with T which was arbitrary but

conventional

Set equal to a constant

Since these two sides depend on different

independent variables (t and x) they must

equal the same constant we call ndashλ so we

have two ODEs

Tprime

kT= ndashλ rArr Tprime + λkT = 0 (separation6)

Xprimeprime

X= ndashλ rArr Xprimeprime + λX = 0 (separation7)

Solve the boundary value problem

The latter of these equations with the

boundary conditions (separation2) is precisely

the same sturm-liouville boundary value

problem from Example pdesturm-1 which had

pde Partial differential equations separation PDE solution by separation of variables p 3

eigenfunctions

Xn(x) = sin(radic

λnx)

(separation8a)

= sin(nπLx)

(separation8b)

with corresponding (positive) eigenvalues

λn =(nπL

)2 (separation9)

Solve the time variable ODE

The time variable ODE is homogeneous and has

the familiar general solution

T(t) = cendashkλt (separation10)

with real constant c However the boundary

value problem restricted values of λ to λn so

Tn(t) = cendashk(nπL)

2t (separation11)

Construct the product solution

The product solution is

up(t x) = Tn(t)Xn(x)

= cendashk(nπL)2t sin

(nπLx) (separation12)

This is a family of solutions that each satisfy

only exotically specific initial conditions

Apply the initial condition

The initial condition is u(0 x) = f(x) The

eigenfunctions of the boundary value problem

form a fourier series that satisfies the initial

condition on the interval [0 L] if we extend

f to be periodic and odd over x (Kreyszig

Advanced Engineering Mathematics p 550) we

call the extension flowast The odd series synthesis

can be written

flowast(x) =infinsumn=1

bn sin(nπLx)

(separation13)

pde Partial differential equations separation PDE solution by separation of variables p 4

where the fourier analysis gives

bn =2

L

ˆ L

0flowast(χ) sin

(nπLχ) (separation14)

So the complete solution is

u(t x) =infinsumn=1

bnendashk(nπL)

2t sin(nπLx)

(separation15)

Notice this satisfies the PDE the boundary

conditions and the initial condition

Plotting solutions

If we want to plot solutions we need to

specify an initial condition u(0 x) = flowast(x)

over [0 L] We can choose anything piecewise

continuous but for simplicity letrsquos let

f(x) = 1 (x isin [0 L])

The odd periodic extension is an odd square

wave The integral (separation14) gives

bn =4

nπ(1 ndash cos(nπ))

=

0 n even

4nπ n odd

(separation16)

Now we can write the solution as

u(t x) =infinsum

n=1 n odd

4

nπendashk(nπL)

2t sin(nπLx)

(separation17)

Plotting in Python

First load some Python packages

import numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

pde Partial differential equations separation PDE solution by separation of variables p 5

Set k = L = 1 and sum values for the first Nterms of the solution

L = 1k = 1N = 100x = nplinspace(0L300)t = nplinspace(02(Lnppi)2100)u_n = npzeros([len(t)len(x)])for n in range(N)n = n+1 because index starts at 0if n 2 == 0 even

pass already initialized to zeroselse odd

u_n += 4(nnppi)npouter(npexp(-k(nnppiL)2t)npsin(nnppiLx)

)

Letrsquos first plot the initial condition

p = pltfigure()pltplot(xu_n[0])pltxlabel(space $x$)pltylabel($u(0x)$)pltshow()

00 02 04 06 08 10space x

00

02

04

06

08

10

12

u(0x)

pde Partial differential equations wave PDE solution by separation of variables p 6

Now we plot the entire response

p = pltfigure()pltcontourf(txu_nT)c = pltcolorbar()cset_label($u(tx))pltxlabel(time $t$)pltylabel(space $x$)pltshow()

000 005 010 015 020

time t

00

02

04

06

08

10

spacex

000

015

030

045

060

075

090

105

120

u(tx)

We see the diffusive action proceeds as we

expected

Python code in this section was generated from a Jupyter

notebook named pde_separation_example_01ipynb with a

python3 kernel

pde Partial differential equations wave The 1D wave equation p 1

wave equation

8 Kreyszig Advanced Engineering Mathematics sect 129 1210

9 Haberman Applied Partial Differential Equations with Fourier Series

and Boundary Value Problems (Classic Version) sect 45 73

pdewave The 1D wave equation

The oneshydimensional wave equation is the

linear PDE

part2ttu(t x) = c2part2xxu(t x) (wave1)

with real constant c This equation models

such phenomena as strings fluids sound and

light It is subject to initial and boundary

conditions and can be extended to multiple

spatial dimensions For 2D and 3D examples

in rectangular and polar coordinates see

Kreyszig8 and Haberman9

Example pdewaveshy1 re vibrating string PDE solution by separation of

variables

Problem Statement

Consider the oneshydimensional wave equation

PDE

part2ttu(t x) = c2part2xxu(t x) (wave2)

with real constant c and with dirichlet

boundary conditions on inverval x isin [0 L]

u(t 0) = 0 and u(t L) = 0 (wave3a)

and with initial conditions (we need two

because of the second timeshyderivative)

u(0 x) = f(x) and parttu(0 x) = g(x) (wave4)

where f and g are some piecewise continuous

functions on [0 L]

Problem Solution

Assume a product solution

First we assume a product solution of the

form up(t x) = T(t)X(x) where T and X are

unknown functions on t gt 0 and x isin [0 L]

pde Partial differential equations wave The 1D wave equation p 2

Separate PDE

Second we substitute the product solution

into Equation wave2 and separate variables

TprimeprimeX = c2TXprimeprime rArr (wave5)

Tprimeprime

c2T=Xprimeprime

X (wave6)

So it is separable Note that we chose to

group c with T which was arbitrary but

conventional

Set equal to a constant

Since these two sides depend on different

independent variables (t and x) they must

equal the same constant we call ndashλ so we

have two ODEs

Tprimeprime

c2T= ndashλ rArr Tprimeprime + λc2T = 0 (wave7)

Xprimeprime

X= ndashλ rArr Xprimeprime + λX = 0 (wave8)

Solve the boundary value problem

The latter of these equations with the

boundary conditions (wave3) is precisely

the same sturm-liouville boundary value

problem from Example pdesturm-1 which had

eigenfunctions

Xn(x) = sin(radic

λnx)

(wave9a)

= sin(nπLx)

(wave9b)

with corresponding (positive) eigenvalues

λn =(nπL

)2 (wave10)

Solve the time variable ODE

The time variable ODE is homogeneous and

with λ restricted by the reals by the

boundary value problem has the familiar

general solution

T(t) = k1 cos(cradicλt) + k2 sin(c

radicλt) (wave11)

pde Partial differential equations wave The 1D wave equation p 3

with real constants k1 and k2 However the

boundary value problem restricted values of λ

to λn so

Tn(t) = k1 cos(cnπLt)+ k2 sin

(cnπLt)

(wave12)

Construct the product solution

The product solution is

up(t x) = Tn(t)Xn(x)

= k1 sin(nπLx)cos

(cnπLt)+ k2 sin

(nπLx)sin

(cnπLt)

This is a family of solutions that each satisfy

only exotically specific initial conditions

Apply the initial conditions

Recall that superposition tells us that any

linear combination of the product solution is

also a solution Therefore

u(t x) =infinsumn=1

an sin(nπLx)cos

(cnπLt)+ bn sin

(nπLx)sin

(cnπLt)

(wave13)

is a solution If an and bn are properly

selected to satisfy the initial conditions

Equation wave13 will be the solution to the

entire problem Substituting t = 0 into our

potential solution gives

u(0 x) =infinsumn=1

an sin(nπLx)

(wave14a)

parttu(t x)|t=0 =infinsumn=1

bncnπ

Lsin

(nπLx) (wave14b)

Let us extend f and g to be periodic and odd

over x we call the extensions flowast and glowast From

Equation wave14 the intial conditions are

satsified if

flowast(x) =infinsumn=1

an sin(nπLx)

(wave15a)

glowast(x) =infinsumn=1

bncnπ

Lsin

(nπLx) (wave15b)

pde Partial differential equations wave The 1D wave equation p 4

We identify these as two odd fourier

syntheses The corresponding fourier analyses

are

an =2

L

ˆ L

0flowast(χ) sin

(nπLχ)

(wave16a)

bncnπ

L=2

L

ˆ L

0glowast(χ) sin

(nπLχ)

(wave16b)

So the complete solution is Equation wave13

with components given by Equation wave16

Notice this satisfies the PDE the boundary

conditions and the initial condition

Discussion

It can be shown that this series solution is

equivalent to two travelingwaves that are

interfering (see Habermana and Kreyszigb)

This is convenient because computing the

series solution exactly requires an infinite

summation We show in the following section

that the approximation by partial summation is

still quite good

Choosing specific initial conditions

If we want to plot solutions we need to

specify initial conditions over [0 L] Letrsquos

model a string being suddenly struck from

rest as

f(x) = 0 (wave17)

g(x) = δ(x ndash ∆L) (wave18)

where δ is the dirac delta distribution and

∆ isin [0 L] is a fraction of L representing the

location of the string being struck The odd

periodic extension is an odd pulse train The

integrals of (wave16) give

an = 0 (wave19a)

bn =2

cnπ

ˆ L

0δ(χ ndash ∆L) sin

(nπLχ)dx

=2

cnπsin(nπ∆) (sifting property)

pde Partial differential equations wave The 1D wave equation p 5

Now we can write the solution as

u(t x) =infinsumn=1

2

cnπsin(nπ∆) sin

(nπLx)sin

(cnπLt)

(wave20)

Plotting in Python

First load some Python packages

import numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

Set c = L = 1 and sum values for the first

N terms of the solution for some striking

location ∆

Delta = 01 0 lt= Delta lt= LL = 1c = 1N = 150t = nplinspace(030(Lnppi)2100)x = nplinspace(0L150)t_b x_b = npmeshgrid(tx)u_n = npzeros([len(x)len(t)])for n in range(N)n = n+1 because index starts at 0u_n += 4(cnnppi)

npsin(nnppiDelta) npsin(cnnppiLt_b) npsin(nnppiLx_b)

Letrsquos first plot some early snapshots of the

response

pde Partial differential equations wave The 1D wave equation p 6

import seaborn as snsn_snaps = 7snsset_palette(snsdiverging_palette(

240 10 n=n_snaps center=dark)

)

fig ax = pltsubplots()it = nplinspace(277n_snapsdtype=int)for i in range(len(it))axplot(xu_n[it[i]]label=ft = t[i]3f)

lgd = axlegend(bbox_to_anchor=(105 1)loc=upper left

)pltxlabel(space $x$)pltylabel($u(tx)$)pltshow()

00 02 04 06 08 10

space x

minus10

minus05

00

05

10

u(tx

)

t = 0000

t = 0031

t = 0061

t = 0092

t = 0123

t = 0154

t = 0184

Now we plot the entire response

fig ax = pltsubplots()p = axcontourf(txu_n)c = figcolorbar(pax=ax)cset_label($u(tx)$)pltxlabel(time $t$)pltylabel(space $x$)pltshow()

pde Partial differential equations ex The 1D wave equation p 7

00 05 10 15 20 25 30

time t

00

02

04

06

08

10

spac

ex

minus12

minus09

minus06

minus03

00

03

06

09

12

u(tx

)

We see a wave develop and travel reflecting

and inverting off each boundary

a Haberman Applied Partial Differential Equations with Fourier

Series and Boundary Value Problems (Classic Version) sect 44

b Kreyszig Advanced Engineering Mathematics sect 122

b Python code in this section was generated from a Jupyter

notebook named pde_separation_example_02ipynb with a

python3 kernel

pde Partial differential equations ex Exercises for Chapter 07 p 1

pdeex Exercises for Chapter 07

Exercise 071 (poltergeist)

The PDE of Example pdeseparation-1 can be used

to describe the conduction of heat along

a long thin rod insulated along its length

where u(t x) represents temperature The initial

and dirichlet boundary conditions in that

example would be interpreted as an initial

temperature distribution along the bar and

fixed temperatures of the ends Now consider

the same PDE

parttu(t x) = kpart2xxu(t x) (ex1)

with real constant k now with neumann

boundary conditions on inverval x isin [0 L]

partxu|x=0 = 0 and partxu|x=L = 0 (ex2a)

and with initial condition

u(0 x) = f(x) (ex3)

where f is some piecewise continuous function

on [0 L] This represents the complete insulation

of the ends of the rod such that no heat flows

from the ends (or from anywhere else)

a Assume a product solution separate

variables into X(x) and T(t) and set the

separation constant to ndashλ

b Solve the boundary value problem for its

eigenfunctions Xn and eigenvalues λn

c Solve for the general solution of the

time variable ODE

d Write the product solution and apply the

initial condition f(x) by constructing it

from a generalized fourier series of the

product solution

pde Partial differential equations ex Exercises for Chapter 07 p 1

e Let L = k = 1 and f(x) = 100ndash200L |xndashL2| as

shown in Figure 071 Compute the solution

series components Plot the sum of the

first 50 terms over x and t

0 02 04 06 08 1050100

x

f(x)

Figure 071 initial condition for Exercise 051Exercise 072 (kathmandu)

Consider the free vibration of a uniform

and relatively thin beammdashwith modulus of

elasticity E second moment of cross-sectional

area I and massshyper-length micromdashpinned at each

end The PDE describing this is a version of

the euler-bernoulli beam equation for vertical

motion u

part2ttu(t x) = ndashα2part4xxxxu(t x) (ex4)

with real constant α defined as

α2 =EI

micro (ex5)

Pinned supports fix vertical motion such that

we have boundary conditions on interval

x isin [0 L]

u(t 0) = 0 and u(t L) = 0 (ex6a)

Additionally pinned supports cannot provide

a moment so

part2xxu|x=0 = 0 and part2xxu|x=L = 0 (ex6b)

Furthermore consider the initial conditions

u(0 x) = f(x) and parttu|t=0 = 0 (ex7a)

where f is some piecewise continuous function

on [0 L]

opt Partial differential equations Exercises for Chapter 07 p 2

a Assume a product solution separate

variables into X(x) and T(t) and set the

separation constant to ndashλ

b Solve the boundary value problem for

its eigenfunctions Xn and eigenvalues λn

Assume real λ gt 0 (itrsquos true but tedious to

show)

c Solve for the general solution of the

time variable ODE

d Write the product solution and apply

the initial conditions by constructing

it from a generalized fourier series of

the product solution

e Let L = α = 1 and f(x) = sin(10πxL)

as shown in Figure 072 Compute the

solution series components Plot the sum

of the first 50 terms over x and t

0 02 04 06 08 1ndash101

x

f(x)

Figure 072 initial condition for Exercise 072

opt

Optimization

opt Optimization grad Gradient descent p 1

objective function

extremum

global extremum

local extremum

gradient descent

stationary point

saddle point

hessian matrix

optgrad Gradient descent

Consider a multivariate function f Rn rarr Rthat represents some cost or value This is

called an objective function and we often

want to find an X isin Rn that yields frsquos

extremum minimum or maximum depending on

whichever is desirable

It is important to note however that some

functions have no finite extremum Other

functions have multiple Finding a global

extremum is generally difficult however

many good methods exist for finding a local

extremum an extremum for some region R sub Rn

The method explored here is called gradient

descent It will soon become apparent why it

has this name

Stationary points

Recall from basic calculus that a function f of

a single variable had potential local extrema

where df(x) dx = 0 The multivariate version of

this for multivariate function f is

grad f = 0 (grad1)

A value X for which Equation grad1 holds is

called a stationary point However as in the

univariate case a stationary point may not be

a local extremum in these cases it called a

saddle point

Consider the hessian matrix H with values for

independent variables xi

Hij = part2xixjf (grad2)

opt Optimization grad Gradient descent p 1

second partial derivative test

positive definite

negative definite

indefinite

convex

concave

the gradient points toward stationary points

For a stationary point X the second partial

derivative test tells us if it is a local

maximum local minimum or saddle point

minimum If H(X) is positive definite (all its

eigenvalues are positive)

X is a local minimum

maximum If H(X) is negative definite (all its

eigenvalues are negative)

X is a local maximum

saddle If H(X) is indefinite (it has both

positive and negative eigenvalues)

X is a saddle point

These are sometimes called tests for concavity

minima occur where f is convex and maxima

where f is concave (ie where ndashf is convex)

It turns out however that solving

Equation grad1 directly for stationary points is

generally hard Therefore we will typically use

an iterative technique for estimating them

The gradient points the way

Although Equation grad1 isnrsquot usually directly

useful for computing stationary points it

suggests iterative techniques that are Several

techniques rely on the insight that the

gradient points toward stationary points

Recall from Lecture vecsgrad that grad f is a

vector field that points in the direction of

greatest increase in f

opt Optimization grad Gradient descent p 1

line search

step size α

Consider starting at some point x0 and

wanting to move iteratively closer to a

stationary point So if one is seeking a

maximum of f then choose x1 to be in the

direction of grad f If one is seeking a

minimum of f then choose x1 to be opposite the

direction of grad f

The question becomes how far α should we go

in (or opposite) the direction of the gradient

Surely too-small α will require more iteration

and too-large α will lead to poor convergence

or missing minima altogether This framing of

the problem is called line search There are a

few common methods for choosing α called the

step size some more computationally efficient

than others

Two methods for choosing the step size

are described below Both are framed as

minimization methods but changing the sign of

the step turns them into maximization methods

The classical method

Let

gk = grad f(xk) (grad3)

the gradient at the algorithmrsquos current

estimate xk of the minimum The classical

method of choosing α is to attempt to solve

analytically for

αk = argminα

f(xk ndash αgk) (grad4)

This solution approximates the function f

as one varies α It is approximate because

as α varies so should x But even with α as

the only variable Equation grad4 may be

opt Optimization grad Gradient descent p 1

1 Kreyszig Advanced Engineering Mathematics sect 221

2 Jonathan Barzilai and Jonathan M Borwein Two-Point Step Size

Gradient Methods In IMA Journal of Numerical Analysis 81 (Jan 1988)

pp 141ndash148 issn 0272-4979 doi 101093imanum81141 This

includes an innovative line search method

difficult or impossible to solve However this

is sometimes called the ldquooptimalrdquo choice for

α Here ldquooptimalityrdquo refers not to practicality

but to ideality This method is rarely used to

solve practical problems

The algorithm of the classical gradient descent

method can be summarized in the pseudocode

of Algorithm 1 It is described further in

Kreyszig1

Algorithm 1 Classical gradient descent

1 procedure classical_minimizer(fx0 T)

2 while δx gt T do until threshold T is

met

3 gk larr grad f(xk)

4 αk larr argminα f(xk ndash αgk)

5 xk+1 larr xk ndash αkgk6 δxlarr xk+1 ndash xk7 klarr k + 1

8 end while

9 return xk the threshold was reached

10 end procedureThe Barzilai and Borwein method

In practice several non-classical methods are

used for choosing step size α Most of these

construct criteria for step sizes that are too

small and too large and prescribe choosing

some α that (at least in certain cases) must

be in the sweet-spot in between Barzilai and

Borwein2 developed such a prescription which

we now present

Let ∆xk = xk ndash xkndash1 and ∆gk = gk ndash gkndash1 This

method minimizes ∆x ndash α∆g2 by choosing

αk =∆xk middot ∆gk∆gk middot ∆gk

(grad5)

opt Optimization grad Gradient descent p 2

3 Barzilai and Borwein Two-Point Step Size Gradient Methods

The algorithm of this gradient descent

method can be summarized in the pseudocode

of Algorithm 2 It is described further in

Barzilai and Borwein3

Algorithm 2 Barzilai and Borwein gradient

descent

1 procedure barzilai_minimizer(fx0 T)

2 while δx gt T do until threshold T is

met

3 gk larr grad f(xk)

4 ∆gk larr gk ndash gkndash15 ∆xk larr xk ndash xkndash16 αk larr

∆xkmiddot∆gk∆gkmiddot∆gk

7 xk+1 larr xk ndash αkgk8 δxlarr xk+1 ndash xk9 klarr k + 1

10 end while

11 return xk the threshold was reached

12 end procedure

Example optgradshy1 re Barzilai and Borwein gradient descent

Problem Statement

Consider the functions (a) f1 R2 rarr R and (b)

f2 R2 rarr R defined as

f1(x) = (x1 ndash 25)2 + 13(x2 + 10)

2 (grad6)

f2(x) =1

2x middot Ax ndash b middot x (grad7)

where

A =

[20 0

0 10

]and (grad8a)

b =[1 1

]gt (grad8b)

Use the method of Barzilai and Borweina

starting at some x0 to find a minimum of each

function

a Ibid

Problem Solution

First load some Python packages

opt Optimization grad Gradient descent p 3

from sympy import import numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latexfrom tabulate import tabulate

We begin by writing a class gradient_descent_minto perform the gradient descent This is not

optimized for speed

class gradient_descent_min() A Barzilai and Borwein gradient descent class

Inputs f Python function of x variables x list of symbolic variables (eg [x1 x2]) x0 list of numeric initial guess of a min of f T step size threshold for stopping the descent

To execute the gradient descent call descend method

nb This is only for gradients in cartesiancoordinates Further work would be to implementthis in multiple or generalized coordinatesSee the grad method below for implementation

def __init__(selffxx0T)selff = fselfx = Array(x)selfx0 = nparray(x0)selfT = Tselfn = len(x0) size of xselfg = lambdify(xselfgrad(fx)numpy)selfxk = nparray(x0)selftable =

def descend(self) unpack variablesf = selffx = selfxx0 = selfx0T = selfTg = selfg initialize variablesN = 0

opt Optimization grad Gradient descent p 4

x_k = x0dx = 2T cant be zerox_km1 = 9x0-1 cant equal x0g_km1 = nparray(g(x_km1))N_max = 100 max iterationstable_data = [[Nx0nparray(g(x0))0]]while (dx gt T and N lt N_max) or N lt 1N += 1 increment indexg_k = nparray(g(x_k))dg_k = g_k - g_km1dx_k = x_k - x_km1alpha_k = dx_kdot(dg_k)dg_kdot(dg_k)x_km1 = x_k storex_k = x_k - alpha_kg_k savetable_dataappend([Nx_kg_kalpha_k])selfxk = npvstack((selfxkx_k)) store other variablesg_km1 = g_kdx = nplinalgnorm(x_k - x_km1) check

selftabulater(table_data)

def tabulater(selftable_data)npset_printoptions(precision=2)tabulateLATEX_ESCAPE_RULES=selftable[python] = tabulate(table_dataheaders=[Nx_kg_kalpha]

)selftable[latex] = tabulate(table_dataheaders=[

$N$$bmx_k$$bmg_k$$alpha$]tablefmt=latex_raw

)

def grad(selffx) cartesian coords gradientreturn derive_by_array(f(x)x)

First consider f1

var(x1 x2)x = Array([x1x2])f1 = lambda x (x[0]-25)2 + 13(x[1]+10)2gd = gradient_descent_min(f=f1x=xx0=[-5040]T=1e-8)

opt Optimization grad Gradient descent p 5

Perform the gradient descent

gddescend()

Print the interesting variables

print(gdtable[python])

N xk gk α

0 [-50 40] [-150 1300] 0

1 [-4365 -1503] [-150 1300] 00423296

2 [shy3836 -10 ] [-1373 -13074] 00384979

3 [shy3311 -10 ] [-127e+02 124e-01] 0041454

4 [ 2499 -10 ] [-116e+02 -962e-03] 0499926

5 [ 25 -1005] [-002 012] 0499999

6 [ 25 -10] [-184e-08 -138e+00] 00385225

7 [ 25 -10] [-170e-08 219e-03] 00384615

8 [ 25 -10] [-157e-08 000e+00] 00384615

Now letrsquos lambdify the function f1 so we can

plot

f1_lambda = lambdify((x1x2)f1(x)numpy)

Now letrsquos plot a contour plot with the

gradient descent overlaid

opt Optimization grad Gradient descent p 6

fig ax = pltsubplots() contour plotX1 = nplinspace(-100100100)X2 = nplinspace(-5050100)X1 X2 = npmeshgrid(X1X2)F1 = f1_lambda(X1X2)pltcontourf(X1X2F1) gradient descent plotfrom mpl_toolkitsmplot3d import Axes3Dfrom matplotlibcollections import LineCollectionxX1 = gdxk[0]xX2 = gdxk[1]points = nparray([xX1 xX2])Treshape(-1 1 2)segments = npconcatenate([points[-1] points[1]] axis=1

)lc = LineCollection(segmentscmap=pltget_cmap(Reds)

)lcset_array(nplinspace(01len(xX1))) color segslcset_linewidth(3)axautoscale(False) avoid the scatter changing limsaxadd_collection(lc)axscatter(xX1xX2zorder=1marker=ocolor=pltcmReds(nplinspace(01len(xX1)))edgecolor=none

)pltshow()

minus100 minus75 minus50 minus25 0 25 50 75 100

minus40

minus20

0

20

40

opt Optimization grad Gradient descent p 7

Now consider f2

A = Matrix([[100][020]])b = Matrix([[11]])def f2(x)X = Array([x])tomatrix()Treturn 12Xdot(AX) - bdot(X)

gd = gradient_descent_min(f=f2x=xx0=[50-40]T=1e-8)

Perform the gradient descent

gddescend()

Print the interesting variables

print(gdtable[python])

opt Optimization grad Gradient descent p 8

N xk gk α

0 [ 50 -40] [ 499 -801] 0

1 [1758 1204] [ 499 -801] 00649741

2 [ 807 -101] [17478 23988] 00544221

3 [362 017] [ 7966 shy2122] 00558582

4 [ 049 -005] [3516 249] 00889491

5 [01 014] [ 389 -194] 00990201

6 [01 0 ] [004 19 ] 00750849

7 [01 005] [ 001 -095] 0050005

8 [01 005] [474e-03 958e-05] 00500012

9 [01 005] [ 237e-03 shy238e-09] 00999186

10 [01 005] [193e-06 237e-09] 01

11 [01 005] [ 000e+00 shy237e-09] 00999997

Now letrsquos lambdify the function f2 so we can

plot

f2_lambda = lambdify((x1x2)f2(x)numpy)

Now letrsquos plot a contour plot with the

gradient descent overlaid

fig ax = pltsubplots() contour plotX1 = nplinspace(-100100100)X2 = nplinspace(-5050100)X1 X2 = npmeshgrid(X1X2)F2 = f2_lambda(X1X2)pltcontourf(X1X2F2) gradient descent plotfrom mpl_toolkitsmplot3d import Axes3Dfrom matplotlibcollections import LineCollectionxX1 = gdxk[0]xX2 = gdxk[1]points = nparray([xX1 xX2])Treshape(-1 1 2)

opt Optimization lin Gradient descent p 9

segments = npconcatenate([points[-1] points[1]] axis=1

)lc = LineCollection(segmentscmap=pltget_cmap(Reds)

)lcset_array(nplinspace(01len(xX1))) color segslcset_linewidth(3)axautoscale(False) avoid the scatter changing limsaxadd_collection(lc)axscatter(xX1xX2zorder=1marker=ocolor=pltcmReds(nplinspace(01len(xX1)))edgecolor=none

)pltshow()

minus100 minus75 minus50 minus25 0 25 50 75 100

minus40

minus20

0

20

40

Python code in this section was generated from a Jupyter

notebook named gradient_descentipynb with a python3kernel

opt Optimization lin Constrained linear optimization p 1

constraints

linear programming problem

maximize

constrained

linear

feasible solution

optimal

hyperplane

optlin Constrained linear

optimization

Consider a linear objective function f

Rn rarr R with variables xi in vector x and

coefficients ci in vector c

f(x) = c middot x (lin1)

subject to the linear constraintsmdashrestrictions

on ximdash

Ax 6 a (lin2a)

Bx = b and (lin2b)

l 6 x 6 u (lin2c)

where A and B are constant matrices and

ab lu are nshyvectors This is one formulation

of what is called a linear programming

problem Usually we want to maximize f over

the constraints Such problems frequently

arise throughout engineering for instance

in manufacturing transportation operations

etc They are called constrained because

there are constraints on x they are called

linear because the objective function and the

constraints are linear

We call a pair (x f(x)) for which x satisfies

Equation lin2 a feasible solution Of course

not every feasible solution is optimal a

feasible solution is optimal iff there exists no

other feasible solution for which f is greater

(assumingwersquore maximizing) We call the vector

subspace of feasible solutions S sub Rn

Feasible solutions form a polytope

Consider the effect of the constraints Each

of the equalities and inequalities defines a

linear hyperplane in Rn (ie a linear subspace

opt Optimization lin Constrained linear optimization p 1

flat

polytope

vertices

basic feasible solutions

level set

an optimal solution occurs at a vertex of S

of dimension n ndash 1) either as a boundary of

S (inequality) or as a restriction of S to the

hyperplane When joined these hyperplanes

are the boundary of S (equalities restrict S

to lower dimension) So we see that each of

the boundaries of S is flat which makes S a

polytope (in R2 a polygon) What makes this

especially interesting is that polytopes have

vertices where the hyperplanes intersect

Solutions at the vertices are called basic

feasible solutions

Only the vertices matter

Our objective function f is linear so for

some constant h f(x) = h defines a level set

that is itself a hyperplane H in Rn If thishyperplane intersects S at a point x (x f(x) = h)

is the corresponding solution There are three

possibilities when H intersects S

1 H cap S is a vertex of S

2 H cap S is a boundary hyperplane of S or

3 H cap S slices through the interior of S

However this third option implies that there

exists a level set G corresponding to f(x) =

g such that G intersects S and g gt h so

solutions on H cap S are not optimal (We have

not proven this but it may be clear from

our progression) We conclude that either the

first or second case must be true for optimal

solutions And notice that in both cases a

(potentially optimal) solution occurs at at

least one vertex The key insight then is that

an optimal solution occurs at a vertex of

S

opt Optimization simplex Constrained linear optimization p 2

This means we donrsquot need to search all of S

or even its boundary we need only search the

vertices Helpful as this is it restricts us down

to( n constraints

)potentially optimal solutionsmdash

usually still too many to search in a naiumlve

way In Lecture optsimplex this is mitigated

by introducing a powerful searching method

opt Optimization simplex The simplex algorithm p 1

simplex algorithm

4 Another Python package pulp (PuLP) is probably more popular for

linear programming however we choose scipyoptimize because it

has applications beyond linear programming

optsimplex The simplex

algorithm

The simplex algorithm (or ldquomethodrdquo) is an

iterative technique for finding an optimal

solution of the linear programming problem

of Equations lin1 and lin2 The details of

the algorithm are somewhat involved but

the basic idea is to start at a vertex of the

feasible solution space S and traverse an edge

of the polytope that leads to another vertex

with a greater value of f Then repeat this

process until there is no neighboring vertex

with a greater value of f at which point the

solution is guaranteed to be optimal

Rather than present the details of the

algorithm we choose to show an example using

Python There have been some improvements

on the original algorithm that have been

implemented into many standard software

packages including Pythonrsquos scipy package

(Pauli Virtanen et al SciPy 10ndashFundamental

Algorithms for Scientific Computing in Python

In arXiv eshyprints arXiv190710121 [July 2019]

arXiv190710121) module scipyoptimize4

Example optsimplexshy1 re simplex method using scipyoptimize

Problem Statement

Maximize the objective function

f(x) = c middot x (simplex1a)

for x isin R2 and

c =[5 2

]gt(simplex1b)

opt Optimization simplex The simplex algorithm p 2

subject to constraints

0 6 x1 6 10 (simplex2a)

ndash5 6 x2 6 15 (simplex2b)

4x1 + x2 6 40 (simplex2c)

x1 + 3x2 6 35 (simplex2d)

ndash8x1 ndash x2 gt ndash75 (simplex2e)

Problem Solution

First load some Python packages

from scipyoptimize import linprogimport numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

Encoding the problem

Before we can use linprog we must first

encode Equations simplex2 and ex4 into

a form linprog will recognize We begin

with f which we can write as c middot x with the

coefficients of c as follows

c = [-5 -2] negative to find max

Wersquove negated each constant because

linprog minimizes f and we want to

maximize f Now letrsquos encode the inequality

constraints We will write the leftshyhand

side coefficients in the matrix A and the

opt Optimization simplex The simplex algorithm p 3

rightshyhand-side values in vector a such that

Ax 6 a (simplex3)

Notice that one of our constraint

inequalities is gt instead of 6 We can

flip this by multiplying the inequality by ndash1

We use simple lists to encode A and a

A = [[4 1][1 3][8 1]

]a = [40 35 75]

Now we need to define the lower l and upper

u bounds of x The function linprog expects

these to be in a single list of lower- and

upper-bounds of each xi

lu = [(0 10)(-515)

]

We want to keep track of each step linprogtakes We can access these by defining a

function callback to be passed to linprog

x = [] for storing the stepsdef callback(res) called at each step

opt Optimization simplex The simplex algorithm p 4

global xprint(fnit = resnit x_k = resx)xappend(resxcopy()) store

Now we need to call linprog We donrsquot have

any equality constraints so we need only use

the keyword arguments A_ub=A and b_ub=aFor demonstration purposes we tell it to

use the simplex method which is not as

good as its other methods which use better

algorithms based on the simplex

res = linprog(cA_ub=Ab_ub=abounds=lumethod=simplexcallback=callback

)

x = nparray(x)

nit = 0 x_k = [ 0 -5]nit = 1 x_k = [10 -5]nit = 2 x_k = [875 5 ]nit = 3 x_k = [772727273 909090909]nit = 4 x_k = [772727273 909090909]nit = 5 x_k = [772727273 909090909]nit = 5 x_k = [772727273 909090909]

So the optimal solution (x f(x)) is as follows

opt Optimization simplex The simplex algorithm p 5

print(foptimum x resx)print(foptimum f(x) -resfun)

optimum x [772727273 909090909]optimum f(x) 5681818181818182

The last point was repeated

1 once because there was no adjacent

vertex with greater f(x) and

2 twice because the algorithm calls

lsquocallbacklsquo twice on the last step

Plotting

When the solution space is in R2 it is

helpful to graphically represent the solution

space constraints and the progression of

the algorithm We begin by defining the

inequality lines from A and a over the bounds

of x1

n = len(c) number of variables xm = npshape(A)[0] number of inequality constraintsx2 = npempty([m2])for i in range(0m)x2[i] = -A[i][0]A[i][1]nparray(lu[0]) + a[i]A[i][1]

Now we plot a contour plot of f over

the bounds of x1 and x2 and overlay the

opt Optimization simplex The simplex algorithm p 6

inequality constraints and the steps of the

algorithm stored in x

lu_array = nparray(lu)fig ax = pltsubplots()mplrcParams[lineslinewidth] = 3 contour plotX1 = nplinspace(lu_array[0]100)X2 = nplinspace(lu_array[1]100)X1 X2 = npmeshgrid(X1X2)F2 = -c[0]X1 + -c[1]X2 negative because max hackcon = axcontourf(X1X2F2)cbar = figcolorbar(conax=ax)cbaraxset_ylabel(objective function) bounds on xun = nparray([11])opts = cwlabelNonelinewidth6pltplot(lu_array[0]lu_array[10]unopts)pltplot(lu_array[0]lu_array[11]unopts)pltplot(lu_array[00]unlu_array[1]opts)pltplot(lu_array[01]unlu_array[1]opts) inequality constraintsfor i in range(0m)p = pltplot(lu[0]x2[i]c=w)

pset_label(constraint) stepspltplot(x[0]x[1]-oc=rclip_on=Falsezorder=20label=simplex

)pltylim(lu_array[1])pltxlabel($x_1$)pltylabel($x_2$)pltlegend()pltshow()

opt Optimization ex The simplex algorithm p 7

0 2 4 6 8 10x1

minus50

minus25

00

25

50

75

100

125

150

x2

constraint

simplex

minus15

0

15

30

45

60

75

90

obje

ctiv

efu

nct

ion

Python code in this section was generated from a Jupyter

notebook named simplex_linear_programmingipynb with a

python3 kernel

nlin Optimization Exercises for Chapter 08 p 1

5 Barzilai and Borwein Two-Point Step Size Gradient Methods

optex Exercises for Chapter 08

Exercise 081 (cummerbund)

Consider the functions (a) f1 R2 rarr R and (b)

f2 R2 rarr R defined as

f1(x) = 4(x1 ndash 16)2 + (x2 + 64)

2 + x1 sin2 x1 (ex1)

f2(x) =1

2x middot Ax ndash b middot x (ex2)

where

A =

[5 0

0 15

]and (ex3a)

b =[ndash2 1

]gt (ex3b)

Use the method of Barzilai and Borwein5

starting at some x0 to find a minimum of each

function

Exercise 082 (melty)

Maximize the objective function

f(x) = c middot x (ex4a)

for x isin R3 and

c =[3 ndash8 1

]gt(ex4b)

subject to constraints

0 6 x1 6 20 (ex5a)

ndash5 6 x2 6 0 (ex5b)

5 6 x3 6 17 (ex5c)

x1 + 4x2 6 50 (ex5d)

2x1 + x3 6 43 (ex5e)

ndash4x1 + x2 ndash 5x3 gt ndash99 (ex5f)

1 As is customary we frequently say ldquosystemrdquo when we mean

ldquomathematical system modelrdquo Recall that multiple models may be

used for any given physical system depending on what one wants to

know

nlin

Nonlinear analysis

1 The ubiquity of near-linear systems and

the tools we have for analyses thereof can

sometimes give the impression that nonlinear

systems are exotic or even downright

flamboyant However a great many systems1

important for a mechanical engineer are

frequently hopelessly nonlinear Here are a some

examples of such systems

bull A robot arm

bull Viscous fluid flow (usually modelled by

the navier-stokes equations)

bull

bull Anything that ldquofills uprdquo or ldquosaturatesrdquo

bull Nonlinear optics

bull Einsteinrsquos field equations (gravitation in

general relativity)

bull Heat radiation and nonlinear heat

conduction

bull Fracture mechanics

bull

nlin Nonlinear analysis ss p 3

2 SH Strogatz and M Dichter Nonlinear Dynamics and Chaos

Second Studies in Nonlinearity Avalon Publishing 2016 isbn

9780813350844

3 W Richard Kolk and Robert A Lerman Nonlinear System Dynamics

1st ed Springer US 1993 isbn 978-1-4684shy6496shy2

2 Lest we think this is merely an

inconvenience we should keep in mind that

it is actually the nonlinearity that makes

many phenomena useful For instance the

depends on the nonlinearity

of its optics Similarly transistors and the

digital circuits made thereby (including the

microprocessor) wouldnrsquot function if their

physics were linear

3 In this chapter we will see some ways

to formulate characterize and simulate

nonlinear systems Purely

are few for nonlinear systems

Most are beyond the scope of this text but we

describe a few mostly in Lecture nlinchar

Simulation via numerical integration of

nonlinear dynamical equations is the most

accessible technique so it is introduced

4 We skip a discussion of linearization

of course if this is an option it is

preferable Instead we focus on the

5 For a good introduction to nonlinear

dynamics see Strogatz and Dichter2 A more

engineer-oriented introduction is Kolk and

Lerman3

nlin Nonlinear analysis ss Nonlinear state-space models p 1

nonlinear stateshyspace models

autonomous system

nonautonomous system

4 Strogatz and Dichter Nonlinear Dynamics and Chaos

equilibrium state

stationary point

nlinss Nonlinear stateshyspace

models

1 A state-space model has the general form

dx

dt= f(xu t) (ss1a)

y = (ss1b)

where f and g are vectorshyvalued functions

that depend on the system Nonlinear state-

space models are those for which f is a

functional of either x or u

For instance a state variable x1 might appear

as x21or two state variables might combine as

x1x2 or an input u1 might enter the equations as

log u1

Autonomous and nonautonomous systems

2 An autonomous system is one for which

f(x) with neither time nor input appearing

explicitly A nonautonomous system is one

for which either t or u do appear explicitly

in f It turns out that we can always write

nonautonomous systems as autonomous by

substituting in u(t) and introducing an extra

for t4

3 Therefore without loss of generality we

will focus on ways of analyzing autonomous

systems

Equilibrium

4 An equilibrium state (also called a

) x is one for which

dx dt = 0 In most cases this occurs only

when the input u is a constant u and for

timeshyvarying systems at a given time t For

autonomous systems equilibrium occurs when

nlin Nonlinear analysis char Nonlinear state-space models p 2

the following holds

(ss2)

This is a system of nonlinear algebraic

equations which can be challenging to solve for

x However frequently several solutionsmdashthat

is equilibrium statesmdashdo exist

nlin Nonlinear analysis char Nonlinear system characteristics p 1

system order

nlinchar Nonlinear system

characteristics

1 Characterizing nonlinear systems can be

challengingwithout the tools developed

for system characterization

However there are ways of characterizing

nonlinear systems and wersquoll here explore a few

DUM See Equation ss1a

Those in-common with linear systems

2 As with linear systems the system order is

either the number of stateshyvariables required

to describe the system or equivalently the

highest-order in a single

scalar differential equation describing the

system

3 Similarly nonlinear systems can have state

variables that depend on alone

or those that also depend on

(or some other independent variable) The

former lead to ordinary differential equations

(ODEs) and the latter to partial differential

equations (PDEs)

4 Equilibrium was already considered in

Lecture nlinss

Stability

5 In terms of system performance perhaps

no other criterion is as important as

nlin Nonlinear analysis char Nonlinear system characteristics p 1

Lyapunov stability theory

5 William L Brogan Modern Control Theory Third Prentice Hall 1991

Ch 10

6 A Choukchou-Braham et al Analysis and Control of

Underactuated Mechanical Systems SpringerLink Buumlcher Springer

International Publishing 2013 isbn 9783319026367 App A

Definition nlinchar1 Stability

If x is perturbed from an equilibrium state x

the response x(t) can

1 asymptotically return to x

(asymptotically )

2 diverge from x ( ) or

3 remain perturned or oscillate

about x with a constant amplitude

( stable)

Notice that this definition is actually

local stability in the neighborhood of one

equilibrium may not be the same as in the

neighborhood of another

6 Other than nonlinear systemsrsquo lack of

linear systemsrsquo eigenvalues poles and roots

of the characteristic equation from which to

compute it the primary difference between

the stability of linear and nonlinear systems

is that nonlinear system stability is often

difficult to establish

Using a linear systemrsquos eigenvalues it is

straightforward to establish stable unstable

and marginally stable subspaces of state-space

(via transforming to an eigenvector basis)

For nonlinear systems no such method exists

However we are not without tools to explore

nonlinear system stability One mathematical

tool to consider is

which is beyond the scope of this

course but has good treatments in5 and6

Qualities of equilibria

7 Equilibria (ie stationary points) come in

a variety of qualities It is instructive to

nlin Nonlinear analysis char Nonlinear system characteristics p 2

x

xprime

(a) r lt 0

x

xprime

(b) r = 0

x

xprime

(c) r gt 0

Figure 091 plots of xprime versus x for Equation char1

attractor

sink

stable

consider the first-order differential equation

in state variable with real constant

xprime = rx ndash x3 (char1)

If we plot xprime versus x for different values of r

we obtain the plots of Figure 091

8 By definition equilibria occur when

xprime = 0 so the xshyaxis crossings of Figure 091

are equilibria The blue arrows on the xshyaxis

show the of state

change xprime quantified by the plots For both

(a) and (b) only one equilibrium exists x = 0

Note that the blue arrows in both plots point

toward the equilibrium In such casesmdashthat is

when a exists around

an equilibrium for which state changes point

toward the equilibriummdashthe equilibrium is

called an or

Note that attractors are

9 Now consider (c) of Figure 091 When r gt 0

three equilibria emerge This change of the

number of equilibria with the changing of a

parameter is called a

nlin Nonlinear analysis sim Nonlinear system characteristics p 3

bifurcation diagram

repeller

source

unstable

saddle

A plot of bifurcations versus the parameter

is called a bifurcation diagram The x =

0 equilibrium now has arrows that point

from it Such an equilibrium is

called a or

and is The other two

equilibria here are (stable) attractors

Consider a very small initial condition x(0) = ε

If ε gt 0 the repeller pushes away x and the

positive attractor pulls x to itself Conversely

if ε lt 0 the repeller again pushes away x and

the negative attractor pulls x to itself

10 Another type of equilibrium is called

the one which acts as an

attractor along some lines and as a repeller

along others We will see this type in the

following example

Example nlincharshy1 re Saddle bifurcation

Problem Statement

Consider the dynamical equation

xprime = x2 + r (char2)

with r a real constant Sketch xprime vs x for

negative zero and positive r Identify and

classify each of the equilibria

nlin Nonlinear analysis pysim Nonlinear system simulation p 1

nlinsim Nonlinear system

simulation

nlin Nonlinear analysis pysim Simulating nonlinear systems in Python p 1

nlinpysim Simulating nonlinear

systems in Python

Example nlinpysimshy1 re a nonlinear unicycle

Problem Statementasdf

Problem Solution

First load some Python packages

import numpy as npimport sympy as spfrom scipyintegrate import solve_ivpimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

The state equation can be encoded via the

following function f

def f(t x u c)dxdt = [

x[3]npcos(x[2])x[3]npsin(x[2])x[4]1c[0] u(t)[0]1c[1] u(t)[1]

]return dxdt

The input function u must also be defined

nlin Nonlinear analysis ex Simulating nonlinear systems in Python p 2

def u(t)return [

15(1+npcos(t))25npsin(3t)

]

Define time spans initial values and constantstspan = nplinspace(0 50 300)xinit = [00000]mass = 10inertia = 10c = [massinertia]

Solve differential equationsol = solve_ivp(

lambda t x f(t x u c)[tspan[0] tspan[-1]]xinitt_eval=tspan

)

Letrsquos first plot the trajectory and

instantaneous velocity

ndash75 ndash50 ndash25 0 25 50 75 100x

ndash75

ndash50

ndash25

0

25

50

75

100

y

Figure 092

xp = soly[3]npcos(soly[2])yp = soly[3]npsin(soly[2])p = pltfigure()pltplot(soly[0]soly[1])pltquiver(soly[0]soly[1]xpyp)pltxlabel($x$)pltylabel($y$)pltshow()

Python code in this section was generated from a Jupyter

notebook named horcruxipynb with a python3 kernel

Nonlinear analysis Exercises for Chapter 09 p 1

nlinex Exercises for Chapter 09

A

Distribution tables

A Distribution tables A01 Gaussian distribution table p 1

A01 Gaussian distribution table

Below are plots of the Gaussian probability

density function f and cumulative distribution

function Φ Below them is Table A1 of CDF

values

zb

ndash3 ndash2 ndash1 0 1 2 30

01

02

03

04

z

Gaussia

nPDFf(z)

Φ(zb)

f(z)

(a) the Gaussian PDF

ndash3 ndash2 ndash1 0 1 2 30

02

04

06

08

1

interval bound zb

Gaussia

nCDFΦ(zb)

(b) the Gaussian CDF

Figure A1 the Gaussian PDF and CDF for z-scoresTable A1 z-score table Φ(zb) = P(z isin (ndashinfin zb])

zb 0 1 2 3 4 5 6 7 8 9

shy34 00003 00003 00003 00003 00003 00003 00003 00003 00003 00002

shy33 00005 00005 00005 00004 00004 00004 00004 00004 00004 00003

shy32 00007 00007 00006 00006 00006 00006 00006 00005 00005 00005

shy31 00010 00009 00009 00009 00008 00008 00008 00008 00007 00007

shy30 00013 00013 00013 00012 00012 00011 00011 00011 00010 00010

shy29 00019 00018 00018 00017 00016 00016 00015 00015 00014 00014

shy28 00026 00025 00024 00023 00023 00022 00021 00021 00020 00019

shy27 00035 00034 00033 00032 00031 00030 00029 00028 00027 00026

shy26 00047 00045 00044 00043 00041 00040 00039 00038 00037 00036

shy25 00062 00060 00059 00057 00055 00054 00052 00051 00049 00048

shy24 00082 00080 00078 00075 00073 00071 00069 00068 00066 00064

shy23 00107 00104 00102 00099 00096 00094 00091 00089 00087 00084

shy22 00139 00136 00132 00129 00125 00122 00119 00116 00113 00110

shy21 00179 00174 00170 00166 00162 00158 00154 00150 00146 00143

shy20 00228 00222 00217 00212 00207 00202 00197 00192 00188 00183

shy19 00287 00281 00274 00268 00262 00256 00250 00244 00239 00233

shy18 00359 00351 00344 00336 00329 00322 00314 00307 00301 00294

shy17 00446 00436 00427 00418 00409 00401 00392 00384 00375 00367

shy16 00548 00537 00526 00516 00505 00495 00485 00475 00465 00455

A Distribution tables A01 Gaussian distribution table p 2

Table A1 z-score table Φ(zb) = P(z isin (ndashinfin zb])

zb 0 1 2 3 4 5 6 7 8 9

shy15 00668 00655 00643 00630 00618 00606 00594 00582 00571 00559

shy14 00808 00793 00778 00764 00749 00735 00721 00708 00694 00681

shy13 00968 00951 00934 00918 00901 00885 00869 00853 00838 00823

shy12 01151 01131 01112 01093 01075 01056 01038 01020 01003 00985

shy11 01357 01335 01314 01292 01271 01251 01230 01210 01190 01170

shy10 01587 01562 01539 01515 01492 01469 01446 01423 01401 01379

shy09 01841 01814 01788 01762 01736 01711 01685 01660 01635 01611

shy08 02119 02090 02061 02033 02005 01977 01949 01922 01894 01867

shy07 02420 02389 02358 02327 02296 02266 02236 02206 02177 02148

shy06 02743 02709 02676 02643 02611 02578 02546 02514 02483 02451

shy05 03085 03050 03015 02981 02946 02912 02877 02843 02810 02776

shy04 03446 03409 03372 03336 03300 03264 03228 03192 03156 03121

shy03 03821 03783 03745 03707 03669 03632 03594 03557 03520 03483

shy02 04207 04168 04129 04090 04052 04013 03974 03936 03897 03859

shy01 04602 04562 04522 04483 04443 04404 04364 04325 04286 04247

shy00 05000 04960 04920 04880 04840 04801 04761 04721 04681 04641

00 05000 05040 05080 05120 05160 05199 05239 05279 05319 05359

01 05398 05438 05478 05517 05557 05596 05636 05675 05714 05753

02 05793 05832 05871 05910 05948 05987 06026 06064 06103 06141

03 06179 06217 06255 06293 06331 06368 06406 06443 06480 06517

04 06554 06591 06628 06664 06700 06736 06772 06808 06844 06879

05 06915 06950 06985 07019 07054 07088 07123 07157 07190 07224

06 07257 07291 07324 07357 07389 07422 07454 07486 07517 07549

07 07580 07611 07642 07673 07704 07734 07764 07794 07823 07852

08 07881 07910 07939 07967 07995 08023 08051 08078 08106 08133

09 08159 08186 08212 08238 08264 08289 08315 08340 08365 08389

10 08413 08438 08461 08485 08508 08531 08554 08577 08599 08621

11 08643 08665 08686 08708 08729 08749 08770 08790 08810 08830

12 08849 08869 08888 08907 08925 08944 08962 08980 08997 09015

13 09032 09049 09066 09082 09099 09115 09131 09147 09162 09177

14 09192 09207 09222 09236 09251 09265 09279 09292 09306 09319

15 09332 09345 09357 09370 09382 09394 09406 09418 09429 09441

16 09452 09463 09474 09484 09495 09505 09515 09525 09535 09545

17 09554 09564 09573 09582 09591 09599 09608 09616 09625 09633

18 09641 09649 09656 09664 09671 09678 09686 09693 09699 09706

19 09713 09719 09726 09732 09738 09744 09750 09756 09761 09767

20 09772 09778 09783 09788 09793 09798 09803 09808 09812 09817

21 09821 09826 09830 09834 09838 09842 09846 09850 09854 09857

22 09861 09864 09868 09871 09875 09878 09881 09884 09887 09890

23 09893 09896 09898 09901 09904 09906 09909 09911 09913 09916

24 09918 09920 09922 09925 09927 09929 09931 09932 09934 09936

25 09938 09940 09941 09943 09945 09946 09948 09949 09951 09952

26 09953 09955 09956 09957 09959 09960 09961 09962 09963 09964

27 09965 09966 09967 09968 09969 09970 09971 09972 09973 09974

28 09974 09975 09976 09977 09977 09978 09979 09979 09980 09981

29 09981 09982 09982 09983 09984 09984 09985 09985 09986 09986

30 09987 09987 09987 09988 09988 09989 09989 09989 09990 09990

31 09990 09991 09991 09991 09992 09992 09992 09992 09993 09993

32 09993 09993 09994 09994 09994 09994 09994 09995 09995 09995

33 09995 09995 09995 09996 09996 09996 09996 09996 09996 09997

34 09997 09997 09997 09997 09997 09997 09997 09997 09997 09998

A Distribution tables Studentrsquos tshydistribution table p 1

A02 Studentrsquos t-distribution

tableTable A2 two-tail inverse studentrsquos t-distribution table

percent probability

ν 600 667 750 800 875 900 950 975 990 995 999

1 0325 0577 1000 1376 2414 3078 6314 12706 31821 63657 31831

2 0289 0500 0816 1061 1604 1886 2920 4303 6965 9925 22327

3 0277 0476 0765 0978 1423 1638 2353 3182 4541 5841 10215

4 0271 0464 0741 0941 1344 1533 2132 2776 3747 4604 7173

5 0267 0457 0727 0920 1301 1476 2015 2571 3365 4032 5893

6 0265 0453 0718 0906 1273 1440 1943 2447 3143 3707 5208

7 0263 0449 0711 0896 1254 1415 1895 2365 2998 3499 4785

8 0262 0447 0706 0889 1240 1397 1860 2306 2896 3355 4501

9 0261 0445 0703 0883 1230 1383 1833 2262 2821 3250 4297

10 0260 0444 0700 0879 1221 1372 1812 2228 2764 3169 4144

11 0260 0443 0697 0876 1214 1363 1796 2201 2718 3106 4025

12 0259 0442 0695 0873 1209 1356 1782 2179 2681 3055 3930

13 0259 0441 0694 0870 1204 1350 1771 2160 2650 3012 3852

14 0258 0440 0692 0868 1200 1345 1761 2145 2624 2977 3787

15 0258 0439 0691 0866 1197 1341 1753 2131 2602 2947 3733

16 0258 0439 0690 0865 1194 1337 1746 2120 2583 2921 3686

17 0257 0438 0689 0863 1191 1333 1740 2110 2567 2898 3646

18 0257 0438 0688 0862 1189 1330 1734 2101 2552 2878 3610

19 0257 0438 0688 0861 1187 1328 1729 2093 2539 2861 3579

20 0257 0437 0687 0860 1185 1325 1725 2086 2528 2845 3552

21 0257 0437 0686 0859 1183 1323 1721 2080 2518 2831 3527

22 0256 0437 0686 0858 1182 1321 1717 2074 2508 2819 3505

23 0256 0436 0685 0858 1180 1319 1714 2069 2500 2807 3485

24 0256 0436 0685 0857 1179 1318 1711 2064 2492 2797 3467

25 0256 0436 0684 0856 1178 1316 1708 2060 2485 2787 3450

26 0256 0436 0684 0856 1177 1315 1706 2056 2479 2779 3435

27 0256 0435 0684 0855 1176 1314 1703 2052 2473 2771 3421

28 0256 0435 0683 0855 1175 1313 1701 2048 2467 2763 3408

29 0256 0435 0683 0854 1174 1311 1699 2045 2462 2756 3396

30 0256 0435 0683 0854 1173 1310 1697 2042 2457 2750 3385

35 0255 0434 0682 0852 1170 1306 1690 2030 2438 2724 3340

40 0255 0434 0681 0851 1167 1303 1684 2021 2423 2704 3307

45 0255 0434 0680 0850 1165 1301 1679 2014 2412 2690 3281

50 0255 0433 0679 0849 1164 1299 1676 2009 2403 2678 3261

55 0255 0433 0679 0848 1163 1297 1673 2004 2396 2668 3245

60 0254 0433 0679 0848 1162 1296 1671 2000 2390 2660 3232infin 0253 0431 0674 0842 1150 1282 1645 1960 2326 2576 3090

B

Fourier andLaplace tables

B Fourier and Laplace tables B01 Laplace transforms p 1

B01 Laplace transforms

Table B1 is a table with functions of time

f(t) on the left and corresponding Laplace

transforms L(s) on the right Where applicable

s = σ + jω is the Laplace transform variable

T is the timeshydomain period ω02πT is the

corresponding angular frequency j =radicndash1 a isin R+

and b t0 isin R are constants

Table B1 Laplace transform identities

function of time t function of Laplace s

a1f1(t) + a2f2(t) a1F1(s) + a2F2(s)

f(t ndash t0) F(s)endasht0s

fprime(t) sF(s) ndash f(0)

dnf(t)

dtnsnF(s) + s(nndash1)f(0) + s(nndash2)fprime(0) + middot middot middot + f(nndash1)(0)

ˆ t

0f(τ)dτ

1

sF(s)

tf(t) ndashFprime(s)

f1(t) lowast f2(t) =ˆ infinndashinfin f1(τ)f2(t ndash τ)dτ F1(s)F2(s)

δ(t) 1

us(t) 1s

ur(t) 1s2

tnndash1(n ndash 1) 1sn

endashat1

s + a

tendashat1

(s + a)2

B Fourier and Laplace tables B01 Laplace transforms p 2

1

(n ndash 1)tnndash1endashat

1

(s + a)n

1

a ndash b(eat ndash ebt)

1

(s ndash a)(s ndash b)(a 6= b)

1

a ndash b(aeat ndash bebt)

s

(s ndash a)(s ndash b)(a 6= b)

sinωtω

s2 +ω2

cosωts

s2 +ω2

eat sinωtω

(s ndash a)2 +ω2

eat cosωts ndash a

(s ndash a)2 +ω2

B Fourier and Laplace tables B02 Fourier transforms p 1

B02 Fourier transforms

Table B2 is a table with functions of time

f(t) on the left and corresponding Fourier

transforms F(ω) on the right Where applicable

T is the timeshydomain period ω02πT is the

corresponding angular frequency j =radicndash1 a isin R+

and b t0 isin R are constants Furthermore feand f0 are even and odd functions of time

respectively and it can be shown that any

function f can be written as the sum f(t) =

fe(t) + f0(t) (Hsu1967)

Table B2 Fourier transform identities

function of time t function of frequency ω

a1f1(t) + a2f2(t) a1F1(ω) + a2F2(ω)

f(at)1

|a|F(ωa)

f(ndasht) F(ndashω)

f(t ndash t0) F(ω)endashjωt0

f(t) cosω0t1

2F(ω ndashω0) +

1

2F(ω +ω0)

f(t) sinω0t1

j2F(ω ndashω0) ndash

1

j2F(ω +ω0)

fe(t) Re F(ω)

f0(t) j Im F(ω)

F(t) 2πf(ndashω)

fprime(t) jωF(ω)

dnf(t)

dtn(jω)nF(ω)

ˆ t

ndashinfin f(τ)dτ1

jωF(ω) + πF(0)δ(ω)

B Fourier and Laplace tables B02 Fourier transforms p 2

ndashjtf(t) Fprime(ω)

(ndashjt)nf(t)dnF(ω)

dωn

f1(t) lowast f2(t) =ˆ infinndashinfin f1(τ)f2(t ndash τ)dτ F1(ω)F2(ω)

f1(t)f2(t)1

2πF1(ω) lowast F2(ω) =

1

ˆ infinndashinfin F1(α)F2(ω ndash α)dα

endashatus(t)1

jω + a

endasha|t|2a

a2 +ω2

endashat2 radic

πa endashω2(4a)

1 for |t| lt a2 else 0a sin(aω2)

aω2

tendashatus(t)1

(a + jω)2

tnndash1

(n ndash 1)endashat)nus(t)

1

(a + jω)n

1

a2 + t2π

aendasha|ω|

δ(t) 1

δ(t ndash t0) endashjωt0

us(t) πδ(ω) +1

us(t ndash t0) πδ(ω) +1

jωendashjωt0

1 2πδ(ω)

t 2πjδprime(ω)

tn 2πjndnδ(ω)

dωn

ejω0t 2πδ(ω ndashω0)

cosω0t πδ(ω ndashω0) + πδ(ω +ω0)

B Fourier and Laplace tables Fourier transforms p 3

sinω0t ndashjπδ(ω ndashω0) + jπδ(ω +ω0)

us(t) cosω0tjω

ω20 ndashω2+π

2δ(ω ndashω0) +

π

2δ(ω +ω0)

us(t) sinω0tω0

ω20 ndashω2+π

2jδ(ω ndashω0) ndash

π

2jδ(ω +ω0)

tus(t) jπδprime(ω) ndash 1ω2

1t πj ndash 2πjus(ω)

1tn(ndashjω)nndash1

(n ndash 1)

(πj ndash 2πjus(ω)

)sgn t

2

jωinfinsumn=ndashinfin δ(t ndash nT) ω0

infinsumn=ndashinfin δ(ω ndash nω0)

C

Mathematicsreference

C Mathematics reference C01 Quadratic forms p 1

C01 Quadratic forms

The solution to equations of the form ax2 + bx +

c = 0 is

x =ndashbplusmn

radicb2 ndash 4ac

2a (C011)

Completing the square

This is accomplished by reshywriting the quadratic

formula in the form of the leftshyhand-side (LHS)

of this equality which describes factorization

x2 + 2xh + h2 = (x + h)2 (C012)

C Mathematics reference C02 Trigonometry p 1

C02 Trigonometry

Triangle identities

With reference to the below figure the law of

sines is

sin α

a=sin β

b=sin γ

c(C021)

and the law of cosines is

c2 = a2 + b2 ndash 2ab cos γ (C022a)

b2 = a2 + c2 ndash 2ac cos β (C022b)

a2 = c2 + b2 ndash 2cb cos α (C022c)

b

ca

α γ

β

Reciprocal identities

csc u =1

sin u(C023a)

sec u =1

cos u(C023b)

cot u =1

tan u(C023c)

Pythagorean identities

1 = sin2 u + cos2 u (C024a)

sec2 u = 1 + tan2 u (C024b)

csc2 u = 1 + cot2 u (C024c)

C Mathematics reference C02 Trigonometry p 1

Co-function identities

sin(π2ndash u

)= cos u (C025a)

cos(π2ndash u

)= sin u (C025b)

tan(π2ndash u

)= cot u (C025c)

csc(π2ndash u

)= sec u (C025d)

sec(π2ndash u

)= csc u (C025e)

cot(π2ndash u

)= tan u (C025f)

Even-odd identities

sin(ndashu) = ndash sin u (C026a)

cos(ndashu) = cos u (C026b)

tan(ndashu) = ndash tan u (C026c)

Sumshydifference formulas (AM or lock-in)

sin(uplusmn v) = sin u cos vplusmn cos u sin v (C027a)

cos(uplusmn v) = cos u cos v∓ sin u sin v (C027b)

tan(uplusmn v) = tan uplusmn tan v1∓ tanu tan v

(C027c)

Double angle formulas

sin(2u) = 2 sin u cos u (C028a)

cos(2u) = cos2 u ndash sin2 u (C028b)

= 2 cos2 u ndash 1 (C028c)

= 1 ndash 2 sin2 u (C028d)

tan(2u) =2 tanu

1 ndash tan2 u(C028e)

Power-reducing or halfshyangle formulas

sin2 u =1 ndash cos(2u)

2(C029a)

cos2 u =1 + cos(2u)

2(C029b)

tan2 u =1 ndash cos(2u)

1 + cos(2u)(C029c)

C Mathematics reference C02 Trigonometry p 1

Sum-toshyproduct formulas

sin u + sin v = 2 sinu + v

2cos

u ndash v

2(C0210a)

sin u ndash sin v = 2 cosu + v

2sin

u ndash v

2(C0210b)

cos u + cos v = 2 cosu + v

2cos

u ndash v

2(C0210c)

cos u ndash cos v = ndash2 sinu + v

2sin

u ndash v

2(C0210d)

Product-to-sum formulas

sin u sin v =1

2

[cos(u ndash v) ndash cos(u + v)

](C0211a)

cos u cos v =1

2

[cos(u ndash v) + cos(u + v)

](C0211b)

sin u cos v =1

2

[sin(u + v) + sin(u ndash v)

](C0211c)

cos u sin v =1

2

[sin(u + v) ndash sin(u ndash v)

](C0211d)

Two-to-one formulas

A sin u + B cos u = C sin(u + φ) (C0212a)

= C cos(u +ψ) where (C0212b)

C =

radicA2 + B2 (C0212c)

φ = arctanB

A(C0212d)

ψ = ndash arctanA

B(C0212e)

C Mathematics reference C03 Matrix inverses p 1

adjoint

C03 Matrix inverses

This is a guide to inverting 1times1 2times2 and ntimesnmatrices

Let A be the 1times 1 matrix

A =[a]

The inverse is simply the reciprocal

Andash1 =[1a

]

Let B be the 2times 2 matrix

B =

[b11 b12b21 b22

]

It can be shown that the inverse follows a

simple pattern

Bndash1 =1

det B

[b22 ndashb12ndashb21 b11

]

=1

b11b22 ndash b12b21

[b22 ndashb12ndashb21 b11

]

Let C be an n times n matrix It can be shown that

its inverse is

Cndash1 =1

det CadjC

where adj is the adjoint of C

C Mathematics reference Laplace transforms p 1

C04 Laplace transforms

The definition of the one-side Laplace and

inverse Laplace transforms follow

Definition C041 Laplace transforms (one-

sided)

Laplace transform L

L(y(t)) = Y(s) =

ˆ infin0

y(t)endashstdt (C041)

Inverse Laplace transform Lndash1

Lndash1(Y(s)) = y(t) =1

2πj

ˆ σ+jinfinσndashjinfin Y(s)estds (C042)

See Table B1 for a list of properties and

common transforms

D

Complex analysis

D Complex analysis Eulerrsquos formulas p 1

Eulerrsquos formulaD01 Eulerrsquos formulas

Eulerrsquos formula is our bridge backshyand forth

between trigonomentric forms (cos θ and sin θ)

and complex exponential form (ejθ)

ejθ = cos θ + j sin θ (The Formulas of Euler1)

Here are a few useful identities implied by

Eulerrsquos formula

endashjθ = cos θ ndash j sin θ (The Formulas of Euler2a)

cos θ = Re (ejθ) (The Formulas of Euler2b)

=1

2

(ejθ + endashjθ

)(The Formulas of Euler2c)

sin θ = Im (ejθ) (The Formulas of Euler2d)

=1

j2

(ejθ ndash endashjθ

)

(The Formulas of Euler2e)

Bibliography

[1] Robert B Ash Basic Probability Theory

Dover Publications Inc 2008

[2] Joan Bagaria Set Theory In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Fall 2019

Metaphysics Research Lab Stanford

University 2019

[3] Maria Baghramian and J Adam

Carter Relativism In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Winter 2019

Metaphysics Research Lab Stanford

University 2019

[4] Stephen Barker and Mark Jago Being

Positive About Negative Facts In

Philosophy and Phenomenological

Research 851 (2012) pp 117ndash138 doi 10

1111j1933-1592201000479x

[5] Jonathan Barzilai and Jonathan M

Borwein Two-Point Step Size Gradient

Methods In IMA Journal of Numerical

Analysis 81 (Jan 1988) pp 141ndash148 issn

0272-4979 doi 101093imanum81141

This includes an innovative line search

method

D Complex analysis Bibliography p 3

1 FBADL04

[6] Anat Biletzki and Anat Matar Ludwig

Wittgenstein In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Summer 2018

Metaphysics Research Lab Stanford

University 2018 An introduction to

Wittgenstein and his thought

[7] Antonio Bove F (Ferruccio) Colombini

and Daniele Del Santo Phase space

analysis of partial differential

equations eng Progress in nonlinear

differential equations and their

applications v 69 Boston Berlin

Birkhaumluser 2006 isbn 9780817645212

[8] William L Brogan Modern Control Theory

Third Prentice Hall 1991

[9] Francesco Bullo and Andrew D Lewis

Geometric control of mechanical systems

modeling analysis and design for simple

mechanical control systems Ed by JE

Marsden L Sirovich and M Golubitsky

Springer 2005

[10] Francesco Bullo and Andrew D Lewis

Supplementary Chapters for Geometric

Control of Mechanical Systems1 Jan

2005

[11] A Choukchou-Braham et al Analysis and

Control of Underactuated Mechanical

Systems SpringerLink Buumlcher Springer

International Publishing 2013 isbn

9783319026367

[12] K Ciesielski Set Theory for the

Working Mathematician London

Mathematical Society Student Texts

Cambridge University Press 1997 isbn

9780521594653 A readable introduction

to set theory

D Complex analysis Bibliography p 4

[13] Marian David The Correspondence

Theory of Truth In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Fall 2016 Metaphysics

Research Lab Stanford University

2016 A detailed overview of the

correspondence theory of truth

[14] David Dolby Wittgenstein on Truth

In A Companion to Wittgenstein John

Wiley amp Sons Ltd 2016 Chap 27 pp 433ndash

442 isbn 9781118884607 doi 101002

9781118884607ch27

[15] Peter J Eccles An Introduction to

Mathematical Reasoning Numbers

Sets and Functions Cambridge

University Press 1997 doi

10 1017 CBO9780511801136 A gentle

introduction to mathematical reasoning

It includes introductory treatments of

set theory and number systems

[16] HB Enderton Elements of Set

Theory Elsevier Science 1977 isbn

9780080570426 A gentle introduction to

set theory and mathematical reasoningmdasha

great place to start

[17] Richard P Feynman Robert B Leighton

and Matthew Sands The Feynman

Lectures on Physics New Millennium

Perseus Basic Books 2010

[18] Michael Glanzberg Truth In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Fall 2018

Metaphysics Research Lab Stanford

University 2018

[19] Hans Johann Glock Truth in the

Tractatus In Synthese 1482 (Jan

2006) pp 345ndash368 issn 1573-0964 doi

101007s11229-004-6226-2

D Complex analysis Bibliography p 5

[20] Mario GoacutemezshyTorrente Alfred Tarski

In The Stanford Encyclopedia of

Philosophy Ed by Edward N Zalta

Spring 2019 Metaphysics Research Lab

Stanford University 2019

[21] Paul Guyer and Rolf-Peter Horstmann

Idealism In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Winter 2018

Metaphysics Research Lab Stanford

University 2018

[22] R Haberman Applied Partial Differential

Equations with Fourier Series and

Boundary Value Problems (Classic

Version) Pearson Modern Classics for

Advanced Mathematics Pearson Education

Canada 2018 isbn 9780134995434

[23] GWF Hegel and AV Miller

Phenomenology of Spirit Motilal

Banarsidass 1998 isbn 9788120814738

[24] Wilfrid Hodges Model Theory In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Fall 2018

Metaphysics Research Lab Stanford

University 2018

[25] Wilfrid Hodges Tarskirsquos Truth

Definitions In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Fall 2018 Metaphysics

Research Lab Stanford University 2018

[26] Peter Hylton and Gary Kemp Willard

Van Orman Quine In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Spring 2019 Metaphysics

Research Lab Stanford University 2019

[27] ET Jaynes et al Probability Theory The

Logic of Science Cambridge University

Press 2003 isbn 9780521592710

D Complex analysis Bibliography p 6

An excellent and comprehensive

introduction to probability theory

[28] I Kant P Guyer and AW Wood Critique

of Pure Reason The Cambridge Edition

of the Works of Immanuel Kant

Cambridge University Press 1999 isbn

9781107268333

[29] Juliette Kennedy Kurt Goumldel In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Winter 2018

Metaphysics Research Lab Stanford

University 2018

[30] Drew Khlentzos Challenges to

Metaphysical Realism In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Winter 2016

Metaphysics Research Lab Stanford

University 2016

[31] Peter Klein Skepticism In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Summer 2015

Metaphysics Research Lab Stanford

University 2015

[32] M Kline Mathematics The Loss

of Certainty A Galaxy book

Oxford University Press 1982 isbn

9780195030853 A detailed account

of the ldquoillogicalrdquo development of

mathematics and an exposition of

its therefore remarkable utility in

describing the world

[33] W Richard Kolk and Robert A Lerman

Nonlinear System Dynamics 1st ed

Springer US 1993 isbn 978-1-4684shy6496shy2

[34] Erwin Kreyszig Advanced Engineering

Mathematics 10th John Wiley amp Sons

Limited 2011 isbn 9781119571094 The

authoritative resource for engineering

mathematics It includes detailed

D Complex analysis Bibliography p 7

accounts of probability statistics

vector calculus linear algebra

fourier analysis ordinary and partial

differential equations and complex

analysis It also includes several other

topics with varying degrees of depth

Overall it is the best place to start

when seeking mathematical guidance

[35] John M Lee Introduction to Smooth

Manifolds second Vol 218 Graduate

Texts in Mathematics Springer 2012

[36] Catherine Legg and Christopher

Hookway Pragmatism In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Spring 2019 Metaphysics

Research Lab Stanford University

2019 An introductory article on the

philsophical movement ldquopragmatismrdquo It

includes an important clarification of

the pragmatic slogan ldquotruth is the end of

inquiryrdquo

[37] Daniel Liberzon Calculus of Variations

and Optimal Control Theory A Concise

Introduction Princeton University Press

2012 isbn 9780691151878

[38] Panu Raatikainen Goumldelrsquos Incompleteness

Theorems In The Stanford Encyclopedia

of Philosophy Ed by Edward N Zalta

Fall 2018 Metaphysics Research Lab

Stanford University 2018 A through

and contemporary description of

Goumldelrsquos incompleteness theorems which

have significant implications for the

foundations and function of mathematics

and mathematical truth

[39] Paul Redding Georg Wilhelm Friedrich

Hegel In The Stanford Encyclopedia

of Philosophy Ed by Edward N Zalta

Summer 2018 Metaphysics Research Lab

Stanford University 2018

D Complex analysis Bibliography p 8

[40] HM Schey Div Grad Curl and All that An

Informal Text on Vector Calculus WW

Norton 2005 isbn 9780393925166

[41] Christopher Shields Aristotle In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Winter 2016

Metaphysics Research Lab Stanford

University 2016

[42] Steven S Skiena Calculated Bets

Computers Gambling and Mathematical

Modeling to Win Outlooks Cambridge

University Press 2001 doi 10 1017

CBO9780511547089 This includes a lucid

section on probability versus statistics

also available here https www3

csstonybrookedu~skienajaialai

excerptsnode12html

[43] George Smith Isaac Newton In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Fall 2008

Metaphysics Research Lab Stanford

University 2008

[44] Daniel Stoljar and Nic Damnjanovic

The Deflationary Theory of Truth

In The Stanford Encyclopedia of

Philosophy Ed by Edward N Zalta Fall

2014 Metaphysics Research Lab Stanford

University 2014

[45] WA Strauss Partial Differential

Equations An Introduction Wiley 2007

isbn 9780470054567 A thorough and yet

relatively compact introduction

[46] SH Strogatz and M Dichter Nonlinear

Dynamics and Chaos Second Studies

in Nonlinearity Avalon Publishing 2016

isbn 9780813350844

[47] Mark Textor States of Affairs In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Winter 2016

D Complex analysis Bibliography p 9

Metaphysics Research Lab Stanford

University 2016

[48] Pauli Virtanen et al SciPy

10ndashFundamental Algorithms for

Scientific Computing in Python In arXiv

eshyprints arXiv190710121 (July 2019)

arXiv190710121

[49] Wikipedia Algebra mdash Wikipedia The Free

Encyclopedia httpenwikipedia

orgwindexphptitle=Algebraampoldid=

920573802 [Online accessed 26shyOctober-

2019] 2019

[50] Wikipedia Carl Friedrich Gauss mdash

Wikipedia The Free Encyclopedia http

en wikipedia org w index php

title=Carl20Friedrich20Gaussampoldid=

922692291 [Online accessed 26shyOctober-

2019] 2019

[51] Wikipedia Euclid mdash Wikipedia The Free

Encyclopedia httpenwikipedia

orgwindexphptitle=Euclidampoldid=

923031048 [Online accessed 26shyOctober-

2019] 2019

[52] Wikipedia First-order logic mdash

Wikipedia The Free Encyclopedia http

enwikipediaorgwindexphptitle=

First-order20logicampoldid=921437906

[Online accessed 29shyOctobershy2019] 2019

[53] Wikipedia Fundamental interaction mdash

Wikipedia The Free Encyclopedia http

en wikipedia org w index php

title=Fundamental20interactionampoldid=

925884124 [Online accessed 16shyNovember-

2019] 2019

[54] Wikipedia Leonhard Euler mdash Wikipedia

The Free Encyclopedia http en

wikipediaorgwindexphptitle=

Leonhard 20Euler amp oldid = 921824700

[Online accessed 26shyOctobershy2019] 2019

D Complex analysis Bibliography p 10

[55] Wikipedia Linguistic turnmdashWikipedia

The Free Encyclopedia http en

wikipediaorgwindexphptitle=

Linguistic20turnampoldid=922305269

[Online accessed 23shyOctobershy2019] 2019

Hey we all do it

[56] Wikipedia Probability space mdash

Wikipedia The Free Encyclopedia http

enwikipediaorgwindexphptitle=

Probability20spaceampoldid=914939789

[Online accessed 31shyOctobershy2019] 2019

[57] Wikipedia Propositional calculus mdash

Wikipedia The Free Encyclopedia http

en wikipedia org w index php

title=Propositional20calculusampoldid=

914757384 [Online accessed 29shyOctober-

2019] 2019

[58] Wikipedia Quaternion mdash Wikipedia The

Free Encyclopedia httpenwikipedia

orgwindexphptitle=Quaternionamp

oldid=920710557 [Online accessed 26-

Octobershy2019] 2019

[59] Wikipedia Set-builder notation mdash

Wikipedia The Free Encyclopedia http

enwikipediaorgwindexphptitle=

Set-builder20notationampoldid=917328223

[Online accessed 29shyOctobershy2019] 2019

[60] Wikipedia William Rowan Hamilton mdash

Wikipedia The Free Encyclopedia http

enwikipediaorgwindexphptitle=

William 20Rowan 20Hamilton amp oldid =

923163451 [Online accessed 26shyOctober-

2019] 2019

[61] L Wittgenstein PMS Hacker and J

Schulte Philosophical Investigations

Wiley 2010 isbn 9781444307979

[62] Ludwig Wittgenstein Tractatus Logico-

Philosophicus Ed by C familyi=C

given=K Ogden giveni= O Project

Complex analysis Bibliography p 11

Gutenberg International Library of

Psychology Philosophy and Scientific

Method Kegan Paul Trench Trubner amp Co

Ltd 1922 A brilliant work on what is

possible to express in languagemdashand what

is not As Wittgenstein puts it ldquoWhat can

be said at all can be said clearly and

whereof one cannot speak thereof one

must be silentrdquo

[63] Slavoj Žižek Less Than Nothing

Hegel and the Shadow of Dialectical

Materialism Verso 2012 isbn

9781844678976 This is one of the most

interesting presentations of Hegel and

Lacan by one of the most exciting

contemporary philosophers

  • Mathematics itself
    • Truth
      • Neo-classical theories of truth
      • The picture theory
      • The relativity of truth
      • Other ideas about truth
      • Where this leaves us
        • The foundations of mathematics
          • Algebra ex nihilo
          • The application of mathematics to science
          • The rigorization of mathematics
          • The foundations of mathematics are built
          • The foundations have cracks
          • Mathematics is considered empirical
            • Mathematical reasoning
            • Mathematical topics in overview
            • What is mathematics for engineering
            • Exercises for Chapter 01
              • Mathematical reasoning logic and set theory
                • Introduction to set theory
                • Logical connectives and quantifiers
                  • Logical connectives
                  • Quantifiers
                    • Exercises for Chapter 02
                      • Exe 021 hardhat
                      • Exe 022 2
                      • Exe 023 3
                      • Exe 024 4
                          • Probability
                            • Probability and measurement
                            • Basic probability theory
                              • Algebra of events
                                • Independence and conditional probability
                                  • Conditional probability
                                    • Bayes theorem
                                      • Testing outcomes
                                      • Posterior probabilities
                                        • Random variables
                                        • Probability density and mass functions
                                          • Binomial PMF
                                          • Gaussian PDF
                                            • Expectation
                                            • Central moments
                                            • Exercises for Chapter 03
                                              • Exe 031 5
                                                  • Statistics
                                                    • Populations samples and machine learning
                                                    • Estimation of sample mean and variance
                                                      • Estimation and sample statistics
                                                      • Sample mean variance and standard deviation
                                                      • Sample statistics as random variables
                                                      • Nonstationary signal statistics
                                                        • Confidence
                                                          • Generate some data to test the central limit theorem
                                                          • Sample statistics
                                                          • The truth about sample means
                                                          • Gaussian and probability
                                                            • Student confidence
                                                            • Multivariate probability and correlation
                                                              • Marginal probability
                                                              • Covariance
                                                              • Conditional probability and dependence
                                                                • Regression
                                                                • Exercises for Chapter 04
                                                                  • Vector calculus
                                                                    • Divergence surface integrals and flux
                                                                      • Flux and surface integrals
                                                                      • Continuity
                                                                      • Divergence
                                                                      • Exploring divergence
                                                                        • Curl line integrals and circulation
                                                                          • Line integrals
                                                                          • Circulation
                                                                          • Curl
                                                                          • Zero curl circulation and path independence
                                                                          • Exploring curl
                                                                            • Gradient
                                                                              • Gradient
                                                                              • Vector fields from gradients are special
                                                                              • Exploring gradient
                                                                                • Stokes and divergence theorems
                                                                                  • The divergence theorem
                                                                                  • The Kelvin-Stokes theorem
                                                                                  • Related theorems
                                                                                    • Exercises for Chapter 05
                                                                                      • Exe 051 6
                                                                                          • Fourier and orthogonality
                                                                                            • Fourier series
                                                                                            • Fourier transform
                                                                                            • Generalized fourier series and orthogonality
                                                                                            • Exercises for Chapter 06
                                                                                              • Exe 061 secrets
                                                                                              • Exe 062 seesaw
                                                                                              • Exe 063 9
                                                                                              • Exe 064 10
                                                                                              • Exe 065 11
                                                                                              • Exe 066 12
                                                                                              • Exe 067 13
                                                                                              • Exe 068 14
                                                                                              • Exe 069 savage
                                                                                              • Exe 0610 strawman
                                                                                                  • Partial differential equations
                                                                                                    • Classifying PDEs
                                                                                                    • Sturm-liouville problems
                                                                                                      • Types of boundary conditions
                                                                                                        • PDE solution by separation of variables
                                                                                                        • The 1D wave equation
                                                                                                        • Exercises for Chapter 07
                                                                                                          • Exe 071 poltergeist
                                                                                                          • Exe 072 kathmandu
                                                                                                              • Optimization
                                                                                                                • Gradient descent
                                                                                                                  • Stationary points
                                                                                                                  • The gradient points the way
                                                                                                                  • The classical method
                                                                                                                  • The Barzilai and Borwein method
                                                                                                                    • Constrained linear optimization
                                                                                                                      • Feasible solutions form a polytope
                                                                                                                      • Only the vertices matter
                                                                                                                        • The simplex algorithm
                                                                                                                        • Exercises for Chapter 08
                                                                                                                          • Exe 081 cummerbund
                                                                                                                          • Exe 082 melty
                                                                                                                              • Nonlinear analysis
                                                                                                                                • Nonlinear state-space models
                                                                                                                                  • Autonomous and nonautonomous systems
                                                                                                                                  • Equilibrium
                                                                                                                                    • Nonlinear system characteristics
                                                                                                                                      • Those in-common with linear systems
                                                                                                                                      • Stability
                                                                                                                                      • Qualities of equilibria
                                                                                                                                        • Nonlinear system simulation
                                                                                                                                        • Simulating nonlinear systems in Python
                                                                                                                                        • Exercises for Chapter 09
                                                                                                                                          • Distribution tables
                                                                                                                                            • Gaussian distribution table
                                                                                                                                            • Students t-distribution table
                                                                                                                                              • Fourier and Laplace tables
                                                                                                                                                • Laplace transforms
                                                                                                                                                • Fourier transforms
                                                                                                                                                  • Mathematics reference
                                                                                                                                                    • Quadratic forms
                                                                                                                                                      • Completing the square
                                                                                                                                                        • Trigonometry
                                                                                                                                                          • Triangle identities
                                                                                                                                                          • Reciprocal identities
                                                                                                                                                          • Pythagorean identities
                                                                                                                                                          • Co-function identities
                                                                                                                                                          • Even-odd identities
                                                                                                                                                          • Sum-difference formulas (AM or lock-in)
                                                                                                                                                          • Double angle formulas
                                                                                                                                                          • Power-reducing or half-angle formulas
                                                                                                                                                          • Sum-to-product formulas
                                                                                                                                                          • Product-to-sum formulas
                                                                                                                                                          • Two-to-one formulas
                                                                                                                                                            • Matrix inverses
                                                                                                                                                            • Laplace transforms
                                                                                                                                                              • Complex analysis
                                                                                                                                                                • Eulers formulas
                                                                                                                                                                  • Bibliography
Page 7: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview

Contents Contents p 7

Product-to-sum formulas 272

Two-to-one formulas 272

C03 Matrix inverses 273

C04 Laplace transforms 274

D Complex analysis 275

D01 Eulerrsquos formulas 276

Bibliography 277

itself

Mathematics itself

itself Mathematics itself tru Truth p 1

1 For much of this lecture I rely on the thorough overview

of Glanzberg (Michael Glanzberg Truth In The Stanford

Encyclopedia of Philosophy Ed by Edward N Zalta Fall 2018

Metaphysics Research Lab Stanford University 2018)

correspondence theory

itselftru Truth

1 Before we can discuss mathematical truth

we should begin with a discussion of truth

itself1 It is important to note that this is

obviously extremely incomplete My aim is

to give a sense of the subject via brutal

(mis)abbreviation

2 Of course the study of truth cannot but

be entangled with the study of the world

as such (metaphysics) and of knowledge

(epistemology) Some of the following theories

presuppose or imply a certain metaphysical

andor epistemological theory but which these

are is controversial

Neo-classical theories of truth

3 The neo-classical theories of truth

take for granted that there is truth and

attempt to explain what its precise nature is

(Michael Glanzberg Truth In The Stanford

Encyclopedia of Philosophy Ed by Edward N

Zalta Fall 2018 Metaphysics Research Lab

Stanford University 2018) What are provided

here are modern understandings of theories

developed primarily in the early 20th century

The correspondence theory

4 A version of what is called the

correspondence theory of truth is the

following

A proposition is true iff there is an

existing entity in the world that

corresponds with it

itself Mathematics itself tru Truth p 1

facts

false

coherence theory

2 This is typically put in terms of ldquobeliefsrdquo or ldquojudgmentsrdquo but for

brevity and parallelism I have cast it in terms of propositions It is

to this theory I have probably committed the most violence

coherence

Such existing entities are called facts Facts

are relational in that their parts (eg subject

predicate etc) are related in a certain way

5 Under this theory then if a proposition

does not correspond to a fact it is false

6 This theory of truth is rather intuitive

and consistently popular (Marian David The

Correspondence Theory of Truth In The

Stanford Encyclopedia of Philosophy Ed

by Edward N Zalta Fall 2016 Metaphysics

Research Lab Stanford University 2016 A

detailed overview of the correspondence

theory of truth)

The coherence theory

7 The coherence theory of truth is adamant

that the truth of any given proposition

is only as good as its holistic system of

propositions2 This includes (but typically

goes beyond) a requirement for consistency of

a given proposition with the whole and the

self-consistency of the whole itselfmdashsometimes

called coherence

8 For parallelism letrsquos attempt a succinct

formulation of this theory cast in terms of

propositions

A proposition is true iff it is

has coherence with a system of

propositions

itself Mathematics itself tru Truth p 1

pragmatism

3 Pragmatism was an American philosophical movement of the early

20th century that valued the success of ldquopracticalrdquo application of

theories For an introduction see Legg and Hookway (Catherine

Legg and Christopher Hookway Pragmatism In The Stanford

Encyclopedia of Philosophy Ed by Edward N Zalta Spring 2019

Metaphysics Research Lab Stanford University 2019 An introductory

article on the philsophical movement ldquopragmatismrdquo It includes an

important clarification of the pragmatic slogan ldquotruth is the end of

inquiryrdquo)

4 This is especially congruent with the work of William James (ibid)

9 Note that this has no reference to facts

whatsoever However it need not necessarily

preclude them

The pragmatic theory

10 Of the neo-classical theories of truth this

is probably the least agreed upon as having a

single clear statement (Glanzberg Truth)

However as with pragmatism in general3 the

pragmatic truth is oriented practically

11 Perhaps the most important aspect

of this theory is that it is thoroughly a

correspondence theory agreeing that true

propositions are those that correspond to

the world However there is a different

focus here that differentiates it from

correspondence theory proper it values as

more true that which has some sort of practical

use in human life

12 Wersquoll try to summarize pragmatism in two

slogans with slightly different emphases herersquos

the first again cast in propositional parallel

A proposition is true iff it works4

Now there are two ways this can be

understood (a) the proposition ldquoworksrdquo in

that it empirically corresponds to the world

or (b) the proposition ldquoworksrdquo in that it has

an effect that some agent intends The former

is pretty standard correspondence theory The

itself Mathematics itself tru Truth p 1

5 This is especially congruent with the work of Charles Sanders

Peirce (Legg and Hookway Pragmatism)

inquiry

facts

latter is new and fairly obviously has ethical

implications especially today

13 Let us turn to a second formulation

A proposition is true if it

corresponds with a process of

inquiry5

This has two interesting facets (a) an agentrsquos

active inquiry creates truth and (b) it is a

sort of correspondence theory that requires a

correspondence of a proposition with a process

of inquiry not as in the correspondence theory

with a fact about the world The latter has

shades of both correspondence theory and

coherence theory

The picture theory

Before we delve into this theory we must take

a moment to clarify some terminology

States of affairs and facts

14 When discussing the correspondence

theory we have used the term fact to mean

an actual state of things in the world A

problem arises in the correspondence theory

here It says that a proposition is true iff

there is a fact that corresponds with it What

of a negative proposition like ldquothere are no

cows in Antarcticardquo We would seem to need

a corresponding ldquonegative factrdquo in the world

to make this true If a fact is taken to be

itself Mathematics itself tru Truth p 1

6 But Barker and Jago (Stephen Barker and Mark Jago Being

Positive About Negative Facts In Philosophy and Phenomenological

Research 851 [2012] pp 117ndash138 DOI 10 1111 j 1933 -1592201000479x) have attempted just that

state of affairs

obtain

model

picture

7 See Wittgenstein (Ludwig Wittgenstein Tractatus Logico-

Philosophicus Ed by C familyi=C given=K Ogden giveni= O

Project Gutenberg International Library of Psychology Philosophy

and Scientific Method Kegan Paul Trench Trubner amp Co Ltd 1922

A brilliant work on what is possible to express in languagemdashand

what is not As Wittgenstein puts it ldquoWhat can be said at all can

be said clearly and whereof one cannot speak thereof one must be

silentrdquo) Biletzki and Matar (Anat Biletzki and Anat Matar Ludwig

Wittgenstein In The Stanford Encyclopedia of Philosophy Ed by

Edward N Zalta Summer 2018 Metaphysics Research Lab Stanford

University 2018 An introduction to Wittgenstein and his thought)

glock2016 () and Dolby (David Dolby Wittgenstein on Truth In

A Companion to Wittgenstein John Wiley amp Sons Ltd 2016 Chap 27

pp 433ndash442 ISBN 9781118884607 DOI 1010029781118884607ch27)

composed of a complex of actual objects and

relations it is hard to imagine such facts6

15 Furthermore if a proposition is true it

seems that it is the corresponding fact that

makes it so what then makes a proposition

false since there is no fact to support the

falsity (Mark Textor States of Affairs In

The Stanford Encyclopedia of Philosophy Ed

by Edward N Zalta Winter 2016 Metaphysics

Research Lab Stanford University 2016)

16 And what of nonsense There are some

propositions like ldquothere is a round cuberdquo

that are neither true nor false However

the preceding correspondence theory cannot

differentiate between false and nonsensical

propositions

17 A state of affairs is something possible

that may or may not be actual (ibid) If

a state of affairs is actual it is said to

obtain The picture theory will make central

this concept instead of that of the fact

The picture theory of meaning (and truth)

18 The picture theory of meaning uses the

analogy of the model or picture to explain

the meaningfulness of propositions7

A proposition names possible objects

and arranges these names to

correspond to a state of affairs

itself Mathematics itself tru Truth p 1

Figure 011 a representation of the picture theory

nonsense

See Figure 011 This also allows for an easy

account of truth falsity and nonsense

nonsense A sentence that appears to be

a proposition is actually not if the

arrangement of named objects is

impossible Such a sentence is simply

nonsense

truth A proposition is true if the state of

affairs it depicts obtains

falsity A proposition is false if the state of

affairs it depicts does not obtain

19 Now some (Hans Johann Glock Truth

in the Tractatus In Synthese 1482 [Jan

2006] pp 345ndash368 ISSN 1573-0964 DOI 10

1007 s11229 - 004 - 6226 - 2) argue this is

a correspondence theory and others (David

Dolby Wittgenstein on Truth In A Companion

to Wittgenstein John Wiley amp Sons Ltd 2016

Chap 27 pp 433ndash442 ISBN 9781118884607 DOI

1010029781118884607ch27) that it is not In

any case it certainly solves some issues that

have plagued the correspondence theory

ldquoWhat cannot be said must be shownrdquo

20 Something the picture theory does is

declare a limit on what can meaningfully be

said A proposition (as defined above) must be

potentially true or false Therefore something

that cannot be false (something necessarily

true) cannot be a proposition (ibid) And

there are certain things that are necessarily

true for language itself to be meaningfulmdash

paradigmatically the logical structure of the

world What a proposition does then is show

itself Mathematics itself tru Truth p 1

8 See also ( Slavoj Žižek Less Than Nothing Hegel and the Shadow

of Dialectical Materialism Verso 2012 ISBN 9781844678976 This is

one of the most interesting presentations of Hegel and Lacan by one

of the most exciting contemporary philosophers pp 25shy6) from whom I

stole the section title

language itself

9 This was one of the contributions to the ldquolinguistic turnrdquo

(Wikipedia Linguistic turnmdashWikipedia The Free Encyclopedia

http en wikipedia org w index php title =Linguistic 20turn amp oldid = 922305269 [Online accessed23shyOctobershy2019] 2019 Hey we all do it) of philosophy in the early

20th century

perspective

skepticism

via its own logical structure the necessary

(for there to be meaningful propositions at all)

logical structure of the world8

21 An interesting feature of this perspective

is that it opens up language itself to analysis

and limitation9 And furthermore it suggests

that the set of what is is smaller than the set

of what can be meaningfully spoken about

The relativity of truth

22 Each subject (ie agent) in the world with

their propositions has a perspective a given

moment a given place an historical-cultural-

linguistic situation At the very least the truth

of propositions must account for this Just

how a theory of truth should do so is a matter

of significant debate (Maria Baghramian and

J Adam Carter Relativism In The Stanford

Encyclopedia of Philosophy Ed by Edward N

Zalta Winter 2019 Metaphysics Research Lab

Stanford University 2019)

23 Some go so far as to be skeptical

about truth (Peter Klein Skepticism In The

Stanford Encyclopedia of Philosophy Ed by

Edward N Zalta Summer 2015 Metaphysics

Research Lab Stanford University 2015)

regarding it to be entirely impossible Others

say that while a proposition may or may not be

true we could never come to know this

24 Often underlying this conversation is the

question of there being a common world in

which we all participate and if so whether

or not we can properly represent this world

itself Mathematics itself tru Truth p 1

10 Especially notable here is the work of Alfred Tarski in the

midshy20th century

beliefs

sentences

interpreted sentences

quasi-quotation

in language such that multiple subjects could

come to justifiably agree or disagree on the

truth of a proposition If every proposition

is so relative that it is relevant to only the

proposer truth would seem of little value On

the other hand if truth is understood to

be ldquoobjectiverdquomdashindependent of subjective

perspectivemdasha number of objections can be

made (Baghramian and Carter Relativism) such

as that there is no non-subjective perspective

from which to judge truth

Other ideas about truth

25 There are too many credible ideas

about truth to attempt a reasonable summary

however I will attempt to highlight a few

important ones

Formal methods

26 A set of tools was developed for

exploring theories of truth especially

correspondence theories10 Focus turned

from beliefs to sentences which are akin to

propositions (Recall that the above theories

have already been recast in the more modern

language of propositions) Another aspect

of these sentences under consideration is

that they begin to be taken as interpreted

sentences they are already have meaning

27 Beyond this several technical apparatus

are introduced that formalize criteria for

truth For instance a sentence is given a

sign φ A need arises to distinguish between

the quotation of sentence φ and the unqoted

sentence φ which is then given the quasi-

quotation notation pφq For instance let φ

stand for snow is white then φ rarr snow is

itself Mathematics itself tru Truth p 1

Convention T

11 Wilfrid Hodges Tarskirsquos Truth Definitions In The Stanford

Encyclopedia of Philosophy Ed by Edward N Zalta Fall 2018

Metaphysics Research Lab Stanford University 2018 Mario Goacutemez-

Torrente Alfred Tarski In The Stanford Encyclopedia of

Philosophy Ed by Edward N Zalta Spring 2019 Metaphysics Research

Lab Stanford University 2019 Peter Hylton and Gary Kemp Willard

Van Orman Quine In The Stanford Encyclopedia of Philosophy Ed

by Edward N Zalta Spring 2019 Metaphysics Research Lab Stanford

University 2019

redundancy theory

white and pφq rarr lsquosnow is whitersquo Tarski

introduces Convention T which states that

for a fixed language L with fully interpreted

sentences (Glanzberg Truth)

An adequate theory of truth for L

must imply for each sentence φ of L

pφq is true if and only if φ

Using the same example then

lsquosnow is whitersquo if and only if snow

is white

Convention T states a general rule for the

adequacy of a theory of truth and is used in

several contemporary theories

28 We can see that these formal methods get

quite technical and fun For more see Hodges

GoacutemezshyTorrente and Hylton and Kemp11

Deflationary theories

29 Deflationary theories of truth try to

minimize or eliminate altogether the concept

of or use of the term lsquotruthrsquo For instance the

redundancy theory claim that (Glanzberg

Truth)

itself Mathematics itself tru Truth p 1

12 Daniel Stoljar and Nic Damnjanovic The Deflationary Theory of

Truth In The Stanford Encyclopedia of Philosophy Ed by Edward N

Zalta Fall 2014 Metaphysics Research Lab Stanford University 2014

truth-bearer

meaning

theory of use

To assert that pφq is true is just to

assert that φ

Therefore we can eliminate the use of lsquois truersquo

30 For more of less see Stoljar and

Damnjanovic12

Language

31 It is important to recognize that language

mediates truth that is truth is embedded in

language The way language in general affects

theories of truth has been studied extensively

For instance whether the truth-bearer is

a belief or a proposition or a sentencemdashor

something elsemdashhas been much discussed The

importance of the meaning of truth-bearers

like sentences has played another large role

Theories of meaning like the picture theory

presented above are often closely intertwined

with theories of truth

32 One of the most popular theories of

meaning is called the theory of use

For a large class of cases of the

employment of the word ldquomeaningrdquo

ndash though not for all ndash this word

can be explained in this way the

meaning of a word is its use in

the language (L Wittgenstein PMS

Hacker and J Schulte Philosophical

itself Mathematics itself tru Truth p 1

languageshygames

metaphysical realism

scientific realism

13 Drew Khlentzos Challenges to Metaphysical Realism In The

Stanford Encyclopedia of Philosophy Ed by Edward N Zalta Winter

2016 Metaphysics Research Lab Stanford University 2016

antishyrealism

Investigations Wiley 2010 ISBN

9781444307979)

This theory is accompanied by the concept of

languageshygames which are loosely defined as

rule-based contexts within which sentences

have uses The idea is that the meaning of

a given sentence is its use in a network of

meaning that is constantly evolving This view

tends to be understood as deflationary or

relativistic about truth

Metaphysical and epistemological

considerations

33 We began with the recognition that

truth is intertwined with metaphysics and

epistemology Letrsquos consider a few such topics

34 The first is metaphysical realism

which claims that there is a world existing

objectively independently of how we think

about or describe it This ldquorealismrdquo tends to be

closely tied to yet distinct from scientific

realism which goes further claiming the

world is ldquoactuallyrdquo as science describes

independently of the scientific descriptions

(eg there are actual objects corresponding to

the phenomena we call atoms molecules light

particles etc)

35 There have been many challenges to the

realist claim (for some recent versions see

Khlentzos13) put forth by what is broadly

called antishyrealism These vary but often

challenge the ability of realists to properly

itself Mathematics itself tru Truth p 2

Metaphysical idealism

Epistemological idealism

14 These definitions are explicated by Guyer and Horstmann (Paul

Guyer and Rolf-Peter Horstmann Idealism In The Stanford

Encyclopedia of Philosophy Ed by Edward N Zalta Winter 2018

Metaphysics Research Lab Stanford University 2018)

noumenal world

phenomenal world

link language to supposed objects in the

world

36 Metaphysical idealism has been

characterized as claiming that ldquomindrdquo or

ldquosubjectivityrdquo generate or completely compose

the world which has no being outside mind

Epistemological idealism on the other hand

while perhaps conceding that there is a world

independent of mind claims all knowledge

of the world is created through mind and

for mind and therefore can never escape a

sort of mindshyworld gap14 This epistemological

idealism has been highly influential since

the work of Immanuel Kant (I Kant P Guyer

and AW Wood Critique of Pure Reason The

Cambridge Edition of the Works of Immanuel

Kant Cambridge University Press 1999 ISBN

9781107268333) in the late 18th century

which ushered in the idea of the noumenal

world in-itself and the phenomenal world

which is how the noumenal world presents to

us Many have held that phenomena can be

known through inquiry whereas noumena are

inaccessible Furthermore what can be known is

restricted by the categories pre-existent in

the knower

37 Another approach taken by Georg Wilhelm

Friedrich Hegel (Paul Redding Georg

Wilhelm Friedrich Hegel In The Stanford

Encyclopedia of Philosophy Ed by Edward N

Zalta Summer 2018 Metaphysics Research

Lab Stanford University 2018) and other

German idealists following Kant is to reframe

reality as thoroughly integrating subjectivity

(GWF Hegel and AV Miller Phenomenology

of Spirit Motilal Banarsidass 1998 ISBN

itself Mathematics itself tru Truth p 1

Notion

9788120814738 Slavoj Žižek Less Than

Nothing Hegel and the Shadow of Dialectical

Materialism Verso 2012 ISBN 9781844678976

This is one of the most interesting presentations

of Hegel and Lacan by one of the most exciting

contemporary philosophers) that is ldquoeverything

turns on grasping and expressing the True not

only as Substance but equally as Subjectrdquo

A subjectrsquos proposition is true inasmuch as it

corresponds with its Notion (approximately

the idea or meaning for the subject) Some

hold that this idealism is compatible with a

sort of metaphysical realism at least as far

as understanding is not independent of but

rather beholden to reality (Žižek Less Than

Nothing Hegel and the Shadow of Dialectical

Materialism p 906 ff)

38 Clearly all these ideas have many

implications for theories of truth and vice

versa

Where this leaves us

39 The truth is hard What may at first

appear to be a simple concept becomes complex

upon analysis It is important to recognize

that we have only sampled some highlights of

the theories of truth I recommend further

study of this fascinating topic

40 Despite the difficulties of finding

definitive grounds for understanding truth

we are faced with the task of provisionally

forging ahead Much of what follows in

the study of mathematics makes certain

implicit and explicit assumptions about truth

However we have found that the foundations

itself Mathematics itself found Truth p 2

of these assumptions may themselves be

problematic It is my contention that despite

the lack of clear foundations it is still worth

studying engineering analysis its mathematical

foundations and the foundations of truth

itself My justification for this claim is that

I find the utility and the beauty of this study

highly rewarding

itself Mathematics itself found The foundations of mathematics p 1

axioms

deduction

proof

theorems

15 Throughout this section for the history of mathematics I rely

heavily on Kline (M Kline Mathematics The Loss of Certainty A

Galaxy book Oxford University Press 1982 ISBN 9780195030853 A

detailed account of the ldquoillogicalrdquo development of mathematics and

an exposition of its therefore remarkable utility in describing the

world)

Euclid

geometry

Gauss

itselffound The foundations of

mathematics

1 Mathematics has long been considered

exemplary for establishing truth Primarily

it uses a method that begins with

axiomsmdashunproven propositions that include

undefined termsmdashand uses logical deduction

to prove other propositions (theorems) to

show that they are necessarily true if the

axioms are

2 It may seem obvious that truth established

in this way would always be relative to the

truth of the axioms but throughout history

this footnote was often obscured by the

ldquoobviousrdquo or ldquointuitiverdquo universal truth of the

axioms15 For instance Euclid (Wikipedia

Euclid mdash Wikipedia The Free Encyclopedia

http en wikipedia org w index php

title = Euclid amp oldid = 923031048 [Online

accessed 26shyOctobershy2019] 2019) founded

geometrymdashthe study of mathematical objects

traditionally considered to represent physical

space like points lines etcmdashon axioms thought

so solid that it was not until the early 19th

century that Carl Friedrich Gauss (Wikipedia

Carl Friedrich Gauss mdash Wikipedia The Free

Encyclopedia httpenwikipediaorgw

indexphptitle=Carl20Friedrich20Gaussamp

oldid=922692291 [Online accessed 26shyOctober-

2019] 2019) and others recognized this was

only one among many possible geometries (M

Kline Mathematics The Loss of Certainty A

Galaxy book Oxford University Press 1982

ISBN 9780195030853 A detailed account of

the ldquoillogicalrdquo development of mathematics

and an exposition of its therefore remarkable

utility in describing the world) resting on

itself Mathematics itself found The foundations of mathematics p 1

Aristotle

algebra

rational numbers

natural numbers

integers

irrational numbers

different axioms Furthermore Aristotle

(Christopher Shields Aristotle In The

Stanford Encyclopedia of Philosophy Ed by

Edward N Zalta Winter 2016 Metaphysics

Research Lab Stanford University 2016) had

acknowledged that reasoning must begin

with undefined terms however even Euclid

(presumably aware of Aristotlersquos work) seemed

to forget this and provided definitions

obscuring the foundations of his work and

starting mathematics on a path that for over

2000 years would forget its own relativity

(Kline Mathematics The Loss of Certainty p 101-

2)

3 The foundations of Euclid were even

shakier than its murky starting point several

unstated axioms were used in proofs and some

proofs were otherwise erroneous However for

two millennia mathematics was seen as the field

wherein truth could be established beyond

doubt

Algebra ex nihilo

4 Although not much work new geometry

appeared during this period the field of

algebra (Wikipedia Algebra mdash Wikipedia The

Free Encyclopedia httpenwikipediaorg

windexphptitle=Algebraampoldid=920573802

[Online accessed 26shyOctobershy2019] 2019)mdashthe

study of manipulations of symbols standing for

numbers in generalmdashbegan with no axiomatic

foundation whatsoever The Greeks had a notion

of rational numbers ratios of natural numbers

(positive integers) and it was known that

many solutions to algebraic equations were

irrational (could not be expressed as a ratio

of integers) But these irrational numbers like

virtually everything else in algebra were

itself Mathematics itself found The foundations of mathematics p 1

negative numbers

imaginary numbers

complex numbers

optics

astronomy

calculus

Newtonian mechanics

non-Euclidean geometry

gradually accepted because they were so

useful in solving practical problems (they

could be approximated by rational numbers and

this seemed good enough) The rules of basic

arithmetic were accepted as applying to these

and other forms of new numbers that arose in

algebraic solutions negative imaginary and

complex numbers

The application of mathematics to science

5 During this time mathematics was being

applied to optics and astronomy Sir

Isaac Newton then built calculus upon

algebra applying it to what is now known as

Newtonian mechanics which was really more

the product of Leonhard Euler (George Smith

Isaac Newton In The Stanford Encyclopedia

of Philosophy Ed by Edward N Zalta Fall

2008 Metaphysics Research Lab Stanford

University 2008 Wikipedia Leonhard

Euler mdash Wikipedia The Free Encyclopedia

httpenwikipediaorgwindexphptitle=

Leonhard20Eulerampoldid=921824700 [Online

accessed 26shyOctobershy2019] 2019) Calculus

introduced its own dubious operations but

the success of mechanics in describing and

predicting physical phenomena was astounding

Mathematics was hailed as the language of

God (later Nature)

The rigorization of mathematics

6 It was not until Gauss created non-

Euclidean geometry in which Euclidrsquos were

shown to be one of many possible axioms

compatible with the world and William Rowan

Hamilton (Wikipedia William Rowan Hamilton

mdash Wikipedia The Free Encyclopedia http

en wikipedia org w index php

title=William20Rowan20Hamiltonampoldid=

923163451 [Online accessed 26shyOctober-

itself Mathematics itself found The foundations of mathematics p 1

quaternions

rigorization

symbolic logic

mathematical model

16 The branch of mathematics called model theory concerns itself

with general types of models that can be made from a given formal

system like an axiomatic mathematical system For more on model

theory see Hodges (Wilfrid Hodges Model Theory In The Stanford

Encyclopedia of Philosophy Ed by Edward N Zalta Fall 2018

Metaphysics Research Lab Stanford University 2018) It is noteworthy

that the engineeringscience use of the term ldquomathematical modelrdquo is

only loosely a ldquomodelrdquo in the sense of model theory

implicit definition

2019] 2019) created quaternions (Wikipedia

Quaternion mdash Wikipedia The Free Encyclopedia

http en wikipedia org w index

php title = Quaternion amp oldid = 920710557

[Online accessed 26shyOctobershy2019] 2019)

a number system in which multiplication is

noncommunicative that it became apparent

somethingwas fundamentally wrong with the

way truth in mathematics had been understood

This started a period of rigorization in

mathematics that set about axiomatizing and

proving 19th century mathematics This included

the development of symbolic logic which

aided in the process of deductive reasoning

7 An aspect of this rigorization is that

mathematicians came to terms with the axioms

that include undefined terms For instance

a ldquopointrdquo might be such an undefined term in

an axiom A mathematical model is what we

create when we attach these undefined terms

to objects which can be anything consistent

with the axioms16 The system that results

from proving theorems would then apply to

anything ldquoproperlyrdquo described by the axioms

So two masses might be assigned ldquopointsrdquo in

a Euclidean geometric space from which

we could be confident that for instance

the ldquodistancerdquo between these masses is the

Euclidean norm of the line drawn between the

points It could be said then that a ldquopointrdquo in

Euclidean geometry is implicitly defined by

its axioms and theorems and nothing else That

is mathematical objects are not inherently

tied to the physical objects to which we tend

to apply them Euclidean geometry is not

the study of physical space as it was long

consideredmdashit is the study of the objects

implicitly defined by its axioms and theorems

The foundations of mathematics are built

itself Mathematics itself found The foundations of mathematics p 2

consistent

theorem

Set theory

foundation

sets

ZFC set theory

8 The building of the modern foundations

mathematics began with clear axioms

solid reasoning (with symbolic logic) and

lofty yet seemingly attainable goals prove

theorems to support the already ubiquitous

mathematical techniques in geometry algebra

and calculus from axioms furthermore prove

that these axioms (and things they imply) do

not contradict each other ie are consistent

and that the axioms are not results of each

other (one that can be derived from others is

a theorem not an axiom)

9 Set theory is a type of formal axiomatic

system that all modern mathematics is

expressed with so set theory is often called

the foundation of mathematics (Joan Bagaria

Set Theory In The Stanford Encyclopedia

of Philosophy Ed by Edward N Zalta Fall

2019 Metaphysics Research Lab Stanford

University 2019) We will study the basics

in The primary objects in set theory are

sets informally collections of mathematical

objects There is not just one a single set

of axioms that is used as the foundation of

all mathematics for reasons will review in a

moment However the most popular set theory

is ZermeloshyFraenkel set theory with the

axiom of choice (ZFC) The axioms of ZF sans C

are as follows (ibid)

extensionality If two sets A and B have the

same elements then they are equal

empty set There exists a set denoted by emptyand called the empty set which has no

elements

pair Given any sets A and B there exists a set

itself Mathematics itself found The foundations of mathematics p 1

denoted by A B which contains A and B

as its only elements In particular there

exists the set A which has A as its only

element

power set For every set A there exists a set

denoted by P(A) and called the power set

of A whose elements are all the subsets of

A

union For every set A there exists a set

denoted by⋃A and called the union of

A whose elements are all the elements of

the elements of A

infinity There exists an infinite set In

particular there exists a set Z that

contains empty and such that if A isin Z then⋃A A isin Z

separation For every set A and every given

property there is a set containing

exactly the elements of A that have

that property A property is given by a

formula ϕ of the first-order language

of set theory Thus separation is not a

single axiom but an axiom schema that is

an infinite list of axioms one for each

formula ϕ

replacement For every given definable

function with domain a set A there is a

set whose elements are all the values of

the function

10 ZFC also has the axiom of choice (Bagaria

Set Theory)

choice For every set A of pairwiseshydisjoint

non-empty sets there exists a set that

contains exactly one element from each

set in A

itself Mathematics itself found The foundations of mathematics p 2

first incompleteness theorem

incomplete

17 Panu Raatikainen Goumldelrsquos Incompleteness Theorems In The

Stanford Encyclopedia of Philosophy Ed by Edward N Zalta Fall

2018 Metaphysics Research Lab Stanford University 2018 A through

and contemporary description of Goumldelrsquos incompleteness theorems

which have significant implications for the foundations and function

of mathematics and mathematical truth

undecidable

second incompleteness theorem

The foundations have cracks

11 The foundationalistsrsquo goal was to prove

that some set of axioms from which all of

mathematics can be derived is both consistent

(contains no contradictions) and complete

(every true statement is provable) The work of

Kurt Goumldel (Juliette Kennedy Kurt Goumldel In

The Stanford Encyclopedia of Philosophy Ed

by Edward N Zalta Winter 2018 Metaphysics

Research Lab Stanford University 2018) in

the mid 20th century shattered this dream by

proving in his first incompleteness theorem

that any consistent formal system within which

one can do some amount of basic arithmetic is

incomplete His argument is worth reviewing

(see Raatikainen17) but at its heart is an

undecidable statement like ldquoThis sentence is

unprovablerdquo Let U stand for this statement If

it is true it is unprovable If it is provable it is

false Therefore it is true iff it is provable

Then he shows that if a statement A that

essentially says ldquoarithmetic is consistentrdquo is

provable then so is the undecidable statement

U But if U is to be consistent it cannot be

provable and therefore neither can A be

provable

12 This is problematic It tells us virtually

any conceivable axiomatic foundation of

mathematics is incomplete If one is complete

it is inconsistent (and therefore worthless) One

problem this introduces is that a true theorem

may be impossible to prove but it turns out we

can never know that in advance of its proof if

it is provable

13 But it gets worse Goumldelrsquos second

itself Mathematics itself found The foundations of mathematics p 1

material implication

Banach-Tarski paradox

empirical

18 Kline Mathematics The Loss of Certainty

incompleteness theorem shows that such

systems cannot even be shown to be consistent

This means at any moment someone could find

an inconsistency in mathematics and not only

would we lose some of the theorems we would

lose them all This is because by what is called

the material implication (Kline Mathematics

The Loss of Certainty pp 187-8 264) if one

contradiction can be found every proposition

can be proven from it And if this is the case

all (even proven) theorems in the system would

be suspect

14 Even though no contradiction has yet

appeared in ZFC its axiom of choice which

is required for the proof of most of what

has thus far been proven generates the

Banach-Tarski paradox that says a sphere

of diameter x can be partitioned into a finite

number of pieces and recombined to form

two spheres of diameter x Troubling to say

the least Attempts were made for a while to

eliminate the use of the axiom of choice but

our buddy Goumldel later proved that if ZF is

consistent so is ZFC (ibid p 267)

Mathematics is considered empirical

15 Since its inception mathematics has been

applied extensively to the modeling of the

world Despite its cracked foundations it has

striking utility Many recent leading minds of

mathematics philosophy and science suggest

we treat mathematics as empirical like any

science subject to its success in describing

and predicting events in the world As Kline18

summarizes

itself Mathematics itself reason The foundations of mathematics p 2

The upshot [hellip] is that sound

mathematics must be determined

not by any one foundation which

may some day prove to be right The

ldquocorrectnessrdquo of mathematics must

be judged by its application to the

physical world Mathematics is an

empirical science much as Newtonian

mechanics It is correct only to the

extent that it works and when it

does not it must be modified It

is not a priori knowledge even

though it was so regarded for two

thousand years It is not absolute

or unchangeable

itself Mathematics itself overview Mathematical reasoning p 1

itselfreason Mathematical

reasoning

itself Mathematics itself engmath Mathematical topics in overview p 1

itselfoverview Mathematical

topics in

overview

itself Mathematics itself ex What is mathematics for engineering p 1

itselfengmath What is

mathematics for

engineering

language

sets Mathematics itself Exercises for Chapter 01 p 1

itselfex Exercises for Chapter

01

formal languages

logic

reasoning

propositions

theorems

proof

propositional calculus

logical connectives

first-order logic

sets

Mathematicalreasoning logicand set theory

In order to communicate mathematical

ideas effectively formal languages have

been developed within which logic ie

deductive (mathematical) reasoning can

proceed Propositions are statements that

can be either true gt or false perp Axiomatic

systems begin with statements (axioms) assumed

true Theorems are proven by deduction

In many forms of logic like propositional

calculus (Wikipedia Propositional

calculus mdash Wikipedia The Free Encyclopedia

httpenwikipediaorgwindexphptitle=

Propositional20calculusampoldid=914757384

[Online accessed 29shyOctobershy2019] 2019)

compound propositions are constructed

via logical connectives like ldquoandrdquo

and ldquoorrdquo of atomic propositions (see

Lecture setslogic) In others like first-order

logic (Wikipedia First-order logic mdash

Wikipedia The Free Encyclopedia http

enwikipediaorgwindexphptitle=First-

order 20logic amp oldid = 921437906 [Online

accessed 29shyOctobershy2019] 2019) there are also

sets Mathematical reasoning logic and set theory setintro p 3

quantifiers

set theory

logical quantifiers like ldquofor everyrdquo and ldquothere

existsrdquo

The mathematical objects and operations about

which most propositions are made are expressed

in terms of set theory which was introduced

in Lecture itselffound and will be expanded

upon in Lecture setssetintro We can say

that mathematical reasoning is comprised of

mathematical objects and operations expressed

in set theory and logic allows us to reason

therewith

sets Mathematical reasoning logic and set theory setintro Introduction to set theory p 1

set theory

1 K Ciesielski Set Theory for the Working Mathematician London

Mathematical Society Student Texts Cambridge University Press 1997

ISBN 9780521594653 A readable introduction to set theory

2 HB Enderton Elements of Set Theory Elsevier Science 1977 ISBN

9780080570426 A gentle introduction to set theory and mathematical

reasoningmdasha great place to start

set

field

addition

multiplication

subtraction

division

setssetintro Introduction to

set theory

Set theory is the language of the modern

foundation of mathematics as discussed

in Lecture itselffound It is unsurprising

then that it arises throughout the study of

mathematics We will use set theory extensively

in on probability theory

The axioms of ZFC set theory were introduced

in Lecture itselffound Instead of proceeding

in the pure mathematics way of introducing

and proving theorems we will opt for a more

applied approach in which we begin with some

simple definitions and include basic operations

A more thorough and still readable treatment

is given by Ciesielski1 and a very gentle

version by Enderton2

A set is a collection of objects Set theory

gives us a way to describe these collections

Often the objects in a set are numbers or sets

of numbers However a set could represent

collections of zebras and trees and hairballs

For instance here are some sets

A field is a set with special structure This

structure is provided by the addition (+)

and multiplication (times) operators and their

inverses subtraction (ndash) and division (divide)

sets Mathematical reasoning logic and set theory setintro Introduction to set theory p 2

real numbers

3 When the natural numbers include zero we write N0

set membership

set operations

union

The quintessential example of a field is the

set of real numbers R which admits these

operators making it a field The reals R thecomplex numbers C the integers Z and the

natural numbers3 N are the fields we typically

consider

Set membership is the belonging of an object

to a set It is denoted with the symbol isinwhich can be read ldquois an element ofrdquo for

element x and set X

For instance we might say 7 isin 1 7 2 or

4 isin 1 7 2 Or we might declare that a is a

real number by stating x isin R

Set operations can be used to construct new

sets from established sets We consider a few

common set operations now

The union cup of sets is the set containing all

the elements of the original sets (no repetition

allowed) The union of sets A and B is denoted

A cup B For instance let A = 1 2 3 and B = ndash1 3

then

sets Mathematical reasoning logic and set theory setintro Introduction to set theory p 3

intersection

empty set

set difference

subset

proper subset

The intersection cap of sets is a set containing

the elements common to all the original sets

The intersection of sets A and B is denoted

A cap B For instance let A = 1 2 3 and B = 2 3 4

then

If two sets have no elements in common the

intersection is the empty set empty = the unique

set with no elements

The set difference of two sets A and B is the

set of elements in A that arenrsquot also in B It is

denoted A B For instance let A = 1 2 3 and

B = 2 3 4 Then

A subset sube of a set is a set the elements of

which are contained in the original set If the

two sets are equal one is still considered a

subset of the other We call a subset that is

not equal to the other set a proper subset sub

sets Mathematical reasoning logic and set theory setintro Introduction to set theory p 4

complement

cartesian product

set-builder notation

For instance let A = 1 2 3 and B = 1 2 Then

The complement of a subset is a set of elements

of the original set that arenrsquot in the subset

For instance if B sube A then the complement of B

denoted B is

The cartesian product of two sets A and B is

denoted A times B and is the set of all ordered

pairs (a b) where a isin A and b isin B Itrsquos

worthwhile considering the following notation

for this definition

which means ldquothe cartesian product of A and B

is the ordered pair (a b) such that a isin A and

b isin Brdquo in set-builder notation (Wikipedia

Set-builder notation mdash Wikipedia The Free

Encyclopedia httpenwikipediaorgw

indexphptitle=Set-builder20notationamp

oldid=917328223 [Online accessed 29shyOctober-

sets Mathematical reasoning logic and set theory logic Introduction to set theory p 5

map

function

value

domain

codomain

image

2019] 2019)

Let A and B be sets A map or function f from

A to B is an assignment of some element a isin A

to each element b isin B The function is denoted

f A rarr B and we say that f maps each element

a isin A to an element f(a) isin B called the value

of a under f or a 7rarr f(a) We say that f has

domain A and codomain B The image of f is

the subset of its codomain B that contains

the values of all elements mapped by f from its

domain A

sets Mathematical reasoning logic and set theory ex Logical connectives and quantifiers p 1

first-order logic

atomistic proposition

compound proposition

logical connectives

not

andor

truth table

universal quantifier symbol

existential quantifier

setslogic Logical connectives

and quantifiers

In order to make compound propositions

we need to define logical connectives In

order to specify quantities of variables

we need to define logical quantifiers The

following is a form of first-order logic

(Wikipedia First-order logic mdash Wikipedia The

Free Encyclopedia)

Logical connectives

A proposition can be either true gt and false perpWhen it does not contain a logical connective

it is called an atomistic proposition

To combine propositions into a compound

proposition we require logical connectives

They are not (not) and (and) and or (or) Table 021

is a truth table for a number of connectives

Quantifiers

Logical quantifiers allow us to indicate

the quantity of a variable The universal

quantifier symbol forall means ldquofor allrdquo For

instance let A be a set then foralla isin A means

ldquofor all elements in Ardquo and gives this quantity

variable a The existential quantifier existmeans ldquothere exists at least onerdquo or ldquofor somerdquo

For instance let A be a set then exista isin A

Table 021 a truth table for logical connectivesThe first two columns are the truth values ofpropositions p and q the rest are outputs

not and or nand nor xor xnor

p q notp pand q por q p uarr q p darr q p Y q phArr q

perp perp gt perp perp gt gt perp gtperp gt gt perp gt gt perp gt perpgt perp perp perp gt gt perp gt perpgt gt perp gt gt perp perp perp gt

sets Mathematical reasoning logic and set theory ex Logical connectives and quantifiers p 2

means ldquothere exists at least one element a in

A helliprdquo

prob Mathematical reasoning logic and set theory Exercises for Chapter 02 p 1

setsex Exercises for Chapter

02

Exercise 021 (hardhat)

For the following write the set described in

set-builder notation

a A = 2 3 5 9 17 33 middot middot middot b B is the set of integers divisible by 11

c C = 13 14 15 middot middot middot d D is the set of reals between ndash3 and 42

Exercise 022 (2)

Let xy isin Rn Prove the Cauchy-Schwarz

Inequality

|x middot y| 6 xy (ex1)

Hint you may find the geometric definition of

the dot product helpful

Exercise 023 (3)

Let x isin Rn Prove that

x middot x = x2 (ex2)

Hint you may find the geometric definition of

the dot product helpful

Exercise 024 (4)

Let xy isin Rn Prove the Triangle Inequality

x + y 6 x + y (ex3)

Hint you may find the Cauchy-Schwarz

Inequality helpful

prob

Probability

prob Probability prob Probability and measurement p 1

Probability theory

1 For a good introduction to probability theory see Ash (Robert B

Ash Basic Probability Theory Dover Publications Inc 2008) or Jaynes

et al (ET Jaynes et al Probability Theory The Logic of Science

Cambridge University Press 2003 ISBN 9780521592710 An excellent

and comprehensive introduction to probability theory)

interpretation of probability

event

probmeas Probability and

measurement

Probability theory is a wellshydefined branch

of mathematics Andrey Kolmogorov described

a set of axioms in 1933 that are still in use

today as the foundation of probability

theory1

We will implicitly use these axioms in our

analysis The interpretation of probability is

a contentious matter Some believe probability

quantifies the frequency of the occurrence of

some event that is repeated in a large number

of trials Others believe it quantifies the

state of our knowledge or belief that some

event will occur

In experiments our measurements are tightly

coupled to probability This is apparent in the

questions we ask Here are some examples

1 How common is a given event

2 What is the probability we will reject

a good theory based on experimental

results

3 How repeatable are the results

4 How confident are we in the results

5 What is the character of the fluctuations

and drift in the data

6 How much data do we need

prob Probability prob Basic probability theory p 1

probability space

sample space

outcomes

probprob Basic probability

theory

The mathematical model for a class of

measurements is called the probability

space and is composed of a mathematical

triple of a sample space Ω σshyalgebra

F and probability measure P typically

denoted (ΩF P) each of which we will

consider in turn (Wikipedia Probability

space mdash Wikipedia The Free Encyclopedia

httpenwikipediaorgwindexphptitle=

Probability20spaceampoldid=914939789 [Online

accessed 31shyOctobershy2019] 2019)

The sample space Ω of an experiment is the

set representing all possible outcomes of the

experiment If a coin is flipped the sample

space is Ω = H T where H is heads and T

is tails If a coin is flipped twice the sample

space could be

However the same experiment can have

different sample spaces For instance for two

coin flips we could also choose

prob Probability prob Basic probability theory p 1

event

We base our choice of Ω on the problem at

hand

An event is a subset of the sample space

That is an event corresponds to a yes-or-no

question about the experiment For instance

event A (remember A sube Ω) in the coin flipping

experiment (two flips) might be A = HT TH A is

an event that corresponds to the question ldquoIs

the second flip different than the firstrdquo A is

the event for which the answer is ldquoyesrdquo

Algebra of events

Because events are sets we can perform the

usual set operations with them

Example probprobshy1 re set operations with events

Problem Statement

Consider a toss of a single die We choose

the sample space to be Ω = 1 2 3 4 5 6 Let the

following define events

A equiv the result is even = 2 4 6

B equiv the result is greater than 2 = 3 4 5 6

Find the following event combinations

A cup B A cap B A B B A A B

prob Probability condition Basic probability theory p 2

σshyalgebra

probability measure

The σshyalgebra F is the collection of events

of interest Often F is the set of all possible

events given a sample space Ω which is just

the power set of Ω (Wikipedia Probability

space mdash Wikipedia The Free Encyclopedia)

When referring to an event we often state

that it is an element of F For instance we

might say an event A isin F

Wersquore finally ready to assign probabilities to

events We define the probability measure

P F rarr [0 1] to be a function satisfying the

following conditions

1 For every event A isin F the probability

measure of A is greater than or equal to

zeromdashie P(A) gt 0

2 If an event is the entire sample space its

probability measure is unitymdashie if A = Ω

P(A) = 1

3 If events A1 A2 middot middot middot are disjoint sets (no

elements in common) then P(A1 cup A2 cup middot middot middot ) =P(A1) + P(A2) + middot middot middot

We conclude the basics by observing four facts

that can be proven from the definitions above

1

2

3

4

prob Probability condition Independence and conditional probability p 1

independent

probcondition Independence

and conditional

probability

Two events A and B are independent if and

only if

P(A cap B) = P(A)P(B)

If an experimenter must make a judgment

without data about the independence of

events they base it on their knowledge of the

events as discussed in the following example

Example probconditionshy1 re independence

Problem Statement

Answer the following questions and

imperatives

1 Consider a single fair die rolled twice

What is the probability that both rolls

are 6

2 What changes if the die is biased by a

weight such that P(6) = 17

3 What changes if the die is biased by a

magnet rolled on a magnetic dice-rolling

tray such that P(6) = 17

4 What changes if there are two dice

biased by weights such that for each

P(6) = 17 rolled once both resulting in

6

5 What changes if there are two dice

biased by magnets rolled together

prob Probability condition Independence and conditional probability p 1

dependent

conditional probability

Conditional probability

If events A and B are somehow dependent we

need a way to compute the probability of B

occurring given that A occurs This is called

the conditional probability of B given A and

is denoted P(B | A) For P(A) gt 0 it is defined as

P(B | A) =P(A cap B)P(A)

(condition1)

We can interpret this as a restriction of

the sample space Ω to A ie the new sample

space Ωprime = A sube Ω Note that if A and B are

independent we obtain the obvious result

prob Probability bay Independence and conditional probability p 2

Example probconditionshy2 re dependence

Problem Statement

Consider two unbiased dice rolled once

Let events A = sum of faces = 8 and

B = faces are equal What is the probability

the faces are equal given that their sum is 8

prob Probability bay Bayesrsquo theorem p 1

Bayesrsquo theorem

probbay Bayesrsquo theorem

Given two events A and B Bayesrsquo theorem (aka

Bayesrsquo rule) states that

P(A | B) = P(B | A)P(A)

P(B) (bay1)

Sometimes this is written

P(A | B) =P(B | A)P(A)

P(B | A)P(A) + P(B | notA)P(notA)(bay2)

=1

1 +P(B | notA)

P(B | A)middot P(notA)P(A)

(bay3)

This is a useful theorem for determining a

testrsquos effectiveness If a test is performed to

determine whether an event has occurred we

might as questions like ldquoif the test indicates

that the event has occurred what is the

probability it has actually occurredrdquo Bayesrsquo

theorem can help compute an answer

Testing outcomes

The test can be either positive or negative

meaning it can either indicate or not indicate

that A has occurred Furthermore this result

can be either true or false

A notA

positive (B) true false

negative (notB) false true

Table 031 test outcomeB for event A

There are four

options then

Consider an event

A and an event that

is a test result B

indicating that event

A has occurred

Table 031 shows

these four possible test outcomes The event

A occurring can lead to a true positive or a

prob Probability bay Bayesrsquo theorem p 1

sensitivity

detection rate

specificity

posterior probability

prior probability

false negative whereas notA can lead to a true

negative or a false positive

Terminology is important here

bull P(true positive) = P(B | A) aka sensitivity

or detection rate

bull P(true negative) = P(notB | notA) aka

specificity

bull P(false positive) = P(B | notA)

bull P(false negative) = P(notB | A)

Clearly the desirable result for any test

is that it is true However no test is true

100 percent of the time So sometimes it

is desirable to err on the side of the

false positive as in the case of a medical

diagnostic Other times it is more desirable to

err on the side of a false negative as in the

case of testing for defects in manufactured

balloons (when a false negative isnrsquot a big

deal)

Posterior probabilities

Returning to Bayesrsquo theorem we can evaluate

the posterior probability P(A | B) of the

event A having occurred given that the

test B is positive given information that

includes the prior probability P(A) of A The

form in Equation bay2 or (bay3) is typically

useful because it uses commonly known test

probabilities of the true positive P(B | A) and

of the false positive P(B | notA) We calculate

P(A | B) when we want to interpret test results

prob Probability bay Bayesrsquo theorem p 2

0 02 04 06 08 1

middot10ndash2

0

02

04

06

08

1

P(A) rarr

P(A|B)rarr

P(B|A) = 09000

P(B|A) = 09900

P(B|A) = 09990

P(B|A) = 09999

Figure 031 for different high-sensitivities theprobability that an event A occurred given thata test for it B is positive versus the probabilitythat the event A occurs under the assumption thespecificity equals the sensitivity

Some interesting results can be found from this

For instance if we let P(B | A) = P(notB | notA)

(sensitivity equal specificity) and realize that

P(B | notA) + P(notB | notA) = 1 (when notA either B or notB)

we can derive the expression

P(B | notA) = 1 ndash P(B | A) (bay4)

Using this and P(notA) = 1 ndash P(A) in Equation bay3

gives (recall wersquove assumed sensitivity equals

specificity)

P(A | B) =1

1 +1 ndash P(B | A)

P(B | A)middot 1 ndash P(A)P(A)

(bay5)

=1

1 +

(1

P(B | A)ndash 1

)(1

P(A)ndash 1

) (bay6)

This expression is plotted in Figure 031 See that

a positive result for a rare event (small P(A)) is

hard to trust unless the sensitivity P(B | A) and

specificity P(notB | notA) are very high indeed

prob Probability bay Bayesrsquo theorem p 3

Example probbayshy1 re Bayesrsquo theorem

Problem StatementSuppose 01 of springs manufactured at a

given plant are defective Suppose you need

to design a test that when it indicates

a deffective part the part is actually

defective 99 of the time What sensitivity

should your test have assuming it can be made

equal to its specificity

Problem Solution

The followingwas generated from a Jupyter

notebook with the following filename and

kernel

notebook filename bayes_theorem_example_01ipynbnotebook kernel python3

from sympy import for symbolicsimport numpy as np for numericsimport matplotlibpyplot as plt for plotsfrom IPythondisplay import display Markdown Latex

Define symbolic variables

var(p_Ap_nAp_Bp_nBp_B_Ap_B_nAp_A_Breal=True)

(p_A p_nA p_B p_nB p_B_A p_B_nA p_A_B)

prob Probability bay Bayesrsquo theorem p 4

Beginningwith Bayesrsquo theorem and assuming

the sensitivity and specificity are equal by

Equation bay4 we can derive the following

expression for the posterior probability P(A | B)

p_A_B_e1 = Eq(p_A_Bp_B_Ap_Ap_B)subs(

p_B p_B_Ap_A+p_B_nAp_nA conditional probp_B_nA 1-p_B_A Eq (35)p_nA 1-p_A

)display(p_A_B_e1)

pAB =pApBA

pApBA +(1 ndash pA

) (1 ndash pBA

)

Solve this for P(B | A) the quantity we seek

p_B_A_sol = solve(p_A_B_e1p_B_Adict=True)p_B_A_eq1 = Eq(p_B_Ap_B_A_sol[0][p_B_A])display(p_B_A_eq1)

pBA =pAB

(1 ndash pA

)ndash2pApAB + pA + pAB

Now letrsquos substitute the given probabilities

p_B_A_spec = p_B_A_eq1subs(

prob Probability rando Bayesrsquo theorem p 5

p_A 0001p_A_B 099

)display(p_B_A_spec)

pBA = 0999989888981011

Thatrsquos a tall order

prob Probability rando Random variables p 1

random variable

discrete random variable

continuous random variable

probrando Random variables

Probabilities are useful even when they do

not deal strictly with events It often occurs

that we measure something that has randomness

associated with it We use random variables to

represent these measurements

A random variable X Ω rarr R is a function

that maps an outcome ω from the sample

space Ω to a real number x isin R as shown in

Figure 032 A random variable will be denoted

with a capital letter (eg X and K) and a

specific value that it maps to (the value) will

be denoted with a lowercase letter (eg x and

k)

A discrete random variable K is one that

takes on discrete values A continuous random

variable X is one that takes on continuous

values

Example probrandoshy1 re dice again

Problem Statement

Roll two unbiased dice Let K be a random

variable representing the sum of the two Let

rvR

Ω

outcome ω

x

X

Figure 032 a random variable X maps an outcomeω isin Ω to an x isin R

prob Probability rando Random variables p 2

P(k) be the probability of the result k isin K Plotand interpret P(k)

Example probrandoshy2 re Johnson-Nyquist noise

Problem Statement

A resistor at nonzero temperature without

any applied voltage exhibits an interesting

phenomenon its voltage randomly fluctuates

This is called JohnsonshyNyquist noise and

is a result of thermal excitation of charge

carriers (electrons typically) For a given

resistor and measurement system let the

probability density function fV of the voltage

V across an unrealistically hot resistor be

fV(V) =1radicπendashV

2

Plot and interpret the meaning of this

function

prob Probability pxf Random variables p 3

prob Probability pxf Probability density and mass functions p 1

frequency distribution

probability mass function

Figure 033 plot of a probability mass function

frequency density distribution

probability density function

probpxf Probability density

and mass functions

Consider an experiment that measures a

random variable We can plot the relative

frequency of the measurand landing in

different ldquobinsrdquo (ranges of values) This is

called a frequency distribution or a

probability mass function (PMF)

Consider for instance a probability mass

function as plotted in Figure 033 where a

frequency ai can be interpreted as an estimate

of the probability of the measurand being in

the ith interval The sum of the frequencies

must be unity

with k being the number of bins

The frequency density distribution is similar

to the frequency distribution but with ai 7rarrai∆x where ∆x is the bin width

If we let the bin width approach zero we

derive the probability density function (PDF)

f(x) = limkrarrinfin∆xrarr0

ksumj=1

aj∆x (pxf1)

prob Probability pxf Probability density and mass functions p 1

Figure 034 plot of a probability density function

We typically think of a probability density

function f like the one in Figure 034 as a

function that can be integrated over to

find the probability of the random variable

(measurand) being in an interval [a b]

P(x isin [a b]) =ˆ b

af(χ)dχ (pxf2)

Of course

We now consider a common PMF and a common

PDF

Binomial PMF

Consider a random binary sequence of length

n such that each element is a random 0 or 1

generated independently like

(1 0 1 1 0 middot middot middot 1 1) (pxf3)

Let events 1 and 0 be mutually exclusive

and exhaustive and P(1) = p The probability of

the sequence above occurring is

prob Probability pxf Probability density and mass functions p 2

binomial distribution PDF

There are n choose k(n

k

)=

n

k(n ndash k) (pxf4)

possible combinations of k ones for n bits

Therefore the probability of any combination

of k ones in a series is

f(k) =

(n

k

)pk(1 ndash p)nndashk (pxf5)

We call Equation pxf5 the binomial

distribution PDF

Example probpxfshy1 re Binomial PMF

Problem Statement

Consider a field sensor that fails for a

given measurement with probability p Given

n measurements plot the binomial PMF as

a function of k failed measurements for

a few different probabilities of failure

p isin [004 025 05 075 096]

Problem SolutionFigure 036 shows Matlab code for

constructing the PDFs plotted in Figure 035

Note that the symmetry is due to the fact

that events 1 and 0 are mutually exclusive

and exhaustive

Example probpxfshy2 re hi

Problem Statement

Sed mattis erat sit amet gravida malesuada

elit augue egestas diam tempus scelerisque

nunc nisl vitae libero Sed consequat feugiat

massa Nunc porta eros in eleifend varius

erat leo rutrum dui non convallis lectus

prob Probability pxf Probability density and mass functions p 3

0 20 40 60 80 1000

005

01

015

02

number of failures in sequence

proba

bility

p = 004

p = 025

p = 05

p = 075

p = 096

Figure 035 binomial PDF for n = 100 measurementsand different values of P(1) = p the probability ofa measurement error The plot is generated by theMatlab code of Figure 036

parametersn = 100k_a = linspace(1nn)p_a = [042557596]

binomial functionf = (nkp) nchoosek(nk)p^k(1-p)^(n-k)

loop through to construct an arrayf_a = NaNones(length(k_a)length(p_a))for i = 1length(k_a)

for j = 1length(p_a)f_a(ij) = f(nk_a(i)p_a(j))

endend

plotfigurecolors = jet(length(p_a))for j = 1length(p_a)

bar(k_af_a(j)facecolorcolors(j)facealpha05displayname [$p = num2str(p_a(j))$]

) hold onendleg = legend(showlocationnorth)set(leginterpreterlatex)hold offxlabel(number of ones in sequence k)ylabel(probability)xlim([0100])

Figure 036 a Matlab script for generating binomialPMFs

prob Probability pxf Probability density and mass functions p 4

orci ut nibh Sed lorem massa nonummy quis

egestas id condimentum at nisl Maecenas at

nibh Aliquam et augue at nunc pellentesque

ullamcorper Duis nisl nibh laoreet suscipit

convallis ut rutrum id enim Phasellus odio

Nulla nulla elit molestie non scelerisque at

vestibulum eu nulla Ut odio nisl facilisis

id mollis et scelerisque nec enim Aenean sem

leo pellentesque sit amet scelerisque sit amet

vehicula pellentesque sapien

prob Probability pxf Probability density and mass functions p 5

prob Probability pxf Probability density and mass functions p 6

prob Probability E Probability density and mass functions p 1

Gaussian or normal random variable

mean

standard deviation

variance

Gaussian PDF

The Gaussian or normal random variable x has

PDF

f(x) =1

σradic2π

expndash(x ndash micro)2

2σ2 (pxf6)

Although wersquore not quite ready to understand

these quantities in detail it can be shown that

the parameters micro and σ have the following

meanings

bull micro is the mean of x

bull σ is the standard deviation of x and

bull σ2 is the variance of x

Consider the ldquobell-shapedrdquo Gaussian PDF in

Figure 037 It is always symmetric The mean micro

is its central value and the standard deviation

σ is directly related to its width We will

continue to explore the Gaussian distribution

in the following lectures especially in

Lecture statsconfidence

prob Probability E Probability density and mass functions p 2

ndash3 ndash2 ndash1 0 1 2 3

random variable x

Figure 037 PDF for Gaussian random variable x

mean micro = 0 and standard deviation σ = 1radic2

prob Probability E Expectation p 1

expected value

expectation

mean

probE Expectation

Recall that a random variable is a function

X Ω rarr R that maps from the sample

space to the reals Random variables are the

arguments of probability mass functions (PMFs)

and probability density functions (PDFs)

The expected value (or expectation) of a

random variable is akin to its ldquoaverage valuerdquo

and depends on its PMF or PDF The expected

value of a random variable X is denoted

〈X〉 or E[X] There are two definitions of

the expectation one for a discrete random

variable the other for a continuous random

variable Before we define them however it

is useful to predefine the most fundamental

property of a random variable its mean

Definition probE1 mean

The mean of a random variable X is defined

as

mX = E[X]

Letrsquos begin with a discrete random variable

Definition probE2 expectation of a

discrete random

variable

Let K be a discrete random variable and f its

PMF The expected value of K is defined as

E[K]=sumforallk

kf(k)

prob Probability E Expectation p 2

Example probEshy1 re expectation of a discrete random variable

Problem Statement

Given a discrete random variable K with PMF

shown below what is its mean mK

Let us now turn to the expectation of a

continuous random variable

Definition probE3 expectation of a

continuous random

variable

Let X be a continuous random variable and f

its PDF The expected value of X is defined as

E[X]=

ˆ infinndashinfin xf(x)dx

Example probEshy2 re expectation of a continuous random variable

Problem Statement

Given a continuous random variable X with

Gaussian PDF f what is the expected value of

X

random variable x

prob Probability moments Expectation p 3

Due to its sum or integral form the expected

value E [middot] has some familiar properties for

random variables X and Y and reals a and b

E [a] = a (E1a)

E[X + a

]= E

[X]+ a (E1b)

E[aX

]= aE

[X]

(E1c)

E[E[X]]= E

[X]

(E1d)

E[aX + bY

]= aE

[X]+ b E

[Y] (E1e)

prob Probability moments Central moments p 1

central moments

variance

probmoments Central moments

Given a probability mass function (PMF) or

probability density function (PDF) of a random

variable several useful parameters of the

random variable can be computed These are

called central moments which quantify

parameters relative to its mean

Definition probmoments1 central moments

The nth central moment of random variable X

with PDF f is defined as

E[(X ndash microX)

n] = ˆ infinndashinfin (x ndash microX)

nf(x)dx

For discrete random variable K with PMF f

E[(K ndash microK)

n] = infinsumforallk(k ndash microK)

nf(k)

Example probmomentsshy1 re first moment

Problem Statement

Prove that the first moment of continuous

random variable X is zero

The second central moment of random variable

X is called the variance and is denoted

prob Probability moments Central moments p 2

standard deviation

skewness

σ2X or Var[X]

or E[(X ndash microX)

2]

(moments1)

The variance is a measure of the width or

spread of the PMF or PDF We usually compute

the variance with the formula

Other properties of variance include for real

constant c

Var [c] = 0 (moments2)

Var[X + c

]= Var

[X]

(moments3)

Var[cX

]= c2 Var

[X] (moments4)

The standard deviation is defined as

Although the variance is mathematically more

convenient the standard deviation has the

same physical units as X so it is often the

more physically meaningful quantity Due to

its meaning as the width or spread of the

probability distribution and its sharing of

physical units it is a convenient choice for

error bars on plots of a random variable

The skewness Skew[X]is a normalized third

prob Probability ex Central moments p 3

asymmetry

kurtosis

tailedness

central moment

Skew[X]=E[(X ndash microX)

3]

σ3X

(moments5)

Skewness is a measure of asymmetry of a

random variablersquos PDF or PMF For a symmetric

PMF or PDF such as the Gaussian PDF Skew[X]=

0

The kurtosis Kurt[X]is a normalized fourth

central moment

Kurt[X]=E[(X ndash microX)

4]

σ4X

(moments6)

Kurtosis is a measure of the tailedness of a

random variablersquos PDF or PMF ldquoHeavierrdquo tails

yield higher kurtosis

A Gaussian random variable has PDF with

kurtosis 3 Given that for Gaussians both

skewness and kurtosis have nice values (0 and

3) we can think of skewness and and kurtosis

as measures of similarity to the Gaussian PDF

stats Probability Exercises for Chapter 03 p 1

probex Exercises for Chapter

03

Exercise 031 (5)

Several physical processes can be modeled

with a random walk a process of interatively

changing a quantity by some random

amount Infinitely many variations are

possible but common factors of variation

include probability distribution step

size dimensionality (eg oneshydimensional

twoshydimensional etc) and coordinate system

Graphical representations of these walks can be

beautiful Develop a computer program that

generates random walks and corresponding

graphics Do it well and call it art because it

is

estimation

stats

Statistics

Whereas probability theory is primarily

focused on the relations among mathematical

objects statistics is concerned with making

sense of the outcomes of observation (Steven

S Skiena Calculated Bets Computers Gambling

and Mathematical Modeling to Win Outlooks

Cambridge University Press 2001 DOI 10

1017CBO9780511547089 This includes a lucid

section on probability versus statistics also

available here httpswww3csstonybrook

edu~skienajaialaiexcerptsnode12html)

However we frequently use statistical methods

to estimate probabilistic models For instance

we will learn how to estimate the standard

deviation of a random process we have some

reason to expect has a Gaussian probability

distribution

Statistics has applications in nearly every

applied science and engineering discipline

Any time measurements are made statistical

analysis is how one makes sense of the results

For instance determining a reasonable level

of confidence in a measured parameter

stats Statistics terms p 3

machine learning

1 Ash Basic Probability Theory

2 Jaynes et al Probability Theory The Logic of Science

3 Erwin Kreyszig Advanced Engineering Mathematics 10th John Wiley amp

Sons Limited 2011 ISBN 9781119571094 The authoritative resource for

engineering mathematics It includes detailed accounts of probability

statistics vector calculus linear algebra fourier analysis ordinary

and partial differential equations and complex analysis It also

includes several other topics with varying degrees of depth Overall

it is the best place to start when seeking mathematical guidance

requires statistics

A particularly hot topic nowadays is machine

learning which seems to be a field with

applications that continue to expand This field

is fundamentally built on statistics

A good introduction to statistics appears at

the end of Ash1 A more involved introduction

is given by Jaynes et al2 The treatment by

Kreyszig3 is rather incomplete as will be our

own

stats Statistics terms Populations samples and machine learning p 1

population

sample

random

machine learning

training

training set

features

predictive model

statsterms Populations samples

and machine

learning

An experimentrsquos population is a complete

collection of objects that we would like to

study These objects can be people machines

processes or anything else we would like to

understand experimentally

Of course we typically canrsquot measure all of

the population Instead we take a subset of

the populationmdashcalled a samplemdashand infer the

characteristics of the entire population from

this sample

However this inference that the sample is

somehow representative of the population

assumes the sample size is sufficiently large

and that the sampling is random This means

selection of the sample should be such that no

one group within a population are systematically

over- or under-represented in the sample

Machine learning is a field that makes

extensive use of measurements and statistical

inference In it an algorithm is trained by

exposure to sample data which is called a

training set The variables measured are

called features Typically a predictive

model is developed that can be used to

extrapolate from the data to a new situation

The methods of statistical analysis we

introduce in this chapter are the foundation

of most machine learning methods

stats Statistics sample Populations samples and machine learning p 2

Example statstermsshy1 re combat boots

Problem Statement

Consider a robot Pierre with a particular

gravitas and sense of style He seeks just

the right-looking pair of combat boots

for wearing in the autumn rains Pierre

is to purchase the boots online via image

recognition and decides to gather data

by visiting a hipster hangout one evening to

train his style For contrast he also watches

footage of a White Nationalist rally focusing

special attention on the boots of wearers of

khakis and polos Comment on Pierrersquos methods

stats Statistics sample Estimation of sample mean and variance p 1

sample mean

statssample Estimation of sample

mean and variance

Estimation and sample statistics

The mean and variance definitions of

Lecture probE and Lecture probmoments apply

only to a random variable for which we have a

theoretical probability distribution Typically

it is not until after having performed many

measurements of a random variable that

we can assign a good distribution model

Until then measurements can help us estimate

aspects of the data We usually start by

estimating basic parameters such as mean

and variance before estimating a probability

distribution

There are two key aspects to randomness in the

measurement of a random variable First of

course there is the underlying randomness with

its probability distribution mean standard

deviation etc which we call the population

statistics Second there is the statistical

variability that is due to the fact that we

are estimating the random variablersquos statisticsmdash

called its sample statisticsmdashfrom some sample

Statistical variability is decreased with

greater sample size and number of samples

whereas the underlying randomness of the

random variable does not decrease Instead

our estimates of its probability distribution

and statistics improve

Sample mean variance and standard

deviation

The arithmetic mean or sample mean of a

measurand with sample size N represented by

stats Statistics sample Estimation of sample mean and variance p 2

population mean

sample variance

population variance

random variable X is defined as

x =1

N

Nsumi=1

xi (sample1)

If the sample size is large x rarr mX (the sample

mean approaches the mean) The population

mean is another name for the mean mX which

is equal to

mX = limNrarrinfin 1

N

Nsumi=1

xi (sample2)

Recall that the definition of the mean is

mX = E [x]

The sample variance of a measurand

represented by random variable X is defined

as

S2X =1

N ndash 1

Nsumi=1

(xi ndash x)2 (sample3)

If the sample size is large S2X rarr σ2X(the

sample variance approaches the variance) The

population variance is another term for the

variance σ2X and can be expressed as

σ2X = limNrarrinfin 1

N ndash 1

Nsumi=1

(xi ndash x)2 (sample4)

Recall that the definition of the variance is

σ2X= E

[(X ndash mX)

2]

The sample standard deviation of a measurand

represented by random variable X is defined

as

SX =

radicS2X (sample5)

If the sample size is large SX rarr σX (the sample

standard deviation approaches the standard

stats Statistics sample Estimation of sample mean and variance p 1

mean of means Xi

mean of standard deviations SXi

standard deviation of means SXi

standard deviation of standard deviations SSXi

deviation) The population standard deviation is

another term for the standard deviation σX

and can be expressed as

σX = limNrarrinfin

radicS2X (sample6)

Recall that the definition of the standard

deviation is σX =radicσ2X

Sample statistics as random variables

There is an ambiguity in our usage of the term

ldquosamplerdquo It can mean just one measurement

or it can mean a collection of measurements

gathered together Hopefully it is clear from

context

In the latter sense often we collect multiple

samples each of which has its own sample

mean Xi and standard deviation SXi In this

situation Xi and SXi are themselves random

variables (meta af I know) This means they

have their own sample means Xi and SXi and

standard deviations SXi

and SSXi

The mean of means Xi is equivalent to

a mean with a larger sample size and is

therefore our best estimate of the mean of

the underlying random process The mean of

standard deviations SXi is our best estimate

of the standard deviation of the underlying

random process The standard deviation

of means SXi

is a measure of the spread

in our estimates of the mean It is our best

estimate of the standard deviation of the

statistical variation and should therefore

tend to zero as sample size and number of

samples increases The standard deviation

of standard deviations SSXiis a measure of

stats Statistics sample Estimation of sample mean and variance p 1

the spread in our estimates of the standard

deviation of the underlying process It should

also tend to zero as sample size and number of

samples increases

Let N be the size of each sample It can be

shown that the standard deviation of the

means SXi

can be estimated from a single

sample standard deviation

SXiasympSXiradicN (sample7)

This shows that as the sample size N increases

the statistical variability of the mean

decreases (and in the limit approaches zero)

Nonstationary signal statistics

The sample mean variance and standard

deviation definitions above assume the random

process is stationarymdashthat is its population

mean does not vary with time However

a great many measurement signals have

populations that do vary with time ie they are

nonstationary Sometimes the nonstationarity

arises from a ldquodriftrdquo in the dc value of a

signal or some other slowly changing variable

But dynamic signals can also change in a

recognizable and predictable manner as when

say the temperature of a room changes when

a window is opened or when a water level

changes with the tide

Typically we would like to minimize the effect

of nonstationarity on the signal statistics In

certain cases such as drift the variation is a

nuissance only but other times it is the point

of the measurement

stats Statistics sample Estimation of sample mean and variance p 1

Two common techniques are used depending

on the overall type of nonstationarity If it

is periodic with some known or estimated

period the measurement data series can

be ldquofoldedrdquo or ldquoreshapedrdquo such that the ith

measurement of each period corresponds to

the ith measurement of all other periods In

this case somewhat counterintuitively we can

consider the ith measurements to correspond

to a sample of size N where N is the number of

periods over which measurements are made

When the signal is aperiodic we often simply

divide it into ldquosmallrdquo (relative to its overall

trend) intervals over which statistics are

computed separately

Note that in this discussion we have assumed

that the nonstationarity of the signal is due to

a variable that is deterministic (not random)

Example statssampleshy1 re measurement of gaussian noise on

nonstationary signal

Problem StatementConsider the measurement of the temperature

inside a desktop computer chassis via an

inexpensive thermistor a resistor that

changes resistance with temperature The

processor and power supply heat the chassis

in a manner that depends on processing

demand For the test protocol the processors

are cycled sinusoidally through processing

power levels at a frequency of 50 mHz for

nT = 12 periods and sampled at 1 Hz Assume

a temperature fluctuation between about 20

and 50 C and gaussian noise with standard

stats Statistics sample Estimation of sample mean and variance p 2

deviation 4 C Consider a sample to be the

multiple measurements of a certain instant in

the period

1 Generate and plot simulated

temperature data as a time series

and as a histogram or frequency

distribution Comment on why the

frequency distribution sucks

2 Compute the sample mean and standard

deviation for each sample in the cycle

3 Subtract the mean from each sample

in the period such that each sample

distribution is centered at zero Plot

the composite frequency distribution

of all samples together This represents

our best estimate of the frequency

distribution of the underlying process

4 Plot a comparison of the theoretical

mean which is 35 and the sample mean

of means with an error bar Vary the

number of samples nT and comment on its

effect on the estimate

5 Plot a comparison of the theoretical

standard deviation and the sample mean

of sample standard deviations with an

error bar Vary the number of samples

nT and comment on its effect on the

estimate

6 Plot the sample means over a single

period with error bars of plusmn one sample

standard deviation of the means This

represents our best estimate of the

sinusoidal heating temperature Vary the

number of samples nT and comment on

the estimate

Problem Solution

stats Statistics sample Estimation of sample mean and variance p 3

clear close all clear kernel

Generate the temperature data

The temperature data can be generated by

constructing an array that is passed to a

sinusoid then ldquorandomizedrdquo by gaussian

random numbers Note that we add 1 to npand n to avoid the sneaky fencepost error

f = 50e-3 Hz sinusoid frequencya = 15 C amplitude of oscillationdc = 35 C dc offset of oscillationfs = 1 Hz sampling frequencynT = 12 number of sinusoid periodss = 4 C standard deviationnp = fsf+1 number of samples per periodn = nTnp+1 total number of samples

t_a = linspace(0nTfn) time arraysin_a = dc + asin(2pift_a) sinusoidal arrayrng(43) seed the random number generatornoise_a = srandn(size(t_a)) gaussian noisesignal_a = sin_a + noise_a sinusoid + noise

Now that we have an array of ldquodatardquo wersquore

ready to plot

h = figurep = plot(t_asignal_ao-

Color[888]MarkerFaceColorbMarkerEdgeColornoneMarkerSize3)

xlabel(time (s))ylabel(temperature (C))hgsave(hfigurestemp)

stats Statistics sample Estimation of sample mean and variance p 4

0 50 100 150 200 250

20

40

60

time (s)

tempera

ture

(C)

Figure 041 temperature over time

This is something like what we might see for

continuous measurement data Now the

histogram

h = figurehistogram(signal_a

30 number of binsnormalizationprobability for PMF

)xlabel(temperature (C))ylabel(probability)hgsave(hfigurestemp)

10 20 30 40 50 600

002

004

006

008

temperature (C)

proba

bility

Figure 042 a poor histogram due tounstationarity of the signal

stats Statistics sample Estimation of sample mean and variance p 5

This sucks because we plot a frequency

distribution to tell us about the random

variation but this data includes the sinusoid

Sample mean variance and standard

deviation

To compute the sample mean micro and standard

deviation s for each sample in the period

we must ldquopick outrdquo the nT data points that

correspond to each other Currently theyrsquore in

one long 1 times n array signal_a It is helpful

to reshape the data so it is in an nT times nparray which each row corresponding to a

new period This leaves the correct points

aligned in columns It is important to note

that we can do this ldquofoldingrdquo operation only

when we know rather precisely the period

of the underlying sinusoid It is given in the

problem that it is a controlled experiment

variable If we did not know it we would

have to estimate it too from the data

signal_ar = reshape(signal_a(1end-1)[npnT])size(signal_ar) check sizesignal_ar(1314) print three rows four columns

ans =

12 21

ans =

302718 400946 408341 447662401836 372245 494076 461137400571 409718 461627 419145

stats Statistics sample Estimation of sample mean and variance p 6

Define the mean variance and standard

deviation functions as ldquoanonmymous

functionsrdquo We ldquoroll our ownrdquo These are not

as efficient or flexible as the built-in Matlab

functions mean var and std which should

typically be used

my_mean = (vec) sum(vec)length(vec)my_var = (vec) sum((vec-my_mean(vec))^2)(length(vec)-1)

my_std = (vec) sqrt(my_var(vec))

Now the sample mean variance and standard

deviations can be computed We proceed by

looping through each column of the reshaped

signal array

mu_a = NaNones(1np) initialize mean arrayvar_a = NaNones(1np) initialize var arrays_a = NaNones(1np) initialize std array

for i = 1np for each columnmu_a(i) = my_mean(signal_ar(i))var_a(i) = my_var(signal_ar(i))s_a(i) = sqrt(var_a(i)) touch of speed

end

Composite frequency distribution

The columns represent samples We want to

subtract the mean from each column We use

repmat to reproduce mu_a in nT rows so it can

be easily subtracted

stats Statistics sample Estimation of sample mean and variance p 7

signal_arz = signal_ar - repmat(mu_a[nT1])size(signal_arz) check sizesignal_arz(1314) print three rows four columns

ans =

12 21

ans =

-50881 09525 -02909 -1570048237 -19176 82826 -0222546972 18297 50377 -44216

Now that all samples have the same mean

we can lump them into one big bin for the

frequency distribution There are some nice

built-in functions to do a quick reshape and

fit

resizesignal_arzr = reshape(signal_arz[1nTnp])size(signal_arzr) check size fitpdfit_model = fitdist(signal_arzrnormal) fitx_a = linspace(-1515100)pdfit_a = pdf(pdfit_modelx_a)pdf_a = normpdf(x_a0s) theoretical pdf

ans =

1 252

Plot

stats Statistics sample Estimation of sample mean and variance p 8

h = figurehistogram(signal_arzr

round(ssqrt(nT)) number of binsnormalizationprobability for PMF

)hold onplot(x_apdfit_ab-linewidth2) hold onplot(x_apdf_ag--linewidth2)legend(pmfpdf estpdf)xlabel(zero-mean temperature (C))ylabel(probability massdensity)hgsave(hfigurestemp)

ndash15 ndash10 ndash5 0 5 10 150

005

01

015

zero-mean temperature (C)proba

bility

massdensity

pmf

pdf est

pdf

Figure 043 PMF and estimated and theoreticalPDFs

Means comparison

The sample mean of means is simply the

following

mu_mu = my_mean(mu_a)

mu_mu =

351175

stats Statistics sample Estimation of sample mean and variance p 9

The standard deviation that works as an

error bar which should reflect how well

we can estimate the point plotted is the

standard deviation of the means It is

difficult to compute this directly for a

nonstationary process We use the estimate

given above and improve upon it by using the

mean of standard deviations instead of a

single samplersquos standard deviation

s_mu = mean(s_a)sqrt(nT)

s_mu =

11580

Now for the simple plot

h = figurebar(mu_mu) hold on gives barerrorbar(mu_mus_murlinewidth2) error barax = gca current axisaxXTickLabels = $overlineoverlineX$axTickLabelInterpreter = latexhgsave(hfigurestemp)

Standard deviations comparison

The sample mean of standard deviations is

simply the following

mu_s = my_mean(s_a)

stats Statistics sample Estimation of sample mean and variance p 10

mu_s =

40114

The standard deviation that works as

an error bar which should reflect how

well we can estimate the point plotted is

the standard deviation of the standard

deviations We can compute this directly

s_s = my_std(s_a)

s_s =

08495

Now for the simple plot

h = figurebar(mu_s) hold on gives barerrorbar(mu_ss_srlinewidth2) error barsax = gca current axisaxXTickLabels = $overlineS_X$axTickLabelInterpreter = latexhgsave(hfigurestemp)

X0

10

20

30

40

Figure 044 sample mean of sample means

SX

0

2

4

Figure 045 sample standard deviation of sample means

Plot a period with error bars

Plotting the data with error bars is fairly

straightforward with the built-in errorbar

stats Statistics confidence Estimation of sample mean and variance p 11

function The main question is ldquowhich

standard deviationrdquo Since wersquore plotting the

means it makes sense to plot the error bars

as a single sample standard deviation of the

means

h = figuree1 = errorbar(t_a(1np)mu_as_muones(1np)b)hold ont_a2 = linspace(01f101)e2 = plot(t_a2dc + asin(2pift_a2)r-)xlim([t_a(1)t_a(np)])grid onxlabel(folded time (s))ylabel(temperature (C))legend([e1 e2]sample meanpopulation meanLocationNorthEast)hgsave(hfigurestemp)

0 5 10 15

20

30

40

50

folded time (s)

tempera

ture

(C)

sample mean

population mean

Figure 046 sample means over a period

stats Statistics confidence Confidence p 1

confidence

central limit theorem

statsconfidence Confidence

One really ought to have it to give a lecture

named it but wersquoll give it a try anyway

Confidence is used in the common sense

although we do endow it with a mathematical

definition to scare business majors who

arenrsquot actually impressed but indifferent

Approximately if under some reasonable

assumptions (probabilistic model) we estimate

the probability of some event to be P we say

we have P confidence in it I mean business

majors are all ldquoSupply and demand Letrsquos call

that a lsquolawrsquo rdquo so I think wersquore even

So wersquore back to computing probability from

distributionsmdashprobability density functions

(PDFs) and probability mass functions (PMFs)

Usually we care most about estimating the

mean of our distribution Recall from the

previous lecture that when several samples

are taken each with its own mean the mean

is itself a random variablemdashwith a mean of

course Meanception

But the mean has a probability distribution

of its own The central limit theorem has

as one of its implications that as the sample

size N gets large regardless of the sample

distributions this distribution of means

approaches the Gaussian distribution

But sometimes I always worry Irsquom being lied to

so letrsquos check

stats Statistics confidence Confidence p 1

clear close all clear kernel

Generate some data to test the central

limit theorem

Data can be generated by constructing an

array using a (seeded for consistency) random

number generator Letrsquos use a uniformly

distributed PDF between 0 and 1

N = 150 sample size (number of measurements per sample)M = 120 number of samplesn = NM total number of measurementsmu_pop = 05 because its a uniform PDF between 0 and 1

rng(11) seed the random number generatorsignal_a = rand(NM) uniform PDFsize(signal_a) check the size

ans =

150 120

Letrsquos take a look at the data by plotting

the first ten samples (columns) as shown in

Figure 047

This is something like what we might see for

continuous measurement data Now the

histogram shown in

samples_to_plot = 10h = figure

stats Statistics confidence Confidence p 1

0 20 40 60 80 100 120 1400

02

04

06

08

1

index

measurement

Figure 047 raw data with colors correspondingto samples

0 02 04 06 08 10

002

004

006

008

measurement

proba

bility

Figure 048 a histogram showing the approximatelyuniform distribution of each sample (color)

c = jet(samples_to_plot) color arrayfor j=1samples_to_plot

histogram(signal_a(j)30 number of binsfacecolorc(j)facealpha3normalizationprobability for PMF

)hold on

endhold offxlim([-05105])xlabel(measurement)ylabel(probability)hgsave(hfigurestemp)

This isnrsquot a great plot but it shows roughly

that each sample is fairly uniformly

distributed

stats Statistics confidence Confidence p 1

Sample statistics

Now letrsquos check out the sample statistics We

want the sample mean and standard deviation

of each column Letrsquos use the built-in functions

mean and std

mu_a = mean(signal_a1) mean of each columns_a = std(signal_a1) std of each column

Now we can compute the mean statistics both

the mean of the mean X and the standard

deviation of the mean sX which we donrsquot

strictly need for this part but wersquore curious

We choose to use the direct estimate instead

of the sXradicN formula but they should be close

mu_mu = mean(mu_a)s_mu = std(mu_a)

mu_mu =

04987

s_mu =

00236

The truth about sample means

Itrsquos the moment of truth Letrsquos look at the

distribution shown in Figure 049

stats Statistics confidence Confidence p 1

042 044 046 048 05 052 054 056 0580

01

02

03

measurement

proba

bility

Figure 049 a histogram showing the approximatelynormal distribution of the means

h = figurehistogram(mu_a

normalizationprobability for PMF)hold offxlabel(measurement)ylabel(probability)hgsave(hfigurestemp)

This looks like a Gaussian distribution about

the mean of means so I guess the central limit

theorem is legit

Gaussian and probability

We already know how to compute the

probability P a value of a random variable X

lies in a certain interval from a PMF or PDF

(the sum or the integral respectively) This

means that for sufficiently large sample size N

such that we can assume from the central limit

theorem that the sample means xi are normally

distributed the probability a sample mean

value xi is in a certain interval is given by

integrating the Gaussian PDF The Gaussian PDF

for random variable Y representing the sample

means is

f(y) =1

σradic2π

expndash(y ndash micro)2

2σ2 (confidence1)

stats Statistics confidence Confidence p 2

Gaussian cumulative distribution function

where micro is the population mean and σ is the

population standard deviation

The integral of f over some interval is the

probability a value will be in that interval

Unfortunately that integral is uncool It gives

rise to the definition of the error function

which for the Gaussian random variable Y is

erf(yb) =1radicπ

ˆ yb

ndashyb

endasht2dt (confidence2)

This expresses the probability a sample mean

being in the interval [ndashyb yb] if Y has mean 0

and variance 12

Matlab has this built-in as erf shown in

Figure 0410

y_a = linspace(03100)h = figurep1 = plot(y_aerf(y_a))p1LineWidth = 2grid onxlabel(interval bound $y_b$interpreterlatex)ylabel(error function $textrmerf(y_b)$

interpreterlatex)hgsave(hfigurestemp)

We could deal directly with the error

function but most people donrsquot and wersquore

weird enough as it is Instead people use the

Gaussian cumulative distribution function

stats Statistics confidence Confidence p 3

0 05 1 15 2 25 30

02

04

06

08

1

interval bound yb

errorfunctionerf(yb)

Figure 0410 the error function

(CDF) Φ Rrarr R which is defined as

Φ(z) =1

2

(1 + erf

(zradic2

))(confidence3)

and which expresses the probability of a

Gaussian random variable Z with mean 0 and

standard deviation 1 taking on a value in

the interval (ndashinfin z] The Gaussian CDF and PDF

are shown in Figure 0411 Values can be taken

directly from the graph but itrsquos more accurate

to use the table of values in Appendix A01

Thatrsquos great and all but occasionally (always)

we have Gaussian random variables with

nonzero means and nonunity standard

deviations It turns out we can shift any

Gaussian random variable by its mean and

scale it by its standard deviation to make

it have zero mean and standard deviation

We can then use Φ and interpret the results

as being relative to the mean and standard

deviation using phrases like ldquothe probability

it is within two standard deviations of its

meanrdquo The transformed random variable Z and

its values z are sometimes called the z-score

For a particular value x of a random variable

X we can compute its z-score (or value z of

stats Statistics confidence Confidence p 4

zb

ndash3 ndash2 ndash1 0 1 2 30

01

02

03

04

z

Gaussia

nPDFf(z)

Φ(zb)

f(z)

(a) the Gaussian PDF

ndash3 ndash2 ndash1 0 1 2 30

02

04

06

08

1

interval bound zb

Gaussia

nCDFΦ(zb)

(b) the Gaussian CDF

Figure 0411 the Gaussian PDF and CDF for z-scores

random variable Z) with the formula

z =x ndash microXσX

(confidence4)

and compute the probability of X taking on a

value within the interval say x isin [xbndash xb+]

from the table (Sample statistics X and SXare appropriate when population statistics are

unknown)

For instance compute the probability a

Gaussian random variable X with microX = 5 and

σX = 234 takes on a value within the interval

x isin [3 6]

1 Compute the z-score of each endpoint of

the interval

stats Statistics confidence Confidence p 5

z3 =3 ndash microXσX

asymp ndash085 (confidence5)

z6 =6 ndash microXσX

asymp 043 (confidence6)

2 Look up the CDF values for z3 and z6 which

are Φ(z3) = 01977 and Φ(z6) = 06664 3 The

CDF values correspond to the probabilities x lt 3

and x lt 6 Therefore to find the probability

x lies in that interval we subtract the lower

bound probability

P(x isin [3 6]) = P(x lt 6) ndash P(x lt 3) (confidence7)

= Φ(6) ndashΦ(3) (confidence8)

asymp 06664 ndash 01977 (confidence9)

asymp 04689 (confidence10)

So there is a 4689 probability and therefore

we have 4689 confidence that x isin [3 6]

Often we want to go the other way estimating

the symmetric interval [xbndash xb+] for which

there is a given probability In this case we

first look up the z-score corresponding to a

certain probability For concreteness given

the same population statistics above letrsquos find

the symmetric interval [xbndash xb+] over which we

have 90 confidence From the table we want

two symmetric z-scores that have CDFshyvalue

difference 09 Or in maths

Φ(zb+) ndashΦ(zbndash) = 09 and zb+ = ndashzbndash

(confidence11)

stats Statistics confidence Confidence p 6

Due to the latter relation and the additional

fact that the Gaussian CDF has antisymmetry

Φ(zb+) +Φ(zbndash) = 1 (confidence12)

Adding the two Φ equations

Φ(zb+) = 192 (confidence13)

= 095 (confidence14)

and Φ(zbndash) = 005 From the table these

correspond (with a linear interpolation) to

zb = zb+ = ndashzbndash asymp 1645 All that remains is

to solve the z-score formula for x

x = microX + zσX (confidence15)

From this

xb+ = microX + zb+σX asymp 8849 (confidence16)

xbndash = microX + zbndashσX asymp 1151 (confidence17)

and X has a 90 confidence interval

[1151 8849]

stats Statistics confidence Confidence p 7

Example statsconfidenceshy1 re gaussian confidence for a mean

Problem StatementConsider the data set generated above

What is our 95 confidence interval in our

estimate of the mean

Problem Solution

Assumingwe have a sufficiently large

data set the distribution of means is

approximately Gaussian Following the same

logic as above we need z-score that gives an

upper CDF value of From the

table we obtain the zb = zb+ = ndashzbndash below

z_b = 196

Now we can estimate the mean using our

sample and mean statistics

X = Xplusmn zbSX (confidence18)

mu_x_95 = mu_mu + [-z_bz_b]s_mu

mu_x_95 =

04526 05449

stats Statistics student Confidence p 8

This is our 95 confidence interval in our

estimate of the mean

stats Statistics student Student confidence p 1

Studentrsquos t-distribution

degrees of freedom

statsstudent Student

confidence

The central limit theorem tells us that for

large sample size N the distribution of the

means is Gaussian However for small sample

size the Gaussian isnrsquot as good of an estimate

Studentrsquos t-distribution is superior for

lower sample size and equivalent at higher

sample size Technically if the population

standard deviation σX is known even for

low sample size we should use the Gaussian

distribution However this rarely arises in

practice so we can usually get away with an

ldquoalways trdquo approach

A way that the tshydistribution accounts for

low-N is by having an entirely different

distribution for each N (seems a bit of a

cheat to me) Actually instead of N it uses

the degrees of freedom ν which is N minus the

number of parameters required to compute the

statistic Since the standard deviation requires

only the mean for most of our cases ν = N ndash 1

As with the Gaussian distribution the

tshydistributionrsquos integral is difficult to

calculate Typically we will use a t-table such

as the one given in Appendix A02 There are

three points of note

1 Since we are primarily concerned with

going from probabilityconfidence

values (eg P probabilityconfidence) to

intervals typically there is a column for

each probability

stats Statistics student Student confidence p 2

2 The extra parameter ν takes over one of

the dimensions of the table because three-

dimensional tables are illegal

3 Many of these tables are ldquotwo-sidedrdquo

meaning their t-scores and probabilities

assume you want the symmetric

probability about the mean over the

interval [ndashtb tb] where tb is your t-score

bound

Consider the following example

Example statsstudentshy1 re student confidence interval

Problem StatementWrite a Matlab script to generate a data

set with 200 samples and sample sizes

N isin 10 20 100 using any old distribution

Compare the distribution of the means for

the different N Use the sample distributions

and a t-table to compute 99 confidence

intervals

Problem Solution

Generate the data set

confidence = 099 requirement

M = 200 of samplesN_a = [1020100] sample sizes

mu = 27 population meansigma = 9 population std

rng(1) seed random number generatordata_a = mu + sigmarandn(N_a(end)M) normal

stats Statistics student Student confidence p 3

size(data_a) check sizedata_a(11015) check 10 rows and five columns

ans =

100 200

ans =

211589 302894 278705 307835 283662376305 171264 282973 240811 343486201739 443719 437059 390699 322002170135 326064 369030 379230 365747193900 329156 237230 224749 197709218460 138295 312479 169527 341876219719 346854 194480 187014 241642286054 322244 222873 269906 376746252282 187326 145011 283814 277645322780 341538 270382 188643 141752

Compute the means for different sample sizes

mu_a = NaNones(length(N_a)M)for i = 1length(N_a)

mu_a(i) = mean(data_a(1N_a(i)1M)1)end

Plotting the distribution of

the means yields Figure 0412

20 25 30 350

01

02

03

random variable X

proba

bility

(PMF)

interval for N = 10

interval for N = 20

interval for N = 100

stats Statistics student Student confidence p 4

Figure 0412 a histogram showing thedistribution of the means for each sample size

It makes sense that the larger the sample size

the smaller the spread A quantitative metric

for the spread is of course the standard

deviation of the means for each sample size

S_mu = std(mu_a02)

S_mu =

283652091810097

Look up t-table values or use Matlabrsquos tinvfor different sample sizes and 99 confidence

Use these the mean of means and the

standard deviation of means to compute the

99 confidence interval for each N

t_a = tinv(confidenceN_a-1)for i = 1length(N_a)

interval = mean(mu_a(i)) + [-11]t_a(i)S_mu(i)

disp(sprintf(interval for N = i N_a(i)))disp(interval)

end

stats Statistics multivar Student confidence p 5

t_a =

28214 25395 23646

interval for N = 10190942 351000

interval for N = 20216292 322535

interval for N = 100247036 294787

As expected the larger the sample size the

smaller the interval over which we have 99

confidence in the estimate

stats Statistics multivar Multivariate probability and correlation p 1

joint PDF

joint PMF

statsmultivar Multivariate

probability and

correlation

Thus far we have considered probability

density and mass functions (PDFs and PMFs)

of only one random variable But of course

often we measure multiple random variables

X1 X2 Xn during a single event meaning

a measurement consists of determining values

x1 x2 xn of these random variables

We can consider an n-tuple of continuous

random variables to form a sample space Ω = Rn

on which a multivariate function f Rn rarr Rcalled the joint PDF assigns a probability

density to each outcome x isin Rn The joint PDF

must be greater than or equal to zero for all

x isin Rn the multiple integral over Ω must be

unity and the multiple integral over a subset

of the sample space A sub Ω is the probability of

the event A

We can consider an n-tuple of discrete random

variables to form a sample space Nn0 on which

a multivariate function f Nn0 rarr R calledthe joint PMF assigns a probability to each

outcome x isin Nn0 The joint PMF must be greater

than or equal to zero for all x isin Nn0 the

multiple sum over Ω must be unity and the

multiple sum over a subset of the sample space

A sub Ω is the probability of the event A

stats Statistics multivar Multivariate probability and correlation p 2

Example statsmultivarshy1 re bivariate gaussian pdf

Problem StatementLetrsquos visualize multivariate PDFs by plotting

a bivariate gaussian using Matlabrsquos mvnpdffunction

Problem Solution

mu = [1020] meansSigma = [1002] cov well talk about thisx1_a = linspace(

mu(1)-5sqrt(Sigma(11))mu(1)+5sqrt(Sigma(11))30)

x2_a = linspace(mu(2)-5sqrt(Sigma(22))mu(2)+5sqrt(Sigma(22))30)

[X1X2] = meshgrid(x1_ax2_a)f = mvnpdf([X1() X2()]muSigma)f = reshape(flength(x2_a)length(x1_a))

h = figurep = surf(x1_ax2_af)xlabel($x_1$interpreterlatex)ylabel($x_2$interpreterlatex)zlabel($f(x_1x_2)$interpreterlatex)shading interphgsave(hfigurestemp)

The result is Figure 0413 Note

how the means and standard

deviations affect the distribution

510

1520

0

02

x1x2

f(x1x2)

stats Statistics multivar Multivariate probability and correlation p 1

marginal PDF

Figure 0413 twoshyvariable gaussian PDF

Marginal probability

The marginal PDF of a multivariate PDF is the

PDF of some subspace of Ω after one or more

variables have been ldquointegrated outrdquo such that

a fewer number of random variables remain

Of course these marginal PDFs must have the

same properties of any PDF such as integrating

to unity

Example statsmultivarshy2 re bivariate gaussian marginal probability

Problem StatementLetrsquos demonstrate this by numerically

integrating over x2 in the bivariate Gaussian

above

Problem Solution

Continuing from where we left off letrsquos

integrate

f1 = trapz(x2_af2) trapezoidal integration

Plotting this yields Figure 0414

h = figurep = plot(x1_af1)pLineWidth = 2xlabel($x_1$interpreterlatex)

stats Statistics multivar Multivariate probability and correlation p 1

machine learning

ylabel($g(x_1)=int_-infty^infty f(x_1x_2) d x_2$interpreterlatex)hgsave(hfigurestemp)

6 8 10 12 140

01

02

03

04

x1

g(x1)=acute infin ndashinfinf

(x1x2)dx2

Figure 0414 marginal Gaussian PDF g(x1)

We should probably verify that this

integrates to one

disp([integral over x_1 = sprintf(07ftrapz(x1_af1))]

)

integral over x_1 = 09999986

Not bad

Covariance

Very often especially in machine learning

stats Statistics multivar Multivariate probability and correlation p 1

artificial intelligence

covariance

correlation

sample covariance

covariance matrix

or artificial intelligence applications the

question about two random variables X and Y

is how do they coshyvary That is what is their

covariance defined as

Cov[XY

]equiv E

((X ndash microX)(Y ndash microY)

)= E(XY) ndash microXmicroY

Note that when X = Y the covariance is just

the variance When a covariance is large and

positive it is an indication that the random

variables are strongly correlated When

it is large and negative they are strongly

anti-correlated Zero covariance means

the variables are uncorrelated In fact

correlation is defined as

Cor[XY

]=

Cov[XY

]radicVar

[X]Var

[Y]

This is essentially the covariance ldquonormalizedrdquo

to the interval [ndash1 1]

Sample covariance

As with the other statistics wersquove considered

covariance can be estimated from

measurement The estimate called the sample

covariance qXY of random variables X and Y

with sample size N is given by

qXY =1

N ndash 1

Nsumi=1

(xi ndash X)(yi ndash Y)

Multivariate covariance

With n random variables Xi one can compute

the covariance of each pair It is common

practice to define an n times n matrix of

covariances called the covariance matrix Σ

stats Statistics multivar Multivariate probability and correlation p 2

sample covariance matrix

such that each pairrsquos covariance

Cov[Xi Xj

](multivar1)

appears in its row-column combination (making it

symmetric) as shown below

Σ =

Cov

[X1 X1

]Cov

[X1 X2

]middot middot middot Cov

[X1 Xn

]Cov

[X2 X1

]Cov

[X2 X2

]Cov

[X2 Xn

]

Cov[Xn X1

]Cov

[Xn X2

]middot middot middot Cov

[Xn Xn

]

The multivariate sample covariance matrix

Q is the same as above but with sample

covariances qXiXj

Both covariance matrices have correlation

analogs

Example statsmultivarshy3 re car data sample covariance and correlation

Problem StatementLetrsquos use a built-in multivariate data set

that describes different features of cars

listed below

d = load(carsmallmat) this is a struct

Letrsquos compute the sample covariance and

correlation matrices

Problem Solution

variables = MPGCylindersDisplacementHorsepowerWeightAcceleration

stats Statistics multivar Multivariate probability and correlation p 3

Model_Yearn = length(variables)m = length(dMPG)

data = NaNones(mn) preallocatefor i = 1n

data(i) = d(variablesi)end

cov_d = nancov(data) sample covariancecor_d = corrcov(cov_d) sample correlation

This is highly correlatedanticorrelated

data Letrsquos plot each variable versus each

other variable to see the correlations of each

We use a red-to-blue colormap to contrast

anticorrelation and correlation Purple then is

uncorrelated

The following builds the red-to-blue colormap

n_colors = 10cmap_rb = NaNones(n_colors3)for i = 1n_colors

a = in_colorscmap_rb(i) = (1-a)[100]+a[001]

end

h = figurefor i = 1n

for j = 1nsubplot(nnsub2ind([nn]ji))p = plot(d(variablesi)d(variablesj)

) hold onthis_color = cmap_rb(round((cor_d(ij)+1)(n_colors-1)2)

stats Statistics multivar Multivariate probability and correlation p 1

)pMarkerFaceColor = this_colorpMarkerEdgeColor = this_color

endendhgsave(hfigurestemp)

Figure 0415 car data correlation

Conditional probability and dependence

Independent variables are uncorrelated

However uncorrelated variables may or may

not be independent Therefore we cannot use

correlation alone as a test for independence

For instance for random variables X and Y

where X has some even distribution and Y = X2

clearly the variables are dependent However

the are also uncorrelated (due to symmetry)

stats Statistics multivar Multivariate probability and correlation p 2

Example statsmultivarshy4 re car data sample covariance and correlation

Problem StatementUsing a uniform distribution U(ndash1 1) show that

X and Y are uncorrelated (but dependent)

with Y = X2 with some sampling We compute

the correlation for different sample sizes

Problem Solution

N_a = round(linspace(10500100))qc_a = NaNones(size(N_a))rng(6)x_a = -1 + 2rand(max(N_a)1)y_a = x_a^2for i = 1length(N_a)

should write incremental algorithm but lazyq = cov(x_a(1N_a(i))y_a(1N_a(i)))qc = corrcov(q)qc_a(i) = qc(21) cross correlation

end

The absolute values of the correlations are

shown in Figure 0416 Note that we should

probably average several such curves to

estimate how the correlation would drop off

with N but the single curve describes our

understanding that the correlation in fact

approaches zero in the large-sample limit

h = figurep = plot(N_aabs(qc_a))pLineWidth = 2xlabel(sample size $N$interpreterlatex)ylabel(absolute sample correlationinterpreterlatex

stats Statistics regression Multivariate probability and correlation p 3

)hgsave(hfigurestemp)

0 100 200 300 400 5000

01

02

03

sample size Nabsolute

sampleco

rrelation

Figure 0416 absolute value of the sample

correlation between X sim U(ndash1 1) and Y = X2 fordifferent sample size N In the limit the populationcorrelation should approach zero and yet X and Yare not independent

stats Statistics regression Regression p 1

statsregression Regression

Suppose we have a sample with two

measurands (1) the force F through a spring and

(2) its displacement X (not from equilibrium)

We would like to determine an analytic

function that relates the variables perhaps

for prediction of the force given another

displacement

There is some variation in the measurement Letrsquos

say the following is the sample

X_a = 1e-3[10213041495061718092100] mF_a = [501504532559572599

610639670679703] N

Letrsquos take a look at the data The result is

Figure 0417

h = figurep = plot(X_a1e3F_abMarkerSize15)xlabel($X$ (mm)interpreterlatex)ylabel($F$ (N)interpreterlatex)xlim([0max(X_a1e3)])grid onhgsave(hfigurestemp)

How might we find an analytic function that

agrees with the data Broadly our strategy

will be to assume a general form of a function

and use the data to set the parameters in the

stats Statistics regression Regression p 2

0 20 40 60 80 10050

60

70

X (mm)

F(N)

Figure 0417 forceshydisplacement data

function such that the difference between the

data and the function is minimal

Let y be the analytic function that we would

like to fit to the data Let yi denote the

value of y(xi) where xi is the ith value of the

random variable X from the sample Then we

want to minimize the differences between the

force measurements Fi and yi

From calculus recall that we can minimize a

function by differentiating it and solving for

the zero-crossings (which correspond to local

maxima or minima)

First we need such a function to minimize

Perhaps the simplest effective function D is

constructed by squaring and summing the

differences we want to minimize for sample

size N

D(xi) =Nsumi=1

(Fi ndash yi)2 (regression1)

stats Statistics regression Regression p 3

(recall that yi = y(xi) which makes D a function

of x)

Now the form of y must be chosen We consider

only mth-order polynomial functions y but

others can be used in a similar manner

y(x) = a0 + a1x + a2x2 + middot middot middot + amxm (regression2)

If we treat D as a function of the polynomial

coefficients aj ie

D(a0 a1 middot middot middot am) (regression3)

and minimize D for each value of xi we must

take the partial derivatives of D with respect

to each aj and set each equal to zero

partD

parta0= 0

partD

parta1= 0 middot middot middot partD

partam= 0

This gives us N equations and m + 1 unknowns aj

Writing the system in matrix form

1 x1 x21middot middot middot xm1

1 x2 x22 middot middot middot xm21 x3 x23 middot middot middot xm3

1 xN x2Nmiddot middot middot xmN

︸ ︷︷ ︸

ANtimes(m+1)

a0a1a2

am

︸ ︷︷ ︸a(m+1)times1

=

F1F2F3

FN

︸ ︷︷ ︸bNtimes1

(regression4)

Typically N gt m and this is an overdetermined

system Therefore we usually canrsquot solve by

taking Andash1 because A doesnrsquot have an inverse

stats Statistics regression Regression p 4

Instead we either find the Moore-Penrose

pseudo-inverse Adagger and have a = Adaggerb as

the solution which is inefficientmdashor we can

approximate b with an algorithm such as that

used by Matlabrsquos operator In the latter case

a_a = Ab_a

Example statsregressionshy1 re regression

Problem StatementUse Matlabrsquos operator to find a good

polynomial fit for the sample Therersquos the

sometimesshydifficult question ldquowhat order

should we fitrdquo Letrsquos try out several and

see what the squaredshydifferences function D

gives

Problem Solution

N = length(X_a) sample sizem_a = 2N all the order up to N

A = NaNones(length(m_a)max(m_a)N)for k = 1length(m_a) each order

for j = 1N each measurementfor i = 1( m_a(k) + 1 ) each coef

A(kji) = X_a(j)^(i-1)end

endenddisp(squeeze(A(215)))

10000 00100 00001 00000 NaN10000 00210 00004 00000 NaN10000 00300 00009 00000 NaN10000 00410 00017 00001 NaN10000 00490 00024 00001 NaN10000 00500 00025 00001 NaN10000 00610 00037 00002 NaN10000 00710 00050 00004 NaN

stats Statistics regression Regression p 5

10000 00800 00064 00005 NaN10000 00920 00085 00008 NaN10000 01000 00100 00010 NaN

Wersquove printed the first five columns of the

third-order matrix which only has four

columns so NaNs fill in the rest

Now we can use the operator to solve for

the coefficients

a = NaNones(length(m_a)max(m_a))

warning(offall)for i = 1length(m_a)

A_now = squeeze(A(i1m_a(i)))a(i1m_a(i)) = (A_now(1m_a(i))F_a)

endwarning(onall)

n_plot = 100x_plot = linspace(min(X_a)max(X_a)n_plot)y = NaNones(n_plotlength(m_a)) preallocatefor i = 1length(m_a)

y(i) = polyval(fliplr(a(i1m_a(i)))x_plot)end

h = figurefor i = 12length(m_a)-1

p = plot(x_plot1e3y(i)linewidth15)hold onpDisplayName = sprintf(

order i (m_a(i)-1)

)

stats Statistics ex Regression p 6

endp = plot(X_a1e3F_abMarkerSize15)xlabel($X$ (mm)interpreterlatex)ylabel($F$ (N)interpreterlatex)pDisplayName = samplelegend(showlocationsoutheast)grid onhgsave(hfigurestemp)

20 40 60 80 100

50

60

70

X (mm)

F(N)

order 1

order 3

order 5

order 7

order 9

sample

Figure 0418 forceshydisplacement data with curvefits

vecs Statistics Exercises for Chapter 04 p 1

statsex Exercises for Chapter

04

calculus

limit

series

derivative

integral

vector calculus

vecs

Vector calculus

A great many physical situations of interest to

engineers can be described by calculus It can

describe how quantities continuously change

over (say) time and gives tools for computing

other quantities We assume familiarity with

the fundamentals of calculus limit series

derivative and integral From these and a

basic grasp of vectors we will outline some

of the highlights of vector calculus Vector

calculus is particularly useful for describing

the physics of for instance the following

mechanics of particles wherein is studied

the motion of particles and the forcing

causes thereof

rigid-body mechanics wherein is studied the

motion rotational and translational and

its forcing causes of bodies considered

rigid (undeformable)

solid mechanics wherein is studied the

motion and deformation and their forcing

causes of continuous solid bodies (those

that retain a specific resting shape)

vecs Vector calculus p 3

complex analysis

1 For an introduction to complex analysis see Kreyszig (Kreyszig

Advanced Engineering Mathematics Part D)

2 Ibid Chapters 9 10

3 HM Schey Div Grad Curl and All that An Informal Text on Vector

Calculus WW Norton 2005 ISBN 9780393925166

euclidean vector space

bases

components

basis vectors

invariant

coordinate system

Cartesian coordinates

fluid mechanics wherein is studied the

motion and its forcing causes of fluids

(liquids gases plasmas)

heat transfer wherein is studied the

movement of thermal energy through and

among bodies

electromagnetism wherein is studied

the motion and its forcing causes of

electrically charged particles

This last example was in fact very influential

in the original development of both vector

calculus and complex analysis1 It is not

an exaggeration to say that the topics above

comprise the majority of physical topics of

interest in engineering

A good introduction to vector calculus is

given by Kreyszig2 Perhaps the most famous

and enjoyable treatment is given by Schey3

in the adorably titled Div Grad Curl and All

that

It is important to note that in much of what

follows we will describe (typically the three-

dimensional space of our lived experience) as

a euclidean vector space an nshydimensional

vector space isomorphic to Rn As we know

from linear algebra any vector v isin Rn can be

expressed in any number of bases That is the

vector v is a basis-free object with multiple

basis representations The components and

basis vectors of a vector change with basis

changes but the vector itself is invariant A

coordinate system is in fact just a basis We

are most familiar of course with Cartesian

vecs Vector calculus div p 4

manifolds

non-euclidean geometry

parallel postulate

differential geometry

4 John M Lee Introduction to Smooth Manifolds second Vol 218

Graduate Texts in Mathematics Springer 2012

5 Francesco Bullo and Andrew D Lewis Geometric control of

mechanical systems modeling analysis and design for simple

mechanical control systems Ed by JE Marsden L Sirovich and M

Golubitsky Springer 2005

coordinates which is the specific orthonormal

basis b for Rn

b1 =

1

0

0

b2 =

0

1

0

middot middot middot bn =

0

0

1

(vecs1)

Manifolds are spaces that appear locally

as Rn but can be globally rather different

and can describe non-euclidean geometry

wherein euclidean geometryrsquos parallel

postulate is invalid Calculus on manifolds

is the focus of differential geometry a

subset of which we can consider our current

study A motivation for further study of

differential geometry is that it is very

convenient when dealingwith advanced

applications of mechanics such as rigid-body

mechanics of robots and vehicles A very nice

mathematical introduction is given by Lee4 and

Bullo and Lewis5 give a compact presentation

in the context of robotics

Vector fields have several important

properties of interest wersquoll explore in this

chapter Our goal is to gain an intuition of

these properties and be able to perform basic

calculation

vecs Vector calculus div Divergence surface integrals and flux p 1

surface integral

flux

6 Schey Div Grad Curl and All that An Informal Text on Vector

Calculus pp 21shy30

7 Kreyszig Advanced Engineering Mathematics sect 106

source

sink

incompressible

vecsdiv Divergence surface

integrals and flux

Flux and surface integrals

Consider a surface S Let

r(u v) = [x(u v) y(u v) z(u v)] be a parametric

position vector on a Euclidean vector space R3Furthermore let F R3 rarr R3 be a vectorshyvalued

function of r and let n be a unit-normal vector

on a surface S The surface integral

uml

S

F middot ndS (div1)

which integrates the normal of F over the

surface We call this quantity the flux of F

out of the surface S This terminology comes

from fluid flow for which the flux is the

mass flow rate out of S In general the flux

is a measure of a quantity (or field) passing

through a surface For more on computing

surface integrals see Schey6 and Kreyszig7

Continuity

Consider the flux out of a surface S that

encloses a volume ∆V divided by that volume

1

∆V

uml

S

F middot ndS (div2)

This gives a measure of flux per unit volume

for a volume of space Consider its physical

meaningwhen we interpret this as fluid flow

all fluid that enters the volume is negative flux

and all that leaves is positive If physical

conditions are such that we expect no fluid to

enter or exit the volume via what is called a

source or a sink and if we assume the density

of the fluid is uniform (this is called an

incompressible fluid) then all the fluid that

vecs Vector calculus div Divergence surface integrals and flux p 1

continuity equation

divergence

enters the volume must exit and we get

1

∆V

uml

S

F middot ndS = 0 (div3)

This is called a continuity equation although

typically this name is given to equations of the

form in the next section This equation is one of

the governing equations in continuum mechanics

Divergence

Letrsquos take the fluxshypershyvolume as the volume

∆Vrarr 0 we obtain the following

Equation div4 divergence integral

form

lim∆Vrarr0

1

∆V

uml

S

F middot ndS

This is called the divergence of F and is

defined at each point in R3 by taking the

volume to zero about it It is given the

shorthand div F

What interpretation can we give this quantity

It is a measure of the vector fieldrsquos flux

outward through a surface containing an

infinitesimal volume When we consider a fluid

a positive divergence is a local decrease in

density and a negative divergence is a density

increase If the fluid is incompressible and has

no sources or sinks we can write the continuity

vecs Vector calculus div Divergence surface integrals and flux p 1

equation

div F = 0 (div5)

In the Cartesian basis it can be shown that the

divergence is easily computed from the field

F = Fxi + Fyj + Fzk (div6)

as follows

Equation div7 divergence

differential form

div F = partxFx + partyFy + partzFz

Exploring divergence

Divergence is perhaps best explored by

considering it for a vector field in R2 Sucha field F = Fxi + Fyj can be represented

as a ldquoquiverrdquo plot If we overlay the quiver

plot over a ldquocolor densityrdquo plot representing

div F we can increase our intuition about the

divergence

The followingwas generated from a Jupyter

notebook with the following filename and

kernel

notebook filename div_surface_integrals_fluxipynbnotebook kernel python3

vecs Vector calculus div Divergence surface integrals and flux p 2

First load some Python packages

from sympy import import numpy as npimport matplotlibpyplot as pltfrom matplotlibticker import LogLocatorfrom matplotlibcolors import from sympyutilitieslambdify import lambdifyfrom IPythondisplay import display Markdown Latex

Now we define some symbolic variables and

functions

var(xy)F_x = Function(F_x)(xy)F_y = Function(F_y)(xy)

Rather than repeat code letrsquos write a single

function quiver_plotter_2D to make several

of these plots

def quiver_plotter_2D(field=F_xxyF_yxygrid_width=3grid_decimate_x=8grid_decimate_y=8norm=Normalize()density_operation=divprint_density=True

) define symbolicsvar(xy)F_x = Function(F_x)(xy)F_y = Function(F_y)(xy)

vecs Vector calculus div Divergence surface integrals and flux p 3

field_sub = field

compute densityif density_operation is divden = F_xdiff(x) + F_ydiff(y)

elif density_operation is curl in the k directionden = F_ydiff(x) - F_xdiff(y)

elseerror(div and curl are the only density operators)

den_simp = densubs(field_sub

)doit()simplify()if den_simpis_constant()

print(Warning density operator is constant (no density plot)

)if print_density

print(fThe density_operation is)display(den_simp)

lambdify for numericsF_x_sub = F_xsubs(field_sub)F_y_sub = F_ysubs(field_sub)F_x_fun = lambdify((xy)F_xsubs(field_sub)numpy)F_y_fun = lambdify((xy)F_ysubs(field_sub)numpy)if F_x_subis_constantF_x_fun1 = F_x_fun dummyF_x_fun = lambda xy F_x_fun1(xy)npones(xshape)

if F_y_subis_constantF_y_fun1 = F_y_fun dummyF_y_fun = lambda xy F_y_fun1(xy)npones(xshape)

if not den_simpis_constant()den_fun = lambdify((xy) den_simpnumpy)

create gridw = grid_widthY X = npmgrid[-ww100j -ww100j]

evaluate numericallyF_x_num = F_x_fun(XY)F_y_num = F_y_fun(XY)if not den_simpis_constant()den_num = den_fun(XY)

plotp = pltfigure() colormeshif not den_simpis_constant()cmap = pltget_cmap(PiYG)im = pltpcolormesh(XYden_numcmap=cmapnorm=norm)pltcolorbar()

Abs quiverdx = grid_decimate_ydy = grid_decimate_x

vecs Vector calculus div Divergence surface integrals and flux p 4

pltquiver(X[dxdy]Y[dxdy]F_x_num[dxdy]F_y_num[dxdy]units=xy scale=10

)plttitle(fF(xy) = [F_xsubs(field_sub)F_ysubs(field_sub)])return p

Note that while wersquore at it we included a

hook for density plots of the curl of F and

wersquoll return to this in a later lecture

Letrsquos inspect several cases

p = quiver_plotter_2D(field=F_xx2F_yy2

)

The div is

2x + 2y

p = quiver_plotter_2D(field=F_xxyF_yxy

)

The div is

vecs Vector calculus div Divergence surface integrals and flux p 5

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [x2y2]

10

5

0

5

10

Figure 051 png

x + y

p = quiver_plotter_2D(field=F_xx2+y2F_yx2+y2

)

The div is

2x + 2y

p = quiver_plotter_2D(field=F_xx2sqrt(x2+y2)F_yy2sqrt(x2+y2)

vecs Vector calculus div Divergence surface integrals and flux p 6

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [xyxy]

10

5

0

5

10

Figure 052 png

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [x2 + y2x2 + y2]

10

5

0

5

10

Figure 053 png

vecs Vector calculus curl Divergence surface integrals and flux p 7

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [x2sqrt(x2 + y2)y2sqrt(x2 + y2)]

100

10 1010 1

100

Figure 054 png

norm=SymLogNorm(linthresh=3 linscale=3))

The div is

ndashx3 ndash y3 + 2(x + y

) (x2 + y2

)(x2 + y2

) 32

vecs Vector calculus curl Curl line integrals and circulation p 1

line integral

8 Schey Div Grad Curl and All that An Informal Text on Vector

Calculus pp 63shy74

9 Kreyszig Advanced Engineering Mathematics sect 101 102

force

work

circulation

the law of circulation

vecscurl Curl line integrals

and circulation

Line integrals

Consider a curve C in a Euclidean vector space

R3 Let r(t) = [x(t) y(t) z(t)] be a parametric

representation of C Furthermore let F R3 rarr R3

be a vectorshyvalued function of r and let rprime(t)

be the tangent vector The line integral is

ˆ

C

F(r(t)) middot rprime(t) dt (curl1)

which integrates F along the curve For more

on computing line integrals see Schey8 and

Kreyszig9

If F is a force being applied to an object

moving along the curve C the line integral is

the work done by the force More generally the

line integral integrates F along the tangent of

C

Circulation

Consider the line integral over a closed curve

C denoted by

˛

C

F(r(t)) middot rprime(t) dt (curl2)

We call this quantity the circulation of F

around C

For certain vectorshyvalued functions F the

circulation is zero for every curve In these

cases (static electric fields for instance) this

is sometimes called the the law of circulation

vecs Vector calculus curl Curl line integrals and circulation p 2

curl

curl

Curl

Consider the division of the circulation around

a curve in R3 by the surface area it encloses

∆S

1

∆S

˛

C

F(r(t)) middot rprime(t) dt (curl3)

In a manner analogous to the operation

that gaves us the divergence letrsquos consider

shrinking this curve to a point and the surface

area to zero

lim∆Srarr0

1

∆S

˛

C

F(r(t)) middot rprime(t) dt (curl4)

We call this quantity the ldquoscalarrdquo curl of F

at each point in R3 in the direction normal

to ∆S as it shrinks to zero Taking three (or

n for Rn) ldquoscalarrdquo curls in indepedent normal

directions (enough to span the vector space)

we obtain the curl proper which is a vector-

valued function curl R3 rarr R3

The curl is coordinate-independent In

cartesian coordinates it can be shown to be

equivalent to the following

Equation curl5 curl differential

form cartesian coordinates

curl F =[partyFz ndash partzFy partzFx ndash partxFz partxFy ndash partyFx

]gt

But what does the curl of F represent It

quantifies the local rotation of F about each

point If F represents a fluidrsquos velocity curl F

vecs Vector calculus curl Curl line integrals and circulation p 1

vorticity

10 A region is simply connected if every curve in it can shrink to a

point without leaving the region An example of a region that is not

simply connected is the surface of a toroid

path independence

is the local rotation of the fluid about each

point and it is called the vorticity

Zero curl circulation and path

independence

Circulation

It can be shown that if the circulation of F on

all curves is zero then in each direction n and

at every point curl F = 0 (ie n middot curl F = 0)

Conversely for curl F = 0 in a simply connected

region10 F has zero circulation

Succinctly informally and without the

requisite qualifiers above

zero circulationrArr zero curl

(curl6)

zero curl + simply connected regionrArr zero circulation

(curl7)

Path independence

It can be shown that if the path integral of F

on all curves between any two points is path-

independent then in each direction n and

at every point curl F = 0 (ie n middot curl F = 0)

Conversely for curl F = 0 in a simply connected

region all line integrals are independent of

path

Succinctly informally and without the

requisite qualifiers above

path independencerArr zero curl

(curl8)

zero curl + simply connected regionrArr path independence

(curl9)

vecs Vector calculus curl Curl line integrals and circulation p 1

zero curl

zero circulation path independence

connectedness and

Figure 055 an implication graph relating zerocurl zero circulation path independence andconnectedness Green edges represent implication(a implies b) and black edges represent logicalconjunctions

hellip and how they relate

It is also true that

path independencehArr zero circulation

(curl10)

So putting it all together we get Figure 055

Exploring curl

Curl is perhaps best explored by considering

it for a vector field in R2 Such a field in

cartesian coordinates F = Fxi + Fyj has curl

curl F =[party0 ndash partzFy partzFx ndash partx0 partxFy ndash partyFx

]gt=[0 ndash 0 0 ndash 0 partxFy ndash partyFx

]gt=[0 0 partxFy ndash partyFx

]gt (curl11)

That is curl F = (partxFy ndash partyFx )k and the only

nonzero component is normal to the xyshyplane

If we overlay a quiver plot of F over a ldquocolor

densityrdquo plot representing the k-component of

curl F we can increase our intuition about the

curl

The followingwas generated from a Jupyter

vecs Vector calculus curl Curl line integrals and circulation p 2

notebook with the following filename and

kernel

notebook filename curl-and-line-integralsipynbnotebook kernel python3

First load some Python packages

from sympy import import numpy as npimport matplotlibpyplot as pltfrom matplotlibticker import LogLocatorfrom matplotlibcolors import from sympyutilitieslambdify import lambdifyfrom IPythondisplay import display Markdown Latex

Now we define some symbolic variables and

functions

var(xy)F_x = Function(F_x)(xy)F_y = Function(F_y)(xy)

We use the same function defined in

Lecture vecsdiv quiver_plotter_2D to make

several of these plots

Letrsquos inspect several cases

vecs Vector calculus curl Curl line integrals and circulation p 3

100 075 050 025 000 025 050 075 100100

075

050

025

000

025

050

075

100F(xy) = [0cos(2pix)]

6

4

2

0

2

4

6

Figure 056 png

p = quiver_plotter_2D(field=F_x0F_ycos(2pix)density_operation=curlgrid_decimate_x=2grid_decimate_y=10grid_width=1

)

The curl is

ndash2π sin (2πx)

p = quiver_plotter_2D(field=F_x0F_yx2density_operation=curl

vecs Vector calculus curl Curl line integrals and circulation p 4

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [0x2]

6

4

2

0

2

4

6

Figure 057 png

grid_decimate_x=2grid_decimate_y=20

)

The curl is

2x

p = quiver_plotter_2D(field=F_xy2F_yx2density_operation=curlgrid_decimate_x=2grid_decimate_y=20

)

vecs Vector calculus curl Curl line integrals and circulation p 5

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [y2x2]

6

4

2

0

2

4

6

Figure 058 png

The curl is

2x ndash 2y

p = quiver_plotter_2D(field=F_x-yF_yxdensity_operation=curlgrid_decimate_x=6grid_decimate_y=6

)

Warning density operator is constant (no densityplot)rarr

The curl is

vecs Vector calculus grad Curl line integrals and circulation p 6

3 2 1 0 1 2 33

2

1

0

1

2

3F(xy) = [-yx]

Figure 059 png

2

vecs Vector calculus grad Gradient p 1

gradient

direction

magnitude

principle of least action

vecsgrad Gradient

Gradient

The gradient grad of a scalarshyvalued

function f R3 rarr R is a vector field

F R3 rarr R3 that is grad f is a vectorshyvalued

function on R3 The gradientrsquos local direction

and magnitude are those of the local maximum

rate of increase of f This makes it useful in

optimization (eg in the method of gradient

descent)

In classical mechanics quantum mechanics

relativity string theory thermodynamics

and continuum mechanics (and elsewhere)

the principle of least action is taken as

fundamental (Richard P Feynman Robert B

Leighton and Matthew Sands The Feynman

Lectures on Physics New Millennium Perseus

Basic Books 2010) This principle tells us that

naturersquos laws quite frequently seem to be

derivable by assuming a certain quantitymdash

called actionmdashis minimized Considering then

that the gradient supplies us with a tool for

optimizing functions it is unsurprising that the

gradient enters into the equations of motion of

many physical quantities

The gradient is coordinate-independent but

its coordinate-free definitions donrsquot add much

to our intuition In cartesian coordinates it

can be shown to be equivalent to the following

vecs Vector calculus grad Gradient p 1

gradience

11 This is nonstandard terminology but wersquore bold

potentials

Equation grad1 gradient cartesian

coordinates

grad f =[partxf partyf partzf

]gt

Vector fields from gradients are special

Although all gradients are vector fields not

all vector fields are gradients That is given

a vector field F it may or may not be equal to

the gradient of any scalarshyvalued function f

Letrsquos say of a vector field that is a gradient

that it has gradience11 Those vector fields

that are gradients have special properties

Surprisingly those properties are connected to

path independence and curl It can be shown

that iff a field is a gradient line integrals of

the field are path independent That is for a

vector field

gradiencehArr path independence (grad2)

Consideringwhat we know from

Lecture vecscurl about path independence we

can expand Figure 055 to obtain Figure 0510

One implication is that gradients have zero

curl Many important fields that describe

physical interactions (eg static electric fields

Newtonian gravitational fields) are gradients

of scalar fields called potentials

Exploring gradient

Gradient is perhaps best explored by

considering it for a scalar field on R2 Sucha field in cartesian coordinates f(x y) has

vecs Vector calculus grad Gradient p 2

gradience

zero curl

zero circulation path independence

connectedness and

Figure 0510 an implication graph relating gradiencezero curl zero circulation path independence andconnectedness Green edges represent implication(a implies b) and black edges represent logicalconjunctions

gradient

grad f =[partxf partyf

]gt(grad3)

That is grad f = F = partxf i + partyf j If we overlay

a quiver plot of F over a ldquocolor densityrdquo

plot representing the f we can increase our

intuition about the gradient

The followingwas generated from a Jupyter

notebook with the following filename and

kernel

notebook filename gradipynbnotebook kernel python3

First load some Python packages

vecs Vector calculus grad Gradient p 3

from sympy import import numpy as npimport matplotlibpyplot as pltfrom matplotlibticker import LogLocatorfrom matplotlibcolors import from sympyutilitieslambdify import lambdifyfrom IPythondisplay import display Markdown Latex

Now we define some symbolic variables and

functions

var(xy)

(x y)

Rather than repeat code letrsquos write a single

function grad_plotter_2D to make several of

these plots

Letrsquos inspect several cases While considering

the following plots remember that they all

have zero curl

p = grad_plotter_2D(field=x

)

vecs Vector calculus grad Gradient p 4

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = x

3

2

1

0

1

2

Figure 0511 png

The gradient is

[1 0

]

p = grad_plotter_2D(field=x+y

)

The gradient is

[1 1

]

vecs Vector calculus grad Gradient p 1

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = x + y

6

4

2

0

2

4

Figure 0512 png

p = grad_plotter_2D(field=1

)

Warning field is constant (no plot)The gradient is

[0 0

]Gravitational potential

Gravitational potentials have the form of

1distance Letrsquos check out the gradient

vecs Vector calculus grad Gradient p 1

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = 1sqrt(x2 + y2)

100

101

Figure 0513 png

conic section

p = grad_plotter_2D(field=1sqrt(x2+y2)norm=SymLogNorm(linthresh=3 linscale=3)mask=True

)

The gradient is

[ndash x(x2+y2

) 32

ndashy(

x2+y2) 32

]

Conic section fields

Gradients of conic section fields can be

explored

vecs Vector calculus grad Gradient p 2

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = x2

1

2

3

4

5

6

7

8

9

Figure 0514 png

parabolic fields

eliptic fields

The following is called a parabolic field

p = grad_plotter_2D(field=x2

)

The gradient is

[2x 0

]

The following are called eliptic fields

vecs Vector calculus grad Gradient p 3

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = x2 + y2

2

4

6

8

10

12

14

16

18

Figure 0515 png

p = grad_plotter_2D(field=x2+y2

)

The gradient is

[2x 2y

]

p = grad_plotter_2D(field=-x2-y2

)

vecs Vector calculus grad Gradient p 4

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = -x2 - y2

18

16

14

12

10

8

6

4

2

Figure 0516 png

hyperbolic fields

The gradient is

[ndash2x ndash2y

]

The following is called a hyperbolic field

p = grad_plotter_2D(field=x2-y2

)

The gradient is

vecs Vector calculus stoked Gradient p 5

3 2 1 0 1 2 33

2

1

0

1

2

3f(xy) = x2 - y2

8

6

4

2

0

2

4

6

8

Figure 0517 png

[2x ndash2y

]

vecs Vector calculus stoked Stokes and divergence theorems p 1

divergence theorem

triple integral

orientable

12 A surface is orientable if a consistent normal direction can

be defined Most surfaces are orientable but some notably the

Moumlbius strip cannot be See Kreyszig (Kreyszig Advanced Engineering

Mathematics sect 106) for more

13 Ibid sect 107

14 Schey Div Grad Curl and All that An Informal Text on Vector

Calculus pp 45-52

continuity equation

15 Ibid pp 49-52

vecsstoked Stokes and

divergence

theorems

Two theorems allow us to exchange certain

integrals in R3 for others that are easier to

evaluate

The divergence theorem

The divergence theorem asserts the equality

of the surface integral of a vector field F and

the triple integral of div F over the volume V

enclosed by the surface S in R3 That is

umlSF middot ndS =

˚Vdiv FdV (stoked1)

Caveats are that V is a closed region bounded

by the orientable12 surface S and that F is

continuous and continuously differentiable

over a region containing V This theorem makes

some intuitive sense we can think of the

divergence inside the volume ldquoaccumulatingrdquo

via the triple integration and equaling the

corresponding surface integral For more on the

divergence theorem see Kreyszig13 and Schey14

A lovely application of the divergence theorem

is that for any quantity of conserved stuff

(mass charge spin etc) distributed in a spatial

R3 with timeshydependent density ρ R4 rarr Rand velocity field v R4 rarr R3 the divergence

theorem can be applied to find that

parttρ = ndash div(ρv) (stoked2)

which is a more general form of a continuity

equation one of the governing equations of

many physical phenomena For a derivation of

this equation see Schey15

vecs Vector calculus ex Stokes and divergence theorems p 1

KelvinshyStokesrsquo theorem

piecewise smooth

16 A surface is smooth if its normal is continuous everywhere It

is piecewise smooth if it is composed of a finite number of smooth

surfaces

17 Kreyszig Advanced Engineering Mathematics sect 109

18 Schey Div Grad Curl and All that An Informal Text on Vector

Calculus pp 93-102

Greenersquos theorem

19 Kreyszig Advanced Engineering Mathematics sect 109

generalized Stokesrsquo theorem

20 Lee Introduction to Smooth Manifolds Ch 16

The Kelvin-Stokesrsquo theorem

The KelvinshyStokesrsquo theorem asserts the

equality of the circulation of a vector field

F over a closed curve C and the surface

integral of curl F over a surface S that has

boundary C That is for r(t) a parameterization

of C and surface normal n

˛CF(r(t)) middot rprime(t) dt =

umlSn middot curl FdS (stoked3)

Caveats are that S is piecewise smooth16 its

boundary C is a piecewise smooth simple closed

curve and F is continuous and continuously

differentiable over a region containing S

This theorem is also somewhat intuitive we

can think of the divergence over the surface

ldquoaccumulatingrdquo via the surface integration and

equaling the corresponding circulation For more

on the Kelvin-Stokesrsquo theorem see Kreyszig17

and Schey18

Related theorems

Greenersquos theorem is a twoshydimensional special

case of the Kelvin-Stokesrsquo theorem It is

described by Kreyszig19

It turns out that all of the above theorems

(and the fundamental theorem of calculus

which relates the derivative and integral)

are special cases of the generalized Stokesrsquo

theorem defined by differential geometry

We would need a deeper understanding of

differential geometry to understand this

theorem For more see Lee20

vecs Vector calculus ex Exercises for Chapter 05 p 1

vecsex Exercises for Chapter

05

Exercise 051 (6)

The PDE of Example pdeseparation-1 can be used

to describe the conduction of heat along

a long thin rod insulated along its length

where u(t x) represents temperature The initial

and dirichlet boundary conditions in that

example would be interpreted as an initial

temperature distribution along the bar and

fixed temperatures of the ends Now consider

the same PDE

parttu(t x) = kpart2xxu(t x) (ex1)

with real constant k now with neumann

boundary conditions on inverval x isin [0 L]

partxu|x=0 = 0 and partxu|x=L = 0 (ex2a)

and with initial condition

u(0 x) = f(x) (ex3)

where f is some piecewise continuous function

on [0 L] This represents the complete insulation

of the ends of the rod such that no heat flows

from the ends (or from anywhere else)

a Assume a product solution separate

variables into X(x) and T(t) and set the

separation constant to ndashλ

b Solve the boundary value problem for its

eigenfunctions Xn and eigenvalues λn

c Solve for the general solution of the

time variable ODE

d Write the product solution and apply the

initial condition f(x) by constructing it

from a generalized fourier series of the

product solution

four Vector calculus Exercises for Chapter 05 p 2

e Let L = 1 and f(x) = 100 ndash 200L |x ndash L2| as

shown in Figure 071 Compute the solution

series components Plot the sum of the

first 50 terms over x and t

0 02 04 06 08 1050100

x

f(x)

Figure 0518 initial condition for Exercise 051

four

Fourier andorthogonality

four Fourier and orthogonality series Fourier series p 1

frequency domain

Fourier analysis

1 Itrsquos important to note that the symbol ωn in this context is not the

natural frequency but a frequency indexed by integer n

frequency spectrum

fourseries Fourier series

Fourier series are mathematical series that

can represent a periodic signal as a sum

of sinusoids at different amplitudes and

frequencies They are useful for solving

for the response of a system to periodic

inputs However they are probably most

important conceptually they are our gateway

to thinking of signals in the frequency

domainmdashthat is as functions of frequency

(not time) To represent a function as a

Fourier series is to analyze it as a sum of

sinusoids at different frequencies1 ωn and

amplitudes an Its frequency spectrum is the

functional representation of amplitudes an

versus frequency ωn

Letrsquos begin with the definition

Definition fourseries1 Fourier series

trigonometric form

The Fourier analysis of a periodic function

y(t) is for n isin N0 period T and angular

frequency ωn = 2πnT

an =2

T

ˆ T2

ndashT2y(t) cos(ωnt)dt (series1)

bn =2

T

ˆ T2

ndashT2y(t) sin(ωnt)dt (series2)

The Fourier synthesis of a periodic function

y(t) with analysis components an and bn

corresponding to ωn is

y(t) =a02+

infinsumn=1

an cos(ωnt) + bn sin(ωnt)

(series3)

four Fourier and orthogonality series Fourier series p 2

harmonic

harmonic frequency

fundamental frequency

Letrsquos consider the complex form of the

Fourier series which is analogous to

Definition fourseries1 It may be helpful to

review Eulerrsquos formula(s) ndash see Appendix D01

Definition fourseries2 Fourier series

complex form

The Fourier analysis of a periodic function

y(t) is for n isin N0 period T and angular

frequency ωn = 2πnT

cplusmnn =1

T

ˆ T2

ndashT2y(t)endashjωntdt (series4)

The Fourier synthesis of a periodic function

y(t) with analysis components cn corresponding

to ωn is

y(t) =infinsumn=ndashinfin cne

jωnt (series5)

We call the integer n a harmonic and the

frequency associated with it

ωn = 2πnT (series6)

the harmonic frequency There is a special

name for the first harmonic (n = 1) the

fundamental frequency It is called this

because all other frequency components are

integer multiples of it

It is also possible to convert between the two

representations above

four Fourier and orthogonality series Fourier series p 3

harmonic amplitude

magnitude line spectrum

harmonic phase

Definition fourseries3 Fourier series

converting between

forms

The complex Fourier analysis of a periodic

function y(t) is for n isin N0 and an and bn as

defined above

cplusmnn =1

2

(a|n| ∓ jb|n|

)(series7)

The sinusoidal Fourier analysis of a periodic

function y(t) is for n isin N0 and cn as defined

above

an = cn + cndashn and (series8)

bn = j (cn ndash cndashn) (series9)

The harmonic amplitude Cn is

Cn =

radica2n + b

2n (series10)

= 2radiccncndashn (series11)

A magnitude line spectrum is a graph of the

harmonic amplitudes as a function of the

harmonic frequencies The harmonic phase is

θn = ndash arctan2(bn an) (see Appendix C02)

= arctan2(Im(cn) Re(cn))

The following illustration demonstrates how

sinusoidal components sum to represent a

square wave A line spectrum is also shown

four Fourier and orthogonality series Fourier series p 4

time

frequency

spectral amplitudeamplitude

Let us compute the associated spectral

components in the following example

Example fourseriesshy1 re Fourier series analysis line spectrum

Problem Statement

Compute the first five harmonic amplitudes

that represent the line spectrum for a square

wave in the figure above

four Fourier and orthogonality trans Fourier series p 5

four Fourier and orthogonality trans Fourier transform p 1

fourtrans Fourier transform

The followingwas generated from a Jupyter

notebook with the following filename and

kernel

notebook filename fourier_series_to_transformipynbnotebook kernel python3

We begin with the usual loading of modules

import numpy as np for numericsimport sympy as sp for symbolicsimport matplotlibpyplot as plt for plotsfrom IPythondisplay import display Markdown Latex

Letrsquos consider a periodic function f with

period T (T) Each period the function has a

triangular pulse of width δ (pulse_width) andheight δ2

period = 15 periodpulse_width = 2 pulse width

First we plot the function f in the time

domain Letrsquos begin by defining f

four Fourier and orthogonality trans Fourier transform p 2

def pulse_train(tTpulse_width)f = lambda xpulse_width2-abs(x) pulsetm = npmod(tT)if tm lt= pulse_width2

return f(tm)elif tm gt= T-pulse_width2

return f(-(tm-T))else

return 0

Now we develop a numerical array in time to

plot f

N = 201 number of points to plottpp = nplinspace(-period25period2N) time valuesfpp = nparray(npzeros(tppshape))for it_now in enumerate(tpp)

fpp[i] = pulse_train(t_nowperiodpulse_width)

axes = pltfigure(1)pltplot(tppfppb-linewidth=2) plotpltxlabel(time (s))pltxlim([-period23period2])pltxticks(

[0pulse_width2period][0$delta2$$T=+str(period)+$ s]

)pltyticks([0pulse_width2][0$delta2$])pltshow() display here

For δ = 2 and T isin [5 15 25] the leftshyhand

column of Figure 061 shows two triangle pulses

for each period T

four Fourier and orthogonality trans Fourier transform p 3

Consider the following argument Just as

a Fourier series is a frequency domain

representation of a periodic signal a Fourier

transform is a frequency domain representation

of an aperiodic signal (we will rigorously

define it in a moment) The Fourier series

components will have an analog then in the

Fourier transform Recall that they can be

computed by integrating over a period of the

signal If we increase that period infinitely

the function is effectively aperiodic The

result (within a scaling factor) will be the

Fourier transform analog of the Fourier series

components

Let us approach this understanding by actually

computing the Fourier series components for

increasing period T using Wersquoll use sympyto compute the Fourier series cosine and sine

components an and bn for component n (n) andperiod T (T)

spvar(xa_0a_nb_nreal=True)spvar(deltaTpositive=True)spvar(nnonnegative=True) a0 = 2Tspintegrate( (delta2-spAbs(x)) (x-delta2delta2) otherwise zero )simplify()an = spintegrate(

2T(delta2-spAbs(x))spcos(2sppinTx)(x-delta2delta2) otherwise zero

)simplify()bn = 2Tspintegrate(

(delta2-spAbs(x))spsin(2sppinTx)(x-delta2delta2) otherwise zero

)simplify()display(spEq(a_nan)spEq(b_nbn))

four Fourier and orthogonality trans Fourier transform p 4

an =

T(1ndashcos

(πδnT

))π2n2

for n 6= 0δ2

2Totherwise

bn = 0

Furthermore let us compute the harmonic

amplitude

(f_harmonic_amplitude)

Cn =

radica2n + b

2n (trans1)

which we have also scaled by a factor Tδ in

order to plot it with a convenient scale

spvar(C_npositive=True)cn = spsqrt(an2+bn2)display(spEq(C_ncn))

Cn =

T∣∣∣cos(πδn

T

)ndash1∣∣∣

π2n2for n 6= 0

δ2

2Totherwise

Now we lambdify the symbolic expression for a

numpy function

cn_f = splambdify((nTdelta)cn)

Now we can plot

four Fourier and orthogonality trans Fourier transform p 5

amplitude CnTδ

δ2T = 5 s

δ2

5 10

δ2

δ2

T = 15 s

δ2

5 10

δ2

δ2

T = 25 s

δ2

time (s)

5 10

δ2

frequency ω (rads)

Figure 061 triangle pulse trains (left column) withlonger periods descending and their correspondingline spectra (right column) scaled for convenientcomparison

omega_max = 12 rads max frequency in line spectrumn_max = round(omega_maxperiod(2nppi)) max harmonicn_a = nplinspace(0n_maxn_max+1)omega = 2nppin_aperiodaxes = pltfigure(2)markerline stemlines baseline = pltstem(

omega periodpulse_widthcn_f(n_aperiodpulse_width)linefmt=b- markerfmt=bo basefmt=r-use_line_collection=True

)pltxlabel(frequency $omega$ (rads))pltxlim([0omega_max])pltylim([0pulse_width2])pltyticks([0pulse_width2][0$delta2$])pltshow() show here

The line spectra are shown in the rightshyhand

column of Figure 061 Note that with our chosen

scaling as T increases the line spectra reveal

a distinct waveform

four Fourier and orthogonality trans Fourier transform p 6

2 4 6 8 10 12

δ2

frequency ω (rads)

F(ω)

Figure 062 F(ω) our mysterious Fourier seriesamplitude analog

Let F be the continuous function of angular

frequency ω

F(ω) =δ

2middot sin

2(ωδ4)

(ωδ4)2 (trans2)

First we plot it

F = lambda w pulse_width2 npsin(wpulse_width(22))2 (wpulse_width(22))2

N = 201 number of points to plotwpp = nplinspace(00001omega_maxN)Fpp = []for i in range(0N)

Fppappend(F(wpp[i])) build array of function valuesaxes = pltfigure(3)pltplot(wppFppb-linewidth=2) plotpltxlim([0omega_max])pltyticks([0pulse_width2][0$delta2$])pltxlabel(frequency $omega$ (rads))pltylabel($F(omega)$)pltshow()

Letrsquos consider the plot in Figure 062 of F Itrsquos

obviously the function emerging in Figure 061

from increasing the period of our pulse train

four Fourier and orthogonality trans Fourier transform p 7

Now we are ready to define the Fourier

transform and its inverse

Definition fourtrans1 Fourier transforms

trigonometric form

Fourier transform (analysis)

A(ω) =

ˆ infinndashinfin y(t) cos(ωt)dt (trans3)

B(ω) =

ˆ infinndashinfin y(t) sin(ωt)dt (trans4)

Inverse Fourier transform (synthesis)

y(t) =1

ˆ infinndashinfin A(ω) cos(ωt)dω +

1

ˆ infinndashinfin B(ω) sin(ωt)dω

(trans5)

Definition fourtrans2 Fourier transforms

complex form

Fourier transform F (analysis)

F(y(t)) = Y(ω) =

ˆ infinndashinfin y(t)endashjωtdt (trans6)

Inverse Fourier transform Fndash1 (synthesis)

Fndash1(Y(ω)) = y(t) =1

ˆ infinndashinfin Y(ω)ejωtdω (trans7)

So now we have defined the Fourier transform

There are many applications including solving

differential equations and frequency domain

representationsmdashcalled spectramdashof time domain

functions

There is a striking similarity between the

four Fourier and orthogonality trans Fourier transform p 8

Fourier transform and the Laplace transform

with which you are already acquainted

In fact the Fourier transform is a special

case of a Laplace transform with Laplace

transform variable s = jω instead of having

some real component Both transforms convert

differential equations to algebraic equations

which can be solved and inversely transformed

to find timeshydomain solutions The Laplace

transform is especially important to use

when an input function to a differential

equation is not absolutely integrable and

the Fourier transform is undefined (for

example our definition will yield a transform

for neither the unit step nor the unit ramp

functions) However the Laplace transform is

also preferred for initial value problems due

to its convenient way of handling them The

two transforms are equally useful for solving

steady state problems Although the Laplace

transform has many advantages for spectral

considerations the Fourier transform is the

only game in town

A table of Fourier transforms and their

properties can be found in Appendix B02

Example fourtransshy1 re a Fourier transform

Problem Statement

Consider the aperiodic signal y(t) = us(t)endashat

with us the unit step function and a gt 0 The

signal is plotted below Derive the complex

frequency spectrum and plot its magnitude

and phase

ndash2 0 2 40

051

t

y(t)

four Fourier and orthogonality trans Fourier transform p 9

Problem SolutionThe signal is aperiodic so the Fourier

transform can be computed from

Equation trans6

Y(ω) =

ˆ infinndashinfin y(t)endashjωtdt

=

ˆ infinndashinfin us(t)e

ndashatendashjωtdt (def of y)

=

ˆ infin0

endashatendashjωtdt (us effect)

=

ˆ infin0

e(ndashandashjω)tdt (multiply)

=1

ndasha ndash jωe(ndashandashjω)t

∣∣∣∣infin0dt (antiderivative)

=1

ndasha ndash jω

(limtrarrinfin e(ndashandashjω)t ndash e0

)(evaluate)

=1

ndasha ndash jω

(limtrarrinfin endashatendashjωt ndash 1

)(arrange)

=1

ndasha ndash jω

((0)(complex with mag 6 1) ndash 1

)(limit)

=ndash1

ndasha ndash jω(consequence)

=1

a + jω

=a ndash jω

a ndash jωmiddot 1

a + jω(rationalize)

=a ndash jω

a2 +ω2

The magnitude and phase of this complex

function are straightforward to compute

|Y(ω)| =radicRe(Y(ω))2 + Im(Y(ω))2

=1

a2 +ω2

radica2 +ω2

=1radic

a2 +ω2

angY(ω) = ndash arctan(ωa)

four Fourier and orthogonality general Fourier transform p 10

Now we can plot these functions of ω Setting

a = 1 (arbitrarily) we obtain the following

plots

0

05

1

|Y(ω)|

ndash10 ndash5 0 5 10

ndash1

0

1

ω

angY(ω)

four Fourier and orthogonality general Generalized fourier series and orthogonality p 1

2 A function f is square-integrable ifacuteinfinndashinfin |f(x)|2 dx ltinfin

inner product

weight function

3 This definition of the inner product can be extended to functions

on R2 and R3 domains using double- and triple-integration See

( Schey Div Grad Curl and All that An Informal Text on Vector

Calculus p 261)

orthogonality

basis

function space

legendre polynomials

fourgeneral Generalized

fourier series and

orthogonality

Let f R rarr C g R rarr C and w R rarr Cbe complex functions For square-integrable2

f g and w the inner product of f and g with

weight function w over the interval [a b] sube Ris3

〈f g〉w =ˆ b

af(x)g(x)w(x) dx (general1)

where g denotes the complex conjugate of

g The inner product of functions can be

considered analogous to the inner (or dot)

product of vectors

The fourier series components can be found by a

special property of the sin and cos functions

called orthogonality In general functions

f and g from above are orthogonal over the

interval [a b] iff

〈f g〉w = 0 (general2)

for weight function w Similar to how a set of

orthogonal vectors can be a basis for a vector

space a set of orthogonal functions can be

a basis for a function space a vector space

of functions from one set to another (with

certain caveats)

In addition to some sets of sinusoids there

are several other important sets of functions

that are orthogonal For instance sets of

legendre polynomials (Erwin Kreyszig

Advanced Engineering Mathematics 10th John

Wiley amp Sons Limited 2011 ISBN 9781119571094

The authoritative resource for engineering

four Fourier and orthogonality ex Generalized fourier series and orthogonality p 2

bessel functions

generalized fourier series

Fourier components

synthesis

analysis

fourier-legendre series

fourier-bessel series

mathematics It includes detailed accounts

of probability statistics vector calculus

linear algebra fourier analysis ordinary

and partial differential equations and

complex analysis It also includes several

other topics with varying degrees of depth

Overall it is the best place to start when

seeking mathematical guidance sect 52) and bessel

functions (Kreyszig Advanced Engineering

Mathematics sect 54) are orthogonal

As with sinusoids the orthogonality of some

sets of functions allows us to compute their

series components Let functions f0 f1 middot middot middot beorthogonal with respect to weight function

w on interval [a b] and let α0 α1 middot middot middot be realconstants A generalized fourier series is

(ibid sect 116)

f(x) =infinsumm=0

αmfm(x) (general3)

and represents a function f as a convergent

series It can be shown that the Fourier

components αm can be computed from

αm =〈f fm〉w〈fm fm〉w

(general4)

In keepingwith our previous terminology

for fourier series Equation general3 and

Equation general4 are called general fourier

synthesis and analysis respectively

For the aforementioned legendre and bessel

functions the generalized fourier series are

called fourier-legendre and fourier-bessel

series (ibid sect 116) These and the standard

fourier series (Lecture fourseries) are of

particular interest for the solution of partial

differential equations (Chapter 07)

four Fourier and orthogonality ex Exercises for Chapter 06 p 1

3 Python code in this section was generated from a Jupyter notebook

named random_signal_fftipynb with a python3 kernel

fourex Exercises for Chapter

06

Exercise 061 (secrets)

This exercise encodes a ldquosecret wordrdquo into a

sampled waveform for decoding via a discrete

fourier transform (DFT) The nominal goal of

the exercise is to decode the secret word

Along the way plotting and interpreting the DFT

will be important

First load relevant packages

import numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

We define two functions letter_to_number to

convert a letter into an integer index of the

alphabet (a becomes 1 b becomes 2 etc) and

string_to_number_list to convert a string to

a list of ints as shown in the example at the

end

def letter_to_number(letter)return ord(letter) - 96

def string_to_number_list(string)out = [] listfor i in range(0len(string))

outappend(letter_to_number(string[i]))return out list

print(faces = string_to_number_list(aces) )

four Fourier and orthogonality ex Exercises for Chapter 06 p 2

aces = [1 3 5 19]

Now we encode a code string code into

a signal by beginningwith ldquowhite noiserdquo

which is broadband (appears throughout the

spectrum) and adding to it sin functions

with amplitudes corresponding to the letter

assignments of the code and harmonic

corresponding to the position of the letter in

the string For instance the string bad would

be represented by noise plus the signal

2 sin2πt + 1 sin 4πt + 4 sin 6πt (ex1)

Letrsquos set this up for secret word chupcabra

N = 2000Tm = 30T = float(Tm)float(N)fs = 1Tx = nplinspace(0 Tm N)noise = 4nprandomnormal(0 1 N)code = chupcabra the secret wordcode_number_array = nparray(string_to_number_list(code))y = nparray(noise)for i in range(0len(code))

y = y + code_number_array[i]npsin(2nppi(i+1)x)

For proper decoding later it is important

to know the fundamental frequency of the

generated data

print(ffundamental frequency = fs Hz)

four Fourier and orthogonality ex Exercises for Chapter 06 p 3

0 1 2 3 4 5 6 7

time (s)

minus60

minus40

minus20

0

20

40

60

y n

Figure 063 the chupacabra signal

fundamental frequency = 6666666666666667 Hz

Now we plot

fig ax = pltsubplots()pltplot(xy)pltxlim([0Tm4])pltxlabel(time (s))pltylabel($y_n$)pltshow()

Finally we can save our data to a numpy file

secretsnpy to distribute our message

npsave(secretsy)

four Fourier and orthogonality ex Exercises for Chapter 06 p 1

Now I have done this (for a different secret

word) and saved the data download it here

ricopiconemathematical_foundationssourcesecretsnpy

In order to load the npy file into Python we

can use the following command

secret_array = npload(secretsnpy)

Your job is to (a) perform a DFT (b) plot the

spectrum and (c) decode the message Here are

a few hints

1 Use from scipy import fft to do the

DFT

2 Use a hanning window to minimize the

end-effects See numpyhanning for

instance The fft call might then look

like

2fft(nphanning(N)secret_array)N

where N = len(secret_array)3 Use only the positive spectrum you can

lop off the negative side and double the

positive side

Exercise 062 (seesaw)

Consider the periodic function f R rarr R with

period T defined for one period as

f(t) = at for t isin (ndashT2 T2] (ex2)

where a T isin R Perform a fourier analysis on

f Letting a = 5 and T = 1 plot f along with the

partial sum of the fourier synthesis the first

50 nonzero components over t isin [ndashT T]

four Fourier and orthogonality ex Exercises for Chapter 06 p 1

Exercise 063 (9)

Consider the function f Rrarr R defined as

f(t) =

a ndash a|t|T for t isin [ndashT T]

0 otherwise(ex3)

where a T isin R Perform a fourier analysis on f

resulting in F(ω) Plot F for various a and T

Exercise 064 (10)

Consider the function f Rrarr R defined as

f(t) = aendashb|tndashT| (ex4)

where a b T isin R Perform a fourier analysis on

f resulting in F(ω) Plot F for various a b and

T

Exercise 065 (11)

Consider the function f Rrarr R defined as

f(t) = a cosω0t + b sinω0t (ex5)

where a bω0 isin R constants Perform a fourier

analysis on f resulting in F(ω)

Exercise 066 (12)

Derive a fourier transform property for

expressions including function f Rrarr R for

f(t) cos(ω0t +ψ)

where ω0 ψ isin R

Exercise 067 (13)

Consider the function f Rrarr R defined as

f(t) = aus(t)endashbt cos(ω0t +ψ) (ex6)

where a bω0 ψ isin R and us(t) is the unit

step function Perform a fourier analysis on

f resulting in F(ω) Plot F for various a b ω0 ψ

and T

four Fourier and orthogonality ex Exercises for Chapter 06 p 1

Exercise 068 (14)

Consider the function f Rrarr R defined as

f(t) = g(t) cos(ω0t) (ex7)

where ω0 isin R and g R rarr R will be defined in

each part below Perform a fourier analysis on

f for each g below for ω1 isin R a constant and

consider how things change if ω1 rarr ω0

a g(t) = cos(ω1t)

b g(t) = sin(ω1t)

Exercise 069 (savage)

An instrument called a ldquolock-in amplifierrdquo can

measure a sinusoidal signal A cos(ω0t + ψ) =

a cos(ω0t) + b sin(ω0t) at a known frequency ω0with exceptional accuracy even in the presence

of significant noise N(t) The workings of these

devices can be described in two operations

first the following operations on the signal

with its noise f1(t) = a cos(ω0t) + b sin(ω0t) + N(t)

f2(t) = f1(t) cos(ω1t) and f3(t) = f1(t) sin(ω1t)

(ex8)

where ω0 ω1 a b isin R Note the relation of

this operation to the Fourier analysis of

Exercise 068 The key is to know with some

accuracty ω0 such that the instrument can set

ω1 asymp ω0 The second operation on the signal is

an aggressive lowshypass filter The filtered f2and f3 are called the inshyphase and quadrature

components of the signal and are typically

given a complex representation

(inshyphase) + j (quadrature)

Explain with fourier analyses on f2 and f3

pde Fourier and orthogonality Exercises for Chapter 06 p 1

a what F2 = F(f2) looks like

b what F3 = F(f3) looks like

c why we want ω1 asymp ω0 d why a lowshypass filter is desirable and

e what the timeshydomain signal will look

like

Exercise 0610 (strawman)

Consider again the lock-in amplifier explored

in Exercise 069 Investigate the lock-in

amplifier numerically with the following steps

a Generate a noisy sinusoidal signal

at some frequency ω0 Include enough

broadband white noise that the signal

is invisible in a timeshydomain plot

b Generate f2 and f3 as described in

Exercise 069

c Apply a timeshydomain discrete lowshypass

filter to each f2 7rarr φ2 and f3 7rarr φ3 such

as scipyrsquos scipysignalsosfiltfilt tocomplete the lock-in amplifier operation

Plot the results in time and as a complex

(polar) plot

d Perform a discrete fourier transform

on each f2 7rarr F2 and f3 7rarr F3 Plot the

spectra

e Construct a frequency domain lowshypass

filter F and apply it (multiply) to each

F2 7rarr Fprime2 and F3 7rarr Fprime3 Plot the filtered

spectra

f Perform an inverse discrete fourier

transform to each Fprime2 7rarr fprime2 and Fprime3 7rarr fprime3

Plot the results in time and as a complex

(polar) plot

g Compare the two methods used ie time-

domain filtering versus frequencyshydomain

filtering

ordinary differential equations

lumped-parameter modeling

pde

Partialdifferential

equations

An ordinary differential equation is one

with (ordinary) derivatives of functions a

single variable eachmdashtime in many applications

These typically describe quantities in some sort

of lumped-parameter way mass as a ldquopoint

particlerdquo a springrsquos force as a function of

timeshyvarying displacement across it a resistorrsquos

current as a function of timeshyvarying voltage

across it Given the simplicity of such models

in comparison to the wildness of nature

it is quite surprising how well they work

for a great many phenomena For instance

electronics rigid body mechanics population

dynamics bulk fluid mechanics and bulk heat

transfer can be lumpedshyparameter modeled

However as we saw in there are many

phenomena of which we require more detailed

models These include

pde Partial differential equations p 3

timeshyvarying spatial distribution

partial differential equations

dependent variables

independent variables

analytic solution

numeric solution

1 There are some analytic techniques for gaining insight into PDEs

for which there are no known solutions such as considering the

phase space This is an active area of research for more see Bove

Colombini and Santo (Antonio Bove F (Ferruccio) Colombini and

Daniele Del Santo Phase space analysis of partial differential

equations eng Progress in nonlinear differential equations and

their applications v 69 Boston Berlin Birkhaumluser 2006 ISBN

9780817645212)

bull detailed fluid mechanics

bull detailed heat transfer

bull solid mechanics

bull electromagnetism and

bull quantum mechanics

In many cases what is required to account

for is the timeshyvarying spatial distribution

of a quantity In fluid mechanics we treat a

fluid as having quantities such as density and

velocity that vary continuous over space and

time Deriving the governing equations for such

phenomena typically involves vector calculus

we observed in that statements about

quantities like the divergence (eg continuity)

can be made about certain scalar and vector

fields Such statements are governing equations

(eg the continuity equation) and they are

partial differential equations (PDEs) because

the quantities of interest called dependent

variables (eg density and velocity) are both

temporally and spatially varying (temporal

and spatial variables are therefore called

independent variables)

In this chapter we explore the analytic

solution of PDEs This is related to but

distinct from the numeric solution (ie

simulation) of PDEs which is another important

topic Many PDEs have no known analytic

solution so numeric solution is the best

available option1 However it is important

to note that the insight one can gain from an

analytic solution is often much greater than

that from a numeric solution This is easily

understood when one considers that a numeric

solution is an approximation for a specific set

pde Partial differential equations class p 4

2 Kreyszig Advanced Engineering Mathematics Ch 12

3 WA Strauss Partial Differential Equations An Introduction Wiley

2007 ISBN 9780470054567 A thorough and yet relatively compact

introduction

4 R Haberman Applied Partial Differential Equations with Fourier

Series and Boundary Value Problems (Classic Version) Pearson Modern

Classics for Advanced Mathematics Pearson Education Canada 2018

ISBN 9780134995434

of initial and boundary conditions Typically

very little can be said of what would happen

in general although this is often what we seek

to know So despite the importance of numeric

solution one should always prefer an analytic

solution

Three good texts on PDEs for further study are

Kreyszig2 Strauss3 and Haberman4

pde Partial differential equations class Classifying PDEs p 1

initial conditions

boundary conditions

well-posed problem

linear

nonlinear

pdeclass Classifying PDEs

PDEs often have an infinite number of

solutions however when applying them to

physical systems we usually assume there is

a deterministic or at least a probabilistic

sequence of events will occur Therefore we

impose additonal constraints on a PDE usually

in the form of

1 initial conditions values of

independent variables over all space at

an initial time and

2 boundary conditions values of

independent variables (or their

derivatives) over all time

Ideally imposing such conditions leaves us

with a well-posed problem which has three

aspects (Antonio Bove F (Ferruccio) Colombini

and Daniele Del Santo Phase space analysis of

partial differential equations eng Progress

in nonlinear differential equations and their

applications v 69 Boston Berlin Birkhaumluser

2006 ISBN 9780817645212 sect 15)

existence There exists at least one solution

uniqueness There exists at most one solution

stability If the PDE boundary conditons or

initial conditions are changed slightly

the solution changes only slightly

As with ODEs PDEs can be linear or nonlinear

pde Partial differential equations class Classifying PDEs p 2

order

second-order PDEs

forcing function

homogeneous

that is the independent variables and

their derivatives can appear in only linear

combinations (linear PDE) or in one or more

nonlinear combination (nonlinear PDE) As with

ODEs there are more known analytic solutions

to linear PDEs than nonlinear PDEs

The order of a PDE is the order of its

highest partial derivative A great many

physical models can be described by second-

order PDEs or systems thereof Let u be an

independent scalar variable a function of

m temporal and spatial variables xi isin RnA second-order linear PDE has the form

for coefficients αβγ δ real functions

of xi (WA Strauss Partial Differential

Equations An Introduction Wiley 2007 ISBN

9780470054567 A thorough and yet relatively

compact introduction sect 16)

nsumi=1

nsumj=1

αijpart2xixju︸ ︷︷ ︸

second-order terms

+msumk=1

(γkpartxku + δku

)︸ ︷︷ ︸

first- and zeroth-order terms

= f(x1 middot middot middot xn)︸ ︷︷ ︸forcing

(class1)

where f is called a forcing function When f

is zero Equation class1 is called homogeneous

We can consider the coefficients αij to be

components of a matrix A with rows indexed

by i and columns indexed by j There are four

prominent classes defined by the eigenvalues of

A

elliptic the eigenvalues all have the same sign

parabolic the eigenvalues have the same sign

except one that is zero

hyperbolic exactly one eigenvalue has the

opposite sign of the others and

pde Partial differential equations sturm Classifying PDEs p 3

cauchyshykowalevski theorem

cauchy problems

ultrahyperbolic at least two eigenvalues of

each signs

The first three of these have received extensive

treatment They are named after conic

sections due to the similarity the equations

have with polynomials when derivatives

are considered analogous to powers of

polynomial variables For instance here is a

case of each of the first three classes

part2xxu + part2yyu = 0 (elliptic)

part2xxu ndash part2yyu = 0 (hyperbolic)

part2xxu ndash parttu = 0 (parabolic)

When A depends on xi it may have multiple

classes across its domain In general this

equation and its associated initial and

boundary conditions do not comprise a well-

posed problem however several special cases

have been shown to be wellshyposed Thus far

the most general statement of existence and

uniqueness is the cauchyshykowalevski theorem

for cauchy problems

pde Partial differential equations sturm Sturm-liouville problems p 1

sturm-liouville (SshyL) differential equation

regular SshyL problem

5 For the SshyL problem to be regular it has the additional constraints

that p qσ are continuous and pσ gt 0 on [a b] This is also sometimes

called the sturm-liouville eigenvalue problem See Haberman (Haberman

Applied Partial Differential Equations with Fourier Series and

Boundary Value Problems (Classic Version) sect 53) for the more general

(non-regular) SshyL problem and Haberman (ibid sect 74) for the multi-

dimensional analog

boundary value problems

eigenfunctions

eigenvalues

6 These eigenvalues are closely related to but distinct from the

ldquoeigenvaluesrdquo that arise in systems of linear ODEs

7 Ibid sect 53

pdesturm Sturm-liouville

problems

Before we introduce an important solution

method for PDEs in Lecture pdeseparation we

consider an ordinary differential equation

that will arise in that method when dealing

with a single spatial dimension x the sturm-

liouville (SshyL) differential equation Let

p qσ be functions of x on open interval

(a b) Let X be the dependent variable and λ

constant The regular SshyL problem is the SshyL

ODE5

d

dx

(pXprime

)+ qX + λσX = 0 (sturm1a)

with boundary conditions

β1X(a) + β2Xprime(a) = 0 (sturm2a)

β3X(b) + β4Xprime(b) = 0 (sturm2b)

with coefficients βi isin R This is a type of

boundary value problem

This problem has nontrivial solutions

called eigenfunctions Xn(x) with n isin Z+ corresponding to specific values of λ = λn

called eigenvalues6 There are several

important theorems proven about this (see

Haberman7) Of greatest interest to us are that

1 there exist an infinite number of

eigenfunctions Xn (unique within a

multiplicative constant)

2 there exists a unique corresponding real

eigenvalue λn for each eigenfunction Xn

3 the eigenvalues can be ordered as λ1 lt

λ2 lt middot middot middot

pde Partial differential equations sturm Sturm-liouville problems p 1

homogeneous boundary conditions

periodic boundary conditions

4 eigenfunction Xn has n ndash 1 zeros on open

interval (a b)

5 the eigenfunctions Xn form an

orthogonal basis with respect to

weighting function σ such that

any piecewise continuous function

f [a b] rarr R can be represented by a

generalized fourier series on [a b]

This last theorem will be of particular interest

in Lecture pdeseparation

Types of boundary conditions

Boundary conditions of the sturm-liouville

kind (sturm2) have four sub-types

dirichlet for just β2 β4 = 0

neumann for just β1 β3 = 0

robin for all βi 6= 0 andmixed if β1 = 0 β3 6= 0 if β2 = 0 β4 6= 0

There are many problems that are not regular

sturm-liouville problems For instance the

rightshyhand sides of Equation sturm2 are

zero making them homogeneous boundary

conditions however these can also be

nonzero Another case is periodic boundary

conditions

X(a) = X(b) (sturm3a)

Xprime(a) = Xprime(b) (sturm3b)

pde Partial differential equations sturm Sturm-liouville problems p 2

Example pdesturmshy1 re a sturm-liouville problem with dirichlet

boundary conditions

Problem Statement

Consider the differential equation

Xprimeprime + λX = 0 (sturm4)

with dirichlet boundary conditions on the

boundary of the interval [0 L]

X(0) = 0 and X(L) = 0 (sturm5)

Solve for the eigenvalues and eigenfunctions

Problem SolutionThis is a sturm-liouville problem so we know

the eigenvalues are real The well-known

general solutions to the ODE is

X(x) =

k1 + k2x λ = 0

k1ejradicλx + k2e

ndashjradicλx otherwise

(sturm6)

with real constants k1 k2 The solution must

also satisfy the boundary conditions Letrsquos

apply them to the case of λ = 0 first

X(0) = 0rArr k1 + k2(0) = 0rArr k1 = 0 (sturm7)

X(L) = 0rArr k1 + k2(L) = 0rArr k2 = ndashk1L (sturm8)

Together these imply k1 = k2 = 0 which gives

the trivial solution X(x) = 0 in which we

arenrsquot interested We say then for nontrivial

solutions λ 6= 0 Now letrsquos check λ lt 0 The

solution becomes

X(x) = k1endashradic|λ|x + k2e

radic|λ|x (sturm9)

= k3 cosh(radic|λ|x) + k4 sinh(

radic|λ|x) (sturm10)

where k3 and k4 are real constants Again

applying the boundary conditions

X(0) = 0rArr k3 cosh(0) + k4 sinh(0) = 0rArr k3 + 0 = 0rArr k3 = 0

X(L) = 0rArr 0cosh(radic|λ|L) + k4 sinh(

radic|λ|L) = 0rArr k4 sinh(

radic|λ|L) = 0

However sinh(radic|λ|L) 6= 0 for L gt 0 so

k4 = k3 = 0mdashagain the trivial solution Now

pde Partial differential equations sturm Sturm-liouville problems p 3

letrsquos try λ gt 0 The solution can be written

X(x) = k5 cos(radicλx) + k6 sin(

radicλx) (sturm11)

Applying the boundary conditions for this

case

X(0) = 0rArr k5 cos(0) + k6 sin(0) = 0rArr k5 + 0 = 0rArr k5 = 0

X(L) = 0rArr 0cos(radicλL) + k6 sin(

radicλL) = 0rArr k6 sin(

radicλL) = 0

Now sin(radicλL) = 0 for

radicλL = nπrArr

λ =(nπL

)2 (n isin Z+)

Therefore the only nontrivial solutions

that satisfy both the ODE and the boundary

conditions are the eigenfunctions

Xn(x) = sin(radic

λnx)

(sturm12a)

= sin(nπLx)

(sturm12b)

with corresponding eigenvalues

λn =(nπL

)2 (sturm13)

Note that because λ gt 0 λ1 is the lowest

eigenvalue

Plotting the eigenfunctions

The followingwas generated from a Jupyter

notebook with the following filename and

kernel

notebook filename eigenfunctions_example_plotipynbnotebook kernel python3

First load some Python packages

pde Partial differential equations sturm Sturm-liouville problems p 4

import numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

Set L = 1 and compute values for the first

four eigenvalues lambda_n and eigenfunctions

X_n

L = 1x = nplinspace(0L100)n = nplinspace(144dtype=int)lambda_n = (nnppiL)2X_n = npzeros([len(n)len(x)])for in_i in enumerate(n)X_n[i] = npsin(npsqrt(lambda_n[i])x)

Plot the eigenfunctions

for in_i in enumerate(n)pltplot(

xX_n[i]linewidth=2label=n = +str(n_i)

)pltlegend()pltshow() display the plot

00 02 04 06 08 10

ndash1

ndash05

0

05

1

n = 1

n = 2

n = 3

n = 4

pde Partial differential equations separation Sturm-liouville problems p 5

We see that the fourth of the SshyL theorems

appears true n ndash 1 zeros of Xn exist on the

open interval (0 1)

pde Partial differential equations separation PDE solution by separation of variables p 1

separation of variables

linear

product solution

separable PDEs

pdeseparation PDE solution by

separation of

variables

We are now ready to learn one of the

most important techniques for solving PDEs

separation of variables It applies only to

linear PDEs since it will require the principle

of superposition Not all linear PDEs yield to

this solution technique but several that are

important do

The technique includes the following steps

assume a product solution Assume the

solution can be written as a product

solution up the product of functions

of each independent variable

separate PDE Substitute up into the PDE and

rearrange such that at least one side

of the equation has functions of a single

independent variabe If this is possible

the PDE is called separable

set equal to a constant Each side of

the equation depends on different

independent variables therefore they

must each equal the same constant often

called ndashλ

repeat separation as needed If there are more

than two independent variables there

will be an ODE in the separated variable

and a PDE (with one fewer variables) in

the other independent variables Attempt

to separate the PDE until only ODEs

remain

solve each boundary value problem Solve

each boundary value problem ODE

pde Partial differential equations separation PDE solution by separation of variables p 1

eigenfunctions

superposition

ignoring the initial conditions for now

solve the time variable ODE Solve for the

general solution of the time variable ODE

sans initial conditions

construct the product solution Multiply the

solution in each variable to construct

the product solution up If the boundary

value problems were sturm-liouville

the product solution is a family of

eigenfunctions from which any function

can be constructed via a generalized

fourier series

apply the initial condition The product

solutions individually usually do not

meet the initial condition However a

generalized fourier series of them nearly

always does Superposition tells us a

linear combination of solutions to the

PDE and boundary conditions is also

a solution the unique series that also

satisfies the initial condition is the

unique solution to the entire problem

Example pdeseparationshy1 re 1D diffusion equation

Problem Statement

Consider the oneshydimensional diffusion

equation PDEa

parttu(t x) = kpart2xxu(t x) (separation1)

with real constant k with dirichlet boundary

conditions on inverval x isin [0 L]

u(t 0) = 0 (separation2a)

u(t L) = 0 (separation2b)

and with initial condition

u(0 x) = f(x) (separation3)

where f is some piecewise continuous function

on [0 L]

pde Partial differential equations separation PDE solution by separation of variables p 2

a For more on the diffusion or heat equation see Haberman

(Haberman Applied Partial Differential Equations with Fourier

Series and Boundary Value Problems (Classic Version) sect 23)

Kreyszig (Kreyszig Advanced Engineering Mathematics sect 125)

and Strauss (Strauss Partial Differential Equations An

Introduction sect 23)

Problem Solution

Assume a product solution

First we assume a product solution of the

form up(t x) = T(t)X(x) where T and X are

unknown functions on t gt 0 and x isin [0 L]

Separate PDE

Second we substitute the product solution

into Equation separation1 and separate

variables

TprimeX = kTXprimeprime rArr (separation4)

Tprime

kT=Xprimeprime

X (separation5)

So it is separable Note that we chose to

group k with T which was arbitrary but

conventional

Set equal to a constant

Since these two sides depend on different

independent variables (t and x) they must

equal the same constant we call ndashλ so we

have two ODEs

Tprime

kT= ndashλ rArr Tprime + λkT = 0 (separation6)

Xprimeprime

X= ndashλ rArr Xprimeprime + λX = 0 (separation7)

Solve the boundary value problem

The latter of these equations with the

boundary conditions (separation2) is precisely

the same sturm-liouville boundary value

problem from Example pdesturm-1 which had

pde Partial differential equations separation PDE solution by separation of variables p 3

eigenfunctions

Xn(x) = sin(radic

λnx)

(separation8a)

= sin(nπLx)

(separation8b)

with corresponding (positive) eigenvalues

λn =(nπL

)2 (separation9)

Solve the time variable ODE

The time variable ODE is homogeneous and has

the familiar general solution

T(t) = cendashkλt (separation10)

with real constant c However the boundary

value problem restricted values of λ to λn so

Tn(t) = cendashk(nπL)

2t (separation11)

Construct the product solution

The product solution is

up(t x) = Tn(t)Xn(x)

= cendashk(nπL)2t sin

(nπLx) (separation12)

This is a family of solutions that each satisfy

only exotically specific initial conditions

Apply the initial condition

The initial condition is u(0 x) = f(x) The

eigenfunctions of the boundary value problem

form a fourier series that satisfies the initial

condition on the interval [0 L] if we extend

f to be periodic and odd over x (Kreyszig

Advanced Engineering Mathematics p 550) we

call the extension flowast The odd series synthesis

can be written

flowast(x) =infinsumn=1

bn sin(nπLx)

(separation13)

pde Partial differential equations separation PDE solution by separation of variables p 4

where the fourier analysis gives

bn =2

L

ˆ L

0flowast(χ) sin

(nπLχ) (separation14)

So the complete solution is

u(t x) =infinsumn=1

bnendashk(nπL)

2t sin(nπLx)

(separation15)

Notice this satisfies the PDE the boundary

conditions and the initial condition

Plotting solutions

If we want to plot solutions we need to

specify an initial condition u(0 x) = flowast(x)

over [0 L] We can choose anything piecewise

continuous but for simplicity letrsquos let

f(x) = 1 (x isin [0 L])

The odd periodic extension is an odd square

wave The integral (separation14) gives

bn =4

nπ(1 ndash cos(nπ))

=

0 n even

4nπ n odd

(separation16)

Now we can write the solution as

u(t x) =infinsum

n=1 n odd

4

nπendashk(nπL)

2t sin(nπLx)

(separation17)

Plotting in Python

First load some Python packages

import numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

pde Partial differential equations separation PDE solution by separation of variables p 5

Set k = L = 1 and sum values for the first Nterms of the solution

L = 1k = 1N = 100x = nplinspace(0L300)t = nplinspace(02(Lnppi)2100)u_n = npzeros([len(t)len(x)])for n in range(N)n = n+1 because index starts at 0if n 2 == 0 even

pass already initialized to zeroselse odd

u_n += 4(nnppi)npouter(npexp(-k(nnppiL)2t)npsin(nnppiLx)

)

Letrsquos first plot the initial condition

p = pltfigure()pltplot(xu_n[0])pltxlabel(space $x$)pltylabel($u(0x)$)pltshow()

00 02 04 06 08 10space x

00

02

04

06

08

10

12

u(0x)

pde Partial differential equations wave PDE solution by separation of variables p 6

Now we plot the entire response

p = pltfigure()pltcontourf(txu_nT)c = pltcolorbar()cset_label($u(tx))pltxlabel(time $t$)pltylabel(space $x$)pltshow()

000 005 010 015 020

time t

00

02

04

06

08

10

spacex

000

015

030

045

060

075

090

105

120

u(tx)

We see the diffusive action proceeds as we

expected

Python code in this section was generated from a Jupyter

notebook named pde_separation_example_01ipynb with a

python3 kernel

pde Partial differential equations wave The 1D wave equation p 1

wave equation

8 Kreyszig Advanced Engineering Mathematics sect 129 1210

9 Haberman Applied Partial Differential Equations with Fourier Series

and Boundary Value Problems (Classic Version) sect 45 73

pdewave The 1D wave equation

The oneshydimensional wave equation is the

linear PDE

part2ttu(t x) = c2part2xxu(t x) (wave1)

with real constant c This equation models

such phenomena as strings fluids sound and

light It is subject to initial and boundary

conditions and can be extended to multiple

spatial dimensions For 2D and 3D examples

in rectangular and polar coordinates see

Kreyszig8 and Haberman9

Example pdewaveshy1 re vibrating string PDE solution by separation of

variables

Problem Statement

Consider the oneshydimensional wave equation

PDE

part2ttu(t x) = c2part2xxu(t x) (wave2)

with real constant c and with dirichlet

boundary conditions on inverval x isin [0 L]

u(t 0) = 0 and u(t L) = 0 (wave3a)

and with initial conditions (we need two

because of the second timeshyderivative)

u(0 x) = f(x) and parttu(0 x) = g(x) (wave4)

where f and g are some piecewise continuous

functions on [0 L]

Problem Solution

Assume a product solution

First we assume a product solution of the

form up(t x) = T(t)X(x) where T and X are

unknown functions on t gt 0 and x isin [0 L]

pde Partial differential equations wave The 1D wave equation p 2

Separate PDE

Second we substitute the product solution

into Equation wave2 and separate variables

TprimeprimeX = c2TXprimeprime rArr (wave5)

Tprimeprime

c2T=Xprimeprime

X (wave6)

So it is separable Note that we chose to

group c with T which was arbitrary but

conventional

Set equal to a constant

Since these two sides depend on different

independent variables (t and x) they must

equal the same constant we call ndashλ so we

have two ODEs

Tprimeprime

c2T= ndashλ rArr Tprimeprime + λc2T = 0 (wave7)

Xprimeprime

X= ndashλ rArr Xprimeprime + λX = 0 (wave8)

Solve the boundary value problem

The latter of these equations with the

boundary conditions (wave3) is precisely

the same sturm-liouville boundary value

problem from Example pdesturm-1 which had

eigenfunctions

Xn(x) = sin(radic

λnx)

(wave9a)

= sin(nπLx)

(wave9b)

with corresponding (positive) eigenvalues

λn =(nπL

)2 (wave10)

Solve the time variable ODE

The time variable ODE is homogeneous and

with λ restricted by the reals by the

boundary value problem has the familiar

general solution

T(t) = k1 cos(cradicλt) + k2 sin(c

radicλt) (wave11)

pde Partial differential equations wave The 1D wave equation p 3

with real constants k1 and k2 However the

boundary value problem restricted values of λ

to λn so

Tn(t) = k1 cos(cnπLt)+ k2 sin

(cnπLt)

(wave12)

Construct the product solution

The product solution is

up(t x) = Tn(t)Xn(x)

= k1 sin(nπLx)cos

(cnπLt)+ k2 sin

(nπLx)sin

(cnπLt)

This is a family of solutions that each satisfy

only exotically specific initial conditions

Apply the initial conditions

Recall that superposition tells us that any

linear combination of the product solution is

also a solution Therefore

u(t x) =infinsumn=1

an sin(nπLx)cos

(cnπLt)+ bn sin

(nπLx)sin

(cnπLt)

(wave13)

is a solution If an and bn are properly

selected to satisfy the initial conditions

Equation wave13 will be the solution to the

entire problem Substituting t = 0 into our

potential solution gives

u(0 x) =infinsumn=1

an sin(nπLx)

(wave14a)

parttu(t x)|t=0 =infinsumn=1

bncnπ

Lsin

(nπLx) (wave14b)

Let us extend f and g to be periodic and odd

over x we call the extensions flowast and glowast From

Equation wave14 the intial conditions are

satsified if

flowast(x) =infinsumn=1

an sin(nπLx)

(wave15a)

glowast(x) =infinsumn=1

bncnπ

Lsin

(nπLx) (wave15b)

pde Partial differential equations wave The 1D wave equation p 4

We identify these as two odd fourier

syntheses The corresponding fourier analyses

are

an =2

L

ˆ L

0flowast(χ) sin

(nπLχ)

(wave16a)

bncnπ

L=2

L

ˆ L

0glowast(χ) sin

(nπLχ)

(wave16b)

So the complete solution is Equation wave13

with components given by Equation wave16

Notice this satisfies the PDE the boundary

conditions and the initial condition

Discussion

It can be shown that this series solution is

equivalent to two travelingwaves that are

interfering (see Habermana and Kreyszigb)

This is convenient because computing the

series solution exactly requires an infinite

summation We show in the following section

that the approximation by partial summation is

still quite good

Choosing specific initial conditions

If we want to plot solutions we need to

specify initial conditions over [0 L] Letrsquos

model a string being suddenly struck from

rest as

f(x) = 0 (wave17)

g(x) = δ(x ndash ∆L) (wave18)

where δ is the dirac delta distribution and

∆ isin [0 L] is a fraction of L representing the

location of the string being struck The odd

periodic extension is an odd pulse train The

integrals of (wave16) give

an = 0 (wave19a)

bn =2

cnπ

ˆ L

0δ(χ ndash ∆L) sin

(nπLχ)dx

=2

cnπsin(nπ∆) (sifting property)

pde Partial differential equations wave The 1D wave equation p 5

Now we can write the solution as

u(t x) =infinsumn=1

2

cnπsin(nπ∆) sin

(nπLx)sin

(cnπLt)

(wave20)

Plotting in Python

First load some Python packages

import numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

Set c = L = 1 and sum values for the first

N terms of the solution for some striking

location ∆

Delta = 01 0 lt= Delta lt= LL = 1c = 1N = 150t = nplinspace(030(Lnppi)2100)x = nplinspace(0L150)t_b x_b = npmeshgrid(tx)u_n = npzeros([len(x)len(t)])for n in range(N)n = n+1 because index starts at 0u_n += 4(cnnppi)

npsin(nnppiDelta) npsin(cnnppiLt_b) npsin(nnppiLx_b)

Letrsquos first plot some early snapshots of the

response

pde Partial differential equations wave The 1D wave equation p 6

import seaborn as snsn_snaps = 7snsset_palette(snsdiverging_palette(

240 10 n=n_snaps center=dark)

)

fig ax = pltsubplots()it = nplinspace(277n_snapsdtype=int)for i in range(len(it))axplot(xu_n[it[i]]label=ft = t[i]3f)

lgd = axlegend(bbox_to_anchor=(105 1)loc=upper left

)pltxlabel(space $x$)pltylabel($u(tx)$)pltshow()

00 02 04 06 08 10

space x

minus10

minus05

00

05

10

u(tx

)

t = 0000

t = 0031

t = 0061

t = 0092

t = 0123

t = 0154

t = 0184

Now we plot the entire response

fig ax = pltsubplots()p = axcontourf(txu_n)c = figcolorbar(pax=ax)cset_label($u(tx)$)pltxlabel(time $t$)pltylabel(space $x$)pltshow()

pde Partial differential equations ex The 1D wave equation p 7

00 05 10 15 20 25 30

time t

00

02

04

06

08

10

spac

ex

minus12

minus09

minus06

minus03

00

03

06

09

12

u(tx

)

We see a wave develop and travel reflecting

and inverting off each boundary

a Haberman Applied Partial Differential Equations with Fourier

Series and Boundary Value Problems (Classic Version) sect 44

b Kreyszig Advanced Engineering Mathematics sect 122

b Python code in this section was generated from a Jupyter

notebook named pde_separation_example_02ipynb with a

python3 kernel

pde Partial differential equations ex Exercises for Chapter 07 p 1

pdeex Exercises for Chapter 07

Exercise 071 (poltergeist)

The PDE of Example pdeseparation-1 can be used

to describe the conduction of heat along

a long thin rod insulated along its length

where u(t x) represents temperature The initial

and dirichlet boundary conditions in that

example would be interpreted as an initial

temperature distribution along the bar and

fixed temperatures of the ends Now consider

the same PDE

parttu(t x) = kpart2xxu(t x) (ex1)

with real constant k now with neumann

boundary conditions on inverval x isin [0 L]

partxu|x=0 = 0 and partxu|x=L = 0 (ex2a)

and with initial condition

u(0 x) = f(x) (ex3)

where f is some piecewise continuous function

on [0 L] This represents the complete insulation

of the ends of the rod such that no heat flows

from the ends (or from anywhere else)

a Assume a product solution separate

variables into X(x) and T(t) and set the

separation constant to ndashλ

b Solve the boundary value problem for its

eigenfunctions Xn and eigenvalues λn

c Solve for the general solution of the

time variable ODE

d Write the product solution and apply the

initial condition f(x) by constructing it

from a generalized fourier series of the

product solution

pde Partial differential equations ex Exercises for Chapter 07 p 1

e Let L = k = 1 and f(x) = 100ndash200L |xndashL2| as

shown in Figure 071 Compute the solution

series components Plot the sum of the

first 50 terms over x and t

0 02 04 06 08 1050100

x

f(x)

Figure 071 initial condition for Exercise 051Exercise 072 (kathmandu)

Consider the free vibration of a uniform

and relatively thin beammdashwith modulus of

elasticity E second moment of cross-sectional

area I and massshyper-length micromdashpinned at each

end The PDE describing this is a version of

the euler-bernoulli beam equation for vertical

motion u

part2ttu(t x) = ndashα2part4xxxxu(t x) (ex4)

with real constant α defined as

α2 =EI

micro (ex5)

Pinned supports fix vertical motion such that

we have boundary conditions on interval

x isin [0 L]

u(t 0) = 0 and u(t L) = 0 (ex6a)

Additionally pinned supports cannot provide

a moment so

part2xxu|x=0 = 0 and part2xxu|x=L = 0 (ex6b)

Furthermore consider the initial conditions

u(0 x) = f(x) and parttu|t=0 = 0 (ex7a)

where f is some piecewise continuous function

on [0 L]

opt Partial differential equations Exercises for Chapter 07 p 2

a Assume a product solution separate

variables into X(x) and T(t) and set the

separation constant to ndashλ

b Solve the boundary value problem for

its eigenfunctions Xn and eigenvalues λn

Assume real λ gt 0 (itrsquos true but tedious to

show)

c Solve for the general solution of the

time variable ODE

d Write the product solution and apply

the initial conditions by constructing

it from a generalized fourier series of

the product solution

e Let L = α = 1 and f(x) = sin(10πxL)

as shown in Figure 072 Compute the

solution series components Plot the sum

of the first 50 terms over x and t

0 02 04 06 08 1ndash101

x

f(x)

Figure 072 initial condition for Exercise 072

opt

Optimization

opt Optimization grad Gradient descent p 1

objective function

extremum

global extremum

local extremum

gradient descent

stationary point

saddle point

hessian matrix

optgrad Gradient descent

Consider a multivariate function f Rn rarr Rthat represents some cost or value This is

called an objective function and we often

want to find an X isin Rn that yields frsquos

extremum minimum or maximum depending on

whichever is desirable

It is important to note however that some

functions have no finite extremum Other

functions have multiple Finding a global

extremum is generally difficult however

many good methods exist for finding a local

extremum an extremum for some region R sub Rn

The method explored here is called gradient

descent It will soon become apparent why it

has this name

Stationary points

Recall from basic calculus that a function f of

a single variable had potential local extrema

where df(x) dx = 0 The multivariate version of

this for multivariate function f is

grad f = 0 (grad1)

A value X for which Equation grad1 holds is

called a stationary point However as in the

univariate case a stationary point may not be

a local extremum in these cases it called a

saddle point

Consider the hessian matrix H with values for

independent variables xi

Hij = part2xixjf (grad2)

opt Optimization grad Gradient descent p 1

second partial derivative test

positive definite

negative definite

indefinite

convex

concave

the gradient points toward stationary points

For a stationary point X the second partial

derivative test tells us if it is a local

maximum local minimum or saddle point

minimum If H(X) is positive definite (all its

eigenvalues are positive)

X is a local minimum

maximum If H(X) is negative definite (all its

eigenvalues are negative)

X is a local maximum

saddle If H(X) is indefinite (it has both

positive and negative eigenvalues)

X is a saddle point

These are sometimes called tests for concavity

minima occur where f is convex and maxima

where f is concave (ie where ndashf is convex)

It turns out however that solving

Equation grad1 directly for stationary points is

generally hard Therefore we will typically use

an iterative technique for estimating them

The gradient points the way

Although Equation grad1 isnrsquot usually directly

useful for computing stationary points it

suggests iterative techniques that are Several

techniques rely on the insight that the

gradient points toward stationary points

Recall from Lecture vecsgrad that grad f is a

vector field that points in the direction of

greatest increase in f

opt Optimization grad Gradient descent p 1

line search

step size α

Consider starting at some point x0 and

wanting to move iteratively closer to a

stationary point So if one is seeking a

maximum of f then choose x1 to be in the

direction of grad f If one is seeking a

minimum of f then choose x1 to be opposite the

direction of grad f

The question becomes how far α should we go

in (or opposite) the direction of the gradient

Surely too-small α will require more iteration

and too-large α will lead to poor convergence

or missing minima altogether This framing of

the problem is called line search There are a

few common methods for choosing α called the

step size some more computationally efficient

than others

Two methods for choosing the step size

are described below Both are framed as

minimization methods but changing the sign of

the step turns them into maximization methods

The classical method

Let

gk = grad f(xk) (grad3)

the gradient at the algorithmrsquos current

estimate xk of the minimum The classical

method of choosing α is to attempt to solve

analytically for

αk = argminα

f(xk ndash αgk) (grad4)

This solution approximates the function f

as one varies α It is approximate because

as α varies so should x But even with α as

the only variable Equation grad4 may be

opt Optimization grad Gradient descent p 1

1 Kreyszig Advanced Engineering Mathematics sect 221

2 Jonathan Barzilai and Jonathan M Borwein Two-Point Step Size

Gradient Methods In IMA Journal of Numerical Analysis 81 (Jan 1988)

pp 141ndash148 issn 0272-4979 doi 101093imanum81141 This

includes an innovative line search method

difficult or impossible to solve However this

is sometimes called the ldquooptimalrdquo choice for

α Here ldquooptimalityrdquo refers not to practicality

but to ideality This method is rarely used to

solve practical problems

The algorithm of the classical gradient descent

method can be summarized in the pseudocode

of Algorithm 1 It is described further in

Kreyszig1

Algorithm 1 Classical gradient descent

1 procedure classical_minimizer(fx0 T)

2 while δx gt T do until threshold T is

met

3 gk larr grad f(xk)

4 αk larr argminα f(xk ndash αgk)

5 xk+1 larr xk ndash αkgk6 δxlarr xk+1 ndash xk7 klarr k + 1

8 end while

9 return xk the threshold was reached

10 end procedureThe Barzilai and Borwein method

In practice several non-classical methods are

used for choosing step size α Most of these

construct criteria for step sizes that are too

small and too large and prescribe choosing

some α that (at least in certain cases) must

be in the sweet-spot in between Barzilai and

Borwein2 developed such a prescription which

we now present

Let ∆xk = xk ndash xkndash1 and ∆gk = gk ndash gkndash1 This

method minimizes ∆x ndash α∆g2 by choosing

αk =∆xk middot ∆gk∆gk middot ∆gk

(grad5)

opt Optimization grad Gradient descent p 2

3 Barzilai and Borwein Two-Point Step Size Gradient Methods

The algorithm of this gradient descent

method can be summarized in the pseudocode

of Algorithm 2 It is described further in

Barzilai and Borwein3

Algorithm 2 Barzilai and Borwein gradient

descent

1 procedure barzilai_minimizer(fx0 T)

2 while δx gt T do until threshold T is

met

3 gk larr grad f(xk)

4 ∆gk larr gk ndash gkndash15 ∆xk larr xk ndash xkndash16 αk larr

∆xkmiddot∆gk∆gkmiddot∆gk

7 xk+1 larr xk ndash αkgk8 δxlarr xk+1 ndash xk9 klarr k + 1

10 end while

11 return xk the threshold was reached

12 end procedure

Example optgradshy1 re Barzilai and Borwein gradient descent

Problem Statement

Consider the functions (a) f1 R2 rarr R and (b)

f2 R2 rarr R defined as

f1(x) = (x1 ndash 25)2 + 13(x2 + 10)

2 (grad6)

f2(x) =1

2x middot Ax ndash b middot x (grad7)

where

A =

[20 0

0 10

]and (grad8a)

b =[1 1

]gt (grad8b)

Use the method of Barzilai and Borweina

starting at some x0 to find a minimum of each

function

a Ibid

Problem Solution

First load some Python packages

opt Optimization grad Gradient descent p 3

from sympy import import numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latexfrom tabulate import tabulate

We begin by writing a class gradient_descent_minto perform the gradient descent This is not

optimized for speed

class gradient_descent_min() A Barzilai and Borwein gradient descent class

Inputs f Python function of x variables x list of symbolic variables (eg [x1 x2]) x0 list of numeric initial guess of a min of f T step size threshold for stopping the descent

To execute the gradient descent call descend method

nb This is only for gradients in cartesiancoordinates Further work would be to implementthis in multiple or generalized coordinatesSee the grad method below for implementation

def __init__(selffxx0T)selff = fselfx = Array(x)selfx0 = nparray(x0)selfT = Tselfn = len(x0) size of xselfg = lambdify(xselfgrad(fx)numpy)selfxk = nparray(x0)selftable =

def descend(self) unpack variablesf = selffx = selfxx0 = selfx0T = selfTg = selfg initialize variablesN = 0

opt Optimization grad Gradient descent p 4

x_k = x0dx = 2T cant be zerox_km1 = 9x0-1 cant equal x0g_km1 = nparray(g(x_km1))N_max = 100 max iterationstable_data = [[Nx0nparray(g(x0))0]]while (dx gt T and N lt N_max) or N lt 1N += 1 increment indexg_k = nparray(g(x_k))dg_k = g_k - g_km1dx_k = x_k - x_km1alpha_k = dx_kdot(dg_k)dg_kdot(dg_k)x_km1 = x_k storex_k = x_k - alpha_kg_k savetable_dataappend([Nx_kg_kalpha_k])selfxk = npvstack((selfxkx_k)) store other variablesg_km1 = g_kdx = nplinalgnorm(x_k - x_km1) check

selftabulater(table_data)

def tabulater(selftable_data)npset_printoptions(precision=2)tabulateLATEX_ESCAPE_RULES=selftable[python] = tabulate(table_dataheaders=[Nx_kg_kalpha]

)selftable[latex] = tabulate(table_dataheaders=[

$N$$bmx_k$$bmg_k$$alpha$]tablefmt=latex_raw

)

def grad(selffx) cartesian coords gradientreturn derive_by_array(f(x)x)

First consider f1

var(x1 x2)x = Array([x1x2])f1 = lambda x (x[0]-25)2 + 13(x[1]+10)2gd = gradient_descent_min(f=f1x=xx0=[-5040]T=1e-8)

opt Optimization grad Gradient descent p 5

Perform the gradient descent

gddescend()

Print the interesting variables

print(gdtable[python])

N xk gk α

0 [-50 40] [-150 1300] 0

1 [-4365 -1503] [-150 1300] 00423296

2 [shy3836 -10 ] [-1373 -13074] 00384979

3 [shy3311 -10 ] [-127e+02 124e-01] 0041454

4 [ 2499 -10 ] [-116e+02 -962e-03] 0499926

5 [ 25 -1005] [-002 012] 0499999

6 [ 25 -10] [-184e-08 -138e+00] 00385225

7 [ 25 -10] [-170e-08 219e-03] 00384615

8 [ 25 -10] [-157e-08 000e+00] 00384615

Now letrsquos lambdify the function f1 so we can

plot

f1_lambda = lambdify((x1x2)f1(x)numpy)

Now letrsquos plot a contour plot with the

gradient descent overlaid

opt Optimization grad Gradient descent p 6

fig ax = pltsubplots() contour plotX1 = nplinspace(-100100100)X2 = nplinspace(-5050100)X1 X2 = npmeshgrid(X1X2)F1 = f1_lambda(X1X2)pltcontourf(X1X2F1) gradient descent plotfrom mpl_toolkitsmplot3d import Axes3Dfrom matplotlibcollections import LineCollectionxX1 = gdxk[0]xX2 = gdxk[1]points = nparray([xX1 xX2])Treshape(-1 1 2)segments = npconcatenate([points[-1] points[1]] axis=1

)lc = LineCollection(segmentscmap=pltget_cmap(Reds)

)lcset_array(nplinspace(01len(xX1))) color segslcset_linewidth(3)axautoscale(False) avoid the scatter changing limsaxadd_collection(lc)axscatter(xX1xX2zorder=1marker=ocolor=pltcmReds(nplinspace(01len(xX1)))edgecolor=none

)pltshow()

minus100 minus75 minus50 minus25 0 25 50 75 100

minus40

minus20

0

20

40

opt Optimization grad Gradient descent p 7

Now consider f2

A = Matrix([[100][020]])b = Matrix([[11]])def f2(x)X = Array([x])tomatrix()Treturn 12Xdot(AX) - bdot(X)

gd = gradient_descent_min(f=f2x=xx0=[50-40]T=1e-8)

Perform the gradient descent

gddescend()

Print the interesting variables

print(gdtable[python])

opt Optimization grad Gradient descent p 8

N xk gk α

0 [ 50 -40] [ 499 -801] 0

1 [1758 1204] [ 499 -801] 00649741

2 [ 807 -101] [17478 23988] 00544221

3 [362 017] [ 7966 shy2122] 00558582

4 [ 049 -005] [3516 249] 00889491

5 [01 014] [ 389 -194] 00990201

6 [01 0 ] [004 19 ] 00750849

7 [01 005] [ 001 -095] 0050005

8 [01 005] [474e-03 958e-05] 00500012

9 [01 005] [ 237e-03 shy238e-09] 00999186

10 [01 005] [193e-06 237e-09] 01

11 [01 005] [ 000e+00 shy237e-09] 00999997

Now letrsquos lambdify the function f2 so we can

plot

f2_lambda = lambdify((x1x2)f2(x)numpy)

Now letrsquos plot a contour plot with the

gradient descent overlaid

fig ax = pltsubplots() contour plotX1 = nplinspace(-100100100)X2 = nplinspace(-5050100)X1 X2 = npmeshgrid(X1X2)F2 = f2_lambda(X1X2)pltcontourf(X1X2F2) gradient descent plotfrom mpl_toolkitsmplot3d import Axes3Dfrom matplotlibcollections import LineCollectionxX1 = gdxk[0]xX2 = gdxk[1]points = nparray([xX1 xX2])Treshape(-1 1 2)

opt Optimization lin Gradient descent p 9

segments = npconcatenate([points[-1] points[1]] axis=1

)lc = LineCollection(segmentscmap=pltget_cmap(Reds)

)lcset_array(nplinspace(01len(xX1))) color segslcset_linewidth(3)axautoscale(False) avoid the scatter changing limsaxadd_collection(lc)axscatter(xX1xX2zorder=1marker=ocolor=pltcmReds(nplinspace(01len(xX1)))edgecolor=none

)pltshow()

minus100 minus75 minus50 minus25 0 25 50 75 100

minus40

minus20

0

20

40

Python code in this section was generated from a Jupyter

notebook named gradient_descentipynb with a python3kernel

opt Optimization lin Constrained linear optimization p 1

constraints

linear programming problem

maximize

constrained

linear

feasible solution

optimal

hyperplane

optlin Constrained linear

optimization

Consider a linear objective function f

Rn rarr R with variables xi in vector x and

coefficients ci in vector c

f(x) = c middot x (lin1)

subject to the linear constraintsmdashrestrictions

on ximdash

Ax 6 a (lin2a)

Bx = b and (lin2b)

l 6 x 6 u (lin2c)

where A and B are constant matrices and

ab lu are nshyvectors This is one formulation

of what is called a linear programming

problem Usually we want to maximize f over

the constraints Such problems frequently

arise throughout engineering for instance

in manufacturing transportation operations

etc They are called constrained because

there are constraints on x they are called

linear because the objective function and the

constraints are linear

We call a pair (x f(x)) for which x satisfies

Equation lin2 a feasible solution Of course

not every feasible solution is optimal a

feasible solution is optimal iff there exists no

other feasible solution for which f is greater

(assumingwersquore maximizing) We call the vector

subspace of feasible solutions S sub Rn

Feasible solutions form a polytope

Consider the effect of the constraints Each

of the equalities and inequalities defines a

linear hyperplane in Rn (ie a linear subspace

opt Optimization lin Constrained linear optimization p 1

flat

polytope

vertices

basic feasible solutions

level set

an optimal solution occurs at a vertex of S

of dimension n ndash 1) either as a boundary of

S (inequality) or as a restriction of S to the

hyperplane When joined these hyperplanes

are the boundary of S (equalities restrict S

to lower dimension) So we see that each of

the boundaries of S is flat which makes S a

polytope (in R2 a polygon) What makes this

especially interesting is that polytopes have

vertices where the hyperplanes intersect

Solutions at the vertices are called basic

feasible solutions

Only the vertices matter

Our objective function f is linear so for

some constant h f(x) = h defines a level set

that is itself a hyperplane H in Rn If thishyperplane intersects S at a point x (x f(x) = h)

is the corresponding solution There are three

possibilities when H intersects S

1 H cap S is a vertex of S

2 H cap S is a boundary hyperplane of S or

3 H cap S slices through the interior of S

However this third option implies that there

exists a level set G corresponding to f(x) =

g such that G intersects S and g gt h so

solutions on H cap S are not optimal (We have

not proven this but it may be clear from

our progression) We conclude that either the

first or second case must be true for optimal

solutions And notice that in both cases a

(potentially optimal) solution occurs at at

least one vertex The key insight then is that

an optimal solution occurs at a vertex of

S

opt Optimization simplex Constrained linear optimization p 2

This means we donrsquot need to search all of S

or even its boundary we need only search the

vertices Helpful as this is it restricts us down

to( n constraints

)potentially optimal solutionsmdash

usually still too many to search in a naiumlve

way In Lecture optsimplex this is mitigated

by introducing a powerful searching method

opt Optimization simplex The simplex algorithm p 1

simplex algorithm

4 Another Python package pulp (PuLP) is probably more popular for

linear programming however we choose scipyoptimize because it

has applications beyond linear programming

optsimplex The simplex

algorithm

The simplex algorithm (or ldquomethodrdquo) is an

iterative technique for finding an optimal

solution of the linear programming problem

of Equations lin1 and lin2 The details of

the algorithm are somewhat involved but

the basic idea is to start at a vertex of the

feasible solution space S and traverse an edge

of the polytope that leads to another vertex

with a greater value of f Then repeat this

process until there is no neighboring vertex

with a greater value of f at which point the

solution is guaranteed to be optimal

Rather than present the details of the

algorithm we choose to show an example using

Python There have been some improvements

on the original algorithm that have been

implemented into many standard software

packages including Pythonrsquos scipy package

(Pauli Virtanen et al SciPy 10ndashFundamental

Algorithms for Scientific Computing in Python

In arXiv eshyprints arXiv190710121 [July 2019]

arXiv190710121) module scipyoptimize4

Example optsimplexshy1 re simplex method using scipyoptimize

Problem Statement

Maximize the objective function

f(x) = c middot x (simplex1a)

for x isin R2 and

c =[5 2

]gt(simplex1b)

opt Optimization simplex The simplex algorithm p 2

subject to constraints

0 6 x1 6 10 (simplex2a)

ndash5 6 x2 6 15 (simplex2b)

4x1 + x2 6 40 (simplex2c)

x1 + 3x2 6 35 (simplex2d)

ndash8x1 ndash x2 gt ndash75 (simplex2e)

Problem Solution

First load some Python packages

from scipyoptimize import linprogimport numpy as npimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

Encoding the problem

Before we can use linprog we must first

encode Equations simplex2 and ex4 into

a form linprog will recognize We begin

with f which we can write as c middot x with the

coefficients of c as follows

c = [-5 -2] negative to find max

Wersquove negated each constant because

linprog minimizes f and we want to

maximize f Now letrsquos encode the inequality

constraints We will write the leftshyhand

side coefficients in the matrix A and the

opt Optimization simplex The simplex algorithm p 3

rightshyhand-side values in vector a such that

Ax 6 a (simplex3)

Notice that one of our constraint

inequalities is gt instead of 6 We can

flip this by multiplying the inequality by ndash1

We use simple lists to encode A and a

A = [[4 1][1 3][8 1]

]a = [40 35 75]

Now we need to define the lower l and upper

u bounds of x The function linprog expects

these to be in a single list of lower- and

upper-bounds of each xi

lu = [(0 10)(-515)

]

We want to keep track of each step linprogtakes We can access these by defining a

function callback to be passed to linprog

x = [] for storing the stepsdef callback(res) called at each step

opt Optimization simplex The simplex algorithm p 4

global xprint(fnit = resnit x_k = resx)xappend(resxcopy()) store

Now we need to call linprog We donrsquot have

any equality constraints so we need only use

the keyword arguments A_ub=A and b_ub=aFor demonstration purposes we tell it to

use the simplex method which is not as

good as its other methods which use better

algorithms based on the simplex

res = linprog(cA_ub=Ab_ub=abounds=lumethod=simplexcallback=callback

)

x = nparray(x)

nit = 0 x_k = [ 0 -5]nit = 1 x_k = [10 -5]nit = 2 x_k = [875 5 ]nit = 3 x_k = [772727273 909090909]nit = 4 x_k = [772727273 909090909]nit = 5 x_k = [772727273 909090909]nit = 5 x_k = [772727273 909090909]

So the optimal solution (x f(x)) is as follows

opt Optimization simplex The simplex algorithm p 5

print(foptimum x resx)print(foptimum f(x) -resfun)

optimum x [772727273 909090909]optimum f(x) 5681818181818182

The last point was repeated

1 once because there was no adjacent

vertex with greater f(x) and

2 twice because the algorithm calls

lsquocallbacklsquo twice on the last step

Plotting

When the solution space is in R2 it is

helpful to graphically represent the solution

space constraints and the progression of

the algorithm We begin by defining the

inequality lines from A and a over the bounds

of x1

n = len(c) number of variables xm = npshape(A)[0] number of inequality constraintsx2 = npempty([m2])for i in range(0m)x2[i] = -A[i][0]A[i][1]nparray(lu[0]) + a[i]A[i][1]

Now we plot a contour plot of f over

the bounds of x1 and x2 and overlay the

opt Optimization simplex The simplex algorithm p 6

inequality constraints and the steps of the

algorithm stored in x

lu_array = nparray(lu)fig ax = pltsubplots()mplrcParams[lineslinewidth] = 3 contour plotX1 = nplinspace(lu_array[0]100)X2 = nplinspace(lu_array[1]100)X1 X2 = npmeshgrid(X1X2)F2 = -c[0]X1 + -c[1]X2 negative because max hackcon = axcontourf(X1X2F2)cbar = figcolorbar(conax=ax)cbaraxset_ylabel(objective function) bounds on xun = nparray([11])opts = cwlabelNonelinewidth6pltplot(lu_array[0]lu_array[10]unopts)pltplot(lu_array[0]lu_array[11]unopts)pltplot(lu_array[00]unlu_array[1]opts)pltplot(lu_array[01]unlu_array[1]opts) inequality constraintsfor i in range(0m)p = pltplot(lu[0]x2[i]c=w)

pset_label(constraint) stepspltplot(x[0]x[1]-oc=rclip_on=Falsezorder=20label=simplex

)pltylim(lu_array[1])pltxlabel($x_1$)pltylabel($x_2$)pltlegend()pltshow()

opt Optimization ex The simplex algorithm p 7

0 2 4 6 8 10x1

minus50

minus25

00

25

50

75

100

125

150

x2

constraint

simplex

minus15

0

15

30

45

60

75

90

obje

ctiv

efu

nct

ion

Python code in this section was generated from a Jupyter

notebook named simplex_linear_programmingipynb with a

python3 kernel

nlin Optimization Exercises for Chapter 08 p 1

5 Barzilai and Borwein Two-Point Step Size Gradient Methods

optex Exercises for Chapter 08

Exercise 081 (cummerbund)

Consider the functions (a) f1 R2 rarr R and (b)

f2 R2 rarr R defined as

f1(x) = 4(x1 ndash 16)2 + (x2 + 64)

2 + x1 sin2 x1 (ex1)

f2(x) =1

2x middot Ax ndash b middot x (ex2)

where

A =

[5 0

0 15

]and (ex3a)

b =[ndash2 1

]gt (ex3b)

Use the method of Barzilai and Borwein5

starting at some x0 to find a minimum of each

function

Exercise 082 (melty)

Maximize the objective function

f(x) = c middot x (ex4a)

for x isin R3 and

c =[3 ndash8 1

]gt(ex4b)

subject to constraints

0 6 x1 6 20 (ex5a)

ndash5 6 x2 6 0 (ex5b)

5 6 x3 6 17 (ex5c)

x1 + 4x2 6 50 (ex5d)

2x1 + x3 6 43 (ex5e)

ndash4x1 + x2 ndash 5x3 gt ndash99 (ex5f)

1 As is customary we frequently say ldquosystemrdquo when we mean

ldquomathematical system modelrdquo Recall that multiple models may be

used for any given physical system depending on what one wants to

know

nlin

Nonlinear analysis

1 The ubiquity of near-linear systems and

the tools we have for analyses thereof can

sometimes give the impression that nonlinear

systems are exotic or even downright

flamboyant However a great many systems1

important for a mechanical engineer are

frequently hopelessly nonlinear Here are a some

examples of such systems

bull A robot arm

bull Viscous fluid flow (usually modelled by

the navier-stokes equations)

bull

bull Anything that ldquofills uprdquo or ldquosaturatesrdquo

bull Nonlinear optics

bull Einsteinrsquos field equations (gravitation in

general relativity)

bull Heat radiation and nonlinear heat

conduction

bull Fracture mechanics

bull

nlin Nonlinear analysis ss p 3

2 SH Strogatz and M Dichter Nonlinear Dynamics and Chaos

Second Studies in Nonlinearity Avalon Publishing 2016 isbn

9780813350844

3 W Richard Kolk and Robert A Lerman Nonlinear System Dynamics

1st ed Springer US 1993 isbn 978-1-4684shy6496shy2

2 Lest we think this is merely an

inconvenience we should keep in mind that

it is actually the nonlinearity that makes

many phenomena useful For instance the

depends on the nonlinearity

of its optics Similarly transistors and the

digital circuits made thereby (including the

microprocessor) wouldnrsquot function if their

physics were linear

3 In this chapter we will see some ways

to formulate characterize and simulate

nonlinear systems Purely

are few for nonlinear systems

Most are beyond the scope of this text but we

describe a few mostly in Lecture nlinchar

Simulation via numerical integration of

nonlinear dynamical equations is the most

accessible technique so it is introduced

4 We skip a discussion of linearization

of course if this is an option it is

preferable Instead we focus on the

5 For a good introduction to nonlinear

dynamics see Strogatz and Dichter2 A more

engineer-oriented introduction is Kolk and

Lerman3

nlin Nonlinear analysis ss Nonlinear state-space models p 1

nonlinear stateshyspace models

autonomous system

nonautonomous system

4 Strogatz and Dichter Nonlinear Dynamics and Chaos

equilibrium state

stationary point

nlinss Nonlinear stateshyspace

models

1 A state-space model has the general form

dx

dt= f(xu t) (ss1a)

y = (ss1b)

where f and g are vectorshyvalued functions

that depend on the system Nonlinear state-

space models are those for which f is a

functional of either x or u

For instance a state variable x1 might appear

as x21or two state variables might combine as

x1x2 or an input u1 might enter the equations as

log u1

Autonomous and nonautonomous systems

2 An autonomous system is one for which

f(x) with neither time nor input appearing

explicitly A nonautonomous system is one

for which either t or u do appear explicitly

in f It turns out that we can always write

nonautonomous systems as autonomous by

substituting in u(t) and introducing an extra

for t4

3 Therefore without loss of generality we

will focus on ways of analyzing autonomous

systems

Equilibrium

4 An equilibrium state (also called a

) x is one for which

dx dt = 0 In most cases this occurs only

when the input u is a constant u and for

timeshyvarying systems at a given time t For

autonomous systems equilibrium occurs when

nlin Nonlinear analysis char Nonlinear state-space models p 2

the following holds

(ss2)

This is a system of nonlinear algebraic

equations which can be challenging to solve for

x However frequently several solutionsmdashthat

is equilibrium statesmdashdo exist

nlin Nonlinear analysis char Nonlinear system characteristics p 1

system order

nlinchar Nonlinear system

characteristics

1 Characterizing nonlinear systems can be

challengingwithout the tools developed

for system characterization

However there are ways of characterizing

nonlinear systems and wersquoll here explore a few

DUM See Equation ss1a

Those in-common with linear systems

2 As with linear systems the system order is

either the number of stateshyvariables required

to describe the system or equivalently the

highest-order in a single

scalar differential equation describing the

system

3 Similarly nonlinear systems can have state

variables that depend on alone

or those that also depend on

(or some other independent variable) The

former lead to ordinary differential equations

(ODEs) and the latter to partial differential

equations (PDEs)

4 Equilibrium was already considered in

Lecture nlinss

Stability

5 In terms of system performance perhaps

no other criterion is as important as

nlin Nonlinear analysis char Nonlinear system characteristics p 1

Lyapunov stability theory

5 William L Brogan Modern Control Theory Third Prentice Hall 1991

Ch 10

6 A Choukchou-Braham et al Analysis and Control of

Underactuated Mechanical Systems SpringerLink Buumlcher Springer

International Publishing 2013 isbn 9783319026367 App A

Definition nlinchar1 Stability

If x is perturbed from an equilibrium state x

the response x(t) can

1 asymptotically return to x

(asymptotically )

2 diverge from x ( ) or

3 remain perturned or oscillate

about x with a constant amplitude

( stable)

Notice that this definition is actually

local stability in the neighborhood of one

equilibrium may not be the same as in the

neighborhood of another

6 Other than nonlinear systemsrsquo lack of

linear systemsrsquo eigenvalues poles and roots

of the characteristic equation from which to

compute it the primary difference between

the stability of linear and nonlinear systems

is that nonlinear system stability is often

difficult to establish

Using a linear systemrsquos eigenvalues it is

straightforward to establish stable unstable

and marginally stable subspaces of state-space

(via transforming to an eigenvector basis)

For nonlinear systems no such method exists

However we are not without tools to explore

nonlinear system stability One mathematical

tool to consider is

which is beyond the scope of this

course but has good treatments in5 and6

Qualities of equilibria

7 Equilibria (ie stationary points) come in

a variety of qualities It is instructive to

nlin Nonlinear analysis char Nonlinear system characteristics p 2

x

xprime

(a) r lt 0

x

xprime

(b) r = 0

x

xprime

(c) r gt 0

Figure 091 plots of xprime versus x for Equation char1

attractor

sink

stable

consider the first-order differential equation

in state variable with real constant

xprime = rx ndash x3 (char1)

If we plot xprime versus x for different values of r

we obtain the plots of Figure 091

8 By definition equilibria occur when

xprime = 0 so the xshyaxis crossings of Figure 091

are equilibria The blue arrows on the xshyaxis

show the of state

change xprime quantified by the plots For both

(a) and (b) only one equilibrium exists x = 0

Note that the blue arrows in both plots point

toward the equilibrium In such casesmdashthat is

when a exists around

an equilibrium for which state changes point

toward the equilibriummdashthe equilibrium is

called an or

Note that attractors are

9 Now consider (c) of Figure 091 When r gt 0

three equilibria emerge This change of the

number of equilibria with the changing of a

parameter is called a

nlin Nonlinear analysis sim Nonlinear system characteristics p 3

bifurcation diagram

repeller

source

unstable

saddle

A plot of bifurcations versus the parameter

is called a bifurcation diagram The x =

0 equilibrium now has arrows that point

from it Such an equilibrium is

called a or

and is The other two

equilibria here are (stable) attractors

Consider a very small initial condition x(0) = ε

If ε gt 0 the repeller pushes away x and the

positive attractor pulls x to itself Conversely

if ε lt 0 the repeller again pushes away x and

the negative attractor pulls x to itself

10 Another type of equilibrium is called

the one which acts as an

attractor along some lines and as a repeller

along others We will see this type in the

following example

Example nlincharshy1 re Saddle bifurcation

Problem Statement

Consider the dynamical equation

xprime = x2 + r (char2)

with r a real constant Sketch xprime vs x for

negative zero and positive r Identify and

classify each of the equilibria

nlin Nonlinear analysis pysim Nonlinear system simulation p 1

nlinsim Nonlinear system

simulation

nlin Nonlinear analysis pysim Simulating nonlinear systems in Python p 1

nlinpysim Simulating nonlinear

systems in Python

Example nlinpysimshy1 re a nonlinear unicycle

Problem Statementasdf

Problem Solution

First load some Python packages

import numpy as npimport sympy as spfrom scipyintegrate import solve_ivpimport matplotlibpyplot as pltfrom IPythondisplay import display Markdown Latex

The state equation can be encoded via the

following function f

def f(t x u c)dxdt = [

x[3]npcos(x[2])x[3]npsin(x[2])x[4]1c[0] u(t)[0]1c[1] u(t)[1]

]return dxdt

The input function u must also be defined

nlin Nonlinear analysis ex Simulating nonlinear systems in Python p 2

def u(t)return [

15(1+npcos(t))25npsin(3t)

]

Define time spans initial values and constantstspan = nplinspace(0 50 300)xinit = [00000]mass = 10inertia = 10c = [massinertia]

Solve differential equationsol = solve_ivp(

lambda t x f(t x u c)[tspan[0] tspan[-1]]xinitt_eval=tspan

)

Letrsquos first plot the trajectory and

instantaneous velocity

ndash75 ndash50 ndash25 0 25 50 75 100x

ndash75

ndash50

ndash25

0

25

50

75

100

y

Figure 092

xp = soly[3]npcos(soly[2])yp = soly[3]npsin(soly[2])p = pltfigure()pltplot(soly[0]soly[1])pltquiver(soly[0]soly[1]xpyp)pltxlabel($x$)pltylabel($y$)pltshow()

Python code in this section was generated from a Jupyter

notebook named horcruxipynb with a python3 kernel

Nonlinear analysis Exercises for Chapter 09 p 1

nlinex Exercises for Chapter 09

A

Distribution tables

A Distribution tables A01 Gaussian distribution table p 1

A01 Gaussian distribution table

Below are plots of the Gaussian probability

density function f and cumulative distribution

function Φ Below them is Table A1 of CDF

values

zb

ndash3 ndash2 ndash1 0 1 2 30

01

02

03

04

z

Gaussia

nPDFf(z)

Φ(zb)

f(z)

(a) the Gaussian PDF

ndash3 ndash2 ndash1 0 1 2 30

02

04

06

08

1

interval bound zb

Gaussia

nCDFΦ(zb)

(b) the Gaussian CDF

Figure A1 the Gaussian PDF and CDF for z-scoresTable A1 z-score table Φ(zb) = P(z isin (ndashinfin zb])

zb 0 1 2 3 4 5 6 7 8 9

shy34 00003 00003 00003 00003 00003 00003 00003 00003 00003 00002

shy33 00005 00005 00005 00004 00004 00004 00004 00004 00004 00003

shy32 00007 00007 00006 00006 00006 00006 00006 00005 00005 00005

shy31 00010 00009 00009 00009 00008 00008 00008 00008 00007 00007

shy30 00013 00013 00013 00012 00012 00011 00011 00011 00010 00010

shy29 00019 00018 00018 00017 00016 00016 00015 00015 00014 00014

shy28 00026 00025 00024 00023 00023 00022 00021 00021 00020 00019

shy27 00035 00034 00033 00032 00031 00030 00029 00028 00027 00026

shy26 00047 00045 00044 00043 00041 00040 00039 00038 00037 00036

shy25 00062 00060 00059 00057 00055 00054 00052 00051 00049 00048

shy24 00082 00080 00078 00075 00073 00071 00069 00068 00066 00064

shy23 00107 00104 00102 00099 00096 00094 00091 00089 00087 00084

shy22 00139 00136 00132 00129 00125 00122 00119 00116 00113 00110

shy21 00179 00174 00170 00166 00162 00158 00154 00150 00146 00143

shy20 00228 00222 00217 00212 00207 00202 00197 00192 00188 00183

shy19 00287 00281 00274 00268 00262 00256 00250 00244 00239 00233

shy18 00359 00351 00344 00336 00329 00322 00314 00307 00301 00294

shy17 00446 00436 00427 00418 00409 00401 00392 00384 00375 00367

shy16 00548 00537 00526 00516 00505 00495 00485 00475 00465 00455

A Distribution tables A01 Gaussian distribution table p 2

Table A1 z-score table Φ(zb) = P(z isin (ndashinfin zb])

zb 0 1 2 3 4 5 6 7 8 9

shy15 00668 00655 00643 00630 00618 00606 00594 00582 00571 00559

shy14 00808 00793 00778 00764 00749 00735 00721 00708 00694 00681

shy13 00968 00951 00934 00918 00901 00885 00869 00853 00838 00823

shy12 01151 01131 01112 01093 01075 01056 01038 01020 01003 00985

shy11 01357 01335 01314 01292 01271 01251 01230 01210 01190 01170

shy10 01587 01562 01539 01515 01492 01469 01446 01423 01401 01379

shy09 01841 01814 01788 01762 01736 01711 01685 01660 01635 01611

shy08 02119 02090 02061 02033 02005 01977 01949 01922 01894 01867

shy07 02420 02389 02358 02327 02296 02266 02236 02206 02177 02148

shy06 02743 02709 02676 02643 02611 02578 02546 02514 02483 02451

shy05 03085 03050 03015 02981 02946 02912 02877 02843 02810 02776

shy04 03446 03409 03372 03336 03300 03264 03228 03192 03156 03121

shy03 03821 03783 03745 03707 03669 03632 03594 03557 03520 03483

shy02 04207 04168 04129 04090 04052 04013 03974 03936 03897 03859

shy01 04602 04562 04522 04483 04443 04404 04364 04325 04286 04247

shy00 05000 04960 04920 04880 04840 04801 04761 04721 04681 04641

00 05000 05040 05080 05120 05160 05199 05239 05279 05319 05359

01 05398 05438 05478 05517 05557 05596 05636 05675 05714 05753

02 05793 05832 05871 05910 05948 05987 06026 06064 06103 06141

03 06179 06217 06255 06293 06331 06368 06406 06443 06480 06517

04 06554 06591 06628 06664 06700 06736 06772 06808 06844 06879

05 06915 06950 06985 07019 07054 07088 07123 07157 07190 07224

06 07257 07291 07324 07357 07389 07422 07454 07486 07517 07549

07 07580 07611 07642 07673 07704 07734 07764 07794 07823 07852

08 07881 07910 07939 07967 07995 08023 08051 08078 08106 08133

09 08159 08186 08212 08238 08264 08289 08315 08340 08365 08389

10 08413 08438 08461 08485 08508 08531 08554 08577 08599 08621

11 08643 08665 08686 08708 08729 08749 08770 08790 08810 08830

12 08849 08869 08888 08907 08925 08944 08962 08980 08997 09015

13 09032 09049 09066 09082 09099 09115 09131 09147 09162 09177

14 09192 09207 09222 09236 09251 09265 09279 09292 09306 09319

15 09332 09345 09357 09370 09382 09394 09406 09418 09429 09441

16 09452 09463 09474 09484 09495 09505 09515 09525 09535 09545

17 09554 09564 09573 09582 09591 09599 09608 09616 09625 09633

18 09641 09649 09656 09664 09671 09678 09686 09693 09699 09706

19 09713 09719 09726 09732 09738 09744 09750 09756 09761 09767

20 09772 09778 09783 09788 09793 09798 09803 09808 09812 09817

21 09821 09826 09830 09834 09838 09842 09846 09850 09854 09857

22 09861 09864 09868 09871 09875 09878 09881 09884 09887 09890

23 09893 09896 09898 09901 09904 09906 09909 09911 09913 09916

24 09918 09920 09922 09925 09927 09929 09931 09932 09934 09936

25 09938 09940 09941 09943 09945 09946 09948 09949 09951 09952

26 09953 09955 09956 09957 09959 09960 09961 09962 09963 09964

27 09965 09966 09967 09968 09969 09970 09971 09972 09973 09974

28 09974 09975 09976 09977 09977 09978 09979 09979 09980 09981

29 09981 09982 09982 09983 09984 09984 09985 09985 09986 09986

30 09987 09987 09987 09988 09988 09989 09989 09989 09990 09990

31 09990 09991 09991 09991 09992 09992 09992 09992 09993 09993

32 09993 09993 09994 09994 09994 09994 09994 09995 09995 09995

33 09995 09995 09995 09996 09996 09996 09996 09996 09996 09997

34 09997 09997 09997 09997 09997 09997 09997 09997 09997 09998

A Distribution tables Studentrsquos tshydistribution table p 1

A02 Studentrsquos t-distribution

tableTable A2 two-tail inverse studentrsquos t-distribution table

percent probability

ν 600 667 750 800 875 900 950 975 990 995 999

1 0325 0577 1000 1376 2414 3078 6314 12706 31821 63657 31831

2 0289 0500 0816 1061 1604 1886 2920 4303 6965 9925 22327

3 0277 0476 0765 0978 1423 1638 2353 3182 4541 5841 10215

4 0271 0464 0741 0941 1344 1533 2132 2776 3747 4604 7173

5 0267 0457 0727 0920 1301 1476 2015 2571 3365 4032 5893

6 0265 0453 0718 0906 1273 1440 1943 2447 3143 3707 5208

7 0263 0449 0711 0896 1254 1415 1895 2365 2998 3499 4785

8 0262 0447 0706 0889 1240 1397 1860 2306 2896 3355 4501

9 0261 0445 0703 0883 1230 1383 1833 2262 2821 3250 4297

10 0260 0444 0700 0879 1221 1372 1812 2228 2764 3169 4144

11 0260 0443 0697 0876 1214 1363 1796 2201 2718 3106 4025

12 0259 0442 0695 0873 1209 1356 1782 2179 2681 3055 3930

13 0259 0441 0694 0870 1204 1350 1771 2160 2650 3012 3852

14 0258 0440 0692 0868 1200 1345 1761 2145 2624 2977 3787

15 0258 0439 0691 0866 1197 1341 1753 2131 2602 2947 3733

16 0258 0439 0690 0865 1194 1337 1746 2120 2583 2921 3686

17 0257 0438 0689 0863 1191 1333 1740 2110 2567 2898 3646

18 0257 0438 0688 0862 1189 1330 1734 2101 2552 2878 3610

19 0257 0438 0688 0861 1187 1328 1729 2093 2539 2861 3579

20 0257 0437 0687 0860 1185 1325 1725 2086 2528 2845 3552

21 0257 0437 0686 0859 1183 1323 1721 2080 2518 2831 3527

22 0256 0437 0686 0858 1182 1321 1717 2074 2508 2819 3505

23 0256 0436 0685 0858 1180 1319 1714 2069 2500 2807 3485

24 0256 0436 0685 0857 1179 1318 1711 2064 2492 2797 3467

25 0256 0436 0684 0856 1178 1316 1708 2060 2485 2787 3450

26 0256 0436 0684 0856 1177 1315 1706 2056 2479 2779 3435

27 0256 0435 0684 0855 1176 1314 1703 2052 2473 2771 3421

28 0256 0435 0683 0855 1175 1313 1701 2048 2467 2763 3408

29 0256 0435 0683 0854 1174 1311 1699 2045 2462 2756 3396

30 0256 0435 0683 0854 1173 1310 1697 2042 2457 2750 3385

35 0255 0434 0682 0852 1170 1306 1690 2030 2438 2724 3340

40 0255 0434 0681 0851 1167 1303 1684 2021 2423 2704 3307

45 0255 0434 0680 0850 1165 1301 1679 2014 2412 2690 3281

50 0255 0433 0679 0849 1164 1299 1676 2009 2403 2678 3261

55 0255 0433 0679 0848 1163 1297 1673 2004 2396 2668 3245

60 0254 0433 0679 0848 1162 1296 1671 2000 2390 2660 3232infin 0253 0431 0674 0842 1150 1282 1645 1960 2326 2576 3090

B

Fourier andLaplace tables

B Fourier and Laplace tables B01 Laplace transforms p 1

B01 Laplace transforms

Table B1 is a table with functions of time

f(t) on the left and corresponding Laplace

transforms L(s) on the right Where applicable

s = σ + jω is the Laplace transform variable

T is the timeshydomain period ω02πT is the

corresponding angular frequency j =radicndash1 a isin R+

and b t0 isin R are constants

Table B1 Laplace transform identities

function of time t function of Laplace s

a1f1(t) + a2f2(t) a1F1(s) + a2F2(s)

f(t ndash t0) F(s)endasht0s

fprime(t) sF(s) ndash f(0)

dnf(t)

dtnsnF(s) + s(nndash1)f(0) + s(nndash2)fprime(0) + middot middot middot + f(nndash1)(0)

ˆ t

0f(τ)dτ

1

sF(s)

tf(t) ndashFprime(s)

f1(t) lowast f2(t) =ˆ infinndashinfin f1(τ)f2(t ndash τ)dτ F1(s)F2(s)

δ(t) 1

us(t) 1s

ur(t) 1s2

tnndash1(n ndash 1) 1sn

endashat1

s + a

tendashat1

(s + a)2

B Fourier and Laplace tables B01 Laplace transforms p 2

1

(n ndash 1)tnndash1endashat

1

(s + a)n

1

a ndash b(eat ndash ebt)

1

(s ndash a)(s ndash b)(a 6= b)

1

a ndash b(aeat ndash bebt)

s

(s ndash a)(s ndash b)(a 6= b)

sinωtω

s2 +ω2

cosωts

s2 +ω2

eat sinωtω

(s ndash a)2 +ω2

eat cosωts ndash a

(s ndash a)2 +ω2

B Fourier and Laplace tables B02 Fourier transforms p 1

B02 Fourier transforms

Table B2 is a table with functions of time

f(t) on the left and corresponding Fourier

transforms F(ω) on the right Where applicable

T is the timeshydomain period ω02πT is the

corresponding angular frequency j =radicndash1 a isin R+

and b t0 isin R are constants Furthermore feand f0 are even and odd functions of time

respectively and it can be shown that any

function f can be written as the sum f(t) =

fe(t) + f0(t) (Hsu1967)

Table B2 Fourier transform identities

function of time t function of frequency ω

a1f1(t) + a2f2(t) a1F1(ω) + a2F2(ω)

f(at)1

|a|F(ωa)

f(ndasht) F(ndashω)

f(t ndash t0) F(ω)endashjωt0

f(t) cosω0t1

2F(ω ndashω0) +

1

2F(ω +ω0)

f(t) sinω0t1

j2F(ω ndashω0) ndash

1

j2F(ω +ω0)

fe(t) Re F(ω)

f0(t) j Im F(ω)

F(t) 2πf(ndashω)

fprime(t) jωF(ω)

dnf(t)

dtn(jω)nF(ω)

ˆ t

ndashinfin f(τ)dτ1

jωF(ω) + πF(0)δ(ω)

B Fourier and Laplace tables B02 Fourier transforms p 2

ndashjtf(t) Fprime(ω)

(ndashjt)nf(t)dnF(ω)

dωn

f1(t) lowast f2(t) =ˆ infinndashinfin f1(τ)f2(t ndash τ)dτ F1(ω)F2(ω)

f1(t)f2(t)1

2πF1(ω) lowast F2(ω) =

1

ˆ infinndashinfin F1(α)F2(ω ndash α)dα

endashatus(t)1

jω + a

endasha|t|2a

a2 +ω2

endashat2 radic

πa endashω2(4a)

1 for |t| lt a2 else 0a sin(aω2)

aω2

tendashatus(t)1

(a + jω)2

tnndash1

(n ndash 1)endashat)nus(t)

1

(a + jω)n

1

a2 + t2π

aendasha|ω|

δ(t) 1

δ(t ndash t0) endashjωt0

us(t) πδ(ω) +1

us(t ndash t0) πδ(ω) +1

jωendashjωt0

1 2πδ(ω)

t 2πjδprime(ω)

tn 2πjndnδ(ω)

dωn

ejω0t 2πδ(ω ndashω0)

cosω0t πδ(ω ndashω0) + πδ(ω +ω0)

B Fourier and Laplace tables Fourier transforms p 3

sinω0t ndashjπδ(ω ndashω0) + jπδ(ω +ω0)

us(t) cosω0tjω

ω20 ndashω2+π

2δ(ω ndashω0) +

π

2δ(ω +ω0)

us(t) sinω0tω0

ω20 ndashω2+π

2jδ(ω ndashω0) ndash

π

2jδ(ω +ω0)

tus(t) jπδprime(ω) ndash 1ω2

1t πj ndash 2πjus(ω)

1tn(ndashjω)nndash1

(n ndash 1)

(πj ndash 2πjus(ω)

)sgn t

2

jωinfinsumn=ndashinfin δ(t ndash nT) ω0

infinsumn=ndashinfin δ(ω ndash nω0)

C

Mathematicsreference

C Mathematics reference C01 Quadratic forms p 1

C01 Quadratic forms

The solution to equations of the form ax2 + bx +

c = 0 is

x =ndashbplusmn

radicb2 ndash 4ac

2a (C011)

Completing the square

This is accomplished by reshywriting the quadratic

formula in the form of the leftshyhand-side (LHS)

of this equality which describes factorization

x2 + 2xh + h2 = (x + h)2 (C012)

C Mathematics reference C02 Trigonometry p 1

C02 Trigonometry

Triangle identities

With reference to the below figure the law of

sines is

sin α

a=sin β

b=sin γ

c(C021)

and the law of cosines is

c2 = a2 + b2 ndash 2ab cos γ (C022a)

b2 = a2 + c2 ndash 2ac cos β (C022b)

a2 = c2 + b2 ndash 2cb cos α (C022c)

b

ca

α γ

β

Reciprocal identities

csc u =1

sin u(C023a)

sec u =1

cos u(C023b)

cot u =1

tan u(C023c)

Pythagorean identities

1 = sin2 u + cos2 u (C024a)

sec2 u = 1 + tan2 u (C024b)

csc2 u = 1 + cot2 u (C024c)

C Mathematics reference C02 Trigonometry p 1

Co-function identities

sin(π2ndash u

)= cos u (C025a)

cos(π2ndash u

)= sin u (C025b)

tan(π2ndash u

)= cot u (C025c)

csc(π2ndash u

)= sec u (C025d)

sec(π2ndash u

)= csc u (C025e)

cot(π2ndash u

)= tan u (C025f)

Even-odd identities

sin(ndashu) = ndash sin u (C026a)

cos(ndashu) = cos u (C026b)

tan(ndashu) = ndash tan u (C026c)

Sumshydifference formulas (AM or lock-in)

sin(uplusmn v) = sin u cos vplusmn cos u sin v (C027a)

cos(uplusmn v) = cos u cos v∓ sin u sin v (C027b)

tan(uplusmn v) = tan uplusmn tan v1∓ tanu tan v

(C027c)

Double angle formulas

sin(2u) = 2 sin u cos u (C028a)

cos(2u) = cos2 u ndash sin2 u (C028b)

= 2 cos2 u ndash 1 (C028c)

= 1 ndash 2 sin2 u (C028d)

tan(2u) =2 tanu

1 ndash tan2 u(C028e)

Power-reducing or halfshyangle formulas

sin2 u =1 ndash cos(2u)

2(C029a)

cos2 u =1 + cos(2u)

2(C029b)

tan2 u =1 ndash cos(2u)

1 + cos(2u)(C029c)

C Mathematics reference C02 Trigonometry p 1

Sum-toshyproduct formulas

sin u + sin v = 2 sinu + v

2cos

u ndash v

2(C0210a)

sin u ndash sin v = 2 cosu + v

2sin

u ndash v

2(C0210b)

cos u + cos v = 2 cosu + v

2cos

u ndash v

2(C0210c)

cos u ndash cos v = ndash2 sinu + v

2sin

u ndash v

2(C0210d)

Product-to-sum formulas

sin u sin v =1

2

[cos(u ndash v) ndash cos(u + v)

](C0211a)

cos u cos v =1

2

[cos(u ndash v) + cos(u + v)

](C0211b)

sin u cos v =1

2

[sin(u + v) + sin(u ndash v)

](C0211c)

cos u sin v =1

2

[sin(u + v) ndash sin(u ndash v)

](C0211d)

Two-to-one formulas

A sin u + B cos u = C sin(u + φ) (C0212a)

= C cos(u +ψ) where (C0212b)

C =

radicA2 + B2 (C0212c)

φ = arctanB

A(C0212d)

ψ = ndash arctanA

B(C0212e)

C Mathematics reference C03 Matrix inverses p 1

adjoint

C03 Matrix inverses

This is a guide to inverting 1times1 2times2 and ntimesnmatrices

Let A be the 1times 1 matrix

A =[a]

The inverse is simply the reciprocal

Andash1 =[1a

]

Let B be the 2times 2 matrix

B =

[b11 b12b21 b22

]

It can be shown that the inverse follows a

simple pattern

Bndash1 =1

det B

[b22 ndashb12ndashb21 b11

]

=1

b11b22 ndash b12b21

[b22 ndashb12ndashb21 b11

]

Let C be an n times n matrix It can be shown that

its inverse is

Cndash1 =1

det CadjC

where adj is the adjoint of C

C Mathematics reference Laplace transforms p 1

C04 Laplace transforms

The definition of the one-side Laplace and

inverse Laplace transforms follow

Definition C041 Laplace transforms (one-

sided)

Laplace transform L

L(y(t)) = Y(s) =

ˆ infin0

y(t)endashstdt (C041)

Inverse Laplace transform Lndash1

Lndash1(Y(s)) = y(t) =1

2πj

ˆ σ+jinfinσndashjinfin Y(s)estds (C042)

See Table B1 for a list of properties and

common transforms

D

Complex analysis

D Complex analysis Eulerrsquos formulas p 1

Eulerrsquos formulaD01 Eulerrsquos formulas

Eulerrsquos formula is our bridge backshyand forth

between trigonomentric forms (cos θ and sin θ)

and complex exponential form (ejθ)

ejθ = cos θ + j sin θ (The Formulas of Euler1)

Here are a few useful identities implied by

Eulerrsquos formula

endashjθ = cos θ ndash j sin θ (The Formulas of Euler2a)

cos θ = Re (ejθ) (The Formulas of Euler2b)

=1

2

(ejθ + endashjθ

)(The Formulas of Euler2c)

sin θ = Im (ejθ) (The Formulas of Euler2d)

=1

j2

(ejθ ndash endashjθ

)

(The Formulas of Euler2e)

Bibliography

[1] Robert B Ash Basic Probability Theory

Dover Publications Inc 2008

[2] Joan Bagaria Set Theory In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Fall 2019

Metaphysics Research Lab Stanford

University 2019

[3] Maria Baghramian and J Adam

Carter Relativism In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Winter 2019

Metaphysics Research Lab Stanford

University 2019

[4] Stephen Barker and Mark Jago Being

Positive About Negative Facts In

Philosophy and Phenomenological

Research 851 (2012) pp 117ndash138 doi 10

1111j1933-1592201000479x

[5] Jonathan Barzilai and Jonathan M

Borwein Two-Point Step Size Gradient

Methods In IMA Journal of Numerical

Analysis 81 (Jan 1988) pp 141ndash148 issn

0272-4979 doi 101093imanum81141

This includes an innovative line search

method

D Complex analysis Bibliography p 3

1 FBADL04

[6] Anat Biletzki and Anat Matar Ludwig

Wittgenstein In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Summer 2018

Metaphysics Research Lab Stanford

University 2018 An introduction to

Wittgenstein and his thought

[7] Antonio Bove F (Ferruccio) Colombini

and Daniele Del Santo Phase space

analysis of partial differential

equations eng Progress in nonlinear

differential equations and their

applications v 69 Boston Berlin

Birkhaumluser 2006 isbn 9780817645212

[8] William L Brogan Modern Control Theory

Third Prentice Hall 1991

[9] Francesco Bullo and Andrew D Lewis

Geometric control of mechanical systems

modeling analysis and design for simple

mechanical control systems Ed by JE

Marsden L Sirovich and M Golubitsky

Springer 2005

[10] Francesco Bullo and Andrew D Lewis

Supplementary Chapters for Geometric

Control of Mechanical Systems1 Jan

2005

[11] A Choukchou-Braham et al Analysis and

Control of Underactuated Mechanical

Systems SpringerLink Buumlcher Springer

International Publishing 2013 isbn

9783319026367

[12] K Ciesielski Set Theory for the

Working Mathematician London

Mathematical Society Student Texts

Cambridge University Press 1997 isbn

9780521594653 A readable introduction

to set theory

D Complex analysis Bibliography p 4

[13] Marian David The Correspondence

Theory of Truth In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Fall 2016 Metaphysics

Research Lab Stanford University

2016 A detailed overview of the

correspondence theory of truth

[14] David Dolby Wittgenstein on Truth

In A Companion to Wittgenstein John

Wiley amp Sons Ltd 2016 Chap 27 pp 433ndash

442 isbn 9781118884607 doi 101002

9781118884607ch27

[15] Peter J Eccles An Introduction to

Mathematical Reasoning Numbers

Sets and Functions Cambridge

University Press 1997 doi

10 1017 CBO9780511801136 A gentle

introduction to mathematical reasoning

It includes introductory treatments of

set theory and number systems

[16] HB Enderton Elements of Set

Theory Elsevier Science 1977 isbn

9780080570426 A gentle introduction to

set theory and mathematical reasoningmdasha

great place to start

[17] Richard P Feynman Robert B Leighton

and Matthew Sands The Feynman

Lectures on Physics New Millennium

Perseus Basic Books 2010

[18] Michael Glanzberg Truth In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Fall 2018

Metaphysics Research Lab Stanford

University 2018

[19] Hans Johann Glock Truth in the

Tractatus In Synthese 1482 (Jan

2006) pp 345ndash368 issn 1573-0964 doi

101007s11229-004-6226-2

D Complex analysis Bibliography p 5

[20] Mario GoacutemezshyTorrente Alfred Tarski

In The Stanford Encyclopedia of

Philosophy Ed by Edward N Zalta

Spring 2019 Metaphysics Research Lab

Stanford University 2019

[21] Paul Guyer and Rolf-Peter Horstmann

Idealism In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Winter 2018

Metaphysics Research Lab Stanford

University 2018

[22] R Haberman Applied Partial Differential

Equations with Fourier Series and

Boundary Value Problems (Classic

Version) Pearson Modern Classics for

Advanced Mathematics Pearson Education

Canada 2018 isbn 9780134995434

[23] GWF Hegel and AV Miller

Phenomenology of Spirit Motilal

Banarsidass 1998 isbn 9788120814738

[24] Wilfrid Hodges Model Theory In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Fall 2018

Metaphysics Research Lab Stanford

University 2018

[25] Wilfrid Hodges Tarskirsquos Truth

Definitions In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Fall 2018 Metaphysics

Research Lab Stanford University 2018

[26] Peter Hylton and Gary Kemp Willard

Van Orman Quine In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Spring 2019 Metaphysics

Research Lab Stanford University 2019

[27] ET Jaynes et al Probability Theory The

Logic of Science Cambridge University

Press 2003 isbn 9780521592710

D Complex analysis Bibliography p 6

An excellent and comprehensive

introduction to probability theory

[28] I Kant P Guyer and AW Wood Critique

of Pure Reason The Cambridge Edition

of the Works of Immanuel Kant

Cambridge University Press 1999 isbn

9781107268333

[29] Juliette Kennedy Kurt Goumldel In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Winter 2018

Metaphysics Research Lab Stanford

University 2018

[30] Drew Khlentzos Challenges to

Metaphysical Realism In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Winter 2016

Metaphysics Research Lab Stanford

University 2016

[31] Peter Klein Skepticism In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Summer 2015

Metaphysics Research Lab Stanford

University 2015

[32] M Kline Mathematics The Loss

of Certainty A Galaxy book

Oxford University Press 1982 isbn

9780195030853 A detailed account

of the ldquoillogicalrdquo development of

mathematics and an exposition of

its therefore remarkable utility in

describing the world

[33] W Richard Kolk and Robert A Lerman

Nonlinear System Dynamics 1st ed

Springer US 1993 isbn 978-1-4684shy6496shy2

[34] Erwin Kreyszig Advanced Engineering

Mathematics 10th John Wiley amp Sons

Limited 2011 isbn 9781119571094 The

authoritative resource for engineering

mathematics It includes detailed

D Complex analysis Bibliography p 7

accounts of probability statistics

vector calculus linear algebra

fourier analysis ordinary and partial

differential equations and complex

analysis It also includes several other

topics with varying degrees of depth

Overall it is the best place to start

when seeking mathematical guidance

[35] John M Lee Introduction to Smooth

Manifolds second Vol 218 Graduate

Texts in Mathematics Springer 2012

[36] Catherine Legg and Christopher

Hookway Pragmatism In The Stanford

Encyclopedia of Philosophy Ed by

Edward N Zalta Spring 2019 Metaphysics

Research Lab Stanford University

2019 An introductory article on the

philsophical movement ldquopragmatismrdquo It

includes an important clarification of

the pragmatic slogan ldquotruth is the end of

inquiryrdquo

[37] Daniel Liberzon Calculus of Variations

and Optimal Control Theory A Concise

Introduction Princeton University Press

2012 isbn 9780691151878

[38] Panu Raatikainen Goumldelrsquos Incompleteness

Theorems In The Stanford Encyclopedia

of Philosophy Ed by Edward N Zalta

Fall 2018 Metaphysics Research Lab

Stanford University 2018 A through

and contemporary description of

Goumldelrsquos incompleteness theorems which

have significant implications for the

foundations and function of mathematics

and mathematical truth

[39] Paul Redding Georg Wilhelm Friedrich

Hegel In The Stanford Encyclopedia

of Philosophy Ed by Edward N Zalta

Summer 2018 Metaphysics Research Lab

Stanford University 2018

D Complex analysis Bibliography p 8

[40] HM Schey Div Grad Curl and All that An

Informal Text on Vector Calculus WW

Norton 2005 isbn 9780393925166

[41] Christopher Shields Aristotle In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Winter 2016

Metaphysics Research Lab Stanford

University 2016

[42] Steven S Skiena Calculated Bets

Computers Gambling and Mathematical

Modeling to Win Outlooks Cambridge

University Press 2001 doi 10 1017

CBO9780511547089 This includes a lucid

section on probability versus statistics

also available here https www3

csstonybrookedu~skienajaialai

excerptsnode12html

[43] George Smith Isaac Newton In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Fall 2008

Metaphysics Research Lab Stanford

University 2008

[44] Daniel Stoljar and Nic Damnjanovic

The Deflationary Theory of Truth

In The Stanford Encyclopedia of

Philosophy Ed by Edward N Zalta Fall

2014 Metaphysics Research Lab Stanford

University 2014

[45] WA Strauss Partial Differential

Equations An Introduction Wiley 2007

isbn 9780470054567 A thorough and yet

relatively compact introduction

[46] SH Strogatz and M Dichter Nonlinear

Dynamics and Chaos Second Studies

in Nonlinearity Avalon Publishing 2016

isbn 9780813350844

[47] Mark Textor States of Affairs In The

Stanford Encyclopedia of Philosophy

Ed by Edward N Zalta Winter 2016

D Complex analysis Bibliography p 9

Metaphysics Research Lab Stanford

University 2016

[48] Pauli Virtanen et al SciPy

10ndashFundamental Algorithms for

Scientific Computing in Python In arXiv

eshyprints arXiv190710121 (July 2019)

arXiv190710121

[49] Wikipedia Algebra mdash Wikipedia The Free

Encyclopedia httpenwikipedia

orgwindexphptitle=Algebraampoldid=

920573802 [Online accessed 26shyOctober-

2019] 2019

[50] Wikipedia Carl Friedrich Gauss mdash

Wikipedia The Free Encyclopedia http

en wikipedia org w index php

title=Carl20Friedrich20Gaussampoldid=

922692291 [Online accessed 26shyOctober-

2019] 2019

[51] Wikipedia Euclid mdash Wikipedia The Free

Encyclopedia httpenwikipedia

orgwindexphptitle=Euclidampoldid=

923031048 [Online accessed 26shyOctober-

2019] 2019

[52] Wikipedia First-order logic mdash

Wikipedia The Free Encyclopedia http

enwikipediaorgwindexphptitle=

First-order20logicampoldid=921437906

[Online accessed 29shyOctobershy2019] 2019

[53] Wikipedia Fundamental interaction mdash

Wikipedia The Free Encyclopedia http

en wikipedia org w index php

title=Fundamental20interactionampoldid=

925884124 [Online accessed 16shyNovember-

2019] 2019

[54] Wikipedia Leonhard Euler mdash Wikipedia

The Free Encyclopedia http en

wikipediaorgwindexphptitle=

Leonhard 20Euler amp oldid = 921824700

[Online accessed 26shyOctobershy2019] 2019

D Complex analysis Bibliography p 10

[55] Wikipedia Linguistic turnmdashWikipedia

The Free Encyclopedia http en

wikipediaorgwindexphptitle=

Linguistic20turnampoldid=922305269

[Online accessed 23shyOctobershy2019] 2019

Hey we all do it

[56] Wikipedia Probability space mdash

Wikipedia The Free Encyclopedia http

enwikipediaorgwindexphptitle=

Probability20spaceampoldid=914939789

[Online accessed 31shyOctobershy2019] 2019

[57] Wikipedia Propositional calculus mdash

Wikipedia The Free Encyclopedia http

en wikipedia org w index php

title=Propositional20calculusampoldid=

914757384 [Online accessed 29shyOctober-

2019] 2019

[58] Wikipedia Quaternion mdash Wikipedia The

Free Encyclopedia httpenwikipedia

orgwindexphptitle=Quaternionamp

oldid=920710557 [Online accessed 26-

Octobershy2019] 2019

[59] Wikipedia Set-builder notation mdash

Wikipedia The Free Encyclopedia http

enwikipediaorgwindexphptitle=

Set-builder20notationampoldid=917328223

[Online accessed 29shyOctobershy2019] 2019

[60] Wikipedia William Rowan Hamilton mdash

Wikipedia The Free Encyclopedia http

enwikipediaorgwindexphptitle=

William 20Rowan 20Hamilton amp oldid =

923163451 [Online accessed 26shyOctober-

2019] 2019

[61] L Wittgenstein PMS Hacker and J

Schulte Philosophical Investigations

Wiley 2010 isbn 9781444307979

[62] Ludwig Wittgenstein Tractatus Logico-

Philosophicus Ed by C familyi=C

given=K Ogden giveni= O Project

Complex analysis Bibliography p 11

Gutenberg International Library of

Psychology Philosophy and Scientific

Method Kegan Paul Trench Trubner amp Co

Ltd 1922 A brilliant work on what is

possible to express in languagemdashand what

is not As Wittgenstein puts it ldquoWhat can

be said at all can be said clearly and

whereof one cannot speak thereof one

must be silentrdquo

[63] Slavoj Žižek Less Than Nothing

Hegel and the Shadow of Dialectical

Materialism Verso 2012 isbn

9781844678976 This is one of the most

interesting presentations of Hegel and

Lacan by one of the most exciting

contemporary philosophers

  • Mathematics itself
    • Truth
      • Neo-classical theories of truth
      • The picture theory
      • The relativity of truth
      • Other ideas about truth
      • Where this leaves us
        • The foundations of mathematics
          • Algebra ex nihilo
          • The application of mathematics to science
          • The rigorization of mathematics
          • The foundations of mathematics are built
          • The foundations have cracks
          • Mathematics is considered empirical
            • Mathematical reasoning
            • Mathematical topics in overview
            • What is mathematics for engineering
            • Exercises for Chapter 01
              • Mathematical reasoning logic and set theory
                • Introduction to set theory
                • Logical connectives and quantifiers
                  • Logical connectives
                  • Quantifiers
                    • Exercises for Chapter 02
                      • Exe 021 hardhat
                      • Exe 022 2
                      • Exe 023 3
                      • Exe 024 4
                          • Probability
                            • Probability and measurement
                            • Basic probability theory
                              • Algebra of events
                                • Independence and conditional probability
                                  • Conditional probability
                                    • Bayes theorem
                                      • Testing outcomes
                                      • Posterior probabilities
                                        • Random variables
                                        • Probability density and mass functions
                                          • Binomial PMF
                                          • Gaussian PDF
                                            • Expectation
                                            • Central moments
                                            • Exercises for Chapter 03
                                              • Exe 031 5
                                                  • Statistics
                                                    • Populations samples and machine learning
                                                    • Estimation of sample mean and variance
                                                      • Estimation and sample statistics
                                                      • Sample mean variance and standard deviation
                                                      • Sample statistics as random variables
                                                      • Nonstationary signal statistics
                                                        • Confidence
                                                          • Generate some data to test the central limit theorem
                                                          • Sample statistics
                                                          • The truth about sample means
                                                          • Gaussian and probability
                                                            • Student confidence
                                                            • Multivariate probability and correlation
                                                              • Marginal probability
                                                              • Covariance
                                                              • Conditional probability and dependence
                                                                • Regression
                                                                • Exercises for Chapter 04
                                                                  • Vector calculus
                                                                    • Divergence surface integrals and flux
                                                                      • Flux and surface integrals
                                                                      • Continuity
                                                                      • Divergence
                                                                      • Exploring divergence
                                                                        • Curl line integrals and circulation
                                                                          • Line integrals
                                                                          • Circulation
                                                                          • Curl
                                                                          • Zero curl circulation and path independence
                                                                          • Exploring curl
                                                                            • Gradient
                                                                              • Gradient
                                                                              • Vector fields from gradients are special
                                                                              • Exploring gradient
                                                                                • Stokes and divergence theorems
                                                                                  • The divergence theorem
                                                                                  • The Kelvin-Stokes theorem
                                                                                  • Related theorems
                                                                                    • Exercises for Chapter 05
                                                                                      • Exe 051 6
                                                                                          • Fourier and orthogonality
                                                                                            • Fourier series
                                                                                            • Fourier transform
                                                                                            • Generalized fourier series and orthogonality
                                                                                            • Exercises for Chapter 06
                                                                                              • Exe 061 secrets
                                                                                              • Exe 062 seesaw
                                                                                              • Exe 063 9
                                                                                              • Exe 064 10
                                                                                              • Exe 065 11
                                                                                              • Exe 066 12
                                                                                              • Exe 067 13
                                                                                              • Exe 068 14
                                                                                              • Exe 069 savage
                                                                                              • Exe 0610 strawman
                                                                                                  • Partial differential equations
                                                                                                    • Classifying PDEs
                                                                                                    • Sturm-liouville problems
                                                                                                      • Types of boundary conditions
                                                                                                        • PDE solution by separation of variables
                                                                                                        • The 1D wave equation
                                                                                                        • Exercises for Chapter 07
                                                                                                          • Exe 071 poltergeist
                                                                                                          • Exe 072 kathmandu
                                                                                                              • Optimization
                                                                                                                • Gradient descent
                                                                                                                  • Stationary points
                                                                                                                  • The gradient points the way
                                                                                                                  • The classical method
                                                                                                                  • The Barzilai and Borwein method
                                                                                                                    • Constrained linear optimization
                                                                                                                      • Feasible solutions form a polytope
                                                                                                                      • Only the vertices matter
                                                                                                                        • The simplex algorithm
                                                                                                                        • Exercises for Chapter 08
                                                                                                                          • Exe 081 cummerbund
                                                                                                                          • Exe 082 melty
                                                                                                                              • Nonlinear analysis
                                                                                                                                • Nonlinear state-space models
                                                                                                                                  • Autonomous and nonautonomous systems
                                                                                                                                  • Equilibrium
                                                                                                                                    • Nonlinear system characteristics
                                                                                                                                      • Those in-common with linear systems
                                                                                                                                      • Stability
                                                                                                                                      • Qualities of equilibria
                                                                                                                                        • Nonlinear system simulation
                                                                                                                                        • Simulating nonlinear systems in Python
                                                                                                                                        • Exercises for Chapter 09
                                                                                                                                          • Distribution tables
                                                                                                                                            • Gaussian distribution table
                                                                                                                                            • Students t-distribution table
                                                                                                                                              • Fourier and Laplace tables
                                                                                                                                                • Laplace transforms
                                                                                                                                                • Fourier transforms
                                                                                                                                                  • Mathematics reference
                                                                                                                                                    • Quadratic forms
                                                                                                                                                      • Completing the square
                                                                                                                                                        • Trigonometry
                                                                                                                                                          • Triangle identities
                                                                                                                                                          • Reciprocal identities
                                                                                                                                                          • Pythagorean identities
                                                                                                                                                          • Co-function identities
                                                                                                                                                          • Even-odd identities
                                                                                                                                                          • Sum-difference formulas (AM or lock-in)
                                                                                                                                                          • Double angle formulas
                                                                                                                                                          • Power-reducing or half-angle formulas
                                                                                                                                                          • Sum-to-product formulas
                                                                                                                                                          • Product-to-sum formulas
                                                                                                                                                          • Two-to-one formulas
                                                                                                                                                            • Matrix inverses
                                                                                                                                                            • Laplace transforms
                                                                                                                                                              • Complex analysis
                                                                                                                                                                • Eulers formulas
                                                                                                                                                                  • Bibliography
Page 8: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 9: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 10: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 11: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 12: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 13: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 14: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 15: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 16: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 17: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 18: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 19: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 20: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 21: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 22: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 23: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 24: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 25: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 26: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 27: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 28: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 29: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 30: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 31: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 32: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 33: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 34: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 35: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 36: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 37: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 38: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 39: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 40: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 41: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 42: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 43: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 44: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 45: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 46: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 47: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 48: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 49: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 50: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 51: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 52: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 53: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 54: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 55: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 56: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 57: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 58: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 59: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 60: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 61: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 62: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 63: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 64: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 65: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 66: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 67: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 68: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 69: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 70: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 71: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 72: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 73: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 74: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 75: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 76: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 77: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 78: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 79: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 80: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 81: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 82: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 83: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 84: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 85: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 86: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 87: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 88: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 89: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 90: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 91: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 92: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 93: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 94: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 95: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 96: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 97: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 98: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 99: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 100: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 101: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 102: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 103: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 104: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 105: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 106: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 107: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 108: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 109: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 110: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 111: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 112: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 113: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 114: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 115: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 116: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 117: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 118: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 119: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 120: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 121: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 122: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 123: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 124: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 125: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 126: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 127: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 128: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 129: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 130: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 131: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 132: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 133: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 134: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 135: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 136: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 137: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 138: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 139: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 140: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 141: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 142: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 143: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 144: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 145: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 146: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 147: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 148: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 149: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 150: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 151: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 152: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 153: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 154: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 155: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 156: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 157: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 158: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 159: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 160: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 161: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 162: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 163: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 164: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 165: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 166: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 167: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 168: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 169: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 170: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 171: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 172: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 173: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 174: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 175: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 176: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 177: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 178: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 179: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 180: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 181: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 182: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 183: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 184: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 185: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 186: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 187: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 188: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 189: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 190: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 191: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 192: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 193: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 194: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 195: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 196: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 197: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 198: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 199: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 200: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 201: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 202: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 203: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 204: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 205: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 206: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 207: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 208: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 209: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 210: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 211: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 212: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 213: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 214: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 215: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 216: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 217: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 218: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 219: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 220: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 221: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 222: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 223: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 224: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 225: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 226: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 227: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 228: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 229: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 230: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 231: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 232: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 233: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 234: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 235: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 236: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 237: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 238: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 239: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 240: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 241: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 242: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 243: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 244: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 245: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 246: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 247: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 248: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 249: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 250: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 251: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 252: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 253: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 254: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 255: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 256: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 257: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 258: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 259: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 260: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 261: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 262: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 263: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 264: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 265: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 266: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 267: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 268: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 269: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 270: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 271: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 272: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 273: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 274: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 275: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 276: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 277: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 278: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 279: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 280: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 281: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 282: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 283: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 284: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 285: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview
Page 286: Mathematical Foundationsricopic.one/mathematical_foundations/mathematical... · 2021. 3. 20. · itselfMathematicsitself truTruth p.1 1.FormuchofthislectureIrelyonthethoroughoverview