automatic differentiation - mcmaster universitycs777/presentations/ad.pdf · automatic...

103
Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua Nie Instructor: Prof. Tamas Terlaky School of Computational Engineering and School McMaster University March. 23, 2007

Upload: phamkhuong

Post on 20-Apr-2018

218 views

Category:

Documents


1 download

TRANSCRIPT

Page 1: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Automatic Differentiation

Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua Nie

Instructor: Prof. Tamas TerlakySchool of Computational Engineering and School

McMaster University

March. 23, 2007

Page 2: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Outline

1 Introductions2 Forward and Reverse Mode

Forward methodsReverse methodsComparisonExtended knowledgeCase Study

3 Complexity AnalysisForward ModeComplexityReverse ModeComplexity

4 AD SoftwaresAD tools in MATLABAD in C/C++ (ADIC)

DevelopersintroductionADIS AnatomyADICProcessExampleHandling Side EffectsReferences

Page 3: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Introductions

Why Do we Need Derivatives?

Optimization via gradient method.

Unconstrained Optimization minimize y = f (x) requiresgradient or hessian.Constrained Optimization minimize y = f (x) such thatc(x) = 0 also requires Jacobian Jc(x) = [∂cj/∂xi ].

Solution of Nonlinear Equations f (x) = 0 by NewtonMethod

xn+1 = xn −[∂f (xn)

∂x

]−1

f (xn)

requires Jacobian JF = [∂f/∂x ].Parameter Estimation, Data Assimilation, SensitivityAnalysis, Inverse Problem, ......

Page 4: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Introductions

Why Do we Need Derivatives?

Optimization via gradient method.

Unconstrained Optimization minimize y = f (x) requiresgradient or hessian.Constrained Optimization minimize y = f (x) such thatc(x) = 0 also requires Jacobian Jc(x) = [∂cj/∂xi ].

Solution of Nonlinear Equations f (x) = 0 by NewtonMethod

xn+1 = xn −[∂f (xn)

∂x

]−1

f (xn)

requires Jacobian JF = [∂f/∂x ].Parameter Estimation, Data Assimilation, SensitivityAnalysis, Inverse Problem, ......

Page 5: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Introductions

Why Do we Need Derivatives?

Optimization via gradient method.

Unconstrained Optimization minimize y = f (x) requiresgradient or hessian.Constrained Optimization minimize y = f (x) such thatc(x) = 0 also requires Jacobian Jc(x) = [∂cj/∂xi ].

Solution of Nonlinear Equations f (x) = 0 by NewtonMethod

xn+1 = xn −[∂f (xn)

∂x

]−1

f (xn)

requires Jacobian JF = [∂f/∂x ].Parameter Estimation, Data Assimilation, SensitivityAnalysis, Inverse Problem, ......

Page 6: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Introductions

How Do We Obtain Derivatives?

Reliability: the correctness and numerical accuracy of thederivative results;Computational Cost: the amount of runtime and memoryrequired for the derivative code;Development Time: the time it takes to design,implement, and verify the derivative code, beyond the timeto implement the code for the computation of underlyingfunction.

Page 7: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Introductions

How Do We Obtain Derivatives?

Reliability: the correctness and numerical accuracy of thederivative results;Computational Cost: the amount of runtime and memoryrequired for the derivative code;Development Time: the time it takes to design,implement, and verify the derivative code, beyond the timeto implement the code for the computation of underlyingfunction.

Page 8: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Introductions

How Do We Obtain Derivatives?

Reliability: the correctness and numerical accuracy of thederivative results;Computational Cost: the amount of runtime and memoryrequired for the derivative code;Development Time: the time it takes to design,implement, and verify the derivative code, beyond the timeto implement the code for the computation of underlyingfunction.

Page 9: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Introductions

Main Approaches

Hand CodingDivided DifferencesSymbolic DifferentiationAutomatic Differentiation

Page 10: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Introductions

Hand Coding

An analytic expression for the derivative is identified first andthen implemented by hand using any high-level programminglanguage.

AdvantagesAccuracy up to machine precision, if care is taken.Highly-optimized implementation depending on the skill ofthe implementer.

DisadvantagesOnly applicable for "simple" functions and error-prone.Requires considerable human effort.

Page 11: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Introductions

Hand Coding

An analytic expression for the derivative is identified first andthen implemented by hand using any high-level programminglanguage.

AdvantagesAccuracy up to machine precision, if care is taken.Highly-optimized implementation depending on the skill ofthe implementer.

DisadvantagesOnly applicable for "simple" functions and error-prone.Requires considerable human effort.

Page 12: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Introductions

Hand Coding

An analytic expression for the derivative is identified first andthen implemented by hand using any high-level programminglanguage.

AdvantagesAccuracy up to machine precision, if care is taken.Highly-optimized implementation depending on the skill ofthe implementer.

DisadvantagesOnly applicable for "simple" functions and error-prone.Requires considerable human effort.

Page 13: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Introductions

Hand Coding

An analytic expression for the derivative is identified first andthen implemented by hand using any high-level programminglanguage.

AdvantagesAccuracy up to machine precision, if care is taken.Highly-optimized implementation depending on the skill ofthe implementer.

DisadvantagesOnly applicable for "simple" functions and error-prone.Requires considerable human effort.

Page 14: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Introductions

Hand Coding

An analytic expression for the derivative is identified first andthen implemented by hand using any high-level programminglanguage.

AdvantagesAccuracy up to machine precision, if care is taken.Highly-optimized implementation depending on the skill ofthe implementer.

DisadvantagesOnly applicable for "simple" functions and error-prone.Requires considerable human effort.

Page 15: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Introductions

Hand Coding

An analytic expression for the derivative is identified first andthen implemented by hand using any high-level programminglanguage.

AdvantagesAccuracy up to machine precision, if care is taken.Highly-optimized implementation depending on the skill ofthe implementer.

DisadvantagesOnly applicable for "simple" functions and error-prone.Requires considerable human effort.

Page 16: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Introductions

Divided Differences

Approximate the derivative of a function f w.r.t the i thcomponent of x at a particular point x0 by differencenumerically, e.g

∂f (x)

∂xi

∣∣∣∣x0

≈ f (x0 + hei)− f (x0)

h

where ei is the i th Cartesian unit vector.

Page 17: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Introductions

Divided Differences(Ctd.)

∂f (x)

∂xi

∣∣∣∣x0

≈ f (x0 + hei)− f (x0)

h

Advantage:only f is needed, easy to be implemented, used as a "blackbox"easy to parallelize

Disadvantage:Accuracy hard to assess, depending on the choice of hComputational complexity bounded below: (n + 1)× cost(f )

Page 18: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Introductions

Divided Differences(Ctd.)

∂f (x)

∂xi

∣∣∣∣x0

≈ f (x0 + hei)− f (x0)

h

Advantage:only f is needed, easy to be implemented, used as a "blackbox"easy to parallelize

Disadvantage:Accuracy hard to assess, depending on the choice of hComputational complexity bounded below: (n + 1)× cost(f )

Page 19: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Introductions

Divided Differences(Ctd.)

∂f (x)

∂xi

∣∣∣∣x0

≈ f (x0 + hei)− f (x0)

h

Advantage:only f is needed, easy to be implemented, used as a "blackbox"easy to parallelize

Disadvantage:Accuracy hard to assess, depending on the choice of hComputational complexity bounded below: (n + 1)× cost(f )

Page 20: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Introductions

Divided Differences(Ctd.)

∂f (x)

∂xi

∣∣∣∣x0

≈ f (x0 + hei)− f (x0)

h

Advantage:only f is needed, easy to be implemented, used as a "blackbox"easy to parallelize

Disadvantage:Accuracy hard to assess, depending on the choice of hComputational complexity bounded below: (n + 1)× cost(f )

Page 21: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Introductions

Divided Differences(Ctd.)

∂f (x)

∂xi

∣∣∣∣x0

≈ f (x0 + hei)− f (x0)

h

Advantage:only f is needed, easy to be implemented, used as a "blackbox"easy to parallelize

Disadvantage:Accuracy hard to assess, depending on the choice of hComputational complexity bounded below: (n + 1)× cost(f )

Page 22: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Introductions

Divided Differences(Ctd.)

∂f (x)

∂xi

∣∣∣∣x0

≈ f (x0 + hei)− f (x0)

h

Advantage:only f is needed, easy to be implemented, used as a "blackbox"easy to parallelize

Disadvantage:Accuracy hard to assess, depending on the choice of hComputational complexity bounded below: (n + 1)× cost(f )

Page 23: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Introductions

Symbolic Differentiation

Find an explicit derivative expression by computer algebrasystems.

Disadvantages:The length of the representation of the resulting derivativeexpressions increases rapidly with the number, n, ofindependent variables;Inefficient in terms of computing time due to the rapidgrowth of the underlying expressions;Unable to deal with constructs such as branches, loops, orsubroutines that are inherent in computer codes.

Page 24: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Introductions

Automatic Differentiation

What is Automatic Differentiation?Algorithmic, or automatic, differentiation (AD) is concernedwith the accurate and efficient evaluation of derivatives forfunctions defined by computer programs. No truncationerrors are incurred, and the resulting numerical derivativevalues can be used for all scientific computations that arebased on linear, quadratic, or even higher orderapproximations to nonlinear scalar or vector functions.

Page 25: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Introductions

Automatic Differentiation (Cont.)

What’s the idea behind Automatic Differentiation?Automatic differentiation techniques rely on the fact thatevery function no matter how complicated is executed on acomputer as a (potentially very long) sequence ofelementary operations such as additions, multiplications,and elementary functions such as sin and cos. Byrepeated application of the chain rule of derivative calculusto the composition of those elementary operations, onecan computes in a completely mechanical fashion.

Page 26: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Introductions

How good AD is?

ReliabilityAccurate to machine precision, no truncation error exists.Computational CostForward Mode: 2 ∼ 3n × cost(f )Reverse Mode: 5× cost(f )Human EffortSpend less time in preparing a code for differentiation, inparticular in situations where computer models are boundto change frequently.

Page 27: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Introductions

How widely is AD used?

Sensitivity Analysis of a Mesoscale Weather ModelApplication Area: Climate ModelingData assimilation for ocean circulationApplication Area: OceanographyIntensity Modulated Radiation TherapyApplication Area: BiomedicineMultidisciplinary Design of AircraftApplication Area: Computational Fluid DynamicsThe NEOS serverApplication Area: Optimization......

Source: http://www.autodiff.org/?module=Applications&submenu=& category=all

Page 28: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Forward and Reverse Mode

AD methods : SimpleExample

Page 29: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Forward and Reverse Mode

SimpleExample

Unify all the variable..

Page 30: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Forward and Reverse Mode

Forward method

Forward methodDifferentiate the Code:

ui = xi i = 1, ...n,

ui = Φ({uj}j<i) i = n + 1, ..., N

Differentiate:∇ui = ei i = 1, ..., n

∇ui =∑j<i

ci,j ∗ ∇uj i = n + 1, ..., N

Page 31: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Forward and Reverse Mode

Reverse method

Reverse methodCompute the Adjoint of the Code

uj =∂y∂uj

=∂(y1, y2, ..ym)

∂uj

Compute for dependent variables

un+p+j =∂(y1, y2, ..ym)

∂uj= ej j = 1, ..., m

Compute for intermediates and independents uj , j = n + p, ..., 1

uj =∂y∂uj

=∑i>j

uici,j

Page 32: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Forward and Reverse Mode

Forward methods

Forward methods

Forward methodMethod : Compute the gradient of each variable, and usethe chain rule to pass the gradientThe size of computed object: In each computation, itcomputes the vectors with input size n.The computation of gradient of each variable proceedswith the computation of each variableEasily implement

Page 33: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Forward and Reverse Mode

Forward methods

Forward methods

Computing Variable Value Computing Gradient Value

Page 34: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Forward and Reverse Mode

Reverse methods

Reverse methods

Reverse methodMethod : Compute Adjoint of each variable, pass theAdjointThe size of computed object: In each computation, itcomputes the vectors with output size m. (Note,usually theoutput size is 1 in optimization application.)The computation of Adjoint of each variable proceed afterthe completion of the computation of all variables.

Page 35: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Forward and Reverse Mode

Reverse methods

Reverse methods

Reverse methodTraverse through the Computational Graph reversely andget the parents of each variable so as to compute theAdjoint.Obtain the gradient by compute each partial deriviate oneby oneHarder to implement

Page 36: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Forward and Reverse Mode

Reverse methods

Reverse methods

Computing Variable Value Computing Adjoint Value

Page 37: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Forward and Reverse Mode

Reverse methods

Implementation of Reverse mode

Implementation of Reverse modeAs mentioned above, the implementation in Forward modeis relatively straightforward. We only propose thecomparison of important feature between SourceTransformation and Operator Overloading:Using Source Transformation: Re-ordering the code upsidedownUsing Operator Overloading: Record computation on a"tape"

Page 38: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Forward and Reverse Mode

Reverse methods

Implementation of Reverse mode

Re-ordering the code upside down:

Page 39: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Forward and Reverse Mode

Reverse methods

Implementation of Reverse mode

Record computation on a "tape"Record:Operation,operandsRelated technique: CheckpointingIf the number of operations going large, Checkpointingprevent the program from exhausting all the memory

Page 40: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Forward and Reverse Mode

Comparison

Comparison

The following topic is discussed in the comparisonbetween Forward mode and backward modeComputational ComplexityMemory RequiredTime to develop

Page 41: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Forward and Reverse Mode

Comparison

Cost of Forward Propagation of Derivs.

Define{

N|c|=1 : No. of unit local derivatives ci,j = ±1N|c|6=1 : No. of nonunit local derivatives ci,j 6= 0, ±1

Solve for derivatives in forward order 5un+1,5un+2, . . . ,5uN

5ui =∑j≺i

ci,j ∗ 5uj , i = n + 1, . . . , N,

with each 5ui = (∂ui/∂x1, . . . , ∂ui/∂xn), a length n vector.Flop count flops(fwd) given by,

flops(fwd) = nN|c|6=1 (mults.ci,j ∗ 5uj , ci,j 6= 1, 0)+n(N|c|6=1 + N|c|=1) (adds./subs. + ci,j 5 uj)−n(p + m) (first n adds./subs.)

flops(fwd) = n(2N|c|6=1 + N|c|=1 − p −m)

Page 42: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Forward and Reverse Mode

Comparison

Cost of Reverse Propagation of Adjoints

Solve for adjoints in reverse order un+p, un+p−1, . . . , u1

uj =∑i�j

uici,j .

with uj = ∂∂uj

(y1, y2, . . . , ym) is a length m vector.

Flop count flops(rev) given by,

flops(rev) = mN|c|6=1 (mults.ui ∗ ci,j , ci,j 6= ±1, 0)= +m(N|c|=1 + N|c|6=1) (adds./subs. + (ui ∗ ci,j))

flops(rev) = m(2N|c|6=1 + N|c|=1).

Page 43: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Forward and Reverse Mode

Comparison

Memory Required

Used Storage:It’s uncertain that which mode takes more memory,usually, reverse mode takes more.The cost of memory for Forward mode is from:Storing size (1) in each variableStoring input size n in each gradient variableThe cost of memory for Reverse mode is from:Storing size (1) in each variableStoring output size m in each Adjoint variableStoring DAG(directed acyclic graph,which present thefunction)

Page 44: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Forward and Reverse Mode

Comparison

Memory Required

It’s more likely to have less memory used while usingforward mode:1.If there exists reused variable in original function2.If n is so large that Reverse requires lots of memory tostore DAG.It’s more likely to have less memory used while usingreverse mode:1.If n is relatively large, so the storage required for storinggradient is more than storing Adjoint

Page 45: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Forward and Reverse Mode

Comparison

Time to develop

Time to develop: Usually, it’s hard to develop Reversecode than Forward one, especially using SourceTransformation technique.

Page 46: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Forward and Reverse Mode

Comparison

Time to develop

Conclusion:Using Forward mode when n � m, such as optimizationUsing Reverse mode when m � n, such as SensitivityAnalysis

Page 47: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Forward and Reverse Mode

Extended knowledge

Extended knowledge

Directional DerivativesForward mode:seed d = (d1, ...dn)Tseeding ∇xi = dicalculates Jf ∗ dMulti-directional derivatives : replace d by D,whereD = [dij ]i=1,..n,j=1,..q

Page 48: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Forward and Reverse Mode

Extended knowledge

Extended knowledge

Directional AdjointsReverse mode:seed v = (v1, ...vm)seeding y j = vjcalculates v ∗ JfMulti-directional Adjoint : replace v by V,whereV = [vij ]i=1,..q,j=1,..m

Page 49: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Forward and Reverse Mode

Case Study

Case Study

Using FADBAD++:FADBAD++ were developed by Ole Stauning and ClausBendtsen.Flexible automatic differentiation using templates andoperator overloading in ANSI C++Only with source code, no additional library required.Free to use

Page 50: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Forward and Reverse Mode

Case Study

Case Study

Using FADBAD++:Test function : f (x) =

∏xi

Objective: Testing different coding of the function inForward mode, try to reuse the variableResult : Basically, no matter how you code,the memorycost as much as n ∗ n ∗ 8byte , no different between reusevariable or not

Page 51: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Forward and Reverse Mode

Case Study

Case Study

Using FADBAD++:Test function : f (x) =

∏xi

Objective: Testing Reverse modeResult : test until n = 6500 , Using Forward mode out ofmemory. Reverse is 127 times faster, and only take fewMB.Remark : Couldn’t see how the DAG take the memoryfrom using reverse mode, it’s more likely to observe byusing fewer independent variables but more complicatedfunction.

Page 52: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Complexity Analysis

Code List

Code-List given by re-writing the code into elemental binaryand unary operations/functions, e.g.[

y1y2

]=

[log2(x1x2) + x2x2

3 − a− x2√b · log(x1x2) + x2/x3 − x2x2

3 + a

]v1 = x1 v7 = v6 ∗ v2 v13 = v8 − v2v2 = x2 v8 = v7 − a v14 = v2

5v3 = x3 v9 = 1/v3 v15 =

√v12

v4 = v1 ∗ v2 v10 = v2 ∗ v9 v16 = v14 + v13v5 = log(v4) v11 = b ∗ v5 v17 = v15 − v8v6 = v2

3 v12 = v11 + v10

Page 53: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Complexity Analysis

Code-list (ctd.)

Assume code-list containsN± addition/substractions e.g v14 + v13N∗ multiplications e.g. v1 ∗ v2Nf nonlinear functions/operations e.g. log(v4), 1/v3Total of p + m = N± + N∗ + Nf statements

ThenEach addition/subtraction generates two ci,j = ±1Each multiplication generates two ci,j 6= ±1, 0Each nonlinear function generates one ci,j 6= 1, 0 requiringone nonlinear function evaluation e.g. v5 = log(v4) givesc5,4 = 1/v4.

So we have,N|c|=1 = 2N±N|c|6=1 = 2N∗ + 1Nf

Page 54: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Complexity Analysis

Forward Mode Complexity

Complexity of Forward Mode

flops(Jf ) = flops(f ) + flops(ci,j) + flops(fwd)

Assume flops(nonlinear function) = w , w > 1.

Cost of evaluation function is,

flops(f ) = N∗ + N± + wNf

Cost of evaluation local derivatives ci,j is,

flops(ci,j) = wNf .

Cost of forward propagation of derivatives is

flops(fwd) = n(2N|c|6=1 + N|c|=1 − p −m)= n(3N∗ + N± + Nf )

Page 55: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Complexity Analysis

Forward Mode Complexity

Complexity of Forward Mode (Ctd.)

Then for forward mode

flops(Jf )flops(f ) = 1 + wNf +n(3N∗+N±+Nf )

N∗+N±+wNf

= 1 + 3nN∗ + nN± + n( 1w + 1

n )wN f

where,

(N∗, N±, wN f ) =(N∗, N±, wNf )

N∗ + N± + wNf.

SinceN∗ + N± + wN f = 1 and all coefficients positive,

flops(Jf )flops(f )

≤ 1 + n ∗max(3, 1, (1w

+1n

)) = 1 + 3n.

n << m, Forward Mode preferred.

Page 56: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Complexity Analysis

Reverse Mode Complexity

Complexity of Reverse Mode

flops(rev) = m(4N∗ + 2N± + 2Nf ),

giving,

flops(Jf )flops(f ) = 1 + 4mN∗ + 2mN± + m( 2

w + 1m )wN f

and

flops(Jf )flops(f )

≤ 1 + m ∗max(4, 2, (2w

+1m

)) = 1 + 4m

For m = 1flops(5f ) ≤ 5flops(f )

Page 57: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD tools in MATLAB

Differentiation Arithmetic

−→u = (u, u′),

where u denotes the value of the function u: R → R evaluatedat the point x0, and where u′ denotes the value u′(x0).

−→u +−→v = (u + v , u′ + v ′)

−→u −−→v = (u − v , u′ − v ′)−→u ×−→v = (uv , uv ′ + u′v)−→u ÷−→v = (u/v , u′ − (u/v)v ′/v)

−→x = (x , 1)−→c = (c, 0)

Ref:http://www.math.uu.se/ warwick/vt07/FMB/avnm1.pdf

Page 58: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD tools in MATLAB

Differentiation Arithmetic

−→u = (u, u′),

where u denotes the value of the function u: R → R evaluatedat the point x0, and where u′ denotes the value u′(x0).

−→u +−→v = (u + v , u′ + v ′)

−→u −−→v = (u − v , u′ − v ′)−→u ×−→v = (uv , uv ′ + u′v)−→u ÷−→v = (u/v , u′ − (u/v)v ′/v)

−→x = (x , 1)−→c = (c, 0)

Ref:http://www.math.uu.se/ warwick/vt07/FMB/avnm1.pdf

Page 59: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD tools in MATLAB

Differentiation Arithmetic

−→u = (u, u′),

where u denotes the value of the function u: R → R evaluatedat the point x0, and where u′ denotes the value u′(x0).

−→u +−→v = (u + v , u′ + v ′)

−→u −−→v = (u − v , u′ − v ′)−→u ×−→v = (uv , uv ′ + u′v)−→u ÷−→v = (u/v , u′ − (u/v)v ′/v)

−→x = (x , 1)−→c = (c, 0)

Ref:http://www.math.uu.se/ warwick/vt07/FMB/avnm1.pdf

Page 60: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD tools in MATLAB

Example of a Rational Function

f (x) = (x+1)(x−2)x+3

f (3) = 2/3, f ′(3) =?

−→f (−→x ) =

(−→x +

−→1 )(

−→x −−→2 )

(−→x +

−→3 )

=((x , 1) + (1, 0))× ((x , 1)− (2, 0))

((x , 1) + (3, 0))

Inserting the value −→x = (3, 1) into−→f produces

−→f (3, 1) =

((3, 1) + (1, 0))× ((3, 1)− (2, 0))

((3, 1) + (3, 0))

=(4, 1)× (1, 1)

(6, 1)

=(4, 5)

(6, 1)=

(23

,1318

)

Page 61: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD tools in MATLAB

Example of a Rational Function

f (x) = (x+1)(x−2)x+3

f (3) = 2/3, f ′(3) =?

−→f (−→x ) =

(−→x +

−→1 )(

−→x −−→2 )

(−→x +

−→3 )

=((x , 1) + (1, 0))× ((x , 1)− (2, 0))

((x , 1) + (3, 0))

Inserting the value −→x = (3, 1) into−→f produces

−→f (3, 1) =

((3, 1) + (1, 0))× ((3, 1)− (2, 0))

((3, 1) + (3, 0))

=(4, 1)× (1, 1)

(6, 1)

=(4, 5)

(6, 1)=

(23

,1318

)

Page 62: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD tools in MATLAB

Example of a Rational Function

f (x) = (x+1)(x−2)x+3

f (3) = 2/3, f ′(3) =?

−→f (−→x ) =

(−→x +

−→1 )(

−→x −−→2 )

(−→x +

−→3 )

=((x , 1) + (1, 0))× ((x , 1)− (2, 0))

((x , 1) + (3, 0))

Inserting the value −→x = (3, 1) into−→f produces

−→f (3, 1) =

((3, 1) + (1, 0))× ((3, 1)− (2, 0))

((3, 1) + (3, 0))

=(4, 1)× (1, 1)

(6, 1)

=(4, 5)

(6, 1)=

(23

,1318

)

Page 63: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD tools in MATLAB

Derivatives of Element Functions

Chain Rule:

(g ◦ u)′(x) = u′(x)(g′ ◦ u)(x)

−→g (−→u ) =

−→g ((u, u′)) = (g(u), u′g′(u))

sin−→u = sin(u, u′) = (sin u, u′ cos u)

cos−→u = cos(u, u′) = (cos u,−u′ sin u)

e−→u = e(u,u′) = (eu, u′eu)

...

Page 64: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD tools in MATLAB

Derivatives of Element Functions

Chain Rule:

(g ◦ u)′(x) = u′(x)(g′ ◦ u)(x)

−→g (−→u ) =

−→g ((u, u′)) = (g(u), u′g′(u))

sin−→u = sin(u, u′) = (sin u, u′ cos u)

cos−→u = cos(u, u′) = (cos u,−u′ sin u)

e−→u = e(u,u′) = (eu, u′eu)

...

Page 65: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD tools in MATLAB

Derivatives of Element Functions

Chain Rule:

(g ◦ u)′(x) = u′(x)(g′ ◦ u)(x)

−→g (−→u ) =

−→g ((u, u′)) = (g(u), u′g′(u))

sin−→u = sin(u, u′) = (sin u, u′ cos u)

cos−→u = cos(u, u′) = (cos u,−u′ sin u)

e−→u = e(u,u′) = (eu, u′eu)

...

Page 66: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD tools in MATLAB

Example of Sin

From ../Intlab/gradient/@gradient/sin.m

Page 67: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD tools in MATLAB

Example for Element Functions

Evaluate the derivative at x=0.

f (x) = (1 + x + ex) sin x−→f (−→x ) = (

−→1 +

−→x + e−→x )sin−→x

−→f (0, 1) =

((1, 0) + (0, 1) + e(0,1)

)sin(0, 1)

=((1, 1) + (e0, e0)

)(sin 0, cos 0)

= (2, 2)(0, 1) = (0, 2).

Page 68: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD tools in MATLAB

Example for Element Functions

Evaluate the derivative at x=0.

f (x) = (1 + x + ex) sin x−→f (−→x ) = (

−→1 +

−→x + e−→x )sin−→x

−→f (0, 1) =

((1, 0) + (0, 1) + e(0,1)

)sin(0, 1)

=((1, 1) + (e0, e0)

)(sin 0, cos 0)

= (2, 2)(0, 1) = (0, 2).

Page 69: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD tools in MATLAB

Example for Element Functions

Evaluate the derivative at x=0.

f (x) = (1 + x + ex) sin x−→f (−→x ) = (

−→1 +

−→x + e−→x )sin−→x

−→f (0, 1) =

((1, 0) + (0, 1) + e(0,1)

)sin(0, 1)

=((1, 1) + (e0, e0)

)(sin 0, cos 0)

= (2, 2)(0, 1) = (0, 2).

Page 70: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD tools in MATLAB

High-order Derivatives

−→u = (u, u′, u′′),

−→u +−→v = (u + v , u′ + v ′, u′′ + v ′′)

−→u −−→v = (u − v , u′ − v ′, u′′ − v ′′)−→u ×−→v = (uv , uv ′ + u′v , uv ′′ + 2u′v ′ + u′′v ′)−→u ÷−→v = (u/v , u′ − (u/v)v ′/v , (u′′ − 2(u/v)′v ′ − (u/v)v ′′)/v)

· · · · · ·

Page 71: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD tools in MATLAB

INTLab

Developers: Institute for Reliable Computing, HamburgUniversity of Technology

Mode: ForwardMethod: Operator overloading

Language: MATLABURL: http://www.ti3.tu-harburg.de/rump/intlab/

Licensing: Open Source

Page 72: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD tools in MATLAB

Rosenbrock Function

y1 = 400x1(x21 − x2) + 2(x1 − 1)

y2 = 200(x21 − x2)

Page 73: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD tools in MATLAB

One Step of Newton Method with INTLab

Page 74: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD tools in MATLAB

TOMLAB/MAD

Developers: Marcus M. Edvall and Kenneth Holmstrom,Tomlab Optimization Inc. (TOMLAB /MADintegration)Shaun A. Forth and Robert Ketzscher, CranfieldUniversity (MAD)

Mode: ForwardMethod: Operator overloading

Language: MATLABURL: http://tomlab.biz/products/mad/

Licensing: License

Page 75: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD tools in MATLAB

One Step of Newton Method with MAD

Page 76: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD tools in MATLAB

ADiMat

Developers: Andre Vehreschild, Institute for ScientificComputing, RWTH Aachen University

Mode: ForwardMethod: Source transformation

Operator overloadingLanguage: MATLAB

URL: http://www.sc.rwth-aachen.de/vehreschild/adimat.html

Licensing: under discussion

Page 77: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD tools in MATLAB

ADiMat’s Example

function [result1, result2]= f(x)% Compute the sin and square-root of x*2.% Very simple example for ADiMat website.% Andre Vehreschild, Institute for% Scientific Computing,% RWTH Aachen University, D-52056 Aachen,% Germany.% [email protected]

result1= sin(x);result2= sqrt(x*2);

Source:http://www.sc.rwth-aachen.de/vehreschild/adimat/example1.html

Page 78: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD tools in MATLAB

ADiMat’s Example (cont.)

>> addiff(@f, ’x’, ’result1,result2’);>> p=magic(5);>> g_p=createFullGradients(p);>> [g_r1, r1, g_r2, r2]= g_f(g_p, p);>> J1= [g_r1{:}]; % and>> J2= [g_r2{:}];

Source: http://www.sc.rwth-aachen.de/vehreschild/adimat/example1.html

Page 79: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD tools in MATLAB

ADiMat’s Example (cont.)

function [g_result1, result1, g_result2, result2] = g_f(g_x, x)% Compute the sin and square-root of x*2.% Very simple example for ADiMat website.% Andre Vehreschild, Institute for Scientific Computing,% RWTH Aachen University, D-52056 Aachen, Germany.% [email protected]

g_result1= ((g_x).* cos(x));result1= sin(x);g_tmp_f_00000= g_x* 2;tmp_f_00000= x* 2;g_result2= ((g_tmp_f_00000)./ (2.*sqrt(tmp_f_00000)));result2= sqrt(tmp_f_00000);

Source:http://www.sc.rwth-aachen.de/vehreschild/adimat/example1.html

Page 80: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD tools in MATLAB

Matrix Calculus

Definition: If X is p × q and Y is m × n, then dY: = dY/dX dX:where the derivative dY/dX is a large mn × pq matrix.

d(X 2) : = (XdX + dXX ) :

d(det(X )) = d(det(X T )) = det(X )(X−T ) :T dX :

d(ln(det(X ))) = (X−T ) :T dX :

Ref: http://www.ee.ic.ac.uk/hp/staff/dmb/matrix/calculus.html

Page 81: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD tools in MATLAB

Vandermonde Function

Source: Shaun A. Forth An Efficient Overloaded Implementation of Forward Mode Automatic Differentiation in

MATLAB ACM Transactions on Mathematical Software, Vol. 32,No.2, 2006, P195-222

Page 82: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD tools in MATLAB

Vandermonde Function (cont.)

Experiment on a PIV 3.0Ghz PC (Windows XP), Matlab Version: 6.5

Source: Shaun A. Forth An Efficient Overloaded Implementation of Forward Mode Automatic Differentiation in

MATLAB ACM Transactions on Mathematical Software, Vol. 32,No.2, 2006, P195-222

Page 83: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD tools in MATLAB

Vandermonde Function (cont.)

Method 10 20 40 80 160 320 640 1280Function 0.000 0.000 0.000 0.000 0.000 0.010 0.000 0.000

MAD(Full) 0.070 0.060 0.070 0.130 0.581 2.664 10.535 45.535MAD(Sparse) 0.071 0.050 0.060 0.060 0.060 0.070 0.100 0.881

INTLab 0.050 0.040 0.040 0.090 0.040 0.050 0.071 0.120ADiMat 0.231 0.140 0.271 0.601 1.362 3.044 7.340 21.611

Unit of CPU time is second. Experiment on a PIII1000Hz PC (Windows 2000), Matlab Version: 7.0.1.24704 (R14)

Service Pack 1, TOMLAB v5.6, INTLAB Version 5.3, ADiMat (beta) 0.4-r9.

Page 84: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD tools in MATLAB

Arrowhead Function

Source: Shaun A. Forth An Efficient Overloaded Implementation of Forward Mode Automatic Differentiation in

MATLAB ACM Transactions on Mathematical Software, Vol. 32,No.2, 2006, P195-222

Page 85: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD tools in MATLAB

Arrowhead Function (cont.)

Experiment on a PIV 3.0Ghz PC (Windows XP), Matlab Version: 6.5

Source: Shaun A. Forth An Efficient Overloaded Implementation of Forward Mode Automatic Differentiation in

MATLAB ACM Transactions on Mathematical Software, Vol. 32,No.2, 2006, P195-222

Page 86: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD tools in MATLAB

Arrowhead Function (cont.)

Method 20 40 80 160 320 640 1280Function 0.010 0.000 0.000 0.000 0.000 0.000 0.000

MAD(Full) 0.180 0.050 0.070 0.200 1.111 4.367 17.796MAD(Sparse) 0.060 0.060 0.060 0.070 0.080 0.100 0.160

INTLab 0.090 0.051 0.050 0.050 0.081 0.140 0.340ADiMat 0.911 0.311 0.651 1.262 2.704 6.028 14.581

Unit of CPU time is second. Experiment on a PIII1000Hz PC (Windows 2000), Matlab Version: 7.0.1.24704 (R14)

Service Pack 1, TOMLAB v5.6, INTLAB Version 5.3, ADiMat (beta) 0.4-r9.

Page 87: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD tools in MATLAB

BDQRTIC mod

Page 88: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD tools in MATLAB

BDQRTIC mod (cont.)

Method 20 40 80 160 320 640 1280Function 12.809 0.010 0.000 0.000 0.000 0.010 0.000

MAD(Full) 2.604 0.121 0.150 0.490 2.513 10.926 43.162MAD(Sparse) 0.270 0.120 0.130 0.150 0.201 0.260 0.371

INTLab 2.293 0.080 0.100 0.110 0.150 0.230 0.481ADiMat 3.455 0.621 1.152 2.544 5.778 14.641 42.671

Unit of CPU time is second. Experiment on a PIII1000Hz PC (Windows 2000), Matlab Version: 7.0.1.24704 (R14)

Service Pack 1, TOMLAB v5.6, INTLAB Version 5.3, ADiMat (beta) 0.4-r9.

Page 89: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD tools in MATLAB

Summary of AD softwares in MATLab

Operator overloading method for AD forward mode is easyto implement by differentiation arithmetic.All of AD tools in Matlab are easy to use.Sparse storage provides a good way to improve theperformance of AD tools.

Page 90: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD in C/C++ (ADIC)

The Computational Differentiation Group atArgonne National Laboratory

ADIC introduced in 1997 by:

Chrirtian BischofScientific Computing at

RWTH Aachen University

Lucas Rohfounder, president andCEO of Hostway Co.

and the other team mem-bers.

Page 91: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD in C/C++ (ADIC)

State of ADIS

ADIC is an Automatic Differentiation tools In ANSI C/C++.

ADIC was introduced in 1966.

Last updated: June 10, 2005.

Official web site www-new.mcs.anl.gov/adic/down-2.htm.

ADIC is using forward method.

Supported Platforms: Unix/Linux.

Selected Application: NEOS

Related Research Group: Argonne National Laboratory,USA

Page 92: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD in C/C++ (ADIC)

ADICAnatomy

Page 93: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD in C/C++ (ADIC)

ADICProcess

Page 94: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD in C/C++ (ADIC)

func.cUntitled

#include "func.h"#include <math.h>

void func(data_t * pdata){ int i; double *x = pdata->x; double *y = pdata->y; double s, temp;

i=0; for (;i < pdata->len ;){ s = s + x[i]*y[i]; i++; }

temp = exp(s);

pdata->r = temp;}

Page 1

Page 95: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD in C/C++ (ADIC)

driver.c

Page 96: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD in C/C++ (ADIC)

Commands

The first command generates the header file ad_deriv.hand derivative function func.ad.c;

The second command compiles and links all neededfunctions and generates ad_func;

Page 97: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD in C/C++ (ADIC)

Handling Side Effects

Page 98: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD in C/C++ (ADIC)

Handling Side Effects

Page 99: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD in C/C++ (ADIC)

Handling Side Effects

Page 100: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD in C/C++ (ADIC)

Handling Side Effects

Page 101: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

AD Softwares

AD in C/C++ (ADIC)

For Further Reading in ADIC

Christian H. Bischof, Paul D. Hovland, Boyana NorrisImplementation of Automatic Differentiation Tools.PEPM Š02, Jan. 1415, 2002 Portland, OR, USA

Paul D. Hovlan and Boyana NorrisUsers’ Guide to ADIC 1.1.UsersŠ Guide to ADIC 1.1

C. H. Bischof, L. Roh, A. J. Mauer-OatsADIC: an extensible automatic differentiation tool forANSI-C.Mathematics and Computer Science Division, ArgonneNational Laboratory, Argonne, IL 60439, USA

Page 102: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Reference

ReferenceC.H. Bischof and H. M. Bucker. Computing Derivatives of Computer Programs, in Modern Methods andAlgorithms of Quantum Chemistry: Proceedings, Second Edition, edited by J. Grotendorst,NIC-Directors,2000, pages 315-327C. Bischof, A. Carle, P. Khademi, and G. Pusch. Automatic Differentiation: Obtaining Fast and ReliableDerivatives-Fast, in Control Problems in Industry, edited by I. Lasiecka and B. Morton,1995, pages 1-16Andreas Griewank. On Automatic Differentiation, in Mathematical Programming: Recent Developments andApplications, edited by M. Iri and K. Tanabe, Kluwer Academic Publishers, 1989.Andreas Griewank. Evaluating Derivatives: Principles and Techniques of Algorithmic Differentiation.Number 19 in Frontiers in Appl. Math. SIAM, Philadelphia, Penn., 2000.Shaun Forth. Introduction to Automatic Differentiation, presentation slide for The 4th InternationalConference on Automatic Differentiation. July 19-23 University of Chicago, Gleacher Centre, Chicago USA,2004.G. F. Corliss, Automatic Differentiation.Warwick Tucker, http://www.math.uu.se/ warwick/vt07/FMB/avnm1.pdfhttp://www.autodiff.org/http://www.ti3.tu-harburg.de/rump/intlab/http://tomopt.com/tomlab/products/mad/http://www.sc.rwth-aachen.de/vehreschild/adimat/index.htmlShaun A. Forth An Efficient Overloaded Implementation of Forward Mode Automatic Differentiation inMATLAB ACM Transactions on Mathematical Software, Vol. 32,No.2, 2006, P195 − 222Siegfried M. Rump INTLAB −− INTerval LABoratory Developments in Reliable Computing, KluwerAcademic Publishers, 1999, p77 − 104Christian H. Bischof, H. Martin Bucker, Bruno Lang, A. Rasch, Andre Vehreschild Combining SourceTransformation and Operator Overloading Techniques to Compute Derivatives for MATLAB ProgramsConference proceeding, Proceedings of the Second IEEE International Workshop on Source Code Analysisand Manipulation (SCAM 2002), IEEE Computer Society, 2002

Page 103: Automatic Differentiation - McMaster Universitycs777/presentations/AD.pdf · Automatic Differentiation Automatic Differentiation Hamid Reza Ghaffari , Jonathan Li, Yang Li, Zhenghua

Automatic Differentiation

Thanks & Questions

Thanks!

Questions?