introduction to algorithmic differentiation · 2012-08-13 · introduction to algorithmic...

178
Introduction to Algorithmic Differentiation Kshitij Kulshreshtha Universit ¨ at Paderborn ESSIM Summer School 13. August - 16. August 2012 Dresden K. Kulshreshtha 1 / 94 Intro to AD ESSIM Summer School

Upload: lamthien

Post on 07-Sep-2018

226 views

Category:

Documents


0 download

TRANSCRIPT

  • Introduction to Algorithmic Differentiation

    Kshitij Kulshreshtha

    Universitat Paderborn

    ESSIM Summer School13. August - 16. August 2012

    Dresden

    K. Kulshreshtha 1 / 94 Intro to AD ESSIM Summer School

  • Schedule

    1. Introduction/Motivation/Function Representation/Forward mode2. Reverse Mode/Higher order Derivatives3. Implementations/ADOL-C4. Hand on Exercises (ADOL-C)

    K. Kulshreshtha 2 / 94 Intro to AD ESSIM Summer School

  • Computing Derivatives

    SimulationSensitivity

    CalculationOptimization

    K. Kulshreshtha 3 / 94 Intro to AD ESSIM Summer School

  • Computing Derivatives

    SimulationSensitivity

    CalculationOptimization

    data

    theory

    user

    input x

    output y

    modelling computerprogram

    K. Kulshreshtha 3 / 94 Intro to AD ESSIM Summer School

  • Computing Derivatives

    SimulationSensitivity

    CalculationOptimization

    data

    theory

    user

    differentiation

    input x

    output y

    modelling computerprogram

    enhancedprogram

    K. Kulshreshtha 3 / 94 Intro to AD ESSIM Summer School

  • Computing Derivatives

    SimulationSensitivity

    CalculationOptimization

    data

    theory

    user user

    differentiation

    input x

    output y

    input x

    output y

    modelling computerprogram

    enhancedprogram

    optimizationalgorithm

    sensitivityyx

    K. Kulshreshtha 3 / 94 Intro to AD ESSIM Summer School

  • Computing DerivativesGiven:

    Description of functional relation asI formula y = F (x) explicit expression y = F (x)I computer program ?

    Task:

    Computation of derivatives takingI requirements on exactnessI computational effort

    into account

    K. Kulshreshtha 4 / 94 Intro to AD ESSIM Summer School

  • Computing DerivativesGiven:

    Description of functional relation asI formula y = F (x) explicit expression y = F (x)I computer program ?

    Task:

    Computation of derivatives takingI requirements on exactnessI computational effort

    into account

    K. Kulshreshtha 4 / 94 Intro to AD ESSIM Summer School

  • Derivation: What for?

    K. Kulshreshtha 5 / 94 Intro to AD ESSIM Summer School

  • Derivation: What for?I Optimization:

    unbounded: min f (x), f : Rn Rbounded: min f (x), f : Rn R

    c(x) = 0, c : Rn Rmh(x) 0, h : Rn Rl

    I Solution of nonlinear equation systemsF (x) = 0, F : Rn Rn

    Newton method requires F (x) RnnI Simulation of complex procedures

    I system definitionI integration of differential equations

    I Sensitivity analysisI Real-time control

    K. Kulshreshtha 5 / 94 Intro to AD ESSIM Summer School

  • Derivation: What for?I Optimization:

    unbounded: min f (x), f : Rn Rbounded: min f (x), f : Rn R

    c(x) = 0, c : Rn Rmh(x) 0, h : Rn Rl

    I Solution of nonlinear equation systemsF (x) = 0, F : Rn Rn

    Newton method requires F (x) Rnn

    I Simulation of complex proceduresI system definitionI integration of differential equations

    I Sensitivity analysisI Real-time control

    K. Kulshreshtha 5 / 94 Intro to AD ESSIM Summer School

  • Derivation: What for?I Optimization:

    unbounded: min f (x), f : Rn Rbounded: min f (x), f : Rn R

    c(x) = 0, c : Rn Rmh(x) 0, h : Rn Rl

    I Solution of nonlinear equation systemsF (x) = 0, F : Rn Rn

    Newton method requires F (x) RnnI Simulation of complex procedures

    I system definitionI integration of differential equations

    I Sensitivity analysisI Real-time control

    K. Kulshreshtha 5 / 94 Intro to AD ESSIM Summer School

  • Derivation: What for?I Optimization:

    unbounded: min f (x), f : Rn Rbounded: min f (x), f : Rn R

    c(x) = 0, c : Rn Rmh(x) 0, h : Rn Rl

    I Solution of nonlinear equation systemsF (x) = 0, F : Rn Rn

    Newton method requires F (x) RnnI Simulation of complex procedures

    I system definitionI integration of differential equations

    I Sensitivity analysisI Real-time control

    K. Kulshreshtha 5 / 94 Intro to AD ESSIM Summer School

  • Symbolic Differentiation

    K. Kulshreshtha 6 / 94 Intro to AD ESSIM Summer School

  • Symbolic Differentiation

    x2 2x

    K. Kulshreshtha 6 / 94 Intro to AD ESSIM Summer School

  • Symbolic Differentiation

    x2 2xxp pxp1

    K. Kulshreshtha 6 / 94 Intro to AD ESSIM Summer School

  • Symbolic Differentiation

    x2 2xxp pxp1cos(x) sin(x)

    K. Kulshreshtha 6 / 94 Intro to AD ESSIM Summer School

  • Symbolic Differentiation

    x2 2xxp pxp1cos(x) sin(x)exp(x) exp(x)

    K. Kulshreshtha 6 / 94 Intro to AD ESSIM Summer School

  • Symbolic Differentiation

    x2 2xxp pxp1cos(x) sin(x)exp(x) exp(x)log(x) 1x

    K. Kulshreshtha 6 / 94 Intro to AD ESSIM Summer School

  • Symbolic Differentiation

    x2 2xxp pxp1cos(x) sin(x)exp(x) exp(x)log(x) 1xg(h(x)) g(h(x)) h(x)

    ...

    K. Kulshreshtha 6 / 94 Intro to AD ESSIM Summer School

  • Symbolic Differentiation

    I exact derivativesI f (x) = exp(sin(x2))

    f (x) = exp(sin(x2)) cos(x2) 2xI min J(x , u) such that x = f (x , u) + IC

    reduced formulation: J(x , u) J(u)

    J (u) based on symbolic adjoint = fx (x , u)> + TC

    I cost (common subexpression, implementation)

    I legacy code with large number of linesclosed form expression not available

    K. Kulshreshtha 7 / 94 Intro to AD ESSIM Summer School

  • Symbolic Differentiation

    I exact derivativesI f (x) = exp(sin(x2))

    f (x) = exp(sin(x2)) cos(x2) 2xI min J(x , u) such that x = f (x , u) + IC

    reduced formulation: J(x , u) J(u)

    J (u) based on symbolic adjoint = fx (x , u)> + TCI cost (common subexpression, implementation)

    I legacy code with large number of linesclosed form expression not available

    K. Kulshreshtha 7 / 94 Intro to AD ESSIM Summer School

  • read_input_file(argv[1], &code_control);

    code_control.timestep_type = 0; // calculate timestep size like TAU

    // read in CFD mesh read_cfd_mesh(code_control.CFDmesh_name, &gridbase); grid[0] = gridbase; // remove mesh corner points arizing more than once . . . // e.g. for block structured area and at interface between // block structured and unstructured area remove_double_points( &gridbase, grid); // write out mesh in tecplot format write_pointdata( name, &(grid[0]));

    // calculate metric of finest grid level/* grid[0].xp[ii][1] += 0.00000001; */ calc_metric(&(grid[0]), &code_control); puts("calc_metric ready"); // create coarse meshes for multigrid, calculate their metric // and initialze forcing functions to zero for (i = 1; i < code_control.nlevels; i++) { create_coarse_mesh(&(grid[i1]), &(grid[i])); init2zero(&(grid[i]), grid[i].force); } puts("create_coarse_mesh ready"); // initialize flow field on all grid levels to free stream // quantities for (i = 0; i < code_control.nlevels; i++) init_field(&(grid[i]), &code_control); puts("init_field ready"); // if selected read restart file if (code_control.restart == 1) read_restart( "restart", grid, &code_control, &first_residual, &first_step);

    // calculate primitive variables for all grid levels and // initialize states at the boundary for (i = 0; i < code_control.nlevels; i++) { cons2prim(&(grid[i]), &code_control); init_bdry_states(&(grid[i])); } // open file for writing convergence history conv = fopen("conv.dat", "w"); fprintf(conv, "title = convergence\n"); fprintf(conv, "variables = iter, l2res, lift, drag\n");

    level = 0; printf("will perform %d steps\n",code_control.nsteps[level]);

    // starting time of computation t1 = time(&t1);

    double lift, drag;

    // loop over all multigrid cycles

    Jan 01, 08 21:46 Seite 29/30euler2d.c

    for (it = 0; it < code_control.nsteps[level]; it++) { double residual; lift = 0.0; drag = 0.0;

    // calculate actual weight of gradient needed for reconstruction if (sum_it+first_step 1.0e10) ? residual: 1.0;

    // print out convergence information to file and standard output printf("IT = %d %20.10e %20.10e %20.10e %4.2f\n", sum_it, residual / first_residual, lift, drag, weight); fprintf(conv, "%d %20.10e %20.10e %20.10e\n", sum_it+first_step, residual / first_residual, lift, drag); sum_it++; }

    // final time of computation t2 = time(&t2); // print out time needed for the time loop printf ("Zeit : %f\n", difftime(t2, t1)); last_step = first_step + code_control.nsteps[0] ;

    fclose(conv);

    // map solution from cell centers to vertices center2point(grid); // write out field solution write_eulerdata( "euler.dat", grid, &code_control); // write out solution on walls write_surf( "eulersurf.dat", grid, &code_control); // write restart file write_restart( "restart", grid, &code_control, first_residual, last_step);

    return 0;}

    Jan 01, 08 21:46 Seite 30/30euler2d.c

    Gedruckt von Andrea Walther

    Dienstag Januar 01, 2008 15/15euler2d.c

    K. Kulshreshtha 8 / 94 Intro to AD ESSIM Summer School

  • Finite Differences

    Idea: Taylor-expansion, f : R R smooth then

    f (x + h) = f (x) + h(x) + h2f (x)/2 + h3f (x)/6 + f (x + h) f (x) + hf (x)

    Df (x) f (x + h) f (x)h

    I simple derivative calculation (only function evaluations!)I inexact derivativesI computation cost often too high

    F : Rn R OPS(F (x)) (n + 1)OPS(F (x))

    K. Kulshreshtha 9 / 94 Intro to AD ESSIM Summer School

  • Finite Differences

    Idea: Taylor-expansion, f : R R smooth then

    f (x + h) = f (x) + h(x) + h2f (x)/2 + h3f (x)/6 + f (x + h) f (x) + hf (x)

    Df (x) f (x + h) f (x)h

    I simple derivative calculation (only function evaluations!)I inexact derivativesI computation cost often too high

    F : Rn R OPS(F (x)) (n + 1)OPS(F (x))

    K. Kulshreshtha 9 / 94 Intro to AD ESSIM Summer School

  • Example 1

    K. Kulshreshtha 10 / 94 Intro to AD ESSIM Summer School

  • Example 1

    K. Kulshreshtha 11 / 94 Intro to AD ESSIM Summer School

  • Example 1

    K. Kulshreshtha 12 / 94 Intro to AD ESSIM Summer School

  • Example 1

    K. Kulshreshtha 13 / 94 Intro to AD ESSIM Summer School

  • Example 1

    K. Kulshreshtha 14 / 94 Intro to AD ESSIM Summer School

  • Example 1

    K. Kulshreshtha 15 / 94 Intro to AD ESSIM Summer School

  • Example 2

    K. Kulshreshtha 16 / 94 Intro to AD ESSIM Summer School

  • Example 2

    K. Kulshreshtha 17 / 94 Intro to AD ESSIM Summer School

  • Example 2

    K. Kulshreshtha 18 / 94 Intro to AD ESSIM Summer School

  • Example 2

    K. Kulshreshtha 19 / 94 Intro to AD ESSIM Summer School

  • Example 2

    K. Kulshreshtha 20 / 94 Intro to AD ESSIM Summer School

  • Algorithmic Differentiation (AD)

    Main Products:I Quantitative dependence information (local):

    I Weighted and directed partial derivativesI Error and condition number estimates . . .I Lipschitz constants, interval enclosures . . .I Eigenvalues, Newton steps . . .

    I Qualitative dependence information (regional):I Sparsity structures, degrees of polynomialsI Ranks, eigenvalue multiplicities . . .

    K. Kulshreshtha 21 / 94 Intro to AD ESSIM Summer School

  • Algorithmic Differentiation (AD)

    Main Products:I Quantitative dependence information (local):

    I Weighted and directed partial derivativesI Error and condition number estimates . . .I Lipschitz constants, interval enclosures . . .I Eigenvalues, Newton steps . . .

    I Qualitative dependence information (regional):I Sparsity structures, degrees of polynomialsI Ranks, eigenvalue multiplicities . . .

    K. Kulshreshtha 21 / 94 Intro to AD ESSIM Summer School

  • Algorithmic Differentation (AD)

    Main Properties:I No truncation errors

    I Chain rule applied to numbersI Applicability to arbitrary programsI A priori bounded and/or adjustable costs:

    I Total operations countI Maximal memory requirementI Total memory traffic

    always relative to original function.

    K. Kulshreshtha 22 / 94 Intro to AD ESSIM Summer School

  • Algorithmic Differentation (AD)

    Main Properties:I No truncation errorsI Chain rule applied to numbers

    I Applicability to arbitrary programsI A priori bounded and/or adjustable costs:

    I Total operations countI Maximal memory requirementI Total memory traffic

    always relative to original function.

    K. Kulshreshtha 22 / 94 Intro to AD ESSIM Summer School

  • Algorithmic Differentation (AD)

    Main Properties:I No truncation errorsI Chain rule applied to numbersI Applicability to arbitrary programs

    I A priori bounded and/or adjustable costs:I Total operations countI Maximal memory requirementI Total memory traffic

    always relative to original function.

    K. Kulshreshtha 22 / 94 Intro to AD ESSIM Summer School

  • Algorithmic Differentation (AD)

    Main Properties:I No truncation errorsI Chain rule applied to numbersI Applicability to arbitrary programsI A priori bounded and/or adjustable costs:

    I Total operations countI Maximal memory requirementI Total memory traffic

    always relative to original function.

    K. Kulshreshtha 22 / 94 Intro to AD ESSIM Summer School

  • Algorithmic Differentation (AD)

    Main Properties:I No truncation errorsI Chain rule applied to numbersI Applicability to arbitrary programsI A priori bounded and/or adjustable costs:

    I Total operations countI Maximal memory requirementI Total memory traffic

    always relative to original function. (Griewank, Walther 09)

    K. Kulshreshtha 22 / 94 Intro to AD ESSIM Summer School

  • The Hello-World-Example of AD

    Lighthouse

    quay

    y1 = tan( t) tan( t) and y2 =

    tan( t) tan( t)

    K. Kulshreshtha 23 / 94 Intro to AD ESSIM Summer School

  • The Hello-World-Example of AD

    Lighthouse

    quay

    y1 = tan( t) tan( t) and y2 =

    tan( t) tan( t)

    K. Kulshreshtha 23 / 94 Intro to AD ESSIM Summer School

  • The Hello-World-Example of AD

    quay

    Lighthouse

    y1

    y2

    t

    y2 = y1

    y1 = tan( t) tan( t) and y2 =

    tan( t) tan( t)

    K. Kulshreshtha 23 / 94 Intro to AD ESSIM Summer School

  • The Hello-World-Example of AD

    quay

    Lighthouse

    y1

    y2

    t

    y2 = y1

    y1 = tan( t) tan( t) and y2 =

    tan( t) tan( t)

    K. Kulshreshtha 23 / 94 Intro to AD ESSIM Summer School

  • Automatisation:Computer tolls as Maple, Mathematica, . . .

    Maple provides

    K. Kulshreshtha 24 / 94 Intro to AD ESSIM Summer School

  • Automatisation:Computer tolls as Maple, Mathematica, . . .

    Maple provides

    K. Kulshreshtha 24 / 94 Intro to AD ESSIM Summer School

  • Automatisation:Computer tolls as Maple, Mathematica, . . .

    Maple provides

    K. Kulshreshtha 24 / 94 Intro to AD ESSIM Summer School

  • Automatisation:Computer tolls as Maple, Mathematica, . . .

    Maple provides

    K. Kulshreshtha 24 / 94 Intro to AD ESSIM Summer School

  • Forward Mode of AD

    x(t)y(t)

    x

    yF

    F

    y(t) = t F (x(t)) = F(x(t)) x(t) F (x , x)

    K. Kulshreshtha 25 / 94 Intro to AD ESSIM Summer School

  • Forward Mode(Lighthouse)v3 = x1 = v2 = x2 = v1 = x3 = v0 = x4 = t

    v1 = v1 v0v2 = tan(v1)v3 = v2 v2v4 = v3 v2v5 = v4/v3v6 = v5 v2y1 = v5y2 = v6

    v3 = x1v2 = x2v1 = x3v0 = x4v1 = v1 v0 + v1 v0v2 = v1/ cos(v1)2

    v3 = v2 v2v4 = v3 v2 + v3 v2v5 = (v4 v3 v5) (1/v3)v6 = v5 v2 + v5 v2y1 = v5y2 = v6

    K. Kulshreshtha 26 / 94 Intro to AD ESSIM Summer School

  • Forward Mode(Lighthouse)v3 = x1 = v2 = x2 = v1 = x3 = v0 = x4 = t

    v1 = v1 v0v2 = tan(v1)v3 = v2 v2v4 = v3 v2v5 = v4/v3v6 = v5 v2y1 = v5y2 = v6

    v3 = x1v2 = x2v1 = x3v0 = x4

    v1 = v1 v0 + v1 v0v2 = v1/ cos(v1)2

    v3 = v2 v2v4 = v3 v2 + v3 v2v5 = (v4 v3 v5) (1/v3)v6 = v5 v2 + v5 v2y1 = v5y2 = v6

    K. Kulshreshtha 26 / 94 Intro to AD ESSIM Summer School

  • Forward Mode(Lighthouse)v3 = x1 = v2 = x2 = v1 = x3 = v0 = x4 = t

    v1 = v1 v0v2 = tan(v1)v3 = v2 v2v4 = v3 v2v5 = v4/v3v6 = v5 v2y1 = v5y2 = v6

    v3 = x1v2 = x2v1 = x3v0 = x4v1 = v1 v0 + v1 v0

    v2 = v1/ cos(v1)2

    v3 = v2 v2v4 = v3 v2 + v3 v2v5 = (v4 v3 v5) (1/v3)v6 = v5 v2 + v5 v2y1 = v5y2 = v6

    K. Kulshreshtha 26 / 94 Intro to AD ESSIM Summer School

  • Forward Mode(Lighthouse)v3 = x1 = v2 = x2 = v1 = x3 = v0 = x4 = t

    v1 = v1 v0v2 = tan(v1)v3 = v2 v2v4 = v3 v2v5 = v4/v3v6 = v5 v2y1 = v5y2 = v6

    v3 = x1v2 = x2v1 = x3v0 = x4v1 = v1 v0 + v1 v0v2 = v1/ cos(v1)2

    v3 = v2 v2v4 = v3 v2 + v3 v2v5 = (v4 v3 v5) (1/v3)v6 = v5 v2 + v5 v2y1 = v5y2 = v6

    K. Kulshreshtha 26 / 94 Intro to AD ESSIM Summer School

  • Forward Mode(Lighthouse)v3 = x1 = v2 = x2 = v1 = x3 = v0 = x4 = t

    v1 = v1 v0v2 = tan(v1)v3 = v2 v2v4 = v3 v2v5 = v4/v3v6 = v5 v2y1 = v5y2 = v6

    v3 = x1v2 = x2v1 = x3v0 = x4v1 = v1 v0 + v1 v0v2 = v1/ cos(v1)2

    v3 = v2 v2

    v4 = v3 v2 + v3 v2v5 = (v4 v3 v5) (1/v3)v6 = v5 v2 + v5 v2y1 = v5y2 = v6

    K. Kulshreshtha 26 / 94 Intro to AD ESSIM Summer School

  • Forward Mode(Lighthouse)v3 = x1 = v2 = x2 = v1 = x3 = v0 = x4 = t

    v1 = v1 v0v2 = tan(v1)v3 = v2 v2v4 = v3 v2v5 = v4/v3v6 = v5 v2y1 = v5y2 = v6

    v3 = x1v2 = x2v1 = x3v0 = x4v1 = v1 v0 + v1 v0v2 = v1/ cos(v1)2

    v3 = v2 v2v4 = v3 v2 + v3 v2

    v5 = (v4 v3 v5) (1/v3)v6 = v5 v2 + v5 v2y1 = v5y2 = v6

    K. Kulshreshtha 26 / 94 Intro to AD ESSIM Summer School

  • Forward Mode(Lighthouse)v3 = x1 = v2 = x2 = v1 = x3 = v0 = x4 = t

    v1 = v1 v0v2 = tan(v1)v3 = v2 v2v4 = v3 v2v5 = v4/v3v6 = v5 v2y1 = v5y2 = v6

    v3 = x1v2 = x2v1 = x3v0 = x4v1 = v1 v0 + v1 v0v2 = v1/ cos(v1)2

    v3 = v2 v2v4 = v3 v2 + v3 v2v5 = (v4 v3 v5) (1/v3)

    v6 = v5 v2 + v5 v2y1 = v5y2 = v6

    K. Kulshreshtha 26 / 94 Intro to AD ESSIM Summer School

  • Forward Mode(Lighthouse)v3 = x1 = v2 = x2 = v1 = x3 = v0 = x4 = t

    v1 = v1 v0v2 = tan(v1)v3 = v2 v2v4 = v3 v2v5 = v4/v3v6 = v5 v2y1 = v5y2 = v6

    v3 = x1v2 = x2v1 = x3v0 = x4v1 = v1 v0 + v1 v0v2 = v1/ cos(v1)2

    v3 = v2 v2v4 = v3 v2 + v3 v2v5 = (v4 v3 v5) (1/v3)v6 = v5 v2 + v5 v2

    y1 = v5y2 = v6

    K. Kulshreshtha 26 / 94 Intro to AD ESSIM Summer School

  • Forward Mode(Lighthouse)v3 = x1 = v2 = x2 = v1 = x3 = v0 = x4 = t

    v1 = v1 v0v2 = tan(v1)v3 = v2 v2v4 = v3 v2v5 = v4/v3v6 = v5 v2y1 = v5y2 = v6

    v3 = x1v2 = x2v1 = x3v0 = x4v1 = v1 v0 + v1 v0v2 = v1/ cos(v1)2

    v3 = v2 v2v4 = v3 v2 + v3 v2v5 = (v4 v3 v5) (1/v3)v6 = v5 v2 + v5 v2y1 = v5y2 = v6

    K. Kulshreshtha 26 / 94 Intro to AD ESSIM Summer School

  • Complexity (Forward Mode)

    tang c MOVES 1 + 1 3 + 3 3 + 3 2 + 2ADDS 0 1 + 1 0 + 1 0 + 0MULTS 0 0 1 + 2 0 + 1NLOPS 0 0 0 1 + 1

    OPS(F (x)x) c OPS(F (x))

    with c [2,5/2] platform dependent

    K. Kulshreshtha 27 / 94 Intro to AD ESSIM Summer School

  • Reverse Mode AD = Discrete Adjoints

    x

    F

    y

    x y>F (x) = x y>F (x) F (x , y)

    K. Kulshreshtha 28 / 94 Intro to AD ESSIM Summer School

  • Reverse Mode AD = Discrete Adjoints

    x

    F

    y

    y

    y >y

    =c

    x y>F (x) = x y>F (x) F (x , y)

    K. Kulshreshtha 28 / 94 Intro to AD ESSIM Summer School

  • Reverse Mode AD = Discrete Adjoints

    x

    x

    F

    Fy

    y

    y >F

    (x)=

    c

    y >y

    =c

    x y>F (x) = x y>F (x) F (x , y)

    K. Kulshreshtha 28 / 94 Intro to AD ESSIM Summer School

  • Reverse Mode AD = Discrete Adjoints

    x

    x

    F

    Fy

    y

    y >F

    (x)=

    c

    y >y

    =c

    x y>F (x) = x y>F (x) F (x , y)

    K. Kulshreshtha 28 / 94 Intro to AD ESSIM Summer School

  • Reverse Mode (Lighthouse)v3 = x1; v2 = x2; v1 = x3; v0 = x4;

    v1 = v1 v0;v2 = tan(v1);

    v3 = v2 v2;v4 = v3 v2;

    v5 = v4/v3;v6 = v5 v2;

    y1 = v5; y2 = v6;v5 = y1; v6 = y2;

    v5 += v6v2; v2 += v6v5;v4 += v5/v3; v3 = v5v5/v3;

    v3 += v4v2; v2 += v4v3;v2 += v3; v2 = v3;

    v1 += v2/ cos2(v1);v1 += v1v0; v0 += v1v1;

    x4 = v0; x3 = v1; x2 = v2; x1 = v3;

    K. Kulshreshtha 29 / 94 Intro to AD ESSIM Summer School

  • Complexity (Reverse Mode)grad c

    MOVES 1 + 1 3 + 6 3 + 8 2 + 5ADDS 0 1 + 2 0 + 2 0 + 1MULTS 0 0 1 + 2 0 + 1NLOPS 0 0 0 1 + 1

    OPS(y>F (x)) c OPS(F (x))MEM(y>F (x)) OPS(F (x))

    with c [3,4] platform dependent

    Remarks:I Cost for gradient calculation independent of nI Memory requirement may cause problem! Checkpointing

    K. Kulshreshtha 30 / 94 Intro to AD ESSIM Summer School

  • Complexity (Reverse Mode)grad c

    MOVES 1 + 1 3 + 6 3 + 8 2 + 5ADDS 0 1 + 2 0 + 2 0 + 1MULTS 0 0 1 + 2 0 + 1NLOPS 0 0 0 1 + 1

    OPS(y>F (x)) c OPS(F (x))MEM(y>F (x)) OPS(F (x))

    with c [3,4] platform dependentRemarks:

    I Cost for gradient calculation independent of nI Memory requirement may cause problem! Checkpointing

    K. Kulshreshtha 30 / 94 Intro to AD ESSIM Summer School

  • Algorithmic Differentiation (AD)Differentiation of computer programs within machine precision

    Basic forms:

    forward mode: OPS(F (x)x) c1 OPS(F ), c1 [2,5/2]reverse mode: OPS(y>F (x)) c2 OPS(F ), c2 [3,4]

    MEM(y> F (x)) OPS(F )combination: OPS(y>F (x)x) c3 OPS(F ), c3 [7,10]

    Tasks:

    Derivatives (of any order) sparsity patterns, differentiation of fixpointiterations and time-stepping procedures . . .Implementations?

    K. Kulshreshtha 31 / 94 Intro to AD ESSIM Summer School

  • Algorithmic Differentiation (AD)Differentiation of computer programs within machine precision

    Basic forms:

    forward mode: OPS(F (x)x) c1 OPS(F ), c1 [2,5/2]reverse mode: OPS(y>F (x)) c2 OPS(F ), c2 [3,4]

    MEM(y> F (x)) OPS(F )combination: OPS(y>F (x)x) c3 OPS(F ), c3 [7,10]

    Tasks:

    Derivatives (of any order) sparsity patterns, differentiation of fixpointiterations and time-stepping procedures . . .

    Implementations?

    K. Kulshreshtha 31 / 94 Intro to AD ESSIM Summer School

  • Algorithmic Differentiation (AD)Differentiation of computer programs within machine precision

    Basic forms:

    forward mode: OPS(F (x)x) c1 OPS(F ), c1 [2,5/2]reverse mode: OPS(y>F (x)) c2 OPS(F ), c2 [3,4]

    MEM(y> F (x)) OPS(F )combination: OPS(y>F (x)x) c3 OPS(F ), c3 [7,10]

    Tasks:

    Derivatives (of any order) sparsity patterns, differentiation of fixpointiterations and time-stepping procedures . . .Implementations?

    K. Kulshreshtha 31 / 94 Intro to AD ESSIM Summer School

  • Intermediate Conclusions

    K. Kulshreshtha 32 / 94 Intro to AD ESSIM Summer School

  • Intermediate Conclusions

    I Evaluation of derivatives with working accuracy

    I Efficient evaluation of derivatives of any order

    I Numerous tools available,based on source transformation or operator overloading

    I Nondifferentiability only on meager set

    I Optimal Jacobi accumulation is (NP-)hard

    I Full Jacobians/Hessians often not needed or sparse

    I see www.autodiff.org

    K. Kulshreshtha 32 / 94 Intro to AD ESSIM Summer School

  • Intermediate Conclusions

    I Evaluation of derivatives with working accuracy

    I Efficient evaluation of derivatives of any order

    I Numerous tools available,based on source transformation or operator overloading

    I Nondifferentiability only on meager set

    I Optimal Jacobi accumulation is (NP-)hard

    I Full Jacobians/Hessians often not needed or sparse

    I see www.autodiff.org

    K. Kulshreshtha 32 / 94 Intro to AD ESSIM Summer School

  • Intermediate Conclusions

    I Evaluation of derivatives with working accuracy

    I Efficient evaluation of derivatives of any order

    I Numerous tools available,based on source transformation or operator overloading

    I Nondifferentiability only on meager set

    I Optimal Jacobi accumulation is (NP-)hard

    I Full Jacobians/Hessians often not needed or sparse

    I see www.autodiff.org

    K. Kulshreshtha 32 / 94 Intro to AD ESSIM Summer School

  • Forward Mode of AD

    x(t)y(t)

    x

    yF

    F

    y(t) = t F (x(t)) = F(x(t)) x(t) F (x , x)

    K. Kulshreshtha 33 / 94 Intro to AD ESSIM Summer School

  • Evaluation procedure for:

    F (x) = exp(sin(x2))

    ddt

    F (x) = exp(sin(x2)) cos(x2) 2x x

    v0 = x v0 = xv1 = v20 v1 = 2v0 v0v2 = sin(v1) v2 = cos(v1) v1v3 = exp(v2) v3 = exp(v2) v2y = v3 y = v3 = ddt F (x)

    K. Kulshreshtha 34 / 94 Intro to AD ESSIM Summer School

  • General: With ui (vj )ji

    v = (ui ) v = i (ui )ui

    Elementary Tangents for Instrinsic Functions

    v = sin(u) v = cos(u) uv =

    u v = 0.5 u/vv = tan(u) v = u/ cos2(u)

    Elementary Tangents for Arithmetic Operations

    v = u w v = u wv = u w v = w u + u wv = v/w v = (u v w)/w

    K. Kulshreshtha 35 / 94 Intro to AD ESSIM Summer School

  • Forward Differentiation of Lighthouse Examplev3 = x1 = v3 x1v2 = x2 = v2 x2v1 = x3 = v1 x3v0 = x4 = t v0 x4v1 = v1 v0v2 = tan(v1)v3 = v2 v2v4 = v3 v2v5 = v4/v3v6 = v5v7 = v5 v2y1 = v6y2 = v7

    K. Kulshreshtha 36 / 94 Intro to AD ESSIM Summer School

  • Forward Differentiation of Lighthouse Examplev3 = x1 = v3 x1v2 = x2 = v2 x2v1 = x3 = v1 x3v0 = x4 = t v0 x4v1 = v1 v0 v1 = v1 v0 + v1 v0v2 = tan(v1) v2 = v1/ cos(v1)2

    v3 = v2 v2 v3 = v2 v2v4 = v3 v2 v4 = v3 v2 + v3 v2v5 = v4/v3 v5 = (v4 v3 v5) (1/v3)v6 = v5 v6 = v5v7 = v5 v2 v7 = v5 v2 + v5 v2y1 = v6y2 = v7

    K. Kulshreshtha 36 / 94 Intro to AD ESSIM Summer School

  • Forward Differentiation of Lighthouse Examplev3 = x1 = v3 x1v2 = x2 = v2 x2v1 = x3 = v1 x3v0 = x4 = t v0 x4v1 = v1 v0 v1 = v1 v0 + v1 v0v2 = tan(v1) v2 = v1/ cos(v1)2

    v3 = v2 v2 v3 = v2 v2v4 = v3 v2 v4 = v3 v2 + v3 v2v5 = v4/v3 v5 = (v4 v3 v5) (1/v3)v6 = v5 v6 = v5v7 = v5 v2 v7 = v5 v2 + v5 v2y1 = v6 y1 = v6y2 = v7 y2 = v7

    K. Kulshreshtha 36 / 94 Intro to AD ESSIM Summer School

  • Definei (ui , ui ) i (ui )ui

    General Tangent Procedure

    [vin, vin] = [xi , xi ] i = 1, . . . ,n

    [vi , vi ] = [i (ui ), i (ui )] i = 1, . . . , l

    [ymi , ymi ] = [vli , vli ] i = m 1, . . . ,0

    K. Kulshreshtha 37 / 94 Intro to AD ESSIM Summer School

  • Complexity:

    tang c MOVES 1 + 1 3 + 3 3 + 3 2 + 2ADDS 0 1 + 1 0 + 1 0 + 0MULTS 0 0 1 + 2 0 + 1NLOPS 0 0 0 1 + 1

    OPS(F (x)x) c OPS(F (x))

    with c [2,5/2] platform dependent

    Remarks:I How to arange evaluation of [vi , vi ] optimal?I Overwrites are no problem

    K. Kulshreshtha 38 / 94 Intro to AD ESSIM Summer School

  • Complexity:

    tang c MOVES 1 + 1 3 + 3 3 + 3 2 + 2ADDS 0 1 + 1 0 + 1 0 + 0MULTS 0 0 1 + 2 0 + 1NLOPS 0 0 0 1 + 1

    OPS(F (x)x) c OPS(F (x))

    with c [2,5/2] platform dependent

    Remarks:I How to arange evaluation of [vi , vi ] optimal?I Overwrites are no problem

    K. Kulshreshtha 38 / 94 Intro to AD ESSIM Summer School

  • Reverse Mode of AD

    x

    F

    y

    x> y>F (x) = x y>F (x) F (x , y)

    K. Kulshreshtha 39 / 94 Intro to AD ESSIM Summer School

  • Reverse Mode of AD

    x

    F

    y

    y

    y >y

    =c

    x> y>F (x) = x y>F (x) F (x , y)

    K. Kulshreshtha 39 / 94 Intro to AD ESSIM Summer School

  • Reverse Mode of AD

    x

    x F

    F

    y

    y

    y >F

    (x)=

    c

    y >y

    =c

    x> y>F (x) = x y>F (x) F (x , y)

    K. Kulshreshtha 39 / 94 Intro to AD ESSIM Summer School

  • Reverse Mode of AD

    x

    x F

    F

    y

    y

    y >F

    (x)=

    c

    y >y

    =c

    x> y>F (x) = x y>F (x) F (x , y)

    K. Kulshreshtha 39 / 94 Intro to AD ESSIM Summer School

  • Evaluation Procedure:

    F (x) = exp(sin(x2))yF (x) = y exp(sin(x2)) cos(x2) 2x

    v0 = x v3 = yv1 = v20 v2 = exp(v2) v3v2 = sin(v1) v1 = cos(v1) v2v3 = exp(v2) v0 = 2v0 v1y = v3 x = v0

    K. Kulshreshtha 40 / 94 Intro to AD ESSIM Summer School

  • Computation of yF (x)?

    Rewrite general tangent procedure

    [vin, vin] = [xi , xi ] i = 1, . . . ,n

    [vi , vi ] = [i (ui ), i (ui )] i = 1, . . . , l

    [ymi , ymi ] = [vli , vli ] i = m 1, . . . ,0

    in matrix-vector notation.

    K. Kulshreshtha 41 / 94 Intro to AD ESSIM Summer School

  • Define

    cij ivj

    Ai

    1 0 0 0...

    ......

    ......

    ......

    ...0 0 1 0ci 1n ci 2n ci i1 0 00 0 1 0...

    ......

    ......

    ......

    ...0 0 1

    R(n+l)(n+l)

    = I + en+i [i (ui ) en+i ]T i = 1, . . . , lPn [I,0, . . . ,0] Rn(n+l),

    Qm [0, . . . , I] Rm(n+l).

    K. Kulshreshtha 42 / 94 Intro to AD ESSIM Summer School

  • Tangent Operation vi = i (ui )ui equals v = Ai v

    Linear transformation y = F (x) x equals

    y = QmAlAl1 . . .A2A1PTn x ,

    with the product representation

    F (x) = QmAlAl1 . . .A2A1PTn Rmn .

    linear transformation x = y F (x) (reverse mode) equals

    xT = PnAT1 AT2 . . .A

    Tl1A

    Tl Q

    Tmy

    T ,

    K. Kulshreshtha 43 / 94 Intro to AD ESSIM Summer School

  • Because ofATi = I + [i (ui ) en+i ]eTn+i ,

    product ATi vj with vj Rn+l forms incremental operation:

    1. Index i 6= j 6 i vj is left unchanged

    2. Index j i vj is incremented by vi cij3. Index i vi is set to zero

    Hence, for computing x = y F (x) one has toI evaluate all intermediate vi obtaining partial derivatives cijI increment adjoint values vi in reverse order.

    K. Kulshreshtha 44 / 94 Intro to AD ESSIM Summer School

  • Corresponding Incremental Adjoint Recursion

    vi = 0 i = 1 n, . . . , lvin = xi i = 1, . . . ,nvi = i (vj )ji i = 1, . . . , lymi = vli i = m 1, . . . ,0vli = ymi i = 0, . . . ,m 1vj + = vi vj

    i (ui ) for j i i = l , . . . ,1xi = vin i = n, . . . ,1

    No overwriting Zeroing out of the vi is omitted.

    K. Kulshreshtha 45 / 94 Intro to AD ESSIM Summer School

  • Adjoint Recursion of Lighthouse-Examplev3 = x1; v2 = x2; v1 = x3; v0 = x4;

    v1 = v1 v0;v2 = tan(v1);

    v3 = v2 v2;v4 = v3 v2;

    v5 = v4/v3;v6 = v5 v2;

    y1 = v5; y2 = v6;v5 = y1; v6 = y2;

    v5 += v6v2; v2 += v6v5;v4 += v5/v3; v3 += v5v5/v3;

    v3 += v4v2; v2 += v4v3;v2 += v3; v2 = v3;

    v1 += v2/ cos2(v1);v1 += v1v0; v0 += v1v1;

    x4 = v0; x3 = v1; x2 = v2; x1 = v3;

    K. Kulshreshtha 46 / 94 Intro to AD ESSIM Summer School

  • Reverse Operations of ElementalsElemental Reverse

    v = c v = 0

    v = u w u += v ; w = v ; v = 0

    v = u w u += v w ; w += v u; v = 0

    v = 1/u u= (v v) v ; v = 0

    v =

    u u += 0.5 v/v ; v = 0

    v = uc u += (v c) v/u; v = 0

    v = exp (u) u += v v ; v = 0

    v = log (u) u += v/u; v = 0

    v = sin (u) u += v cos (u); v = 0

    K. Kulshreshtha 47 / 94 Intro to AD ESSIM Summer School

  • Complexity:

    grad c MOVES 1 + 1 3 + 6 3 + 8 2 + 5ADDS 0 1 + 2 0 + 2 0 + 1MULTS 0 0 1 + 2 0 + 1NLOPS 0 0 0 1 + 1

    OPS(yF (x)) c OPS(F (x))

    with c [3,5] platform dependentRemark:

    I Overwrites cause problem! See Checkpointing.

    K. Kulshreshtha 48 / 94 Intro to AD ESSIM Summer School

  • Interpretation of adjoint values vi :I Lagrange multipliers of

    Min yys.t. (ui ) vi = 0 i = 1, . . . , l

    vli ymi = 0 i = 1, . . . ,mI Error propagation factors

    vi = vi (1 + i ), |i | macheps

    y y =l

    i=1

    vivii + H.O.T

    l

    i=1

    |vivi |+ H.O.T

    K. Kulshreshtha 49 / 94 Intro to AD ESSIM Summer School

  • Vector ModesHow to reduce overhead in forward and reverse mode?

    Propagate a bundle of vectors instead of a single vector!

    Forward Mode:

    Compute Y = F (x)X Rmp for X Rnp

    instead of

    y = F (x)x Rm for x Rn

    I Replace vj R by Vj Rp in tangent procedureI Replace z R(lm) by Z R(lm)pI Everything else remains unchanged

    K. Kulshreshtha 50 / 94 Intro to AD ESSIM Summer School

  • Vector Tangents:

    v = u w V = U Wv = u w V = w U + u Wv = sin(u) V = cos(u) U

    Complexity:

    tangp c MOVES 1 + p 3 + 3p 3 + 3p 2 + 2pADDS 0 1 + p 0 + p 0 + 0MULTS 0 0 1 + 2p 0 + pNLOPS 0 0 0 2

    OPS(F (x)X ) c OPS(F (x))

    with c [1 + p,1 + 1.5p]

    K. Kulshreshtha 51 / 94 Intro to AD ESSIM Summer School

  • Applications:I Computation of full Jacobian

    Set X = In

    OPS(F (x)) (1 + 1.5n)OPS(F (x))

    Compare with scalar mode, i.e. F (x)ei , i = 1, . . . ,n:

    OP(f (x)) 2.5nOPS(F (x))

    Remarkable run-time savings can be obtainedI Computation of sparse Jacobians

    K. Kulshreshtha 52 / 94 Intro to AD ESSIM Summer School

  • Reverse mode:

    Compute X = YF (x) Rqn for Y Rqm

    instead of

    x = yF (x) Rn for y Rm

    Implementation:I Replace vi R by Vi Rq in adjoint recursionI Everything else remains unchanged

    K. Kulshreshtha 53 / 94 Intro to AD ESSIM Summer School

  • Complexity:

    gradq c MOVES 1 + q 3 + 6q 5 + 6q 3 + 4qADDS 0 1 + 2q 0 + 2q 0 + qMULTS 0 0 1 + 2q 0 + qNLOPS 0 0 0 2

    OPS(YF (x)) k OPS(F (x))

    with c [1 + 2q,1.5 + 2.5q]

    Remarks:I Computation of full JacobianI Remarkable run-time savings can be obtainedI Application: e.g. matrix compression for sparse Jacobians

    K. Kulshreshtha 54 / 94 Intro to AD ESSIM Summer School

  • General Evaluation Procedure:

    vin = xi for i = 1, . . . ,n

    vi = i (vj )ji for i = 1, . . . , l

    ymi = vli for i = m 1, . . . ,0

    whereI vi , i 0, are the independentsI vi , i l m + 1, are the dependentsI i is elemental function, = set of elemental functionsI j i is dependence relation: vi depends directly on vj

    K. Kulshreshtha 55 / 94 Intro to AD ESSIM Summer School

  • Essential Optional Vector

    Smooth u + v , u v u v , u/v k uk vku, c c u, c u k ck ukrec(u) 1/u ukexp(u), log(u) uc , arcsin(u)sin(u), cos(u) tan(u)|u|c , c > 1 . . .

    Lipschitz abs(u) |u| max(u, v) max{|uk |}k ,

    k |uk ||u, v |

    u2+v2 min(u, v)

    k u

    2k

    General heav(u) sign(u)(u > 0)?v , w

    K. Kulshreshtha 56 / 94 Intro to AD ESSIM Summer School

  • Representation as computational graph

    v3

    v2

    v1

    v0

    v1 v2

    v4 v5

    v3

    v6

    v7

    Computational graph of lighthouse example

    K. Kulshreshtha 57 / 94 Intro to AD ESSIM Summer School

  • Representation as extended system E(x ; v) = 0

    I Defineui (vj )ji for i = 1 n, . . . , li (ui ) xi+n for i = 1 n, . . . ,0cij ivj called elemental partials

    I Assume dependents variables as mutually independent

    Then one has thatI i < 1 or j > l m implies cij 0I Evaluation procedure equivalent to nonlinear system

    0 = E(x ; v) (i (ui ) vi )i=1n,...,l

    K. Kulshreshtha 58 / 94 Intro to AD ESSIM Summer School

  • Topics to keep in mind:

    I Composite Differentiation (overall differentiability):Here it is assumed that all i are differentiable atrespective argument.

    I Overwriting = Shared Allocation of vi and vjNot considered here, but:There are ways to handle them.

    I Aliasing of Arguments and Results:Not considered here, but:Again there are ways to handle them.

    K. Kulshreshtha 59 / 94 Intro to AD ESSIM Summer School

  • Main Results:

    I Gradients are cheap function costsI Costs grow only quadratically in degree

    I Nondifferentiability only on meager setI Optimal Jacobi-accumulation is (NP-)hardI Jacobi-accumulation is often unneccessary

    K. Kulshreshtha 60 / 94 Intro to AD ESSIM Summer School

  • Main Results:

    I Gradients are cheap function costsI Costs grow only quadratically in degreeI Nondifferentiability only on meager set

    I Optimal Jacobi-accumulation is (NP-)hardI Jacobi-accumulation is often unneccessary

    K. Kulshreshtha 60 / 94 Intro to AD ESSIM Summer School

  • Main Results:

    I Gradients are cheap function costsI Costs grow only quadratically in degreeI Nondifferentiability only on meager setI Optimal Jacobi-accumulation is (NP-)hard

    I Jacobi-accumulation is often unneccessary

    K. Kulshreshtha 60 / 94 Intro to AD ESSIM Summer School

  • Main Results:

    I Gradients are cheap function costsI Costs grow only quadratically in degreeI Nondifferentiability only on meager setI Optimal Jacobi-accumulation is (NP-)hardI Jacobi-accumulation is often unneccessary

    K. Kulshreshtha 60 / 94 Intro to AD ESSIM Summer School

  • Higher Order DerivativesSecond derivatives:

    Use forward and reverse differentiation together!

    y = F (x)reverse

    differentiation x = yF (x)

    x = yF (x)forward

    differentiation x = yF (x)x + yF (x)

    For scalar-valued function F , y = 1, y = 0, and x one has

    x = F (x)x = Hessian-vector product

    K. Kulshreshtha 61 / 94 Intro to AD ESSIM Summer School

  • Implementation

    x1

    d1

    d2

    bx

    y4

    Obj

    y1

    y2

    Sourcecode

    y = F (x) Rm

    K. Kulshreshtha 62 / 94 Intro to AD ESSIM Summer School

  • Implementation

    d2

    Sourcecode

    Object Code

    y = F (x)

    y = F (x) Rm

    y = F (x) x Rm

    forwa

    rd

    (x, x) R2n

    K. Kulshreshtha 62 / 94 Intro to AD ESSIM Summer School

  • Implementation

    Sourcecode

    Object Code Object Code

    y = F (x)

    y = F (x) Rm

    y = F (x) x Rm

    forwa

    rdreverse

    (x, x) R2n

    Intermediatevalues

    Stack (Tape)

    x = yF (x) Rn

    (x, y) Rn+m

    K. Kulshreshtha 62 / 94 Intro to AD ESSIM Summer School

  • Implementation

    Sourcecode

    Object Code Object Code

    y = F (x)

    y = F (x) Rm

    y = F (x) x Rm

    forwa

    rdreverse

    (x, x) R2n

    Intermediatevalues

    Stack (Tape)

    x = yF (x) Rnx = yF (x) x Rn

    (x, y) Rn+m

    K. Kulshreshtha 62 / 94 Intro to AD ESSIM Summer School

  • Higher derivatives: Think in Taylor polynomialsLet x be the path

    x(t) d

    j=0

    xj t j : R 7 R with xj =1j! j

    t jx(t)

    t=0

    HenceI xj is scaled derivative at t = 0.I x1 = tangent,x2 = curvature.

    Now:Consider F : Rn Rm,d times continuously differentiable y(t) F (x(t)) : R 7 Rm is smooth (chain-rule)

    K. Kulshreshtha 63 / 94 Intro to AD ESSIM Summer School

  • There exist Taylor coefficients yj Rm, 0 j d , with

    y(t) =d

    j=0

    yj t j + O(td+1).

    Using the chain-rule, one obtains:

    y0 = F (x0)

    y1 = F (x0)x1

    y2 = F (x0)x2 +12

    F (x0)x1x1

    y3 = F (x0)x3 + F (x0)x1x2 +16

    F (x0)x1x1x1

    directional derivatives of high order (forward mode)

    OPS(x0, . . . , xd y0, . . . , yd ) d2OPS(x0 y0)

    using Taylor arithmetic

    K. Kulshreshtha 64 / 94 Intro to AD ESSIM Summer School

  • Cross Country / Preaccumulation

    Consider again Jacobian of the extended system E(x ; v):

    E (x ; v) =

    I 0 0B L I 0R T 1

    with F (x) = R + T (I L)1B = Schur complement

    I Forward mode: Row-Block-Elimination of T by L II Reverse mode: Column-Block-Elimination of B by L II Cross country: Apply other elimination order

    K. Kulshreshtha 65 / 94 Intro to AD ESSIM Summer School

  • Another InterpretationElimination in Jacobian of extended system =Vertex elimination in computational graph:

    4

    1

    +

    Example for forward and reverse mode:

    0

    1

    1 2 3 4

    6

    5

    c 1,0

    c1,

    1 c2,1 c3,2 c4,3

    c5,1

    c 5,4

    c6,0

    c6,4

    K. Kulshreshtha 66 / 94 Intro to AD ESSIM Summer School

  • Another InterpretationElimination in Jacobian of extended system =Vertex elimination in computational graph:

    4

    1

    +

    Example for forward and reverse mode:

    0

    1

    1 2 3 4

    6

    5

    c 1,0

    c1,

    1 c2,1 c3,2 c4,3

    c5,1

    c 5,4

    c6,0

    c6,4

    K. Kulshreshtha 66 / 94 Intro to AD ESSIM Summer School

  • AD by Source Transformation

    Code for Simulation Code for Derivatives

    func.src 1. Lexical analysis2. Syntax analysis3. Semantic analysis4. Static data flow analyses (Symbol table, Error handler)5. AD !!!6. Code optimization7. Unparsing

    func&der.drc

    K. Kulshreshtha 67 / 94 Intro to AD ESSIM Summer School

  • AD by Source Transformation

    Code for Simulation Code for Derivatives

    func.src 1. Lexical analysis2. Syntax analysis3. Semantic analysis4. Static data flow analyses (Symbol table, Error handler)5. AD !!!6. Code optimization7. Unparsing

    func&der.drc

    K. Kulshreshtha 67 / 94 Intro to AD ESSIM Summer School

  • Remarks

    Pros:I Make use of compiler optimization !!I Derivative code is readable

    Cons:I Requires complete compiler activity analysis, new language features

    I Flexibility

    K. Kulshreshtha 68 / 94 Intro to AD ESSIM Summer School

  • Speelpennings Example

    Speelpennings function f (x) =n

    i=1 xi coded in Fortran:

    SUBROUTINE speelpenning ( x , y )DOUBLE PRECISION x (10 ) , yINTEGER i

    y=1.0DO i =1 ,10

    y=yx ( i )END DOEND

    K. Kulshreshtha 69 / 94 Intro to AD ESSIM Summer School

  • Forward Mode Differentiation

    Requirements for AD tool:I determine active variablesI accociate derivative objectsI generate differntiated statementsI generate differentiated subroutines if required

    K. Kulshreshtha 70 / 94 Intro to AD ESSIM Summer School

  • Forward Mode Differentiation

    SUBROUTINE SPEELPENNING D( x , xd , y , ydDOUBLE PRECISION x (10 ) , xd (10 ) , y , ydINTEGER i

    y=1.0yd =0.D0DO i =1 ,10

    y=yd=ydx ( i )+ yxd ( i )y=yx ( i )

    END DOEND

    K. Kulshreshtha 71 / 94 Intro to AD ESSIM Summer School

  • Reverse Mode Differentiation

    Requirements for AD tool:I determine active variablesI associate derivative objectsI generate adjoint statementsI determine values to be recordedI reverse control flowI generate recording part of the code

    K. Kulshreshtha 72 / 94 Intro to AD ESSIM Summer School

  • Reverse Mode DifferentiationSUBROUTINE SPEELPENNING B( x , xb , y , yb )DOUBLE PRECISION x (10 ) , xb (10 ) , y , ybINTEGER i

    y=1.0DO i =1 ,10

    CALL PUSHREAL8( y )y=yx ( i )

    END DOCALL PUSHINTEGER4( i 1)CALL POPINTEGER4( adTo )DO i =adTo,1 ,1

    CALL POPREAL8( y )xb ( i )= xb ( i ) + yybyb=x ( i ) yb

    END DOyb =0.D0END

    K. Kulshreshtha 73 / 94 Intro to AD ESSIM Summer School

  • Source Transformation ToolsFortran:

    AMPL Bell Lab, USA forward + reverseNAG RWTH Aachen, germany + NAG forward + reverseTAF FastOpt, Germany forward + reverseTapenade INRIA, France forward + reverseOpenAD ANL, USA, RWTH Aachen, germany forward. . .

    All tools provide at most second derivatives !Goal: Get AD into the compiler

    Matlab:ADIMAT RWTH Aachen, Germany forwardMAD (in TOMS) Univ. Shrivenham, GB forward. . .

    K. Kulshreshtha 74 / 94 Intro to AD ESSIM Summer School

  • Source Transformation ToolsFortran:

    AMPL Bell Lab, USA forward + reverseNAG RWTH Aachen, germany + NAG forward + reverseTAF FastOpt, Germany forward + reverseTapenade INRIA, France forward + reverseOpenAD ANL, USA, RWTH Aachen, germany forward. . .

    All tools provide at most second derivatives !Goal: Get AD into the compiler

    Matlab:ADIMAT RWTH Aachen, Germany forwardMAD (in TOMS) Univ. Shrivenham, GB forward. . .

    K. Kulshreshtha 74 / 94 Intro to AD ESSIM Summer School

  • AD by Opeator OverloadingImplementation strategies:

    I New variable type for active variables,i.e., for independents, dependents, intermediates (!)

    I Derivative information included asI field in composite structuresI generation of internal function representation

    Principle: New class definition

    type ad double = c lassVAL : double ;???

    end ;

    K. Kulshreshtha 75 / 94 Intro to AD ESSIM Summer School

  • AD by Field in Composite StructureIdea for forward mode:

    type ad double = c lassVAL : double ;DER : array [ 1 . . p ] of double ;

    end ;

    All operations are overloaded, e.g., multiplication:

    opera tor (U,V : ad double ) W : ad double ;i : integer ;begin

    W.VAL := U. VAL V.VAL ;for i := 1 to p do

    W.DER[ i ] := U.VALV.DER[ i ] + U.DER[ i ]V.VAL ;end ;

    K. Kulshreshtha 76 / 94 Intro to AD ESSIM Summer School

  • AD by Field in Composite StructureIdea for forward mode:

    type ad double = c lassVAL : double ;DER : array [ 1 . . p ] of double ;

    end ;

    All operations are overloaded, e.g., multiplication:

    opera tor (U,V : ad double ) W : ad double ;i : integer ;begin

    W.VAL := U.VAL V.VAL ;for i := 1 to p do

    W.DER[ i ] := U.VALV.DER[ i ] + U.DER[ i ]V.VAL ;end ;

    K. Kulshreshtha 76 / 94 Intro to AD ESSIM Summer School

  • Remarks:I Implements basic rules of differentations for

    I each arithmetic operationI each intrinsic function

    I Easy to realizetapeless forward mode of ADOL-C

    Idea for reverse mode: Graph in coreNeed of backward pointer for calculating adjoints new node type:

    . . .

    VAL IND

    DER ADJ

    K. Kulshreshtha 77 / 94 Intro to AD ESSIM Summer School

  • Remarks:I Implements basic rules of differentations for

    I each arithmetic operationI each intrinsic function

    I Easy to realizetapeless forward mode of ADOL-C

    Idea for reverse mode: Graph in coreNeed of backward pointer for calculating adjoints new node type:

    . . .

    VAL IND

    DER ADJ

    K. Kulshreshtha 77 / 94 Intro to AD ESSIM Summer School

  • W := U * V generates

    U.VAL U.IND

    V.VAL V.IND

    V.ADJV.DER

    U.ADJU.DER

    W.VAL

    W.DER

    W.IND

    W.ADJ

    Data dependency in opposite direction !!

    K. Kulshreshtha 78 / 94 Intro to AD ESSIM Summer School

  • W := U * V generates

    U.VAL U.IND

    V.VAL V.IND

    V.ADJV.DER

    U.ADJU.DER

    W.VAL

    W.DER

    W.IND

    W.ADJ

    Data dependency in opposite direction !!

    K. Kulshreshtha 78 / 94 Intro to AD ESSIM Summer School

  • Remarks:I Calculation of derivatives only possible after

    function evaluation at same point

    I Complete computational graph is kept in coreAmount of internal storage may cause problems

    I Access on nodes largely at random

    Conclusion:Derivative calculation in active class OK for forward mode butnot practicable for reverse mode on large problems !!

    K. Kulshreshtha 79 / 94 Intro to AD ESSIM Summer School

  • Remarks:I Calculation of derivatives only possible after

    function evaluation at same point

    I Complete computational graph is kept in coreAmount of internal storage may cause problems

    I Access on nodes largely at random

    Conclusion:Derivative calculation in active class OK for forward mode butnot practicable for reverse mode on large problems !!

    K. Kulshreshtha 79 / 94 Intro to AD ESSIM Summer School

  • AD by Internal RepresentationIdea:

    Class adouble {VAL : double ;IND : i n t e g e r ; }

    W = U * V causes recording onto three tapes:INDEXCOUNT += 1W.IND = INDEXCOUNTW.VAL = U.VAL * V.VALW.VAL VALUE-TAPEMULT OPERATION-TAPE(U.IND, V.IND, W.IND) INDEX-TAPE

    K. Kulshreshtha 80 / 94 Intro to AD ESSIM Summer School

  • AD by Internal RepresentationIdea:

    Class adouble {VAL : double ;IND : i n t e g e r ; }

    W = U * V causes recording onto three tapes:INDEXCOUNT += 1W.IND = INDEXCOUNTW.VAL = U.VAL * V.VALW.VAL VALUE-TAPEMULT OPERATION-TAPE(U.IND, V.IND, W.IND) INDEX-TAPE

    K. Kulshreshtha 80 / 94 Intro to AD ESSIM Summer School

  • Remarks for Tape-based Derivatives

    I Mechanical application of chain-rule based on the tapes

    I Data access on tapes strictly sequential

    I Tapes can be reused for function / derivativecalculation at different points

    I Several tapes for different control flows depending onactive variables

    K. Kulshreshtha 81 / 94 Intro to AD ESSIM Summer School

  • General Remarks for Operator Overloading

    Pros:I Only choice for C/C++I Great flexibility with respect to mode/orderI Simple to maintain

    Cons:I Compiler optimization may be lostI Only object code for derivatives

    K. Kulshreshtha 82 / 94 Intro to AD ESSIM Summer School

  • Operator Overloading Tools

    C/C++:ADO1 Cranfield Uni., GB forward + reverseADOL-C Uni. Paderborn, Germany forward + reverseCppAD Uni. Washington, USA forward + reverseFADBAD TU Denmark forward + reverseTADIFF TU Denmark forward...

    Matlab:ADMAT Cayuga Research Associates, USA forward + reverse...

    K. Kulshreshtha 83 / 94 Intro to AD ESSIM Summer School

  • Operator Overloading Tools

    C/C++:ADO1 Cranfield Uni., GB forward + reverseADOL-C Uni. Paderborn, Germany forward + reverseCppAD Uni. Washington, USA forward + reverseFADBAD TU Denmark forward + reverseTADIFF TU Denmark forward...

    Matlab:ADMAT Cayuga Research Associates, USA forward + reverse...

    K. Kulshreshtha 83 / 94 Intro to AD ESSIM Summer School

  • Conclusions

    I Evaluation of derivatives with working accuracy

    I Forward mode: OPS(F (x)x) c OPS(F ), c [2,5/2]I Reverse mode: OPS(y> F (x)) c OPS(F ), c [3,4]

    Gradients are cheap Function Costs!!

    I Combination: OPS(y>,F (x)x) c OPS(F ), c [7,10]I Cost of higher derivatives grows quadratically in the degree

    K. Kulshreshtha 84 / 94 Intro to AD ESSIM Summer School

  • Conclusions

    I Evaluation of derivatives with working accuracy

    I Efficient evaluation of derivatives of any order

    I Numerous tools available,based on source transformation or operator overloading

    I Nondifferentiability only on meager set

    I Optimal Jacobi accumulation is (NP-)hard

    I Full Jacobians/Hessians often not needed or sparse

    I see www.autodiff.org

    K. Kulshreshtha 84 / 94 Intro to AD ESSIM Summer School

  • Conclusions

    I Evaluation of derivatives with working accuracy

    I Efficient evaluation of derivatives of any order

    I Numerous tools available,based on source transformation or operator overloading

    I Nondifferentiability only on meager set

    I Optimal Jacobi accumulation is (NP-)hard

    I Full Jacobians/Hessians often not needed or sparse

    I see www.autodiff.org

    K. Kulshreshtha 84 / 94 Intro to AD ESSIM Summer School

  • Conclusions

    I Evaluation of derivatives with working accuracy

    I Efficient evaluation of derivatives of any order

    I Numerous tools available,based on source transformation or operator overloading

    I Nondifferentiability only on meager set

    I Optimal Jacobi accumulation is (NP-)hard

    I Full Jacobians/Hessians often not needed or sparse

    I see www.autodiff.org

    K. Kulshreshtha 84 / 94 Intro to AD ESSIM Summer School

  • Automatic differentiation by overloading in C++

    ADOL-C version 2.1

    I reorganization of tapingtape dependent information kept in separate structures

    I different differentiation contextsI documented external function facilityI documented fixpoint iteration facilityI documented checkpointing facility based on revolve

    I documented parallelization of derivative calculationI coupled with ColPack for exploitation of sparsityI available at COIN-OR since May 2009

    K. Kulshreshtha 85 / 94 Intro to AD ESSIM Summer School

  • Source Code ModificationsI Include needed header-files

    easiest way: #include adolc.h

    I Define region that has to be differentiated:

    trace on(tag,keep); Start of... active sectiontrace off(file); and its end

    I Mark independents and dependents in active section:

    xa = yp; mark dependents

    I Declare all active variables of type adoubleI Calculate derivative objects after trace off(file)

    K. Kulshreshtha 86 / 94 Intro to AD ESSIM Summer School

  • Source Code ModificationsI Include needed header-files

    easiest way: #include adolc.hI Define region that has to be differentiated:

    trace on(tag,keep); Start of... active sectiontrace off(file); and its end

    I Mark independents and dependents in active section:

    xa = yp; mark dependents

    I Declare all active variables of type adoubleI Calculate derivative objects after trace off(file)

    K. Kulshreshtha 86 / 94 Intro to AD ESSIM Summer School

  • Source Code ModificationsI Include needed header-files

    easiest way: #include adolc.hI Define region that has to be differentiated:

    trace on(tag,keep); Start of... active sectiontrace off(file); and its end

    I Mark independents and dependents in active section:

    xa = yp; mark dependents

    I Declare all active variables of type adoubleI Calculate derivative objects after trace off(file)

    K. Kulshreshtha 86 / 94 Intro to AD ESSIM Summer School

  • Source Code ModificationsI Include needed header-files

    easiest way: #include adolc.hI Define region that has to be differentiated:

    trace on(tag,keep); Start of... active sectiontrace off(file); and its end

    I Mark independents and dependents in active section:

    xa = yp; mark dependents

    I Declare all active variables of type adouble

    I Calculate derivative objects after trace off(file)

    K. Kulshreshtha 86 / 94 Intro to AD ESSIM Summer School

  • Source Code ModificationsI Include needed header-files

    easiest way: #include adolc.hI Define region that has to be differentiated:

    trace on(tag,keep); Start of... active sectiontrace off(file); and its end

    I Mark independents and dependents in active section:

    xa = yp; mark dependents

    I Declare all active variables of type adoubleI Calculate derivative objects after trace off(file)

    K. Kulshreshtha 86 / 94 Intro to AD ESSIM Summer School

  • Evaluation Procedure (Lighthouse)

    y1 = tan(t)tan(t)

    y2 = tan(t)tan(t)

    v3 = x1 = v2 = x2 = v1 = x3 = v0 = x4 = tv1 = v1 v0 1(v1, v0)v2 = tan(v1) 2(v1)v3 = v2 v2 3(v2, v2)v4 = v3 v2 4(v3, v2)v5 = v4/v3 5(v4, v3)v6 = v5 v2 6(v5, v2)y1 = v5y2 = v6

    K. Kulshreshtha 87 / 94 Intro to AD ESSIM Summer School

  • Model Generation with ADOL-C

    v

    v

    v

    v

    x1

    1

    x2

    x3x4

    2

    3 4

    5

    6

    v

    v

    K. Kulshreshtha 88 / 94 Intro to AD ESSIM Summer School

  • Model Generation with ADOL-C

    v

    v

    v

    v

    x1

    1

    x2

    x3x4

    2

    3 4

    5

    6

    v

    v

    double x1, x2, x3, x4;double v1, v2, . . .double y1, y2;

    x1 = 3.7; x2 = 0.7;x3 = 0.5; x4 = 0.5;

    v1 = x3*x4;v2 = tan(v1);v3 = x2-v2;v4 = x1*v2;v5 = v4/v3;v6 = v5*x2

    y1 = v5; y2 = v6

    K. Kulshreshtha 88 / 94 Intro to AD ESSIM Summer School

  • Model Generation with ADOL-C

    v

    v

    v

    v

    x1

    1

    x2

    x3x4

    2

    3 4

    5

    6

    v

    v

    adouble x1, x2, x3, x4;adouble v1, v2, . . .double y1, y2;

    trace on(tag);x1= 3.7; x2= 0.7;x3= 0.5; x4= 0.5;v1 = x3*x4;v2 = tan(v1);v3 = x2-v2;v4 = x1*v2;v5 = v4/v3;v6 = v5*x2

    v5= y1; v6= y2;trace off();

    K. Kulshreshtha 88 / 94 Intro to AD ESSIM Summer School

  • Model Generation with ADOL-C

    v

    v

    v

    v

    x1

    1

    x2

    x3x4

    2

    3 4

    5

    6

    v

    v

    adouble x1, x2, x3, x4;adouble v1, v2, . . .double y1, y2;

    trace on(tag);x1= 3.7; x2= 0.7;x3= 0.5; x4= 0.5;v1 = x3*x4;v2 = tan(v1);v3 = x2-v2;v4 = x1*v2;v5 = v4/v3;v6 = v5*x2

    v5= y1; v6= y2;trace off();

  • Easy-to-use routinesint gradient (tag,n,x[n],g[n]): tag = tape number

    n = # indepsx[n] = values of indepsg[n] = f (x)

    int jacobian (tag,m,n,x[n],J[m][n]): tag = tape numberm = # depsn = # indepsx[n] = values of indepsJ[m][n] = F (x)

    int hessian (tag,n,x[n],H[n][n]) tag = tape numbern = # indepsx[n] = values of indepsH[n][n] = 2f (x)

    K. Kulshreshtha 89 / 94 Intro to AD ESSIM Summer School

  • Easy-to-use routinesint gradient (tag,n,x[n],g[n]): tag = tape number

    n = # indepsx[n] = values of indepsg[n] = f (x)

    int jacobian (tag,m,n,x[n],J[m][n]): tag = tape numberm = # depsn = # indepsx[n] = values of indepsJ[m][n] = F (x)

    int hessian (tag,n,x[n],H[n][n]) tag = tape numbern = # indepsx[n] = values of indepsH[n][n] = 2f (x)

    K. Kulshreshtha 89 / 94 Intro to AD ESSIM Summer School

  • Easy-to-use routinesint gradient (tag,n,x[n],g[n]): tag = tape number

    n = # indepsx[n] = values of indepsg[n] = f (x)

    int jacobian (tag,m,n,x[n],J[m][n]): tag = tape numberm = # depsn = # indepsx[n] = values of indepsJ[m][n] = F (x)

    int hessian (tag,n,x[n],H[n][n]) tag = tape numbern = # indepsx[n] = values of indepsH[n][n] = 2f (x)

    K. Kulshreshtha 89 / 94 Intro to AD ESSIM Summer School

  • Further drivers for optimization:

    I vec_jac(tag,m,n,repeat,x[n],u[m],z[n])

    Computes z = uT F (x)I jac_vec(tag,m,n,x[n],v[n],z[n])

    Computes z = F (x)vI hess_vec(tag,n,x[n],v[n],z[n])

    Computes z = 2f (x)vI lagra_hess_vec(tag,n,m,x[n],v[n],u[m],h[n])

    Computes h = uT F (x)vExtension to uT F (x)V available

    I jac_solv(tag,n,x[n],b[n], sparse, mode)Computes w with F (x)w = b and store result in b

    K. Kulshreshtha 90 / 94 Intro to AD ESSIM Summer School

    vec_jac(tag, m, n, repeat, x[n], u[m], z[n])jac_vec(tag, m, n, x[n], v[n], z[n])hess_vec(tag, n, x[n], v[n], z[n])lagra_hess_vec(tag, n, m, x[n], v[n], u[m], h[n])jac_solv(tag, n, x[n], b[n]

  • Drivers for sparse Derivative MatricesJacobian:sparse_jac(tag,m,n,repeat,x,&nnz,&row_ind,&col_ind,

    &values,options);

    jac_pat(tag,m,n,x,JP,options);

    generate_seed_jac(m,n,JP,&seed,&p,option);

    Hessian:sparse_hess(tag,n,repeat,x,&nnz,&row_ind,&col_ind,

    &values,options);

    hess_pat(tag,n,x,HP,options);

    generate_seed_hess(n,HP,&seed,&p,option);

    graph coloring routines by ColPack (Gebrehemedin, Pothen)

    K. Kulshreshtha 91 / 94 Intro to AD ESSIM Summer School

    sparse_jac(tag, m, n, repeat, x, &nnz, &row_ind, &col_ind,& values, options);jac_pat(tag, m, n, x, JP, options);generate_seed_jac(m, n, JP, &seed, &p, option);sparse_hess(tag, n, repeat, x, &nnz, &row_ind, &col_ind,&values, options);hess_pat(tag, n, x, HP, options);generate_seed_hess(n, HP, &seed, &p, option);

  • Drivers for sparse Derivative MatricesJacobian:sparse_jac(tag,m,n,repeat,x,&nnz,&row_ind,&col_ind,

    &values,options);

    jac_pat(tag,m,n,x,JP,options);

    generate_seed_jac(m,n,JP,&seed,&p,option);

    Hessian:sparse_hess(tag,n,repeat,x,&nnz,&row_ind,&col_ind,

    &values,options);

    hess_pat(tag,n,x,HP,options);

    generate_seed_hess(n,HP,&seed,&p,option);

    graph coloring routines by ColPack (Gebrehemedin, Pothen)

    K. Kulshreshtha 91 / 94 Intro to AD ESSIM Summer School

    sparse_jac(tag, m, n, repeat, x, &nnz, &row_ind, &col_ind,& values, options);jac_pat(tag, m, n, x, JP, options);generate_seed_jac(m, n, JP, &seed, &p, option);sparse_hess(tag, n, repeat, x, &nnz, &row_ind, &col_ind,&values, options);hess_pat(tag, n, x, HP, options);generate_seed_hess(n, HP, &seed, &p, option);

  • Drivers for sparse Derivative MatricesJacobian:sparse_jac(tag,m,n,repeat,x,&nnz,&row_ind,&col_ind,

    &values,options);

    jac_pat(tag,m,n,x,JP,options);

    generate_seed_jac(m,n,JP,&seed,&p,option);

    Hessian:sparse_hess(tag,n,repeat,x,&nnz,&row_ind,&col_ind,

    &values,options);

    hess_pat(tag,n,x,HP,options);

    generate_seed_hess(n,HP,&seed,&p,option);

    graph coloring routines by ColPack (Gebrehemedin, Pothen)

    K. Kulshreshtha 91 / 94 Intro to AD ESSIM Summer School

    sparse_jac(tag, m, n, repeat, x, &nnz, &row_ind, &col_ind,& values, options);jac_pat(tag, m, n, x, JP, options);generate_seed_jac(m, n, JP, &seed, &p, option);sparse_hess(tag, n, repeat, x, &nnz, &row_ind, &col_ind,&values, options);hess_pat(tag, n, x, HP, options);generate_seed_hess(n, HP, &seed, &p, option);

  • Drivers for sparse Derivative MatricesJacobian:sparse_jac(tag,m,n,repeat,x,&nnz,&row_ind,&col_ind,

    &values,options);

    jac_pat(tag,m,n,x,JP,options);

    generate_seed_jac(m,n,JP,&seed,&p,option);

    Hessian:sparse_hess(tag,n,repeat,x,&nnz,&row_ind,&col_ind,

    &values,options);

    hess_pat(tag,n,x,HP,options);

    generate_seed_hess(n,HP,&seed,&p,option);

    graph coloring routines by ColPack (Gebrehemedin, Pothen)

    K. Kulshreshtha 91 / 94 Intro to AD ESSIM Summer School

    sparse_jac(tag, m, n, repeat, x, &nnz, &row_ind, &col_ind,& values, options);jac_pat(tag, m, n, x, JP, options);generate_seed_jac(m, n, JP, &seed, &p, option);sparse_hess(tag, n, repeat, x, &nnz, &row_ind, &col_ind,&values, options);hess_pat(tag, n, x, HP, options);generate_seed_hess(n, HP, &seed, &p, option);

  • Advanced taping techniques

    Part A Part B Part C

    ADOL-C tape (basic)

    K. Kulshreshtha 92 / 94 Intro to AD ESSIM Summer School

  • Advanced taping techniques

    Part A Part B Part C

    ADOL-C tape (basic)

    ADOL-C tape (nested)

    K. Kulshreshtha 92 / 94 Intro to AD ESSIM Summer School

  • Advanced taping techniques

    Part A Part B Part C

    ADOL-C tape (basic)

    ADOL-C tape (nested)

    External differentiated functions

    K. Kulshreshtha 92 / 94 Intro to AD ESSIM Summer School

  • Differentiation of fixpoint iterations

    ADOLC tape (basic)

    initialization iteration evaluation

    () () ()

    K. Kulshreshtha 93 / 94 Intro to AD ESSIM Summer School

  • Differentiation of fixpoint iterations

    ADOLC tape (basic)

    initialization iteration evaluation

    ADOLC function:

    ADOLC tape (nested)

    fp_iteration()

    () () ()

    K. Kulshreshtha 93 / 94 Intro to AD ESSIM Summer School

  • Differentiation of fixpoint iterations

    ADOLC tape (basic)

    initialization iteration evaluation

    repeated subtape evaluation

    reverse evaluation

    ()

    ()

    ()

    K. Kulshreshtha 93 / 94 Intro to AD ESSIM Summer School

  • Summary and Outlook

    I AD for C and C++ derivatives of any orderI Easy-to-use routines for standard derivative objects,

    optimization, ODEs, tensors, . . .I Randomly accessed memory of original magnitudeI Sequential access of data on tape, i.e., array or fileI Wide range of application:

    I aerodynamic, chemical engineering, medicine, netword opt., . . .Several ideas for improvement:

    I runtime activity, parallelization, front ends, . . .

    Contact:http://www.coin-or.org/projects/ADOL-C.xml

    K. Kulshreshtha 94 / 94 Intro to AD ESSIM Summer School