an introduction to the homogenization method in optimal...

71
Optimal Shape Design, Lecture Notes in Maths. 1740, A. Cellina & A. Ornelas eds, 47–156, Springer, 2000. An Introduction to the Homogenization Method in Optimal Design Luc TARTAR Department of Mathematical Sciences Carnegie Mellon University Pittsburgh, PA 15213-3890, U.S.A. 0. Introduction This course describes Optimization/Optimal Design problems which lead to Homogenization questions, together with the method to treat them that Fran¸ cois MURAT and I had developed, mostly in the 70s. I adopt a chronological point of view, which I find best for describing how some new techniques (which are often misattributed nowadays) were introduced for overcoming some difficulties that we had encountered, or for generalizing a particular result that we had found useful in our search. There are many ways to practice research in Mathematics, and I hope that the description of the way we thought in front of new questions could help some to experience for themselves the extraordinary feeling of discovery; it must be said, however, that one rarely finds oneself in exactly the same situations that others had experienced before. What one finds may also have been known to others, a fact that Ennio DE GIORGI had resumed in “Chi cerca trova, chi ricerca ritrova” 1 . In a few occasions it had probably been useful that I did not know the approach that others had followed before, as the one that I created appeared to be different and to offer new possibilities: the knowledge of an old method often makes it difficult to invent a new one, and this is an important obstacle in research. Although the difference between exploration and exploitation is quite obvious for any oil engineer, many researchers in Mathematics spend their life exploiting methods invented by others, mimicking what they have read or heard. There is nothing really wrong about that behaviour for those who understand their role in the scientific community as that of soldiers of an army working for the benefit of the whole community. However, it is a potentially destructive behaviour that some pretend to have invented the ideas that they have read or that they have been taught directly by more creative people, even if they simply misattribute them to some of their friends instead, because too often they have not completely understood the whole potential of the ideas that they use, and they may transmit twisted informations with the main effect of misleading part of the army, and playing therefore the same role as traitors. Having been raised in a religious background, I do not attribute my mathematical ability to my own efforts and therefore I consider it a duty to put the gift that I was given to the service of others. I have noticed that religious teachers from the past seem to have liked using parables, stories so simple that their students asked about their meaning, and oral tradition then transmitted us the initial story, the questions of the students, and one “example” given by the master; because it was so easy to remember, the teaching had therefore been transmitted intact by people who did not even understand it, unaware of the innumerable applications that the little story contained. Some mathematicians behave in this way too, writing general theorems but giving only one or two examples, or they may simply teach a general theory by explaining one example in detail, thinking that every trained mathematician will automatically see what the general idea is. I have done that “mistake” often, and after inventing a method applicable to all variational problems, I had been quite upset when I had found written that I had only solved the case of a diffusion equation, wondering how someone writing such a stupid statement could consider himself a mathematician, and why his coauthors had not jumped out of their seat at such an idiotic remark that could well be attributed to them too. Jealousy, the observation that mathematical ability is too sparsely distributed, or the saying “au pays des aveugles les borgnes sont rois”, 2 come to my mind for explaining the behaviour of those who try 1 From the sentence “Ask and it will be given to you; seek and you will find; knock and the door will be opened to you.” in the gospels (Matthew 7:7, Luke 11:9), the middle part has given rise to the French saying “Qui cherche trouve”, equivalent to the Italian “Chi cerca trova”. The play on the prefix, “re” in French, “ri” in Italian, does not work as well in English (one could replace seek and find by search and discover in order to use research and rediscover). 2 In the land of the blind, a one-eyed man is king. 1

Upload: hoangdat

Post on 09-Aug-2018

215 views

Category:

Documents


0 download

TRANSCRIPT

Optimal Shape Design, Lecture Notes in Maths. 1740, A. Cellina & A. Ornelas eds, 47–156, Springer, 2000.

An Introduction to the Homogenization Method in Optimal Design

Luc TARTAR

Department of Mathematical SciencesCarnegie Mellon University

Pittsburgh, PA 15213-3890, U.S.A.

0. IntroductionThis course describes Optimization/Optimal Design problems which lead to Homogenization questions,

together with the method to treat them that Francois MURAT and I had developed, mostly in the 70s. Iadopt a chronological point of view, which I find best for describing how some new techniques (which areoften misattributed nowadays) were introduced for overcoming some difficulties that we had encountered, orfor generalizing a particular result that we had found useful in our search.

There are many ways to practice research in Mathematics, and I hope that the description of the waywe thought in front of new questions could help some to experience for themselves the extraordinary feelingof discovery; it must be said, however, that one rarely finds oneself in exactly the same situations that othershad experienced before. What one finds may also have been known to others, a fact that Ennio DE GIORGI

had resumed in “Chi cerca trova, chi ricerca ritrova”1.In a few occasions it had probably been useful that I did not know the approach that others had followed

before, as the one that I created appeared to be different and to offer new possibilities: the knowledge of anold method often makes it difficult to invent a new one, and this is an important obstacle in research.

Although the difference between exploration and exploitation is quite obvious for any oil engineer, manyresearchers in Mathematics spend their life exploiting methods invented by others, mimicking what they haveread or heard. There is nothing really wrong about that behaviour for those who understand their role in thescientific community as that of soldiers of an army working for the benefit of the whole community. However,it is a potentially destructive behaviour that some pretend to have invented the ideas that they have read orthat they have been taught directly by more creative people, even if they simply misattribute them to someof their friends instead, because too often they have not completely understood the whole potential of theideas that they use, and they may transmit twisted informations with the main effect of misleading part ofthe army, and playing therefore the same role as traitors.

Having been raised in a religious background, I do not attribute my mathematical ability to my ownefforts and therefore I consider it a duty to put the gift that I was given to the service of others. I havenoticed that religious teachers from the past seem to have liked using parables, stories so simple that theirstudents asked about their meaning, and oral tradition then transmitted us the initial story, the questions ofthe students, and one “example” given by the master; because it was so easy to remember, the teaching hadtherefore been transmitted intact by people who did not even understand it, unaware of the innumerableapplications that the little story contained. Some mathematicians behave in this way too, writing generaltheorems but giving only one or two examples, or they may simply teach a general theory by explaining oneexample in detail, thinking that every trained mathematician will automatically see what the general ideais. I have done that “mistake” often, and after inventing a method applicable to all variational problems,I had been quite upset when I had found written that I had only solved the case of a diffusion equation,wondering how someone writing such a stupid statement could consider himself a mathematician, and whyhis coauthors had not jumped out of their seat at such an idiotic remark that could well be attributed tothem too. Jealousy, the observation that mathematical ability is too sparsely distributed, or the saying “aupays des aveugles les borgnes sont rois”,2 come to my mind for explaining the behaviour of those who try

1 From the sentence “Ask and it will be given to you; seek and you will find; knock and the door will beopened to you.” in the gospels (Matthew 7:7, Luke 11:9), the middle part has given rise to the French saying“Qui cherche trouve”, equivalent to the Italian “Chi cerca trova”. The play on the prefix, “re” in French,“ri” in Italian, does not work as well in English (one could replace seek and find by search and discover inorder to use research and rediscover).

2 In the land of the blind, a one-eyed man is king.

1

to mislead students out of the right path, but it may well be the result of their complete lack of moraleducation.

For some mysterious reason, Francois MURAT and I were the first who put together the various pieces ofthe theory that I will describe, and I will explain what other people had done according to my information. Ihad taught similar lectures in 1983, before some people embarked in a systematic campaign of misattributionof ideas and results: everyone with a minimal mathematical ability may quickly identify the names of thosewho have indulged in intentional misleading, but this does not mean that they have not themselves had somegenuine idea, in which case I will mention their name for that.

1. A counter-example of Francois MURAT

In 1970, Francois MURAT worked on an academic problem of optimization that had been proposed byJacques-Louis LIONS [Li1], and he found that it had no solution [Mu1]. His result was unexpected,3 andas we were sharing an office in Jussieu, we started a long and fruitful collaboration, first discussing somegeneralizations of his first idea [Mu2], and then embarking on the exploration that led us to (re)discover4 thegeneral theory of “Homogenization”5 [Ta1], [Ta2], [Ta3], [Mu3]. The initial problem that Francois MURAT

considered was to minimize the cost function

J(a) =∫ 1

0

|y(x)− 1− x2|2 dx, (1.1)

when the state y solves the equation of state

− d

dx

(ady

dx

)+ a y = 0 in (0, 1), y(0) = 1, y(1) = 2; y ∈ H1(0, 1), (1.2)

and the control a lies in the following admissible control set

A = a | a ∈ L∞(0, 1), α ≤ a ≤ β a.e. in (0, 1), (1.3)

and as he tried to apply the direct method of the Calculus of Variations, he noticed that for a sequencean ∈ A such that

an a+ and1an

1a−

in L∞(0, 1) weak ?, (1.4)

then the corresponding sequence of solutions yn of (1.2) converges in H1(0, 1) weak to the solution y∞ of

− d

dx

(a−

dy∞dx

)+ a+ y∞ = 0 in (0, 1), y∞(0) = 1, y∞(1) = 2; y∞ ∈ H1(0, 1), (1.5)

withandyndx→ a−

dy∞dx

in L2(0, 1) strong, (1.6)

3 We had not heard about the work that Laurence C. YOUNG had done in the 40s [Yo]. I had first methim in 1971 in Madison, but I only learned many years after that he was the inventor of the objects whichI was still using under the name of parametrized measures in my Heriot–Watt lectures in 1978 [Ta8]; it wasRonald DIPERNA who then insisted that they should be called Young measures.

4 We were not aware of the earlier work on G-convergence of Sergio SPAGNOLO [Sp1], [Sp2], and his workwith Antonio MARINO [Ma&Sp], or the work of Tullio ZOLEZZI [Zo]. It was only after having developed ourown approach, which Francois MURAT named H-convergence in [Mu3], that we became aware of these worksand that of Ennio DE GIORGI and Sergio SPAGNOLO [DG&Sp].

5 As in the title of my Peccot lectures in 1977, I have adopted the term Homogenization, first introducedby Ivo BABUSKA [Ba], for describing our general approach of H-convergence; of course, no constraints ofperiodicity are considered. Some authors automatically associate the term with periodic structures, probablybecause they have not understood the general framework introduced by Sergio SPAGNOLO or by FrancoisMURAT and me!

2

and

J(an)→ J(a−, a+) =∫ 1

0

|y∞(x)− 1− x2|2 dx. (1.7)

Indeed, yn is bounded in H1(0, 1) and vn = an dyn

dx is bounded in L2(0, 1), but as its derivative is an yn whichis bounded in L2(0, 1), vn is actually bounded in H1(0, 1). Therefore one can extract a subsequence suchthat ym converges in H1(0, 1) weak and in L2(0, 1) strong to y∞, and vm converges in L2(0, 1) strong to v∞.Then am ym and dym

dx = 1am vm converge in L2(0, 1) weak, respectively to a+y∞ and to dy∞

dx = 1a−v∞, showing

(1.6), and therefore (1.5); the fact that y∞ is uniquely determined by (1.5) shows that the extraction of asubsequence is not necessary.

For α =√

2−1√2, β =

√2+1√

2, Francois MURAT used the particular sequence

an(x) =

1−√

12 −

x2

6 when x ∈(

2k2n ,

2k+12n

), k = 0, . . . , n− 1

1 +√

12 −

x2

6 when x ∈(

2k+12n , 2k+2

2n

), k = 0, . . . , n− 1,

(1.8)

corresponding to

a− =12

+x2

6; a+ = 1; y∞ = 1 + x2 in (0, 1), (1.9)

showing that infa∈A J(a) = 0. He checked then that it was not possible to have y = 1 + x2 in (1.2) forsome a ∈ A, by considering (1.2) as a differential equation for a, and noticing that all nonzero solutions areunbounded, as they are a = C√

xexp(x2

4

).

We were naturally led to characterize all the possible pairs (a−, a+) which could appear in (1.4) and wefound

α ≤ a−(x) ≤ a+(x) ≤ a−(x)(α+ β)− αβa−(x)

≤ β a.e. x ∈ (0, 1), (1.10)

or equivalently1

a+(x)≤ 1a−(x)

≤ α+ β − a+(x)αβ

a.e. x ∈ (0, 1). (1.11)

Our proof of (1.11) easily extended to the following more general situation,6 and the characterization (1.11)corresponds to using the following Lemma, with a and 1

a being the components of U , K being the piece ofhyperbola U1 U2 = 1 with α ≤ U1 ≤ β.

Lemma 1: Let U (n) be a sequence of measurable functions from an open set Ω ⊂ RN into Rp satisfyingU (n) U (∞) in L∞(Ω; Rp) weak ? and U (n)(x) ∈ K a.e. x ∈ Ω. For a bounded set K,7 the characterizationof all the possible limits U (∞) is U (∞)(x) ∈ conv(K), the closed convex hull of K, a.e. x ∈ Ω.Proof: The closed convex hull of K is the intersection of all the closed half spaces which contain K, and aclosed half space H+ has an equation λ | λ ∈ Rp, L(λ) ≥ 0 for some nonconstant affine function L, and ifH+ contains K one has L(Un) ≥ 0 a.e. x ∈ Ω and therefore L(U∞) ≥ 0 a.e. x ∈ Ω, i.e. U∞(x) ∈ H+ a.e.

6 As we had learned that weak convergence is not adapted to nonlinear problems, we were surprised tohave found such a simple characterization, and I called our common thesis advisor, Jacques-Louis LIONS,to ask him if this was not known already. He suggested that I ask Ivar EKELAND, who told me that it hadbeen implicitely used in some work of CASTAING and was related to a classical result of LYAPUNOV, validfor a set endowed with a nonnegative measure without atoms; indeed our proof extended easily to such ageneral case. I only learned in 1975 from Zvi ARTSTEIN about his simple proof of LYAPUNOV’s result [Ar].

7 If K is unbounded, one denotes KM the elements of K of norm ≤ M , and the characterization is thatthere exists M with U∞(x) ∈ conv(KM ) a.e. x ∈ Ω. The functions U (∞) such that U (∞)(x) ∈ conv(K)correspond to the weak ? closure in L∞(Ω; Rp) of the functions taking (a.e.) their values in K, and one caneasily find cases where these two sets of functions are different, showing that the weak ? topology is notmetrizable on these unbounded sets.

3

x ∈ Ω; the conclusion follows, if one is careful to write the closed convex hull as a countable intersection ofclosed half spaces containing K.

Let V ∈ L∞(Ω; Rp) be such that V (x) ∈ clconv(K) a.e. x ∈ Ω. For each m, one can cut Rp into smallcubes of size 1

m and choose a point of conv(K), the convex hull of K, in each cube intersecting clconv(K)and that helps creating a function W (m) ∈ L∞(Ω; Rp) such that |V −W (m)| ≤ 1

m a.e. x ∈ Ω and W (m)

takes only a finite number of values in conv(K). On a measurable subset ω of Ω where W (m) is constant, wewant to construct a sequence of functions converging in L∞(ω; Rp) weak ? to W (m) and taking their valuesin K, and putting these functions together will create a sequence converging in L∞(Ω; Rp) weak ? to W (m),and then, as the weak ? topology of L∞(Ω; Rp) is metrizable on bounded sets, this will ensure that one canapproach V in that topology.

Let W (m) = λ ∈ conv(K) on ω, so that λ =∑i θi ki, with ki ∈ K and the sum is finite (with all θi ≥ 0,∑

i θi = 1). We cut now ω into measurable pieces of diameter at most 1

n , then partition each of such pieces Einto measurable subsets Ei with meas(Ei) = θimeas(E) and define the function Zn to be equal to ki on eachsuch Ei. The claim is then that as n tends to∞, the sequence Zn converges in L∞(ω; Rp) weak ? to λ; as Zn

only takes a finite number of values in K it is bounded, and it is enough to check that for every continuousfunction ϕ with compact support

∫ωϕZn dx →

∫ωϕλdx. As ϕ is uniformly continuous |ϕ(x) − ϕ(y)| ≤ ε

when |x − y| ≤ 1n , so if e ∈ E one has |ϕ(e)

∫EZn dx −

∫EϕZn dx| ≤ εM meas(E) and |ϕ(e)

∫Eλ dx −∫

Eϕλdx| ≤ εM meas(E) where M is a bound of the norm of Rp on K, but as

∫EZn dx =

∑i

∫Eiki dx =∑

i θimeas(E)ki = meas(E)λ =

∫Eλ dx, one deduces |

∫ωϕZn dx−

∫ωϕλdx| ≤ 2εM meas(ω).

We left aside the construction of the Ei from E, which is easy for sets in RN . L being a nonzero linearfunction on RN , the measure of E

⋂x | x ∈ RN , L(x) ≥ t is a continuous function of t which grows from 0

to meas(E) and one obtains the desired partition of E by cutting E by suitable hyperplanes L−1(ti). Theconstruction can be generalized when Ω is any set equipped with a measure without atoms, as stated by aclassical result of LYAPUNOV (in 1975, I learned from Zvi ARTSTEIN a method of his giving a simple proofof that result as well as bang-bang results in control theory [Ar]).

2. The independent discoveries of othersJean-Louis ARMAND had studied at Ecole Polytechnique in Paris a year ahead of me, but we only

met almost fifteen years after graduating, not so much because we were then part time lecturers at EcolePolytechnique in Palaiseau (he in Mechanics, I in Mathematics), but because he had learned about my workby going to visit Konstantin LUR’IE, in Leningrad. Jean-Louis ARMAND had been computing some OptimalDesign problems and he had been puzzled by the fact that in the meetings that he attended, various engineerswere showing results which were quite different, although they were supposed to solve the same problem;however, he seemed to have been alone in thinking that this was the sign of a serious theoretical difficulty.He had tried to find explanations in the litterature, and he had discovered an article by K. LUR’IE whichseemed relevant; he had then traveled to visit him in Leningrad and, having been told about my work there,he had contacted me after his return to Paris and he had mentioned to me what K. LUR’IE had done andwhat he had told him.

K. LUR’IE had extended some ideas of PONTRYAGUIN to partial differential equations, and he had beenable to obtain better necessary conditions of optimality than those given by a classical method.8 However,he had been quite puzzled when he had discovered a situation with no function satisfying his necessaryconditions [Lu]. I do not know if K. LUR’IE had already obtained the right intuition about what was goingon before finding my article [Ta2], but he had mentioned to Jean-Louis ARMAND that it was in [Ta2] that hehad found the missing ideas that he needed. I had not given the detail of my work with Francois MURAT in[Ta2], but I had mentioned the work on G-convergence of Sergio SPAGNOLO [Sp1], [Sp2], as well as the workof Henri SANCHEZ-PALENCIA [S-P1], [S-P2] , which had helped us understand that our work had something

8 It is an idea going back to HADAMARD to push an interface along its normal for computing derivatives offunctionals, in order to obtain necessary conditions of optimality for example. Although many applicationsof that idea may be formal, Francois MURAT and Jacques SIMON have spent some time giving the method arigourous framework [Mu&Si]. However, the classical approach only gives conditions that must be satisfiedalong the interface, while K. LUR’IE’s approach as well as ours give conditions which must be satisfiedeverywhere.

4

to do with the question of effective properties of mixtures; K. LUR’IE had then coined the term G-closure fordescribing the set of all possible effective tensors of admissible mixtures. I will describe later the intuitiveideas behind the necessary conditions of optimality obtained by K. LUR’IE, but it is important to understandthat the reason why we were able to go further was that we were rediscovering and extending the ideas ofLaurence C. YOUNG, while K. LUR’IE was following the ideas of PONTRYAGUIN and he could hardly haverealized that he had taken the wrong track. As pointed out by Laurence C. YOUNG [Yo], if one has obtainedsome necessary conditions of optimality for an optimization problem, and one finds that only one functionsatisfies them, one cannot even assert that it is the solution of the problem, unless of course one has alreadyproved that there exists at least one solution of the problem. Perron’s paradox seems too naive an example,9

but the point of view of PONTRYAGUIN becomes indeed useless if the problem at hand has no solution. Onthe contrary, the point of view of Laurence C. YOUNG is adapted to that kind of situation, and it createsa relaxed problem which answers two questions: first it explains how the minimizing sequences may havetheir limit outside the initial space in the case of nonexistence of solutions, then it does give the necessaryconditions obtained by following the point of view of PONTRYAGUIN, as they are just part of the necessaryconditions of optimality for the relaxed problem. I will describe these questions on an elementary model.

Once we had become aware of the work of Sergio SPAGNOLO, we did find there some ideas which wehad not thought about, and we checked that we could handle them with our methods, but for the questionof Optimal Design which was our initial motivation we needed more precise results. For example, AntonioMARINO and Sergio SPAGNOLO had shown that in order to obtain all the materials with an effective tensorequal to a general symmetric tensor having its eigenvalues between α and β, it was sufficient to mix isotropicmaterials with tensor γ I with γ ∈ [α′, β′] for some α′ > 0 and β′ < ∞ [Ma&Sp], but in order to computenecessary conditions of optimality for questions of Optimal Design we needed a more precise characterization,and we were able to obtain one in dimension 2. As we will see the question of optimal bounds has two sides,one where one must prove inequalities that must be satisfied by every mixture using given proportions, andwe had found a method for doing that, and one where one must build particular mixtures and compute theireffective tensors, and we used repeated layering for that, as Antonio MARINO and Sergio SPAGNOLO haddone, but they had not addressed the first question.

I am not sure if we had found reference to the work of Sergio SPAGNOLO before reading some work ofTullio ZOLEZZI, and one of his articles had puzzled us for a while, as we thought that one of his theoremscontradicted some of ours [Zo]. The puzzling theorem stated that if a sequence an converges weakly to a+

in L∞(Ω) then the corresponding sequence of solutions un converges weakly to the solution associated toa+. Francois MURAT thought that some nuance in Italian might have tricked us in mistranslating whatwas meant, but as we were pondering if “debolmente” could mean anything else than weakly, it suddenlyappeared that our mistake had been to read correctly weakly and to interpret it incorrectly as weakly ?, asindeed it was the first time that we had seen any use of the weak topology of L∞(Ω) in a concrete situation;we understood then why there was a reference to an article of Alexandre GROTHENDIECK, who had shownthat convergence in L∞(Ω) weak implies strong convergence in Lploc(Ω) for every finite p.

In the early 70s, Jean CEA and his team in Nice had been performing some numerical computations forsimilar problems of Optimal Design than the ones which I had been studying with Francois MURAT. We hadbeen aware of some work by Denise CHENAIS [Ch], which means that if one imposes some kind of regularitycondition on an interface between two materials then the set of corresponding characteristic functions belongsto a compact subset of Lp(Ω) for p < ∞, and therefore a classical optimal solution exists; our work hadsuggested that if one does not impose such a condition there may not exist any classical solution, in whichcase one has to use the generalized solutions that we had been studying, corresponding to mixtures.

In 1974, after my talk containing the necessary conditions of optimality described in [Ta2], I had notbeen able to convince Jean CEA that the apparition of mixtures was a real possibility that one had toconsider, and he might have been mistaken because of a result which he had obtained a few years beforewith K. MALANOWSKI [Ce&Ma], corresponding to (4.1)/(4.3) with g(x, u, a) = f(x)u, for which a specialsimplification occurs and a classical solution exists. For a given triangulation, one of the numerical methodsthat Jean CEA had tested consists in choosing each triangle to be entirely made of only one material, and I

9 The paradox quoted in [Yo] is as follows: let n be the largest integer; a necessary condition of optimalityis n ≥ n2, which leaves only two candidates, 0 or 1; therefore the largest integer must be 1!

5

could not convince him that if one refines triangulations enough one may start to see oscillations and thatour analysis is important for that reason. Had the computers been more powerful in those days, he mightindeed have discovered numerical oscillations in refining his triangulations, but the cost would have beenprohibitive at the time and only coarse triangulations were used. The classical way for obtaining necessaryconditions of optimality by pushing the interface in the direction of its normal, an idea of J. HADAMARD

that is often only used at a formal level but which Francois MURAT and Jacques SIMON have put into arigourous framework [Mu&Si], gives conditions that must be satisfied on the interface, while I was obtainingnecessary conditions which are valid everywhere; in his method J. CEA could switch any triangle from onematerial to another, and therefore there was no real interface in his approach and this led him to some kindof discretized necessary condition valid everywhere, and he might have been mistaken by what he may haveseen as a similarity between our results.

After some discussions with Guy CHAVENT, who was studying the related problem of identifying thelocal permeability of an oil field from measurements at various points, I had proposed a numerical approachfor solving numerically the type of optimization problem that we had been studying, but the numericalmethod that I had proposed, and that one of his students had implemented, did not work well at all. As weknew that the optimal mixture that we were looking for could be obtained locally as a layered medium, I hadchosen to parametrize the various possible mixtures with a proportion θ ∈ [0, 1] and with an angle describingthe orientation of layers (as I was considering a 2-dimensional problem), but that method appeared to bequite unstable because when θ is 0 or 1, i.e. the material is isotropic, the orientation of the layers is notwell defined. I did not try another numerical method, but I had learned that even when the solution ofan optimization problem is on the boundary of a set, it might not be a good idea to move only along theboundary of this set in order to find the solution, and a better approach could be to cut through the set inorder to arrive more quickly at the interesting points on the boundary.

In June 1980 in New York, Robert V. KOHN had told me that he had been to a meeting organized byJean CEA and E.J. HAUG, and as I knew that the approach of my work with Francois MURAT had probablynot been mentioned at this meeting, I taught him our ideas about how Homogenization problems appear inOptimal Design problems.

3. An elementary model problemThis model problem was mentioned to me by Ivar EKELAND, and I believe that his answer involved

huge abstract compactified spaces, while my solution is entirely based on our Lemma 1 (which correspondsto a small compactification).

We want to minimize the cost function

J(u) =∫ T

0

(|y|2 − |u|2) dt, (3.1)

where the control u belongs to

Uad = u | u ∈ L∞(0, T ), −1 ≤ u(t) ≤ 1, a.e. t ∈ (0, T ), (3.2)

and the state y is defined by the equation of state

dy

dt= u a.e. on (0, T ); y(0) = 0. (3.3)

We can consider here classical necessary conditions of optimality because Uad is convex; the map u 7→ y isaffine continuous and therefore the map u 7→ J is quadratic continuous, and therefore Frechet differentiable,but the following arguments are valid in cases where only Gateaux differentiability holds. Let u∗ ∈ Uad,corresponding to a state y∗, and let δu ∈ L∞(0, T ) be an admissible direction at u, i.e. u = u∗ + ε δu ∈ Uadfor ε > 0 small. Then y = y∗ + ε δy and J(u) = J(u∗) + ε δJ + o(ε), where

d(δy)dt

= δu; δy(0) = 0, (3.4)

6

and

δJ = 2∫ T

0

(y∗ δy − u∗ δu) dt (3.5)

and the classical necessary conditions of optimality consist in writing that δJ ≥ 0 for all admissible δu. Inorder to eliminate δy so that δJ is expressed only in terms of δu, one introduces the adjoint state10 p∗ by

−dp∗dt

= y∗; p∗(T ) = 0, (3.6)

and a simple integration by parts gives∫ T

0

y∗ δy dt = −∫ T

0

dp∗dt

δy dt =∫ T

0

p∗dδy

dtdt =

∫ T

0

p∗ δu dt, (3.7)

and therefore

δJ = 2∫ T

0

(p∗ − u∗)δu dt. (3.8)

The admissibility of δu means that δu ≥ 0 where u∗ = −1, δu ≤ 0 where u∗ = +1, and δu arbitrary where−1 < u∗ < 1 (one first works on the set of points where −1 + η ≤ u ≤ 1− η for η > 0 and then one takes theunion of these sets for all η > 0), and therefore one immediately deduces the classical necessary conditionsof optimality

u∗ = −1 implies p∗ − u∗ ≥ 0, i.e. p∗ ≥ −1−1 < u∗ < +1 implies p∗ − u∗ = 0, i.e. p∗ = u∗u∗ = +1 implies p∗ − u∗ ≤ 0, i.e. p∗ ≤ +1,

(3.9)

which can be read as giving u∗ as the following multivalued function of p∗p∗ < −1 implies u∗ = +1−1 ≤ p∗ ≤ +1 implies u∗ ∈ −1, p∗,+1p∗ > +1 implies u∗ = −1.

(3.10)

One can notice that the system of these classical necessary conditions of optimality, i.e. (3.3), (3.6) and(3.9)/(3.10), has at least the solution u∗ = 0 on (0, T ), corresponding to y∗ = p∗ = 0 on (0, T ).

The point of view of PONTRYAGUIN for obtaining necessary conditions of optimality consists in com-paring u∗ to another control w ∈ Uad by noticing that any control which jumps from u∗ to w is admissible.In the language of Functional Analysis it means

u = (1− χ)u∗ + χw ∈ Uad for every characteristic function χ of a measurable subset of (0, T ), (3.11)

and it is then natural to consider a sequence χn of characteristic functions such that

χn θ in L∞(0, T ) weak ?, (3.12)

with 0 ≤ θ ≤ 1 a.e. in (0, T ); Lemma 1 also tells us that any such θ can be obtained in this way, as canbe checked easily directly. One notices then that the corresponding functions yn, which satisfy a uniformLipschitz condition, converge uniformly to y∞ solution of

dy∞dt

= (1− θ)u∗ + θ w in (0, T ); y∞(0) = 0, (3.13)

and, using the fact that F((1 − χ)u∗ + χw

)= (1 − χ)F (u∗) + χF (w) for every function F and every

characteristic function χ, one deduces that J(un) converges to J(θ) given by

J(θ) =∫ T

0

(|y∞|2 − (1− θ)|u∗|2 − θ|w|2) dt. (3.14)

10 I do not know who introduced that notion. It does play an important role in PONTRYAGUIN’s approach,and he may have introduced it.

7

If J attains its minimum on Uad at u∗, one deduces that J(0) = J(u∗) ≤ limn J(un) = J(θ), and thereforeJ attains its minimum at 0. One writes then the classical necessary conditions of optimality for J , noticingthat admissibility for δθ means δθ ≥ 0, that δy solves

dδy

dt= (w − u∗)δθ; δy(0) = 0, (3.15)

and, as θ = 0 corresponds to y∞ = y∗, that

δJ =∫ T

0

(2y∗ δy + (|u∗|2 − |w|2)δθ) dt. (3.16)

Using the same p∗ as defined in (3.6), the integration by parts gives a different result because the equationfor δy is different∫ T

0

2y∗ δy dt = −2∫ T

0

dp∗dt

δy dt =∫ T

0

2p∗dδy

dtdt =

∫ T

0

2p∗(w − u∗)δθ dt, (3.17)

and therefore

δJ =∫ T

0

(2p∗(w − u∗) + (|u∗|2 − |w|2)

)δθ dt, (3.18)

and the necessary conditions of optimality for J become

2p∗(w − u∗) + (|u∗|2 − |w|2) ≥ 0 a.e. on (0, T ). (3.19)

It is only now that one lets w vary in Uad and, taking advantage of the fact that p∗ is independent of thechoice of w, (3.19) means

2p∗ u∗ − |u∗|2 = inf−1≤w≤1

(2p∗ w − |w|2) a.e. on (0, T )

= −2|p∗| − 1 a.e. on (0, T ),(3.20)

and therefore u∗ = −1 implies p∗ ≥ 0−1 < u∗ < +1 does not occuru∗ = +1 implies p∗ ≤ 0,

(3.21)

or p∗ < 0 implies u∗ = +1p∗ = 0 implies u∗ = ±1p∗ > 0 implies u∗ = −1,

(3.22)

which are obviously more restrictive than (3.9)/(3.10).

The choice of the model problem comes from the simple direct observation that it has no solution, asI will show later. This fact by itself does not tell much about the existence of solutions for the system ofnecessary conditions of optimality, but the analysis of the relaxed problem that I will also introduce laterwill have as a consequence that no such solution exists.

However, one can see by a direct computation that there is no function u∗ ∈ Uad for which the necessaryconditions of optimality (3.21)/(3.22) hold, with y∗ defined by (3.3) and p∗ defined by (3.6). Indeed

0 =∫ T

0

d(y∗ p∗)dt

dt =∫ T

0

(u∗ p∗ − |y∗|2) dt = −∫ T

0

(|p∗|+ |y∗|2) dt, (3.23)

shows that one must have y∗ = p∗ = 0 a.e. in (0, T ), but y∗ = 0 is incompatible with the condition u∗ = ±1a.e. in (0, T ).

8

The point of view of PONTRYAGUIN usually gives stronger necessary conditions of optimality than theclassical ones.11 If the necessary conditions of optimality (either the classical ones or those of PONTRYAGUIN)have no solution, then the minimization problem cannot have any solution, but there is then no obviousexplanation of what minimizing sequences are doing, for example. Of course, the proof of the necessaryconditions contains a hint about oscillating sequences,12 and it is Laurence C. YOUNG’s point of view tostudy directly such oscillating sequences, in order to create a relaxed problem.13

In order to show directly that our minimization problem has no solution, one first notices that

J(u) > −T for all u ∈ Uad, (3.24)

because |y|2 − |u|2 ≥ −1 a.e. on (0, T ) for every u ∈ Uad implies J(u) ≥ −T , and because one cannot haveJ(u) = −T , which would require both y = 0 and |u| = 1 a.e. on (0, T ), in contradiction with the fact thaty = 0 a.e. on (0, T ) implies u = 0 a.e. on (0, T ). Then one notices that

un 0 in L∞(0, T ) weak ? and |un| = 1 a.e. in (0, T ) imply J(un)→ −T, (3.25)

as un 0 in L∞(0, T ) weak ? implies that yn converges uniformly to 0; an example of such a sequence unbelonging to Uad is defined by un(t) = sign(cos n t) on (0, T ).

One sees also that any minimizing sequence, i.e. any sequence un ∈ Uad such that J(un) → −T =infu∈Uad

J(u), must be such that yn → 0 in L2(0, T ) strong and u2n → 1 in L1(0, T ) strong. Because Uad is

bounded in L∞(0, T ), yn → 0 in L2(0, T ) strong is equivalent to un 0 in L∞(0, T ) weak ?, and because|un| ≤ 1 a.e. in (0, T ), u2

n → 1 in L1(0, T ) strong is equivalent to u2n 1 in L∞(0, T ) weak ?.

The same analysis shows that if a sequence un ∈ Uad converges in L∞(0, T ) weak ? to u, one can deducethat yn converges uniformly to y given by (3.3), but one cannot deduce what the limit of J(un) is. However,if one knows that

un u in L∞(0, T ) weak ?

u2n v in L∞(0, T ) weak ?,

(3.26)

then,

J(un)→ J (u, v) =∫ T

0

(|y|2 − v) dt. (3.27)

Lemma 1 characterizes the pairs (u, v) that one can obtain by (3.26) for a sequence un ∈ Uad, choosing forK the piece of parabola U2 = U2

1 with −1 ≤ U1 ≤ 1, and one can therefore introduce a relaxed problemdefined on

Uad = (u, v) | −1 ≤ u ≤ 1; u2 ≤ v ≤ 1 a.e. in (0, T ), (3.28)

11 If the problem is convex, it gives the same conditions as the classical ones. If the controls are imposedto take values in a discrete set, there is no natural differentiable path from one control to another andtherefore one cannot obtain any classical necessary conditions of optimality. However, even for a convex setof admissible functions, if the equation of state has the form y′ = A(y, u) and the cost function has the formJ(u) =

∫ T0B(y, u) dt, one requires differentiability of A and B in both y and u in order to obtain the classical

necessary conditions of optimality, while one only requires differentiability of A and B in y for obtaining thenecessary conditions of optimality of PONTRYAGUIN.

12 The original proof of Pontryaguin’s principle, which Jacques-Louis LIONS had asked me to read in thelate 60s, contains no Functional Analysis at all, but the idea of switching quickly from u∗ to w is explicitthere. I found the proof shown above much later, probably in the late 70s or early 80s, and it might havebeen used in this way before.

13 Again, I am not really sure about who introduced the term relaxation, but I have probably heard itfirst in the seminar PALLU DE LA BARRIERE at IRIA in the late 60s, when I also heard about parametrizedmeasures, now named Young measures.

9

the state y still being given by (3.3), and the cost function J being given by the formula in (3.27). Theoriginal problem is a subset of the new one as

u ∈ Uad if and only if (u, u2) ∈ UadJ(u) = J (u, u2) for all u ∈ Uad.

(3.29)

By Lemma 1, for every (u, v) ∈ Uad there exists a sequence un with un u and u2n v in L∞(0, T ) weak

?, which imply J(un)→ J (u, v), and using (3.29) one deduces that

infu∈Uad

J(u) = inf(u,v)∈Uad

J (u, v) (3.30)

andu∗ minimizes J on Uad if and only if (u∗, u2

∗) minimizes J on Uad. (3.31)

Let us look for the classical necessary conditions of optimality for (u∗, v∗) ∈ Uad. Actually these conditionswill be sufficient conditions of optimality as Uad is convex and J is a convex function; as J is strictly convexin u, u∗ is known in advance to be unique, but although J is affine in v, one deduces that v∗ = 1 a.e. on(0, T ) from the observation that

J (u, v) ≤ J (u, 1) = K(u)− T with K(u) =∫ T

0

|y|2 dt, (3.32)

with equality if and only if v = 1 a.e. on (0, T ). Obviously K attains its minimum at u∗ = 0, but let us forgetfor a moment that J attains its minimum only at (0, 1), and let us check what the necessary conditions ofoptimality for J are at an arbitrary point (u∗, v∗) ∈ Uad. For an admissible direction (δu, δv), δy is stillgiven by (3.4), but

δJ =∫ T

0

(2y∗ δy − δv) dt, (3.33)

which, using p∗ defined by (3.6) and the integration by parts (3.7) gives

δJ =∫ T

0

(2p∗ δu− δv) dt. (3.34)

As Uad is convex, it is equivalent to restrict attention to the admissible directions of the form (δu, δv) =(w− u∗, w2 − v∗)δη with w ∈ Uad and δη ∈ L∞(0, T ) with δη ≥ 0 a.e. in (0, T ), and therefore the necessaryconditions of optimality can be read as

2p∗(w − u∗)− (w2 − v∗) ≥ 0 a.e. on (0, T ), (3.35)

which in the case v∗ = u∗2 coincide with the PONTRYAGUIN necessary conditions of optimality (3.19).Instead of (3.20), (3.35) implies

2p∗ u∗ − v∗ = −2|p∗| − 1 a.e. on (0, T ), (3.36)

and thereforeu∗ ∈ −sign(p∗); v∗ = 1 a.e. on (0, T ), (3.37)

and the system of necessary conditions (3.3), (3.6), (3.37) gives then u∗ = 0.

Instead of the above relaxed problem, I could have used a set much bigger than Uad by introducingthe set of Young measures, which describe the possible weak ? limits of sequences F (un) for all continuousfunctions F . It would have appeared then that only the limits in L∞(0, T ) weak ? of un and u2

n wereimportant, i.e. the only useful functions F are the identity id, together with id2. Therefore starting with

10

a relaxed problem which is too big for the problem at hand, one has not lost information but one carriessome unnecessary information; one can reduce the size of the relaxed problem by getting rid of a part ofthat unnecessary information.

The preceding analysis has identified a topology which is adapted to the initial problem, namely thatdefined by (3.26). As the weak ? topology of L∞(0, T ) is metrizable on bounded sets, one can define it usinga distance d0 for the set Uad, and (3.26) corresponds to using the new topology associated to the distanced1 defined by d1(f, g) = d0(f, g) + d0(f2, g2). The set Uad is not complete for the metric d1, but Lemma 1describes the completion of Uad which is Uad, and it also shows that Uad is compact for the metric d1. Thefunction J is uniformly continuous for d1, and therefore it extends in a unique way to the completion, andthis extension is the function J defined in (3.27).

The preceding construction also fulfills the following requirements for a relaxed problem.An initial problem is to minimize a function F on a set X, 14 a relaxed problem of it is a topological

space X, a mapping j from X into X and a lower semicontinuous function F defined on X satisfying thefollowing properties

i) F(j(x)

)= F (x) for all x in X,

ii) j(X) is dense in X,iii) For every y ∈ X there exists a sequence xn in X such that j(xn) converges in X to y, and F (y) =

limn F (xn) in the case where the topology of X is metrizable. If the topology of X is not metrizable, forevery ε > 0 and every neighbourhood V of y in X there exists x ∈ X with j(x) ∈ V and |F (y)− F (x)| ≤ ε.

By i) the relaxed problem contains a copy of the initial one; if ii) was not satisfied one would restrictoneself to the closure of j(X); if iii) was not satisfied, one would replace F by a larger function by taking thelower semi-continuous envelope of the function equal to F j−1 on j(X) and +∞ on X \ j(X), eventuallyremoving the point where this envelope is +∞ if one wants to work with functions which are finite everywhere.

I could have used a smaller set than Uad by working on Uad but minimizing the function K(u)−T , andthis would be consistent with the observation (3.32), and although one can still compute infu∈Uad

J(u) fromthis new problem, it cannot help discover what is the adequate topology for the initial problem, as definedby (3.26). This new problem is not a relaxed problem of the initial one, as i) is not satisfied, but the functionu 7→ K(u)−T is the Γ-limit of J , i.e. the lower semi-continuous envelope of J , for a particular topology, theL∞(0, T ) weak ? topology. Indeed, un u in L∞(0, T ) weak ? implies lim infn J(un) ≥ K(u) − T and forevery u ∈ Uad there exists a sequence un ∈ Uad with un u and u2

n 1 in L∞(0, T ) weak ?, and thereforeJ(un)→ J (u, 1) = K(u)− T .

This last point suggests to be careful with the use of Γ-limits, which are not always the right objectsto characterize if one has not choosen the right topology, which might not belong to the list of usualtopologies that one is accustomed to use. Another reason to be careful lies in the difference between G-convergence and H-convergence,15 the latter being more general and adapted to most of the situationsoccuring in Continuum Mechanics/Physics. As many problems of Optimal Design come from engineering,and often involve Elasticity, it is worth mentioning not only the inadequacy of linearized Elasticity, butthe inadequacy of the Γ-convergence approach, which is not the same as Homogenization, to questions ofElasticity. Although an intensive propaganda has made many mathematicians believe that Nature minimizesEnergy, it is obviously not so, and one must remember that “conservation of Energy” is the First Principleof Thermodynamics, which no one doubts (of course, one has to include all forms of Energy, including Heat,but in cases where some energy can be stored and released later, one might have to be careful in writing

14 If X has already a topology, one may forget about it, as j is usually not continuous from X into X.15 When I first heard Ennio DE GIORGI talk about Γ-convergence at the seminar of Jacques-Louis LIONS

at College de France around 1977, I understood it as a natural generalization of his earlier work on G-convergence with Sergio SPAGNOLO [DG&Sp], but I had been impressed by the application that he hadmentioned that energy localized on a surface could appear as the Γ-limit of a three-dimensional problem.The natural association which immediately came to my mind was that surface tension in liquids should bedetermined from three-dimensional laws, and that one should extend the idea of Γ-convergence to evolutionproblems in order to study that question.

11

the balance of Energy). Unfortunately Thermodynamics should be called Thermostatics as it only dealswith questions at equilibrium, and its Second Principle does not explain what are the possible exchangesof Energy under its different forms and only postulates the result of these exchanges, but it would be quitenaive to believe that in an elastic material equilibrium is obtained instantaneously.16

4. H-convergenceAfter his one-dimensional counter-example, Francois MURAT had looked at the more general situation

where u is the solution ofAu = −div

(a grad(u)

)= f in Ω, u ∈ H1

0 (Ω), (4.1)

Ω being a bounded open set of RN , with f ∈ L2(Ω),17 and

a ∈ Aad = a | a ∈ L∞(Ω), α ≤ a ≤ β a.e. in (Ω), (4.2)

with α > 0 (so that by Lax–Milgram lemma, the operator A is an isomorphism from H10 (Ω) onto its dual

H−1(Ω)), with the intention of minimizing a functional J of the form

J(a) =∫

Ω

g(x, u(x), a(x)

)dx, (4.3)

and it is important in the sequel that grad(u) does not enter explicitly in the functional, although somespecial dependence in grad(u) can be allowed, like g(u)grad(u), g(u)a grad(u) or g(u)

(a grad(u).grad(u)

),

with some smoothness and growth properties imposed on g. He had noticed that he could solve explicitlythe special case of “layered” media, involving sequences an ∈ Aad depending on x1 alone for example, andas for (1.4) he assumed that for an interval I such that Ω ⊂ I × RN−1

an a+;1an

1a−

in L∞(I) weak ?, (4.4)

and he had deduced that

un u∞ in H10 (Ω) weak and L2(Ω) strong

an∂un∂x1

a−∂u∞∂x1

in L2(Ω) weak

an∂un∂xj

a+∂u∞∂xj

in L2(Ω) weak for j = 2, . . . , N,

(4.5)

and therefore that u∞ is the solution of

Aeffu∞ = −∑

i,j=1,...,N

∂xi

(Aeffij

∂u∞∂xj

)= f in Ω; u∞ ∈ H1

0 (Ω), (4.6)

16 I heard a talk of Joseph KELLER at a meeting of the Institute for Mathematics and its Applicationsin Minneapolis in 1985, in which he explained damping in real elastic materials by the presence of inho-mogeneities together with the effect of geometric nonlinearity. Elastic waves are scattered by the variousinclusions in the material, or by the grain boundaries in the case of a polycrystal, but without the nonlin-earity of geometric origin in the strain-stress relation there would be no coupling between different modes,and no possible explanation about why Energy gets trapped in higher and higher frequencies, which is thereason why one thinks that one has attained a macroscopic equilibrium.

17 More generally f ∈ H−1(Ω), the dual of H10 (Ω). H1

0 (Ω) is the closure of smooth functions with compactsupport in Ω in the Sobolev space H1(Ω), consisting of functions in L2(Ω) having each of their partial deriva-tives in L2(Ω). As Ω is bounded, Poincare inequality holds on H1

0 (Ω), i.e.∫

Ω|u|2 dx ≤ C

∫Ω|grad(u)|2| dx.

H10 (Ω) is compactly imbedded in L2(Ω) and L2(Ω) is compactly imbedded in H−1(Ω). H−1(Ω) consists of

distributions in Ω which are sums of derivatives of functions in L2(Ω). If the boundary ∂Ω of Ω is smooth,there is a notion of trace on the boundary for functions in H1(Ω), and H1

0 (Ω) consists then of those functionsin H1(Ω) having trace 0 on the boundary.

12

withAeff11 = a−

Aeffjj = a+, j = 2, . . . , N

Aeffij = 0, i 6= j.

(4.7)

Of course, one first extracts a subsequence for which un u∞ in H10 (Ω) weak and L2(Ω) strong, and as (4.5)

has a unique solution, all the sequence converges. For j 6= 1 one has an ∂un

∂xj= ∂(an un)

∂xjand an un a+ u∞

in L2(Ω) weak because un → u∞ in L2(Ω) strong. For computing the limit of an ∂un

∂x1,18 Francois MURAT

used a compactness argument that we had learned from Jacques-Louis LIONS [Li2]: writing Dni = an ∂un

∂xi, he

assumed that f ∈ L2(Ω), and for an interval I in x1 and a cube ω in (x2, . . . , xN ) such that I×ω ⊂ Ω he ob-served that div Dn = f implies that ∂Dn

1∂x1

is bounded in L2(I;H−1(ω)

), and as Dn

1 is bounded in L2(I;L2(ω)

)and the injection of L2(ω) into H−1(ω) is compact, the compactness argument implies that Dn

1 stays in acompact set of L2

(I;H−1(ω)

). A subsequence Dm

1 converges in L2(I;H−1(ω)

)strong to D∞1 and therefore

for ϕ ∈ H10 (ω) the function vm defined in I by vm(x1) =

∫ωDm

1 (x1, x2, . . . , xN )ϕ(x2, . . . , xN ) dx2 . . . dxNconverges in L2(I) strong to v∞ defined in a similar way from D∞1 , and therefore 1

am vm converges in L2(ω)

weak to 1a−v∞, and this implies that D∞1 = a−

∂u∞∂x1

in I × ω; varying the cube I × ω in Ω gives the sameresult in Ω.

I noticed later that the analysis of effective properties of layered media can be greatly simplified by usingthe Div-Curl lemma,19 which we only proved in 1974 after having completed our basic approach describedhere, in an attempt to unify the cases for which we were able to compute explicitly the effective coefficientAeff .20

Lemma 2: (Div-Curl lemma) Let Ω be an open set of RN . Let

En E∞ in L2loc(Ω; RN ) weak

Dn D∞ in L2loc(Ω; RN ) weak

div Dn stays in a compact set of H−1loc (Ω) strong

curl En stays in a compact set of H−1loc (Ω; RN(N−1)/2) strong,

(4.8)

then ∫Ω

( N∑i=1

Eni Dni

)ϕdx→

∫Ω

( N∑i=1

E∞i D∞i

)ϕdx for every ϕ ∈ Cc(Ω), (4.9)

18 He had considered the more general case where an(x) =∏i f

in(xi) with 0 < αi ≤ f in ≤ βi, with

f in f i+ and 1fi

n 1

fi−

in L∞ weak ?, and he had found that an ∂un

∂xj Aeffjj

∂u∞∂xj

in L2(Ω) weak, with

Aeffii = f i−∏j 6=i f

j+ for i = 1, . . . , N , and Aeffij = 0 for i 6= j.

19 Some, who want to avoid mentioning the Div-Curl lemma or the more general theory of CompensatedCompactness which I also developed with Francois MURAT in 1976, do not hesitate to lengthen their proofsin order to use only older methods. The correct behaviour in Mathematics is to mention the shortest proofeven if one does not follow it, usually because the writer finds it too difficult for himself/herself, and assumesthat it would be the same for the reader. Failing to mention such generalizations is a good way to slow downthe progress of Science. In this course, the general Compensated Compactness theory (and the theory ofH-measures which I developed in the late 80s) will only be used in describing methods for obtaining boundson effective coefficients.

20 After learning the term Homogenization, introduced by Ivo BABUSKA, we called these limiting coef-ficients “homogenized” coefficients, but after learning the term effective, from George PAPANICOLAOU, Idecided to adopt it; it is often used by physicists, even if they have almost never defined it correctly.

13

where curl E denotes the lists of all ∂Ei

∂xj− ∂Ei

∂xjand Cc(Ω) is the space of continuous functions with compact

support in Ω.21

In 1974,22 our first proof involved localization in x, Fourier transform, Lagrange formula and Plancherelformula, but in the case where En = grad(un) with un converging weakly to u∞ in H1

loc(Ω), there is aneasier proof by integration by parts, which we had already used: for ϕ ∈ C1

c (Ω) one has∫

Ω(∑iE

ni D

ni )ϕdx =

−〈div(Dn), un ϕ〉−∫

Ωun(∑iD

ni ∂i ϕ) dx and one passes easily to the limit as un ϕ converges in H1

0 (Ω) weakto u∞ ϕ and un converges in L2

loc(Ω) strong to u∞. Most applications of the Div-Curl lemma correspond toEn being a gradient, but I will describe later the Compensated Compactness theorem (Theorem 31), whichI also used for questions of bounds for effective coefficients.

As a corrolary of the Div-Curl lemma, if a sequence of functions ψn which only depends upon x1

converges in L2loc(R) weak to ψ∞, then ψnD

n1 converges in L2

loc(Ω) weak to ψ∞D∞1 if ψn is bounded inL∞; as a simplification, I say that if Dn satisfies the hypothesis in Lemma 2, then Dn

1 does not oscillate inx1.23 Before we had solved the general problem, extending the notion of G-convergence introduced by SergioSPAGNOLO but unaware of his work yet, Francois MURAT had obtained an explicit formula for the effectivecoefficients of a layered media in the general anisotropic case; later, in the Spring 1975, Louis NIRENBERG

had shown me a preprint of W. MCCONNELL (maybe a preliminary version of [MC]), who had derived thegeneral formula for layered media in linearized Elasticity, and it was the analogue of what Francois MURAT

had done, but with more technical computations of Linear Algebra; a few years after I noticed a generalapproach for obtaining the effective behaviour of layered media,24 based on the preceding corollary. Theresult of Francois MURAT can be stated as

En E∞ in L2loc(Ω; RN ) weak; Dn = An(x1)En D∞ in L2

loc(Ω; RN ) weak

div Dn and the components of curl En stay in a compact set of H−1loc (Ω) strong

imply D∞ = Aeff (x1)E∞ a.e. in Ω,

(4.10)

where1An11

1

Aeff11

in L∞(Ω) weak ?

An1iAn11

Aeff1i

Aeff11

in L∞(Ω) weak ? for i = 2, . . . , N

Ani1An11

Aeffi1

Aeff11

in L∞(Ω) weak ? for i = 2, . . . , N

Anij −Ani1A

n1j

An11

Aeffij −Aeffi1 Aeff1j

Aeff11

in L∞(Ω) weak ? for i, j = 2, . . . , N.

(4.11)

21 One cannot use for ϕ the characteristic function of a smooth set, for example, but I have noticed thatone can develop the theory of H-measures with test functions in L∞

⋂VMO, by using a commutation lemma

of COIFMAN, ROCHBERG and WEISS, and therefore one can use ϕ ∈ L∞⋂VMO in the Div-Curl lemma.

22 Joel ROBBIN taught me afterwards how to interpret the Div-Curl lemma in terms of differential forms,and he showed me another proof, based on Hodge decomposition. In 1976, Francois MURAT and I developpedthe Compensated Compactness theory, following our original proof using Fourier transform, and Plancherelformula (Proposition 30, Theorem 31).

23 More precisely, a sequence vn converging weakly in L2loc(Ω) does not oscillate in a direction ξ0 if the

H-measures associated with subsequences do not charge the direction ξ0; a consequence is that vn fn((ξ0.·)

)converges weakly to v∞ f∞

((ξ0.·)

)if fn converges weakly to f∞ in L2

loc(R).24 In 1979, working with Georges DUVAUT as consultants for INRIA, we had been asked about an industrial

application using layers of steel and rubber. I already knew the method shown here in the linear case, andI explained how to use it for nonlinear Elasticity, but I pointed out that there was no general theory ofHomogenization for nonlinear Elasticity (this is still true, as the results based on Γ-convergence do notanswer the right questions).

14

The An are not assumed to be symmetric, and the proof actually shows that uniform ellipticity of theAn is not necessary: in the case of layered media in x1, the result holds if there exists α > 0 such thatAn11 ≥ α a.e. in Ω for all n. The basic idea is that Dn

1 does not oscillate in x1, but also En2 , . . . , EnN ,

because of the information on ∂1Enj −∂j En1 for j ≥ 2; one forms the vector Gn with the “good” components

Dn1 , E

n2 , . . . , E

nN , and one expresses the vector On of the “oscillating” components En1 , D

n2 , . . . , D

nN , and one

has On = Bn(x1)Gn with Bn = Φ(An), and Φ is a well defined (involutive) nonlinear transformation; thecorollary of the Div-Curl lemma shows that the weak limit of BnGn is B∞G∞, and therefore the explanationof (4.11) is that whenever An only depends upon x1 and An11 ≥ α > 0,

Dn = AnEn is the same as

En1Dn

2...

DnN

= Φ(An)

Dn

1

En2...EnN

Φ(An) Φ(Aeff ) in L∞

(Ω;L(RN ; RN )

)weak ? .

(4.12)

In the case of linearized Elasticity, the relation is σnij =∑kl C

nijkl(x1)εnkl, with εnkl = (∂k unl + ∂l u

nk )/2, σn is

the symmetric Caucht stress tensor, and one may always assume that Cnijkl = Cnijlk = Cnjikl for all i, j, k, l,but one does not need to use Cnijkl = Cnklij for all i, j, k, l (which correspond to hyperelastic materials); theequilibrium equations are

∑j ∂j σ

nij = fi for all i; for a direction ξ one defines the acoustic tensor An(ξ)

by Anik(ξ) =∑jl C

nijklξjξl for all i, k, and in the case of layered media in x1, the formula for layers holds

if there exists α > 0 such that (An(e1)λ.λ) ≥ α|λ|2 a.e. in Ω for all λ ∈ RN and all n; in that case thecomponents of the good vector Gn are the σni1 and σn1i for all i and the εnkl for k, l ≥ 2, and the componentsof the oscillating vector On are the εni1 and εn1i for all i and the σnkl for k, l ≥ 2; one rewrites the constitutiverelation Onij =

∑kl Γ

nijklG

nkl and the formulas for layered material in linearized Elasticity that MCCONNELL

had derived consist in writing Γn Γeff .25

Sergio SPAGNOLO had introduced the notion of G-convergence in the late 60s, and unaware of it FrancoisMURAT and I had introduced a slightly different concept in the early 70s,26 for which the name H-convergencewas coined much later.27 We were interested in sequences satisfying

un u∞ in H1loc(Ω) weak

− div(An grad(un)

)= fn → f in H−1

loc strong,(4.13)

where An satisfies

An bounded in L∞(Ω; L(RN ; RN )

); (An λ.λ) ≥ α|λ|2 a.e. in Ω, for all λ ∈ RN and all n, (4.14)

25 In the case of nonlinear Elasticity, the stress tensor used is the Piola–Kirchhoff stress tensor, which isnot symmetric, and the strain F is defined by Fij = δij + ∂j ui where u(x) is the displacement from theinitial position x of a material point to its final position x+u(x); for a material like steel which breaks if onestretches it more than 10%, F lies near the set of rotations SO(3), but in linearized Elasticity one postulates(often wrongly) that it lies near I; in nonlinear Elasticity, the components of the good vector Gn are the σni1for all i and the Fnij for all i but only for j ≥ 2, and the components of the oscillating vector On are the Fni1for all i and the σnij for all i but only for j ≥ 2; in order to be able to compute the constitutive relation asOn = Ψn(Gn) in the case of hyperelastic materials, a natural condition to impose for the stored energy isthe uniform convexity in all directions of the form a⊗ e1.

26 I had met Sergio SPAGNOLO at a CIME course in Varenna in 1970, and he had asked me if my inter-polation results had something to do with his own results, but as soon as he had mentioned that he did notassume any regularity for the coefficients in his work I could tell him that what I had done could not help;however, I did not get a clear idea of what his results were.

27 The name was choosen by Francois MURAT in the lectures that he gave in Algiers [Mu3], shortly after Ihad taught my Peccot lectures in the Spring 1977, where under the title “Homogeneisation dans les equationsaux derivees partielles” I had described my method of oscillating test functions in Homogenization and theCompensated Compactness theorem, but the notion of H-convergence was indeed clear from our early work.

15

with α > 0; the bounds on un were deduced from an application of Lax–Milgram lemma, using specificboundary conditions (we had started with Dirichlet conditions, but after having read that Sergio SPAGNOLO

had noticed that Aeff is the same for different boundary conditions, we checked that this was also clearin our framework). I used notations from Electrostatics, denoting En = grad(un) and Dn = An grad(un),and after having extracted a subsequence so that Dn D∞ in L2

loc(Ω; RN ) weak, the question was toidentify what D∞ was. If one showed that there exists Aeff such that D∞ = Aeff E∞, then with usualboundary conditions u∞ would be the solution of a similar boundary value problem, and this was whatSergio SPAGNOLO had done in the symmetric case; in the nonsymmetric case the knowledge of the inversemapping f 7→ u∞ does not characterize what Aeff is, but as nonsymmetric problems do not occur so oftenin applications, the main advantage of our approach is that after I had introduced my method of oscillatingtest functions,28 it generalizes easily to all sort of linear partial differential equations or systems.29

In the early 70s, we had started by an abstract elliptic framework, where V is a real separable Hilbertspace (corresponding to H1

0 (Ω) in the concrete example that we had in mind), using || · || for the norm onV , || · ||∗ for the norm on V ′, and 〈·, ·〉 for the duality product between V ′ and V , and we had considered abounded sequence An ∈ L(V ;V ′) (corresponding to Anu = −div

(An grad(u)

)in our example), satisfying a

uniform V -ellipticity condition (corresponding to (An(x)ξ.ξ) ≥ α|ξ|2, |An(x)ξ| ≤ M |ξ| for all ξ ∈ RN , a.e.x ∈ Ω in our example), i.e. one assumes that there exist 0 < α ≤M <∞ such that

〈Anu, u〉 ≥ α||u||2 and ||Anu||∗ ≤M ||u|| for all u ∈ V. (4.15)

By Lax–Milgram lemma, each An is an isomorphism from V onto V ′, and the first basic result in thisabstract framework is the following lemma.

Lemma 3: There exists a subsequence Am and a linear continuous operator Aeff from V into V ′ such thatfor every f ∈ V ′, the sequence of solutions um of Amum = f converges in V weak to the solution u∞ ofAeffu∞ = f , and Aeff satisfies

〈Aeffu, u〉 ≥ α||u||2 and ||Aeffu||∗ ≤M2

α||u|| for all u ∈ V, (4.16)

but M2

α can be replaced by M if all the operators An are symmetric.Proof: One has ||(An)−1||L(V ′;V ) ≤ 1

α as An un = f implies α||un||2 ≤ 〈Anun, un〉 = 〈f, un〉 ≤ ||f ||∗||un||,so that ||un|| ≤ 1

α ||f ||∗. One can extract a subsequence um converging in V weak to u∞, and repeating thisextraction for f belonging to a countable dense family F of V ′ and using a diagonal subsequence, one canextract a subsequence Am such that for every f ∈ F , the sequence um = (Am)−1f converges in V weak to alimit S(f); the sequence (Am)−1 being uniformly bounded and F being dense, (Am)−1f converges in V weakto a limit S(f) for every f ∈ V ′. S is a linear continuous operator from V ′ into V , with ||S f || ≤ 1

α ||f ||∗for all f ∈ V ′, and in order to show that S is invertible the mere fact that the operators (An)−1 areuniformly bounded is not sufficient, because in any infinite dimensional Hilbert space, one can construct asequence of symmetric surjective isometries converging weakly to 0 (in L2(0, 1) for example, one can take

28 The method was discovered independently by Leon SIMON [Si]; his student MCCONNELL had only donethe layered case in linearized Elasticity, and he had looked himself at the problem in a general way; it wasthe referee, probably from the Italian school, who had mentioned to him my work. I first wrote about mymethod in [Ta3], but I had explained it to Jacques-Louis LIONS in the Fall 1975 in Marseille [Li3]; he hadbeen convinced by Ivo BABUSKA of the importance in engineering of problems with a periodic structure, buthe had not thought of asking about the proof of our results which I had mentioned in a meeting that he hadorganized in June 1974 [Ta2], either to Francois MURAT who had stayed in Paris, or to me who had spentthe year in Madison, where he actually came in the Spring.

29 As I taught it in my Peccot lectures, my method extends easily to some monotone situations, butas I mentioned at a meeting in Rio de Janeiro a few months after [Ta7], I could not find a good settingfor Homogenization of nonlinear Elasticity. This is still the case, and the spreading error of mistaking Γ-convergence for Homogenization seems to come from the fact that those who commit it had not paid muchattention to the difference between G-convergence and H-convergence.

16

the multiplication by sign(sin(n·)

)). However, the ellipticity condition prevents this difficulty. One notices

that 〈S f, f〉 = limm〈um, f〉 = limm〈Am um, um〉 ≥ α lim infm ||um||2 (and ≤ 1M ||f ||

2∗ in the symmetric case),

and as M ||um|| ≥ ||Amum||∗ = ||f ||∗, one finds 〈S f, f〉 ≥ αM2 ||f ||2∗, and therefore S is invertible by Lax–

Milgram lemma and its inverse Aeff has a norm bounded by M2

α . As um converges in V weak to S f , onehas lim infm ||um||2 ≥ ||S f ||2, so that 〈S f, f〉 ≥ α||S f ||2 for every f ∈ V ′, or equivalently, as we know nowthat S is invertible, 〈Aeff v, v〉 ≥ α||v||2 for every v ∈ V .

Of course, as most results in Functional Analysis, this lemma only gives a general framework and doesnot help much for identifying Aeff in concrete cases,30 but one often uses the information that Aeff isinvertible so that one can choose f ∈ V ′ for which the weak limit u∞ is any element of V prescribed inadvance.

In our concrete example, the problem of G-convergence consists in showing that u∞ solves an equation−div

(Aeff grad(u∞)

)= f , while the problem of H-convergence consists in showing that Am grad(um)

converges in L2(Ω; RN ) weak to Aeff grad(u∞), and this is a different question in nonsymmetric situations,because if one adds to An a constant antisymmetric matrix B (small enough in norm in order to retain theellipticity condition), one will not change the operator An and therefore the preceding abstract result cannothelp identify the precise matrix Aeff , as it only uses the operators An.

Lemma 3 points to a technical difficulty in the nonsymmetric case, because we started with a bound Mfor An and ended with the greater bound M2

α for Aeff , and this is solved by defining differently the boundson the coefficients An.

Definition 4: For 0 < α ≤ β < ∞, M(α, β; Ω) will denote the set of A ∈ L∞(Ω;L(RN ; RN )

)satisfying

(A(x)ξ.ξ) ≥ α|ξ|2 and (A(x)ξ.ξ) ≥ 1β |A(x)ξ|2 for all ξ ∈ RN (or equivalently (A−1(x)ξ.ξ) ≥ 1

β |ξ|2 for all

ξ ∈ RN ), a.e. x ∈ Ω.31 If A is independent of x ∈ Ω, one writes A ∈M(α, β).

The reason for using the sets M(α, β; Ω) is that they are compact for the topology of H-convergencewhich I define now.

Definition 5: We will say that a sequence An ∈ M(α, β; Ω) H-converges to Aeff ∈ M(α′, β′; Ω) for some0 < α′ ≤ β′ <∞,32 if for every f ∈ H−1(Ω), the sequence of solutions un ∈ H1

0 (Ω) of−div(An grad(un)

)= f

converges in H10 (Ω) weak to u∞, and the corresponding sequence An grad(un) converges in L2(Ω; RN ) weak

to Aeff grad(u∞); u∞ is therefore the solution of −div(Aeff grad(u∞)

)= f in Ω.

H-convergence comes from a topology on X =⋃n≥1M

(1n , n; Ω

), union of all M(α, β; Ω), which is the

coarsest topology that makes a list of maps continuous. For f ∈ H−1(Ω) one such map is A 7→ u from Xinto H1

0 (Ω) weak and another one is A 7→ Agrad(u) from X into L2(Ω; RN ) weak, where u is the solutionof −div

(Agrad(u)

)= f . When one restricts that topology to M(α, β; Ω), it is equivalent to consider only

f belonging to a countable bounded set whose combinations are dense in H−1(Ω); then u and Agrad(u)belong to bounded sets respectively of H1

0 (Ω) and L2(Ω; RN ) which are metrizable for their respectuve weaktopology, and therefore the restriction of that topology to M(α, β; Ω) is defined by a countable numberof semi-distances and is therefore defined by a semi-distance. That it is actually a distance can be seenby showing uniqueness of the limit: if a sequence An H-converges to both Aeff and to Beff , then onededuces that Aeff grad(u∞) = Beff grad(u∞) a.e. x ∈ Ω for every f ∈ H−1(Ω), and therefore for everyu∞ ∈ H1

0 (Ω); choosing then u∞ to coincide successively with xj , j = 1, . . . , N , on an open subset ω withcompact closure in Ω, one must have Aeff = Beff a.e. x ∈ ω. Of course, one never needs much from thistopology, but some arguments do make use of the fact that M(α, β; Ω) is metrizable.

However, if one wants to let α tend to 0, like some people do for domains with holes, one must be verycareful because one cannot use arguments based on metrizability in that case.

30 Even when all the operators An are differential operators, it may happen that Aeff is not a differentialoperator and in some cases nonlocal integral corrections must be taken into account.

31 If A ∈ L(RN ; RN ) satisfies (Aξ.ξ) ≥ 1β |Aξ|

2 for all ξ ∈ RN , then |Aξ| ≤ β|ξ| for all ξ ∈ RN , butif A ∈ L(RN ; RN ) satisfies (Aξ.ξ) ≥ α|ξ|2 and |Aξ| ≤ M |ξ| for all ξ ∈ RN , then one can only deduce(Aξ.ξ) ≥ α

M2 |Aξ|2, if A is not symmetric, while of course one has (Aξ.ξ) ≥ 1M |Aξ|

2 if A is symmetric.32 Theorem 6 shows that one can take α′ = α and β′ = β.

17

Theorem 6: For any sequence An ∈ M(α, β; Ω) there exists a subsequence Am and an element Aeff ∈M(α, β; Ω) such that Am H-converges to Aeff .Proof: Using the same argument than in Lemma 3, F being a countable dense set of H−1(Ω), we can extract asubsequence Am such that for every f ∈ F the sequence um ∈ H1

0 (Ω) of solutions of −div(Am grad(um)

)= f

converges in H10 (Ω) weak to u∞ = S(f) and Am grad(um) converges in L2(Ω; RN ) weak to R(f); the same is

true then for all f ∈ H−1(Ω), the operator S is invertible, and R(f) = C u∞ where C is a linear continuousoperator from H1

0 (Ω) into L2(Ω; RN ). It remains to show that C is local, of the form C v = Aeff grad(v) forall v ∈ H1

0 (Ω), and that Aeff ∈ M(α, β; Ω). We first show that for all v ∈ H10 (Ω) one has

(C v.grad(v)

)≥

α|grad(v)|2 and(C v.grad(v)

)≥ 1

β |C v|2 a.e. x ∈ Ω.

For v ∈ H10 (Ω), let f = −div(C v), so that u∞ = v, and let ϕ be a smooth function so that we may use

ϕum and ϕv as test functions. One gets 〈f, ϕ um〉 =∫

Ω

(Am grad(um).ϕ grad(um) + um grad(ϕ)

)dx, and

as um converges strongly to v in L2(Ω) because H10 (Ω) is compactly imbedded into L2(Ω), one deduces that

〈f, ϕ v〉 = limm

∫Ωϕ(Am grad(um).grad(um)

)dx+

∫Ω

(C v.v grad(ϕ)

)dx, but 〈f, ϕ v〉 =

∫Ω

(C v.ϕ grad(v) +

v grad(ϕ))dx, and therefore one deduces that for every smooth function ϕ one has33

∫Ω

ϕ(Am grad(um).grad(um)

)dx→

∫Ω

ϕ(C v.grad(v)

)dx. (4.17)

Choosing now ϕ to be nonnegative, and using the first part of the definition of M(α, β; Ω), we deduce that∫Ω

ϕ(C v.grad(v)

)dx ≥ α lim inf

m

∫Ω

ϕ|grad(um)|2 dx ≥ α∫

Ω

ϕ|grad(v)|2 dx, (4.18)

where the second inequality follows from the fact that grad(um) converges in L2(Ω; RN ) weak to grad(v),and as this inequality holds for all smooth nonnegative functions ϕ, one obtains(

C v.grad(v))≥ α|grad(v)|2 a.e. x ∈ Ω, for every v ∈ H1

0 (Ω). (4.19)

Using the second part of the definition of M(α, β; Ω), we deduce that∫Ω

ϕ(C v.grad(v)

)dx ≥ 1

βlim infm

∫Ω

ϕ|Am grad(um)|2 dx ≥ α∫

Ω

ϕ|C v|2 dx, (4.20)

as Am grad(um) converges in L2(Ω; RN ) weak to C v, so that

(C v.grad(v)

)≥ 1β|C v|2 a.e. x ∈ Ω, for every v ∈ H1

0 (Ω). (4.21)

From (4.21) one deduces

|C v| ≤ β|grad(v)| a.e. x ∈ Ω, for every v ∈ H10 (Ω), (4.22)

and as C is linear, (4.22) implies that

if grad(v) = grad(w) a.e. in an open subset ω, then C v = C w a.e. in ω. (4.23)

Writing Ω as the union of an increasing sequence ωk of open subsets with compact closure in Ω, we defineAeff in the following way: for ξ ∈ RN , we choose vk ∈ H1

0 (Ω) such that grad(vk) = ξ on ωk, and we defineAeff ξ on ωk to be the restriction of C(vk) to ωk; this defines Aeff ξ as a measurable function in Ω becauseC vk and C vl coincide on ωk

⋂ωl by (4.23); (4.23) also implies that Aeff is linear in ξ. If w ∈ H1

0 (Ω)is piecewise affine so that grad(w) is piecewise constant, then (4.23) implies that C w = Aeff grad(w)a.e. x ∈ Ω. As piecewise affine functions are dense in H1

0 (Ω), for each v ∈ H10 (Ω) there is a sequence

33 One could deduce (4.17) directly from the Div-Curl lemma, but I am showing how we first argued, andwe had proved this result before discovering the Div-Curl lemma.

18

wj of piecewise affine functions such that grad(wj) converges strongly in L2(Ω; RN ) to grad(v), and as|C v − Aeff gradwk| = |C v − C wj | ≤ β|grad(v − wj)| a.e. x ∈ Ω, one deduces C v = Aeff grad(v)a.e. x ∈ Ω. Having shown that C v = Aeff grad(v) for a measurable Aeff , (4.19) and (4.21) imply thatAeff ∈M(α, β; Ω) as one can take v to be any affine function in an open subset ω with compact closure inΩ.

At a meeting in Roma in the Spring of 1974,34 I had apparently upset Ennio DE GIORGI by my claimthat my method (which was actually the joint work with Francois MURAT) was more general than the methoddeveloped by the Italian school. I thought that Meyers’s regularity theorem which Sergio SPAGNOLO wasusing in his proof was based on the maximum principle [Me], but Ennio DE GIORGI had told me that it wasnot restricted to second order equations. Actually, it would have been difficult for me at that time to explainhow to perform all the computations for higher order nonnecessarily symmetric equations or to systems likelinearized Elasticity, and even more difficult to explain what to do for nonlinear elliptic equations, but lessthan a year after I had noticed that most of the important known properties of Homogenization of secondorder variational elliptic equations in divergence form could be obtained through repeated applications of theDiv-Curl lemma, and the extension to linear elliptic systems in a variational framework (and some simplenonlinear systems of monotone type) really became straightforward. As my method, which has been wronglycalled the “energy method” and which I prefer to call the “method of oscillating test functions”, uses onlya variational structure, it can be extended with minor changes35 to most of the linear partial differentialequations of Continuum Mechanics (not much is understood for nonlinear equations).

One starts with the same abstract analysis, Lemma 3 and the beginning of the proof of Theorem 6; oneextracts a subsequence Am for which there is a linear continuous operator C from H1

0 (Ω) into L2(Ω; RN )such that for every f ∈ H−1(Ω) the sequence of solutions um ∈ H1

0 (Ω) of −div(Am grad(um)

)= f converges

in H10 (Ω) weak to u∞ and Am grad(um) converges in L2(Ω; RN ) weak to R(f) = C(u∞). One uses then

the Div-Curl lemma to give a new proof that C is a local operator of the form C(v) = Aeff grad(v) withAeff ∈ L∞

(Ω;L(RN ; RN )

).

One constructs a sequence of oscillating test functions vm satisfying

−div((Am)T grad(vm)

)converges in H−1

loc (Ω) strong, (4.24)

where (Am)T is the transposed operator of Am and

vm v∞ in H1(Ω) weak; (Am)T grad(vm) w∞ in L2(Ω; RN ) weak, (4.25)

and one passes to the limit in the identity(Am grad(um).grad(vm)

)=(grad(um).(Am)T grad(vm)

). (4.26)

The Div-Curl lemma applies to the left side of (4.26) which converges in the sense of measures36 to(C(u∞).grad(v∞)

), because div

(Am grad(um)

)is a fixed element of H−1(Ω) and grad(vm) converges in

34 Umberto MOSCO had insisted that I should write something before I left, and instead of visiting Roma,I stayed in the hotel, so nicely located above Piazza di Spagna, writing [Ta1] and another short descriptionconcerning quasi-variational inequalities.

35 Becoming a mathematician requires some ability with abstract concepts, and it is part of the trainingto check that one can apply a general method to particular examples, and it depends upon one’s taste andone’s own scientific stature to decide if it was an exercise or something worth publishing.

36 A sequence sn converges to s∞ in the sense of distributions in Ω if for every ϕ ∈ C∞c (Ω), the space ofindefinitely differentiable functions with compact support in Ω, one has 〈sn, ϕ〉 → 〈s∞, ϕ〉; the sn must bedistributions and according to the theory of distributions of Laurent SCHWARTZ the limit s∞ is automaticallya distribution. The sequence sn converges to s∞ in the sense of measures if the preceding convergence holdsfor every ϕ ∈ Cc(Ω), the space of continuous functions with compact support in Ω; the sn must be Radonmeasures and s∞ is automatically a Radon measure. If sn ∈ L1

loc(Ω), then 〈sn, ϕ〉 means∫

Ωsn ϕdx.

19

L2(Ω; RN ) weak to grad(v∞); the Div-Curl lemma also applies to the right side of (4.26) which converges inthe sense of measures to (grad(u∞).w∞), because of (4.24) and (4.25), and this shows(

C(u∞).grad(v∞))

= (grad(u∞).w∞) a.e. in Ω. (4.27)

One constructs a sequence vm satisfying (4.24) and (4.25) by first choosing an open set Ω′ containing theclosure of Ω, then extending Am in Ω′ \ Ω for example by Am(x) = αI for x ∈ Ω′ \ Ω, and then choosingvm ∈ H1

0 (Ω′) solution of−div

((Am)T grad(vm)

)= g in Ω′, (4.28)

for some g ∈ H−1(Ω′). One obtains a sequence vm bounded in H10 (Ω′) and therefore its restriction to Ω is

bounded in H1(Ω), and a subsequence satisfies (4.24) and (4.25). By Lemma 3 one can choose g ∈ H−1(Ω′)such that v∞ is any arbitrary element of H1

0 (Ω′), and in particular for each j = 1, . . . , N , there exists gj ∈H−1(Ω′) such that v∞ = xj a.e. in Ω, and using these N choices of gj , (4.27) means C u∞ = Aeff grad(u∞)for some Aeff ∈ L2

(Ω;L(RN ; RN )

).

This method quickly gives only an intermediate result and is not very good for questions of bounds,and it must emphasized that questions of bounds are not yet very well understood for general equations orsystems. As pointed out by Francois MURAT, one can easily show that Aeff ∈ L∞

(Ω;L(RN ; RN )

)by using

the following lemma based on the continuity of C, but the information Aeff ∈M(α, β; Ω) is not so naturalin this approach.

Lemma 7: If M ∈ L2(Ω;L(RN ; RN )

)and the operator C defined by C(v) = M grad(v) for all v ∈ H1

0 (Ω) isa linear continuous operator from H1

0 (Ω) into L2(Ω; RN ) of norm ≤ γ, then one has M ∈ L∞(Ω;L(RN ; RN )

)and ||M(x)||L(RN ;RN ) ≤ γ a.e. in Ω.Proof: Let ξ ∈ RN \ 0 and ϕ ∈ C1

c (Ω), the space of functions of class C1 with compact support inΩ. Define ϕn by ϕn(x) = ϕ(x) sinn(ξ.x)

n for x ∈ Ω, which gives a bounded sequence in H10 (Ω) with

limn ||grad(ϕn)||L2(Ω;RN ) = 1√2||ξ ϕ||L2(Ω;RN ) = |ξ|√

2||ϕ||L2(Ω). C(ϕn) = M

( sinn(ξ.·)n grad(ϕ) + ϕ cosn(ξ.·)ξ

),

and therefore limn ||C(ϕn)||L2(Ω;RN ) = 1√2||ϕM ξ||L2(Ω), because ϕ and grad(ϕ) are bounded, so that ϕM ξ

and M grad(ϕ) belongs to L2(Ω; RN ). Therefore one deduces that ||ϕM ξ||L2(Ω) ≤ γ|ξ| ||ϕ||L2(Ω) for everyϕ ∈ C1

c (Ω), and therefore for every ϕ ∈ L2(Ω) by density, and this means ||M ξ||L∞(Ω;RN ) ≤ γ|ξ|, and as thisis valid for all ξ, the lemma is proved.

Using the same approach,37 one can derive a few useful properties of H-convergence, the main toolremaining the Div-Curl lemma.

Proposition 8: If a sequence An ∈ M(α, β; Ω) H-converges to Aeff , then the transposed sequence (An)T

H-converges to (Aeff )T . In particular if a sequence An H-converges to Aeff and if An(x) is symmetric a.e.x ∈ Ω for all n, then Aeff (x) is symmetric a.e. x ∈ Ω.Proof: A ∈ M(α, β; Ω) implies (and therefore is equivalent to) AT ∈ M(α, β; Ω) as A ∈ M(α, β; Ω) means(A(x)ξ.ξ) ≥ α|ξ|2 and (A−1(x)ξ.ξ) ≥ 1

β |ξ|2 for all ξ ∈ RN , a.e. x ∈ Ω, and as (A−1)T = (AT )−1, this is

the same as (AT (x)ξ.ξ) ≥ α|ξ|2 and((AT )−1(x)ξ.ξ

)≥ 1

β |ξ|2 for all ξ ∈ RN , a.e. x ∈ Ω. By Theorem 6 a

subsequence (Am)T H-converges to Beff . For f, g ∈ H−1(Ω), let us define the sequences um, vm ∈ H10 (Ω)

by−div

(Am grad(um)

)= f, −div

((Am)T grad(vm)

)= g in Ω, (4.29)

so that um and vm converge in H10 (Ω) weak respectively to u∞ and v∞, Am grad(um) and (Am)T grad(vm)

converge in L2(Ω; RN ) weak respectively to Aeff grad(u∞) and Beffgrad(v∞), and one can use the Div-Curllemma to take the limit of the identity(

Am grad(um).grad(vm))

=(grad(um).(Am)T grad(vm)

), (4.30)

37 The preceding method is not adapted to nonlinear problems, but I developped a variant where theoscillating test functions satisfy the initial equation instead of the transposed equation, and it extends tothe case of monotone operators.

20

and obtain (Aeff grad(u∞).grad(v∞)

)=(grad(u∞).Beff grad(v∞)

), (4.31)

and as u∞ and v∞ can be arbitrary elements of H10 (Ω) by Lemma 3, this implies (Aeff )T = Beff a.e. in Ω.

The second part of the Proposition results from uniqueness of H-limits.

The next result shows that H-convergence inside Ω is not related to any particular boundary conditionimposed on ∂Ω.

Proposition 9: If a sequence An ∈ M(α, β; Ω) H-converges to Aeff and a sequence un converges inH1loc(Ω) weak to u∞, and div

(An grad(un)

)belongs to a compact set of H−1

loc (Ω) strong, then the sequenceAn grad(un) converges to Aeff grad(u∞) in L2

loc(Ω; RN ) weak.Proof: Let ϕ ∈ C1

c (Ω) so that ϕun converges in H10 (Ω) weak to ϕu∞, ϕgrad(un) converges in L2(Ω; RN )

weak to ϕgrad(u∞); curl(ϕgrad(un)

)has its components bounded in L2(Ω), as they are of the form

∂ϕ∂xj

∂un

∂xk− ∂ϕ

∂xk

∂un

∂xj. As div

(ϕAn grad(un)

)= ϕdiv

(An grad(un)

)+ (An grad

(un).grad(ϕ)

), it belongs to a

compact set of H−1(Ω) strong, as multiplication by ϕ maps H−1loc (Ω) into H−1(Ω) and (An grad

(un).grad(ϕ)

)is bounded in L2(Ω). One extracts a subsequence such that ϕAm grad(um) converges in L2(Ω; RN ) weak tow∞. For f ∈ H−1(Ω), one defines vn ∈ H1

0 (Ω) by −div((An)T grad(vn)

)= f , so that vn converges in H1

0 (Ω)weak to v∞ and (An)T grad(vn) converges in L2(Ω; RN ) weak to (Aeff )T grad(v∞) by Proposition 8. Onethen passes to the limit in both sides of

(ϕAm grad(um).grad(vm)

)=(ϕgrad(um).(Am)T grad(vm)

)by

using the Div-Curl lemma, and one obtains the relation(w∞.grad(v∞)

)=(ϕgrad(u∞).(Aeff )T grad(v∞)

)a.e. in Ω. As v∞ is an arbitrary element of H1

0 (Ω) by Lemma 3, w∞ = ϕAeff grad(u∞) a.e. in Ω, and as ϕis arbitrary in C1

c (Ω) and the limit does not depend upon which subsequence has been chosen, one deducesthat all the sequence An grad(un) converges in L2

loc(Ω; RN ) weak to Aeff grad(u∞).

In the preceding proof, div(An grad(ϕun)

)may not belong to a compact set of H−1(Ω) strong as it is

div(ϕAn grad(un)

)+ div

(unA

n grad(ϕ))

and div(ϕAn grad(un)

)does indeed belong to a compact set of

H−1(Ω) strong as was already used, but it is not clear if div(unA

n grad(ϕ))

does, because unAn grad(ϕ)may only converge in L2(Ω; RN ) weak. We have used then the complete form of the Div-Curl lemma andnot only the special case where one only considers gradients.

Proposition 9 expresses that the boundary conditions used for un are not so important, as long as thesolutions stay bounded, as had been noticed by Sergio SPAGNOLO in the case of G-convergence. We did defineH-convergence by using Dirichlet conditions, but the result inside Ω would be the same for other boundaryconditions, if one can apply Lax–Milgram lemma for existence as we need to start by using Lemma 3. UsingDirichlet conditions has the advantage that no smoothness assumption is necessary for the boundary of Ω.What happens on the boundary ∂Ω may depend upon the particular boundary condition used; the particularcases of nonhomogeneous Dirichlet conditions, Neumann conditions, and other variational conditions can allbe considered at once in the framework of variational inequalities, allowing actually some nonlinearity in theboundary conditions (the nonlinearity inside Ω is a different matter).

The next result states that H-convergence has a local character, extending the corresponding result ofSergio SPAGNOLO for G-convergence.

Proposition 10: If a sequence An ∈ M(α, β; Ω) H-converges to Aeff , and ω is an open subset of Ω, thenthe sequence Mn = An |ω of the restrictions of An to ω H-converges to Meff = Aeff |ω. Therefore if asequence Bn ∈ M(α, β; Ω) H-converges to Beff and An = Bn for all n, a.e. x ∈ ω, then Aeff = Beff a.e.x ∈ ω.Proof: If all An belong to M(α, β; Ω), then all Mn belong to M(α, β;ω) and by Theorem 6 a subse-quence Mm H-converges to some Meff ∈ M(α, β;ω). For f ∈ H−1(ω) and g ∈ H−1(Ω), let us solve−div

(Mm grad(um)

)= f in ω and −div

((Am)T grad(vm)

)= g in Ω so that um converges in H1

0 (ω) weakto u∞, Mm grad(um) converges in L2(ω; RN ) weak to Meff grad(u∞), vm converges in H1

0 (Ω) weak to v∞,and (Am)T grad(vm) converges in L2(Ω; RN ) weak to (Aeff )T grad(v∞). Extending un and u∞ by 0 inΩ \ ω, one can apply the Div-Curl lemma in ω for the left side and in Ω for the right side of the equality(Mm grad(um).grad(vm)

)=(grad(um).(Am)T grad(vm)

), and one obtains

(Meff grad(u∞).grad(v∞)

)=(

grad(u∞).(Aeff )T grad(v∞))

a.e. in ω. As by Lemma 3 u∞ can be arbitrary in H10 (ω) and v∞ arbitrary

21

in H10 (Ω), one deduces that Meff = Aeff a.e. in ω; as the H-limit is independent of the subsequence used,

all the sequence Mn H-converges to Aeff |ω.

Actually if for a measurable subset ω of Ω, one has An = Bn for all n, a.e. x ∈ ω, and the sequencesAn, Bn ∈ M(α, β; Ω) H-converge respectively in Ω to Aeff , Beff , then one has Aeff = Beff a.e. x ∈ ω.This can be proved by applying the same regularity theorem of MEYERS [Me] that Sergio SPAGNOLO usedin the symmetric case. It is equivalent to prove that Aeff = Beff a.e. in ω(ε) for each ε > 0, whereω(ε) is the set of points of ω at a distance at least ε from ∂Ω. Defining vn as above but choosing f ∈H−1(Ω) and un ∈ H1

0 (Ω) instead, the problem is to use χω(ε), the characteristic function of ω(ε), as atest function in the Div-Curl lemma in Ω. For obtaining that result one first takes f, g ∈ W−1,p(Ω) withp > 2, and Meyers’s regularity theorem tells that grad(un) and grad(vn) stay bounded in Lq(ε)

(ω(ε)

)for

some q(ε) ∈ (2, p], and therefore(An grad(un).grad(vn)

)and

(grad(un).(Bn)T grad(vn)

)(which are equal

on ω) are bounded in Lq(ε)/2(ω(ε)

), and converge in Lq(ε)/2

(ω(ε)

)weak to

(Aeff grad(u∞).grad(v∞)

)and(

grad(u∞).(Beff )T grad(v∞))

which are then equal a.e. in ω(ε). As W−1,p(Ω) is dense in H−1(Ω), onecan pass to the limit in this equality in ω(ε) and obtain it for arbitrary f, g ∈ H−1(Ω), i.e. for arbitraryu∞, v∞ ∈ H1

0 (Ω), and that gives Aeff = Beff a.e. in ω(ε), and therefore a.e. in ω.The argument of Proposition 10 is variational and extends therefore to all variational situations, while in

order to extend the preceding argument to a general situation one would have to prove a regularity theoremlike MEYERS’s one, and I do not know if this has been done; it has been checked by Jacques-Louis LIONS

that the analogous statement is valid for some linearized Elasticity systems.

5. Bounds on effective coefficients: first methodIn the case An = anI, which we had investigated first, we knew that the L∞(Ω) weak ? limits of an and

1an , denoted respectively by a+ and 1

a−, are needed for expressing Aeff in the case where an only depends

upon one variable. We had obtained sequences En = grad(un) and Dn = anEn = an grad(un) convergingin L2(Ω; RN ) weak, respectively to E∞ and D∞, and the analogue of (4.17) had told us that an|En|2 wasconverging in the sense of measures to (D∞.E∞). We had then decided to look at the convex hull in R2N+3

of the setK =

(E, aE, a|E|2, a, 1

a

)| E ∈ RN , a ∈ [α, β]

, (5.1)

with the goal of investigating what could be deduced if D∞ and E∞ satisfy the property(E∞, D∞, (D∞.E∞), a+,

1a−

)∈ clconv(K). (5.2)

I will show the necessary computations, which we did not carry out exactly as (5.34)/(5.40) in the early 70s,as we had noticed that a simple argument of convexity showed that

a−I ≤ Aeff ≤ a+I. (5.3)

Lemma 11: If a sequence vn converges in L2(Ω; RN ) weak to v∞, if Mn ∈ M(α, β; Ω) is symmetric a.e.x ∈ Ω and (Mn)−1 converges in L∞

(Ω;L(RN ; RN )

)weak ? to (M−)−1, then for every nonnegative continuous

function ϕ with compact support in Ω, one has

lim infn

∫Ω

ϕ(Mn vn.vn) dx ≥∫

Ω

ϕ(M− v∞.v∞) dx, (5.4)

i.e. if (Mn vn.vn) converges to a Radon measure ν in the sense of measures (i.e. weakly ?), then ν ≥(M− v∞.v∞) in the sense of measures in Ω.Proof: If Ls+(RN ; RN ) denotes the convex cone of symmetric positive operators from RN into itself, Lemma11 follows from the fact that (P, v) 7→ (P−1 v.v) is convex on Ls+(RN ; RN )× RN . Indeed, one has

(P−1 v.v) = (P−10 v0.v0) + 2(P−1

0 v0.v − v0)− (P−10 (P − P0)P−1

0 v0.v0) +R (5.5)

22

and the remainder R is nonnegative for every P0 ∈ Ls+(RN ; RN ) and v0 ∈ RN , as an explicit computationshows that

R =(P (P−1 v − P−1

0 v0).(P−1 v − P−10 v0)

). (5.6)

As an application of Lemma 11, we can deduce upper bounds as well as lower bounds for Aeff , improvingthen the result of Theorem 6; the bounds are expressed in terms of the weak ? limit of An and the weak ?limit of (An)−1.

Proposition 12: Assume that a sequence An ∈ M(α, β; Ω) H-converges to Aeff , if (An)T (x) = An(x) a.e.x ∈ Ω for all n, and satisfies

An A+; (An)−1 (A−)−1 in L∞(Ω;L(RNRN )

)weak ? . (5.7)

Then one hasA− ≤ Aeff ≤ A+ a.e. x ∈ Ω. (5.8)

Proof: In the proof of Theorem 6 we have constructed a sequence grad(un) converging in L2(Ω; RN ) weakto grad(u∞), and such that An grad(un) converges in L2(Ω; RN ) weak to Aeff grad(u∞); moreover we hadshown that

(An grad(un).grad(un)

)converges in the sense of measures to

(Aeff grad(u∞).grad(u∞)

)(which

can be deduced from the Div-Curl lemma).By using Lemma 11 with Mn = An and vn = grad(un) one obtains

(Aeff grad(u∞).grad(u∞)

)≥(

A− grad(u∞).grad(u∞))

in the sense of measures. As both sides of the inequality belong to L1(Ω) theinequality is valid a.e. x ∈ Ω. From the fact that u∞ can be any element of H1

0 (Ω), one can choosegrad(u∞) to be any constant vector λ on an open subset ω with compact closure in Ω, so that one hasproved that (Aeffλ.λ) ≥ (A−λ.λ) for every λ ∈ RN , and therefore Aeff ≥ A− a.e. in Ω.

Similarly,(Aeff grad(u∞).grad(u∞)

)≥((A+)−1Aeff grad(u∞).Aeff grad(u∞)

)in the sense of mea-

sures, by applying Lemma 11 with Mn = (An)−1 and vn = An grad(un), and as both sides of the inequalitybelong to L1(Ω) the inequality is valid a.e. x ∈ Ω, and choosing grad(u∞) = λ on an open subset ω withcompact closure in Ω, one obtains (Aeff λ.λ) ≥ (A+)−1Aeff λ.Aeff λ) for every λ ∈ RN , and thereforeAeff ≥ Aeff (A+)−1Aeff , or equivalently (Aeff )−1 ≥ (A+)−1 or Aeff ≤ A+ a.e. in Ω.

There is an important logical point to be emphasized here, as this kind of result may easily be attributedto a few different persons. It would be interesting to check if those who either claim to have proved it beforeFrancois MURAT and I had proved it in the early 70s, or claim that it had been proved a long time ago bysuch or such a pioneer in Continuum Mechanics or Physics, could show that there was a clear definition ofwhat one was looking for in any of these “proofs”. When one says that something is well known, it onlymeans that it is well known by those who know it well, and in Roma in 1974, I think that Ennio DE GIORGI

did not know about such an inequality; I believe that he would have quickly found a proof if I had askedhim the question instead of saying that I had already proved the result. It is difficult to explain how anyonecould have proved the result before there was a definition of what effective coefficients were, i.e. before thework of Sergio SPAGNOLO in the late 60s or the work of Francois MURAT and me in the early 70s.

Many would probably argue that they knew about effective coefficients much before there was a defi-nition, and indeed some had a good intuition about that question, but many just had a fuzzy idea of whatit was about, and I could observe that at a meeting at the Institute of Mathematics and its Applications inMinneapolis in the Fall of 1995 when one of the speakers challenged the mathematicians by saying that hehad proved a result that mathematicians had not proved. Of course that could well happen and I have alwaysbeen ready to learn from engineers about anything that they may know on interesting scientific matters, butit did not start well because the speaker was working with linearized Elasticity, and if there are still a fewmathematicians who do not know about the defects of linearized Elasticity it is not really my fault becauseI have been a strong advocate of mentioning the known defects of models, and those of linearized Elasticityare well known by now, and I expected a better understanding about this kind of questions from an engineeranyway. The speaker pretended then to have proved bounds for (linear) effective elastic coefficients usinginclusions of (linear) elastic materials that were not elliptic, and as he was claiming that his bounds onlyinvolved proportions I had mentioned to my neighbour and then loud enough to be heard by all that it was

23

false;38 indeed my comment was heard by many but after the talk there was only one person in the audienceinterested in clarifying the question, as John WILLIS came to tell me that the speaker had not really meant tosay nonelliptic, because the materials that he wanted to use were actually strongly elliptic (by opposition tovery strongly elliptic), and certainly an engineer who does not know the definition of ellipticity should avoidchallenging mathematicians in public,39 but that was not the only problem in the statement. The speakerhad only obtained results for Dirichlet conditions and he would have been in great trouble for showing thathis results were local and could be obtained for all kinds of variational boundary conditions. Some peoplemight argue that boundary conditions are not of such importance in Elasticity, as they may have in otherquestions,40 but the problem is that the materials used in the mathematical approach do not exist in the realworld, and if engineers misuse their knowledge and intuition about the real world by considering unrealisticsituations and pretending that they know the mathematical answer to some questions, they may just bewrong: I do not see much reason why engineers should have a better intuition than mathematicians aboutproblems which are completely unrealistic.41

According to the work of Sergio SPAGNOLO in the lates 60s or the work of Francois MURAT and me inthe early 70s, homogenized/effective properties are “local” properties of a mixture of materials, and in thiscourse about Optimal Design it is extremely important to use local properties and to avoid any restrictionlike periodic situations for example, because we are looking for the best design and we should not postulatewhat we would like it to be. It is a different question to consider “global” properties, like how much energy islocated in a container, and some pioneers might have understood effective coefficients only in this restrictedway. People who are interested in the question of how much energy a given domain contains without being

38 For a diffusion equation, the question is very similar to using a function of one complex variable in thestyle of the work of David BERGMAN [Be], and extending it for small negative real values, and one cannotexpect to use materials which are not elliptic without imposing something on the interface, as can be checkedfor the checkerboard pattern, according to a formula of Joseph KELLER [Ke], which George PAPANICOLAOU

pointed out to me in 1980 after I had proved the same result. In the early 80s, Stefano MORTOLA and SergioSTEFFE had shown that using regularity of interfaces one can indeed use some “materials” with nonellipticcoefficients, but there is a relation between a measure of the regularity of the interface and the amount ofnonellipticity allowed [Mo&St1].

39 He may also have mistaken as mathematicians some people who do not hesitate to publish withoutcorrect attribution some results that they have learned from others, without realizing that it may soonbecome apparent that they do not understand very well the subject that they are talking about; the lackof reaction of the audience might have been a sign that many did not care much about publishing wrongresults.

40 I remember a Chemistry teacher showing us a piece of phosphorus in a container full of oil, and heexplained why it was kept in oil by taking a small piece out, and it quickly burst into flames; the boundaryconditions on a piece of phosphorus are important because of chemical reactions taking place precisely atthe boundary.

41 After this incident in Minneapolis, I asked my student Sergio GUTIERREZ to look if one could extendthe theory of Homogenization in linearized Elasticity to some materials which are strongly elliptic but notvery strongly elliptic. As he showed as part of his PhD thesis [Gu], a local theory englobing the very stronglyelliptic materials and for which the formula of layers is valid cannot englobe any (isotropic) material which isstrongly elliptic but not very strongly elliptic, except perhaps for the limiting cases. I considered that resultas a fact that it is unlikely that such materials may exist, and I conjectured that the (linearized) evolutionequation with a single interface with one of these materials could be ill posed. Mort GURTIN has pointed outthat one can obtain some of these materials by linearization around an unstable equilibrium in (nonlinear)Elasticity, and it suggests then that it is unlikely that one could avoid Dirichlet conditions if one uses some ofthese materials, except perhaps by putting enormous forces at the boundary in order to avoid these materialsto become unstable. It is interesting to notice that the publication of Sergio GUTIERREZ’s result has createda strange reaction from a referee, who thought that it was contradicting a result on Γ-convergence; I havenot been able to convince the editor that this was irrelevant and proved that the referee did not understandwhat Homogenization is about if he/she thinks that Γ-convergence is Homogenization, that the correctnessof the computations of Sergio GUTIERREZ is quite easy to check and that there is no reason to impose onhim the burden of explaining the errors that others may have committed elsewhere in unrelated subjects.

24

interested about where this energy is located precisely and how this energy moves around,42 often drift toquite unrealistic questions, and some still use names like Elasticity for these unrealistic questions, luring afew naive mathematicians out of the scientific path.

Let us go back to deriving bounds on effective coefficients, and look at the compatibility of H-convergencewith the usual preorder relation on L(RN ; RN ); let us recall that for A,B ∈ L(RN ; RN ), A ≤ B means thatfor every ξ ∈ RN one has (Aξ.ξ) ≤ (Bξ.ξ); this preorder is not an order on L(RN ; RN ), but it is a partialorder if one restricts attention to symmetric operators.

Proposition 13: [DG&Sp] If a sequence An ∈ M(α, β; Ω) satisfies (An)T = An for all n, a.e. x ∈ Ω andH-converges to Aeff , if a sequence un converges to u∞ in H1

0 (Ω) weak and if ϕ ≥ 0 in Ω with ϕ ∈ Cc(Ω),then one has

lim infn

∫Ω

ϕ(An grad(un).grad(un)

)dx ≥

∫Ω

ϕ(Aeff grad(u∞).grad(u∞)

)dx. (5.9)

For every u∞ ∈ H10 (Ω) there exists a sequence vn converging to u∞ in H1

0 (Ω) weak and such that for everyϕ ∈ Cc(Ω) one has

limn

∫Ω

ϕ(An grad(vn).grad(vn)

)dx =

∫Ω

ϕ(Aeff grad(u∞).grad(u∞)

)dx. (5.10)

Proof: Let f = −div(Aeff grad(u∞)

)and let vn ∈ H1

0 (Ω) be the solution of −div(An grad(vn)

)= f in Ω,

then vn converges to some v∞ in H10 (Ω) weak and as v∞ is solution of −div

(Aeff grad(v∞)

)= f in Ω, one

must have v∞ = u∞ a.e. in Ω; therefore vn converges to u∞ in H10 (Ω) weak and An grad(vn) converges to

Aeff grad(u∞) in L2(Ω; RN ) weak. Then one computes

lim infn

∫Ω

ϕ(An(grad(un)− grad(vn)

).grad(un)− grad(vn)

)dx, (5.11)

which is a nonnegative number. One term is∫

Ωϕ(An grad(un).grad(un)

)dx whose lim infn is what we are

interested in; then, because of the symmetry of An, the other terms are∫

Ωϕ(An grad(vn).grad(−2un +

vn))dx, and the Div-Curl lemma applies so the limit is −

∫Ωϕ(Aeff grad(u∞).grad(u∞)

)dx, and this gives

(5.10). Of course, (5.11) is obtained by using the sequence vn just constructed and applying the Div-Curllemma.

The preceding result is due to Ennio DE GIORGI and Sergio SPAGNOLO [DG&Sp], who were usingcharacteristic functions of measurable sets for ϕ, because the use of Meyers’s regularity result permits toprove the preceding result for ϕ ≥ 0, ϕ ∈ L∞(Ω). Notice that the convexity argument of Lemma 11 givesA− on the right side of (5.11) instead of Aeff , so Proposition 13 is more precise then that Lemma 11 asA− ≤ Aeff by Proposition 12.

Proposition 14: If a sequence An ∈M(α, β; Ω) satisfies (An)T = An for all n, a.e. x ∈ Ω and H-convergesto Aeff , if a sequence Bn ∈ M(α, β; Ω) H-converges to Beff and if An ≤ Bn a.e. in Ω for all n, then onehas Aeff ≤ Beff a.e. x ∈ Ω.Proof: Let g ∈ H−1(Ω) and let vn ∈ H1

0 (Ω) be the sequence of solutions of −div(Bn grad(vn)

)= g in Ω,

which converges in H10 (Ω) weak to v∞, and Bn grad(vn) converges in L2(Ω; RN ) weak to Beff grad(v∞).

Then for ϕ ≥ 0, ϕ ∈ Cc(Ω), one passes to the limit in the inequality∫Ω

ϕ(Bn grad(vn).grad(vn)

)dx ≥

∫Ω

ϕ(An grad(vn).grad(vn)

)dx. (5.12)

42 Many mathematicians still seem to believe in a world described by stationary equations, where energyis minimized, as if they had never heard about the First Principle of “conservation of energy”, and althoughit is a difficult question to explain how some energy is transformed into heat and how good the SecondPrinciple is, it certainly seems utterly unrealistic to believe that after a short time every system finds itspoint of minimum potential energy.

25

The left side converges to∫

Ωϕ(Beff grad(v∞).grad(v∞)

)dx by the Div-Curl lemma; the lim infn of the

right side is ≥∫

Ωϕ(Aeff grad(v∞).grad(v∞)

)dx by Proposition 13, and one obtains∫

Ω

ϕ(Beff grad(v∞).grad(v∞)

)dx ≥

∫Ω

ϕ(Aeff grad(v∞).grad(v∞)

)dx. (5.13)

As v∞ is arbitrary in H10 (Ω), one obtains ϕBeff ≥ ϕAeff a.e. in Ω, and varying ϕ gives the result.

The second part of Proposition 13 is valid without any symmetry requirement, but the first part is notalways true without symmetry. Similarly, Proposition 14 is not true for a general sequence An, even if all theBn are symmetric instead. Furthermore, in Proposition 12 we compared Aeff with A+, and the symmetryhypothesis on An is also important there. Let us construct a counter-example for N ≥ 2 (as every operatoris symmetric if N = 1); define An by

An = I + ψn(x1)(e1 ⊗ e2 − e2 ⊗ e1), (5.14)

whereψn Ψ1; (ψn)2 Ψ2 in L∞(R) weak ?, (5.15)

and choose the sequence ψn so that Ψ2 > (Ψ1)2. The formula for layers (4.11) shows that

An H-converges to Aeff = I + Ψ1(e1 ⊗ e2 − e2 ⊗ e1) +(Ψ2 − (Ψ1)2

)e2 ⊗ e2. (5.16)

As A+ = I+Ψ1(e1⊗e2−e2⊗e1), this gives an example where Aeff ≤ A+ is false. One has An ≤ I for all n,while one does not have Aeff ≤ I, and one has instead Aeff ≥ I, as it must be from Proposition 14 becauseone has An ≥ I for all n. Finally, taking un = u∞ for all n, one has

∫Ωϕ(An grad(un).grad(un)

)dx =∫

Ωϕ| grad(u∞)|2 dx but∫

Ω

(Aeff grad(u∞).grad(u∞)

)dx =

∫Ω

ϕ(|grad(u∞)|2 + (Ψ2 −Ψ2

1)∣∣∣∂u∞∂x2

∣∣∣2) dx (5.17)

showing that in this case (5.9) is not true.

Although in this course on Optimal Design all the problems considered are symmetric, I find useful todescribe general results valid without symmetry assumptions: it is a good training in Mathematics to learnhow to deal with general problems first so that one can easily deduce what to do on simpler problems, andit is usually difficult for those who have been only trained on simple special cases to understand what to dowhen they encounter a new situation. For that reason, I describe estimates which are useful for questions ofperturbations and continuous dependence of H-limits with respect to parameters.

Lemma 15: Let A ∈M(α, β; Ω) and D ∈ L∞(Ω;L(RN ; RN )

)with

||D||L∞(

Ω;L(RN ;RN )) ≤ δ < α, (5.18)

then

A+D ∈M(α− δ, α β − δ

2

α− δ; Ω). (5.19)

Proof: Of course((A+D)ξ.ξ

)≥ α|ξ|2−|D ξ|.|ξ| ≥ (α−δ)|ξ|2. If A and D are symmetric, one has immediately

A+D ∈M(α−δ, β+δ; Ω), but in the general case the replacement for β requires more technical computations.One first notices that (Aξ.ξ) ≥ 1

β |Aξ|2 means

∣∣Aξ− β2 ξ∣∣ ≤ β

2 |ξ|, and drawing a picture in a Euclidean planecontaining ξ and Aξ helps understand how to obtain the above bound and also see why it is optimal.Analytically, defining L by 2L = αβ−δ2

α−δ one wants to show that for all ξ one has |(A+D)ξ−Lξ| ≤ L|ξ|, andthis will be a consequence of |Aξ−Lξ| ≤ (L− δ)|ξ|, i.e. of |Aξ|2− 2L(Aξ.ξ) ≤ (−2δ L+ δ2)|ξ|2, and by thedefinition of L one has −2δ L+ δ2 = (β − 2L)α; then one notices that |Aξ|2 − 2L(Aξ.ξ) ≤ (β − 2L)(Aξ.ξ)which is ≤ (β − 2L)α|ξ|2 as β − 2L ≤ 0.

26

Proposition 16: If sequences An ∈M(α, β; Ω) and Bn ∈M(α′, β′; Ω) H-converge respectively to Aeff andBeff , and |Bn −An|L(RN ;RN ) ≤ ε a.e. x ∈ Ω for all n, then

||Beff −Aeff ||L∞(Ω;L(RN ;RN )) ≤ ε√β β′

αα′. (5.20)

Proof: For f, g ∈ H−1(Ω), one solves

−div(An grad(un)

)= f in Ω; −div

((Bn)T grad(vn)

)= g in Ω, (5.21)

so un, vn converge in H10 (Ω) weak respectively to u∞, v∞, and An grad(un) and (Bn)T grad(vn) converge in

L2(Ω; RN ) weak respectively to Aeff grad(u∞) and (Beff )T grad(v∞). By the Div-Curl lemma one knowsthat

(An grad(un).grad(vn)

)and

(grad(un).(Bn)T grad(vn)

)converge in the sense of measures respectively

to(Aeff grad(u∞), grad(v∞)

)and

(grad(u∞).(Beff )T grad(v∞)

), and so for every ϕ ∈ Cc(Ω) one has

limn

∫Ω

ϕ((Bn −An) grad(un).grad(vn)

)dx =

∫Ω

ϕ((Beff −Aeff ) grad(u∞).grad(v∞)

)dx. (5.22)

Choosing moreover ϕ ≥ 0, and defining

X =∣∣∣∫

Ω

ϕ((Beff −Aeff ) grad(u∞).grad(v∞)

)dx∣∣∣ (5.23)

one deducesX ≤ ε lim sup

n

∫Ω

ϕ|grad(un)| |grad(vn)| dx. (5.24)

Then |grad(un)| |grad(vn)| ≤ aα|grad(un)|2 + b α′|grad(vn)|2 when 4a bαα′ ≥ 1, and as An ∈ M(α, β; Ω)and Bn ∈M(α′, β′; Ω) one has

X ≤ ε lim supn

∫Ω

ϕ[a(An grad(un).grad(un)

)+ b(Bn grad(vn).grad(vn)

)]dx, (5.25)

which gives

X ≤ ε∫

Ω

ϕ[a(Aeff grad(u∞).grad(u∞)

)+ b(Beff grad(v∞).grad(v∞)

)]dx, (5.26)

and thereforeX ≤ ε a β

∫Ω

ϕ|grad(u∞)|2 dx+ ε b β′∫

Ω

ϕ|grad(v∞)|2 dx. (5.27)

As this inequality is true for every ϕ ≥ 0, ϕ ∈ Cc(Ω), one deduces∣∣((Beff −Aeff ) grad(u∞).grad(v∞))∣∣ ≤ ε(a β|grad(u∞)|2 + b β′|grad(v∞)|2

)(5.28)

a.e. in Ω, and optimizing on rationals a and b satisfying 4a bαα′ ≥ 1, one obtains

∣∣((Beff −Aeff ) grad(u∞).grad(v∞))∣∣ ≤ ε√β β′

αα′|grad(u∞)| |grad(v∞)|, in Ω, (5.29)

and as u∞ and v∞ are arbitrary, one obtains (5.20).

Proposition 17: Let P be an open set of Rp. Let An be a sequence defined on Ω×P , such that An(·, p) ∈M(α, β; Ω) for each p ∈ P , and such that the mappings p→ An(·, p) are of class Ck (or real analytic) fromP into L∞

(Ω;L(RN ; RN )

), with bounds of derivatives up to order k independent of n. Then there exists a

subsequence Am such that for every p ∈ P the sequence Am(·, p) H-converges to Aeff (·, p) and p→ Aeff (·, p)is of class Ck (or real analytic) from P into L∞

(Ω;L(RN ; RN )

).

27

Proof: One considers a countable dense set Π of P and, using a diagonal subsequence, one extracts asubsequence Am such that for every p ∈ Π the sequence Am(·, p) H-converges to a limit Aeff (·, p). Usingthe fact that A is uniformly continuous on compact subsets of P and Proposition 16, one deduces then thatp→ Aeff (·, p) is continuous from P into L∞

(Ω;L(RN ; RN )

)and that for every p ∈ P the sequence Am(·, p)

H-converges to Aeff (·, p).Defining the operators Am(p) from V = H1

0 (Ω) into V ′ = H−1(Ω) by Am(p)v = −div(Am(·, p) grad(v)

),

one finds that the mappings p→ Am(p) are of class Ck (or real analytic) from P to L(V ;V ′) and similarlyp→

(Am(p)

)−1 are of class Ck (or real analytic) from P to L(V ′;V ), and finally the operators Rm definedby Rmv = Am(·, p) grad(vm) with vm defined by Am(vm) = Aeff (v) are of class Ck (or real analytic) fromP into L

(V ;L2(Ω; RN )

); all the bounds of derivatives up to order k being independent of m so that the limit

inherits of the same bounds and as Reffv = Aeff (·, p) grad(v), one deduces therefore that p→ Aeff (·, p) isof class Ck (or real analytic) from P into L∞

(Ω;L(RN ,RN )

).

As I mentioned, some of the preceding results will not be used in this course but they serve as a moregeneral training on questions of Homogenization, but let us now come back to the method that FrancoisMURAT and I were following in the early 70s in order to obtain some information on effective coefficients: wehad computed the convex hull of K defined in (5.1), we had used the relation (5.2) (where a first version ofthe Div-Curl lemma had been used), and we concluded that Aeff must satisfy (5.3); then we had noticed aquicker way to prove the result by the convexity argument of Lemma 11, but I will describe in (5.34)/(5.40)how to carry out the computations of the convex hull in a way that will be useful for further generalizations.We were considering the special case where an = χnα + (1 − χn)β, with 0 < α < β < ∞, and χn being asequence of characteristic functions converging in L∞(Ω) weak ? to θ, so that θ(x) represent the proportionof the material of conductivity α near the point x; in that case we denote

λ+(θ) = θ α+ (1− θ)β;1

λ−(θ)=θ

α+

1− θβ

, (5.30)

and the analogue of Proposition 12 told us that the eigenvalues of Aeff were in the interval [λ−(θ), λ+(θ)].In dimension N = 2, the eigenvalues λ1, λ2 of Aeff must satisfy

λ−(θ) =αβ

θ β + (1− θ)α≤ λ1, λ2 ≤ θ α+ (1− θ)β = λ+(θ), (5.31)

defining a square in the (λ1, λ2) plane, and varying θ between 0 and 1, the union of these squares gaveus the constraint that any mixture of two isotropic materials with conductivity α and β must satisfy theinequalities

α ≤ αβ

α+ β −maxλ1, λ2≤ minλ1, λ2 ≤ maxλ1, λ2 ≤ α+ β − αβ

minλ1, λ2≤ β, (5.32)

and we showed that the characterization (5.32) is optimal (while the characterization (5.31) is not optimal).For this we used the formula for layers, which in dimension N creates a tensor Aeff with one eigenvalueequal to λ−(θ) with the eigenvector orthogonal to the layers, and (N − 1) eigenvalues equal to λ+(θ) withthe eigenvectors parallel to the layers, so that if N = 2 all the points on the boundary of the set (5.32) areobtained by layered materials. Then we took one anisotropic material with eigenvalues (γ, δ) and changingthe orientation of the material gave two diagonal tensors A1 = (γ, δ) and A2 = (δ, γ), and using layersorthogonal to the x1 axis, using proportion η of material A1 and proportion (1 − η) of material A2 gavematerials with diagonal tensors

(γ δ

η δ+(1−η)γ , η δ + (1 − η)γ), having therefore determinant equal to γ δ, so

that by taking γ = λ−(θ) and δ = λ+(θ) showed that the piece of equilateral hyperbola joining the two(non isotropic) diagonal corners of the square defined by (5.31) were attainable by mixtures and the unionof these pieces of hyperbolas covered the set (5.32) (independently, Alain BAMBERGER had noticed that indimension N = 2 if one mixes materials with det(A) = c then one has det(Aeff ) = c.

In doing so, we had followed the intuition that if one mixes a few materials which themselves have beenobtained as mixtures of some initial materials, then the result can be obtained by mixing directly the initialmaterials in an adapted way. Mathematically, it is here that the metrizability property mentioned before is

28

important: one tries to define the closure for a metrizable topology of a set containing the tensors of the form(χα+(1−χ)β)I with χ being the characteristic function of an arbitrary measurable set (or of a smoother setlike an open set, for example), and one identifies then some first generation sets contained in the sequentialclosure of the initial set, then one identifies some second generation sets contained in the sequential closureof some first generation sets, and one repeats the process finitely many times, and because the topology ismetrizable every set constructed is included in the sequential closure of the initial set. However, the localcharacter of H-convergence proved in Proposition 10 has also been used: if ωj , j ∈ J , is a countable collectionof disjoint open43 sets such that Ω is the union of all the ωj plus a set of measure 0, and if for each j one hasa sequence Ajn ∈ M(α, β;ωj) which H-converges to Aeffj , then one can glue these pieces together, definingAn =

∑j χωjAjn, which belongs to M(α, β; Ω) and by Theorem 6 a subsequence H-converges to some Aeff ,

and Proposition 10 asserts that the restriction of Aeff to ωj is Aeffj , and therefore all the sequence H-converges to Aeff =

∑j χωj

Aeffj . The preceding arguments enabled us to create in the desired H-closure(or G-closure, as we were working with symmetric tensors) any measurable tensor taking a constant valuebelonging to the set (5.32) for each of the disjoint open sets ωj , and the conclusion came from the remarkthat if a sequence An ∈M(α, β; Ω) converges almost everywhere to A∞ then it H-converges to A∞.

This is the result which I used in 1974 in order to compute necessary conditions of optimality for classicalsolutions [Ta2]. Although Francois MURAT and I had followed the same type of construction that AntonioMARINO and Sergio SPAGNOLO had used in [Ma&Sp], the reason why we had been able to go further wasthat we had obtained the necessary condition of Proposition 12, which had given us in dimension N = 2the conditions (5.31) and (5.32). We were not able at the time to obtain the characterization in dimensionN ≥ 3, or even in the case N = 2 we could not find the optimal characterization improving (5.31), whenone imposes to use given proportions. The first step towards the solution of these more general questionswas my introduction at the end of 1977 of a new method for obtaining bounds for effective coefficients [Ta6];this method makes use of the notion of correctors in Homogenization and it requires the choice of adaptedfunctionals for which one checks the hypotheses by applying the Compensated Compactness theory. Beforedescribing these new ingredients, I want to show what a more precise analysis of (5.2) with the definition(5.1) gives.44

In the case An = an I which we had investigated first, we knew that the L∞(Ω) weak ? limits of an

and 1an , denoted respectively by a+ and 1

a−, were needed in the case where an only depends upon one linear

combination of coordinates. We had constructed sequences En = grad(un) and Dn = anEn = an grad(un)converging in L2(Ω; RN ) weak, respectively to E∞ and D∞, and we had shown by an integration by parts(instead of the Div-Curl lemma which we discovered later) that an|En|2 converges in the sense of measuresto (D∞.E∞). Therefore we decided to look at the convex hull in R2N+3 of the set K defined by (5.1), andone may wonder if it changes much to add other functions of an to the list and look at the convex hull ofthe subset K of R2N+k+1 defined by

K =(E, aE, a|E|2, f1(a), . . . , fk(a)

)| E ∈ RN , a ∈ [α, β]

, (5.33)

and f1, . . . , fk are k given continuous functions on [α, β]. The computations will show that the L∞(Ω) weak ?limits of an and 1

an appear naturally. One characterization of the closed convex hull of K requires consideringquantities of the form (E.u1) + (aE.u2) +C0 a |E|2 +

∑ki=1 Cifi(a), where u1, u2 ∈ RN and C0, . . . , Ck ∈ R,

in order to compute their infimum for E ∈ RN and a ∈ [α, β]. One has then to consider the infimum of(E.u1) + (aE.u2) + C0 a |E|2 for E ∈ RN , and this infimum is −∞ if C0 < 0, or if C0 = 0 and either u1 or

43 In his work on G-convergence, Sergio SPAGNOLO uses Meyers’s regularity theorem [Me], and he can usedisjoint measurable sets.

44 Although we could have done the following computations in the early 70s, I only noticed Lemma 18and Lemma 19 while I was working on a set of lecture notes for my CBMS-NSF course in Santa Cruz inthe Summer 1993 (I have abandoned this project since), and for preparing my lecture for a meeting in Nicein 1995, where I applied our method for the case of mixing arbitrary anisotropic materials [Ta14]; I willdescribe later this extension (Lemma 42), which is based on Lemma 18.

29

u2 is not 0; therefore the typical formula is

minE∈RN

(a|E|2 − 2(E.v)− 2(aE.w)

)= −|v + aw|2

a= −|v|

2

a− 2(v.w)− a |w|2 for v, w ∈ RN . (5.34)

Taking the limit of

an|En|2 − 2(En.v)− 2(anEn.w) ≥ −|v|2

an− 2(v.w)− an |w|2, (5.35)

one obtains

(D∞.E∞)− 2(E∞.v)− 2(D∞.w) ≥ −|v|2

a−− 2(v.w)− a+|w|2, (5.36)

and that inequality is true for every v, w ∈ RN . The best choice for v and w in (5.36) is obtained by solvingthe system

v

a−+ w = E∞

v + a+w = D∞,(5.37)

and there is a difficulty if a− = a+, as one needs to have D∞ = a+E∞; this is not surprising as one always

has a− ≤ a+, with equality if and only if an converges in L1loc(Ω) strong to a+ (i.e. in Lploc(Ω) strong for

every p <∞, because an is bounded in L∞(Ω)). If a− < a+, the solution of (5.37) is given by

v =a−(a+E

∞ −D∞)a+ − a−

w =D∞ − a−E∞

a+ − a−,

(5.38)

and (5.34) leads to

(a+ − a−)(D∞.E∞)− a−(E∞.(a+E

∞ −D∞))−(D∞.(D∞ − a−E∞)

)≥ 0, (5.39)

i.e.(D∞ − a−E∞.D∞ − a+E

∞) ≤ 0, (5.40)

which means that D∞ belongs to the closed ball with diameter [a−E∞, a+E∞], and the formula is stillvalid if a− = a+. Having shown already that D∞ = Aeff E∞ for a symmetric matrix Aeff , the validityof (5.40) for every E∞ ∈ RN is equivalent to Aeff having all its eigenvalues between a− and a+ (definingM = Aeff − α+β

2 I, the condition becomes (M z.z) ≤ β−α2 |z|

2 for all z ∈ RN , equivalent to M having all itseigenvalues with modulus ≤ β−α

2 ).

A slightly different point of view for dealing with (5.40) is that if 0 < b ≤ a < ∞ and two vectorsE,D ∈ RN satisfy (D − aE.D − bE) ≤ 0, then there exists a symmetric B having its eigenvalues betweenb and a such that D = BE. Indeed, assuming b < a, let γ = a+b

2 , δ = a−b2 and F = 1

δ (D − γ E), one has|F | ≤ |E| and one wants to find a symmetric C of norm ≤ 1, equal to 1

δ (B − γ I), such that C E = F , andthere is actually such a C satisfying ||C|| ≤ |F ||E| if E 6= 0; in the case of two unit vectors e1, e2 which are notparallel, the basic construction for finding a symmetric contraction mapping e1 onto e2 is to consider thesymmetry which has e1 ± e2 as eigenvectors with eigenvalues ±1 and if N > 2 the subspace orthogonal toe1 and e2 as eigenspace with an eigenvalue between −1 and +1. If N ≥ 2, the preceding construction showsthat if E 6= 0 and (D − aE.D − bE) = 0 then D = BE for a symmetric B having one eigenvalue b and theN − 1 other eigenvalues equal to a, and such a case appears when one uses layerings.

This helps proving the following result.

Lemma 18: If 0 < α′ ≤ bn ≤ an ≤ β′ <∞ a.e. in Ω, En, Dn ∈ L2(Ω; RN ) for all n, and

(Dn − anEn.Dn − bnEn) ≤ 0 a.e. in Ω, (5.41)

30

withan a∞ in L∞(Ω) weak ?,1bn

1b∞

in L∞(Ω) weak ?,

En E∞ in L2(Ω; RN ) weak,

Dn D∞ in L2(Ω; RN ) weak,(En.Dn) (E∞.D∞) in the sense of measures,

(5.42)

then one has(D∞ − a∞E∞.D∞ − b∞E∞) ≤ 0 a.e. in Ω. (5.43)

Proof: By the preceding analysis, one has Dn = BnEn a.e. in Ω, with Bn symmetric and having itseigenvalues in [bn, an]. Therefore for v, w ∈ RN one has

(Dn.En)− 2(En.v)− 2(Dn.w) = (BnEn.En)− 2(En.v +Bn w) ≥ −((Bn)−1(v +Bn w).(v +Bn w)

)= −

((Bn)−1v.v

)− 2(v.w)− (Bn w.w) ≥ − 1

bn|v|2 − 2(v.w)− an|w|2,

(5.44)a.e. in Ω; after using test functions ϕ ∈ Cc(Ω) with ϕ ≥ 0 in Ω, one obtains

(D∞.E∞)− 2(E∞.v)− 2(D∞.w) ≥ − 1b∞|v|2 − 2(v.w)− a∞|w|2 a.e. in Ω, (5.45)

and as this is the same inequality than (5.36) with a− replaced by b∞ and a+ replaced by a∞, one deducesthe analogue of (5.40), which is (5.43).

Lemma 18 can also be derived as a consequence of the following result.

Lemma 19: Define the real function Φ on RN × RN × (0,∞)× (0,∞) by

Φ(E,D, a, b) =

1a−b (D − aE.D − bE) if b < a,0 if b = a and D = aE,+∞ otherwise,

(5.46)

then Φ is a convex function in(E,D, (E.D), a, 1

b

), and more precisely

Φ(E,D, a, b) = supv,w∈RN

(−(D.E) + 2(E.v) + 2(D.w)− 1

b|v|2 − 2(v.w)− a|w|2

). (5.47)

Proof: Indeed in the case 0 < b < a the quadratic form − 1b |v|

2 − 2(v.w) − a|w|2 is negative definite, andthe supremum is attained when (v, w) solves the analogue of (5.37), v

b + w = E and v + aw = D, whichgives the analogue of (5.38), v = b(aE−D)

a−b and w = D−bEa−b , and the value of the supremum is the quantity

defined in (5.46). If b > a > 0 the quadratic form is not definite and the supremum is +∞. If b = a > 0, thequantity to maximize is −(D.E) + 2(D − aE.w) + 2(E.v + aw)− 1

a |v + aw|2, and the supremum is +∞ ifD − aE 6= 0, and if D = aE it is −(D.E) + a|E|2, which is 0 in that case.

6. Correctors in HomogenizationIn 1975, I heard Ivo BABUSKA mention the importance of amplification factors: in real elastic materials

there is a threshold above which nonelastic effects (plastic deformation or fracture) usually appear. In amixture it is not the average stress which is relevant but the local stress, and therefore one must know anamplification factor for computing the local stresses from the average stress; I did keep this comment inmind when I defined my correctors.45 I first consider the case of layered media, for which one can prove astronger result than for the general case of H-convergence.46

45 I had then heard Jacques-Louis LIONS describe his computations for the periodically modulated case,which he had studied with Alain BENSOUSSAN and George PAPANICOLAOU [Be&Li&Pa], and I found thathis notation with χij created an unnecessary chaos with indices, which I decided to avoid.

46 I did the general framework in 1975 or 1976, but I only noticed later the stronger result for the case oflayered media, because of a lecture at a meeting in Luminy in the Summer 1993 [Ta13], where I consideredfunctionals depending upon the gradient, and for the reasons mentioned in footnote 44.

31

Proposition 20: Let a sequence An ∈ M(α, β; Ω) be such that it only depends upon x1 and that it H-converges to Aeff . If a sequence un converges in H1

loc(Ω) weak to u∞ and div(An grad(un)

)stays in a

compact of H−1loc (Ω) strong, then(

An grad(un))

1→(Aeff grad(u∞)

)1

in L2loc(Ω) strong,

∂un∂xi→ ∂u∞

∂xiin L2

loc(Ω) strong, for i = 2, . . . , N.(6.1)

If one defines the sequence Pn ∈ L∞(Ω;L(RN ; RN )

)by

(Pn)11 =(Aeff )11

(An)11,

(Pn)1j =(Aeff )1j

(Aeff )11− (An)1j

(An)11for j = 2, . . . , N,

(Pn)ij = δij for i = 2, . . . , N, and j = 1, . . . , N,

(6.2)

then one hasgrad(un)− Pn grad(u∞)→ 0 in L2

loc(Ω; RN ) strong. (6.3)

Proof: If one denotes En = grad(un) and Dn = An grad(un) and if one uses the vector Gn and the tensorBn = Φ(An) introduced after (4.11), the statement (6.1) means that Gn converges in L2

loc(Ω; RN ) strong toG∞. In order to prove this statement, one first notices that(

Bn(Gn −G∞).Gn −G∞)

converges to 0 in the sense of measures. (6.4)

Indeed, (BnGn.Gn) = (Dn.En) which converges in the sense of measures to (D∞.E∞) = (B∞G∞.G∞)by the Div-Curl lemma, and as Gn does not oscillate in x1 and Bn only depends upon x1 and convergesin L∞

(Ω;L(RN ; RN )

)weak ? to B∞ = Φ(Aeff ), both (BnGn.G∞) and (BnG∞.Gn) converge in L1

loc(Ω)weak to (B∞G∞.G∞). Then one notices that there exists γ > 0 such that

(Bn λ.λ) ≥ γ|λ|2 for all λ ∈ RN , (6.5)

as one may take γ = αβ2+1 for example, as (BnGn.Gn) = (AnEn.En) ≥ α|En|2 and |Gn|2 ≤ |Dn|2 + |En|2 ≤

(β2 + 1)|En|2. Then (6.1) follows from writing

∂un∂x1

=1

(An)11

[(An grad(un)

)1−

N∑j=2

(An)1j∂un∂xj

], (6.6)

and using the fact that 1(An)11

is uniformly bounded by 1α .

I have mentioned after (4.11) that the formula for computing the effective properties of a layered materialis valid under a much weaker hypothesis than the one used for the general theory of H-convergence, namely An

bounded and An11 ≥ α > 0 a.e. in Ω for layers in the direction x1.47 The main reason for using Lax–Milgramlemma in the general framework is that it enables to construct sequences En = grad(un) converging inL2(ω,RN ) weak to a constant vector with Dn = (An)TEn such that div(Dn) stays in a compact of H−1

loc (Ω);such an abstract construction is not needed in the case of layers because one can immediately write downexplicitly a similar sequence; indeed taking G to be a constant vector and defining On = Bn(x1)G, givesa vector En which is a gradient as (En)1 only depends upon x1 and Ei is constant for i = 2, . . . , N , andthe vector Dn = An(x1)En has divergence 0 as (Dn)1 is constant and (Dn)i only depends upon x1 for

47 It applies to the hyperbolic equation ∂∂t

(ρn(x)∂un

∂t

)− ∂

∂x

(an(x)∂un

∂x

)= f if an ≥ α and ρn ≥ α a.e. for

all n: if ρn converges to ρ∞ and 1an converges in L∞(Ω) weak ? to 1

aeff , then the effective equation has thesame form, with ρn and an replaced by ρ∞ and aeff .

32

i = 2, . . . , N ; the sequence En indeed converges in L2(Ω; RN ) weak to a limit E∞, and E∞ is not constantas (E∞)1 may indeed be a nonconstant function of x1, but it is not so important that the limit be constantas what is necessary for the argument to work is that one can construct N such sequences with limits whichare linearly independent, and this is true here.

However, the strong convergence result of Proposition 20 cannot be true without assuming an ellipticitycondition: let A be a constant tensor with A11 > 0 and (Aξ.ξ) = 0 for some ξ 6= 0 (and ξ 6= e1, as A11 > 0),then the sequence un defined by un(x) = 1

n sin(n(ξ.x)

)satisfies div

(Agrad(un)

)= 0 and Gn converges in

L2(Ω; RN ) weak to 0 but it does not converge in L2loc(Ω; RN ) strong. It is therefore natural to assume that

(Aξ.ξ) does not vanish and the hypothesis An ∈M(α, β; Ω) appears then as a natural restriction when onewants the argument to apply for every direction of layers.

In the general framework of H-convergence, one cannot prove a result as strong as Proposition 20, andthe basic result is the following.

Theorem 21: Let a sequence An ∈ M(α, β; Ω) H-converge to Aeff . Then there is a subsequence Am andan associated sequence Pm of correctors such that

Pm I in L2(Ω;L(RN ; RN )

)weak,

Am Pm Aeff in L2(Ω;L(RN ; RN )

)weak,

curl(Pm λ) = 0 in Ω for all λ ∈ RN ,div(Am Pm λ) stays in a compact of H−1

loc (Ω) strong, for all λ ∈ RN .

(6.7)

For any sequence um converging in H1loc(Ω) weak to u∞ with div

(Am grad(um)

)staying in a compact of

H−1loc (Ω) strong, one has

grad(um)− Pm grad(u∞)→ 0 in L1loc(Ω; RN ) strong. (6.8)

Proof: For an open set Ω′ of RN containing Ω, one extends An by α I in Ω′\Ω and one extracts a subsequenceAm which H-converges to a limit on Ω′; one also denotes this limit Aeff , and by Proposition 10 it mustbe an extension to Ω′ of the H-limit already defined on Ω. Then for i = 1, . . . , N , one chooses a functionϕi ∈ H1

0 (Ω′) such that grad(ϕi) = ei on Ω, and one defines Pm ei = grad(vm) in Ω, where vm ∈ H10 (Ω′) is

the solution of div(Am grad(vm)− Aeff grad(ϕi)

)= 0 in Ω′. By this construction vm converges in H1

0 (Ω′)weak to v∞, solution of div

(Aeff grad(v∞)−Aeff grad(ϕi)

)= 0 in Ω′, i.e. v∞ = ϕi, and therefore grad(vm)

and Am grad(vm) converge in L2(Ω′; RN ) weak, respectively to grad(ϕi) and Aeff grad(ϕi), i.e. Pm ei andAm Pm ei converge in L2(Ω; RN ) weak, respectively to ei and Aeff ei. By repeating this construction fori = 1, . . . , N , one obtains a sequence Pm satisfying (6.7). Actually the construction gives Pm satisfying amore precise condition than (6.7), as one has div(Am Pm λ − Aeff λ) = 0 in Ω for all λ ∈ RN , but it isuseful to impose only (6.7) as there may be slightly different definitions for Pn that may not satisfy thissupplementary requirement.48 The sequence Pm grad(u∞) is bounded in L1(Ω; RN ) because the sequenceof correctors Pm is bounded in L2

(Ω;L(RN ; RN )

). In order to prove (6.8), one chooses g ∈ C(Ω; RN ),

ϕ ∈ Cc(Ω), and one computes the limit of

Xm =∫

Ω

ϕ(Am(grad(um)− Pm g).grad(um)− Pm g

)dx. (6.9)

Writing g =∑k gkek, one expands the integrand in Xm and by (6.7) the Div-Curl lemma applies to

each term:(Am grad(um).grad(um)

)converges in the sense of measures to

(Aeff grad(u∞).grad(u∞)

),

and similarly (Am grad(um).Pm el) converges to (Aeff grad(u∞).el),(Am Pm ek.grad(um)

)converges to(

Aeff ek.grad(u∞))

and (Am Pm ek.Pm el) converges to (Aeff ek.el); as each gk is continuous and ϕ hascompact support, one deduces that

Xm → X∞ =∫

Ω

ϕ(Aeff (grad(u∞)− g).grad(u∞)− g

)dx. (6.10)

48 In the case of layers for example, it is more natural to look for Pm depending only upon x1, and thecorrectors defined in (6.2) satisfy div(Am Pm λ) = 0 in Ω, even if div(Aeff λ) does depend upon x1. In theperiodic case, it is more natural to ask for Pm to be periodic.

33

If u∞ ∈ C1(Ω) one can take g = grad(u∞), and one deduces that Xm → 0; by taking 0 ≤ ϕ ≤ 1 and ϕ = 1 ona compact K of Ω, one deduces that grad(um)− Pm grad(u∞)→ 0 in L2(K; RN ) strong for every compactof Ω. If u∞ ∈ H1(Ω), one cannot use g = grad(u∞) in general, and therefore one approaches grad(u∞) byg ∈ C(Ω; RN ) in order to have

||grad(u∞)− g||L2(Ω;RN ) ≤ ε, (6.11)

implying

X∞ ≤ β∫

Ω

|grad(u∞)− g|2 dx ≤ β ε2. (6.12)

By (6.10) and (6.12) one has

lim supm

∫K

α|grad(um)− Pm g|2 dx ≤ β ε2, (6.13)

from which one deduces

lim supm

∫K

|grad(um)− Pm g| dx ≤ ε√β meas(K)

α. (6.14)

Using (6.11) one deduces that

lim supm

∫K

|grad(um)− Pm grad(u∞)| dx ≤ ε√β meas(K)

α+ C ε, (6.15)

where C is an upper bound for the norm of Pm in L2(Ω;L(RN ; RN )

); therefore grad(um)−Pm grad(u∞)→ 0

in L1(K; RN ) strong and as K is an arbitrary compact of Ω, one obtains the desired result (6.8).

Using better integrability property for grad(u∞), and Meyers’s regularity theorem [Me], one can provethat grad(um) − Pm grad(u∞) converges in Lploc(Ω; RN ) strong to 0 for some p > 1. If Pm is bounded inLqloc

(Ω;L(RN ; RN )

)for some q > 2, as can be shown using Meyers’s regularity theorem [Me], or directly

as in the case of layers where one can take q = ∞, and grad(u∞) ∈ Lr(Ω; RN ) for some r ≥ 2, then onecan take g ∈ Ls(Ω; RN ) in (6.9) and (6.10) with s = 2q

q−2 , and if g is near grad(u∞) in Lr(Ω; RN ) or equalto grad(u∞) if r ≥ s, then Pm(grad(u∞) − g) is small in Ltloc(Ω; RN ) with t = qr

q+r ; one can then choosep = mins, t.

Although the following results will not be used in this course, it is important to realize that correctorsare important even for finding the effective equation for similar equations obtained by adding lower orderterms; for simplicity, I will ignore the advantage of using Meyers’s regularity theorem [Me].

Proposition 22: Let a sequence An ∈ M(α, β; Ω) H-converge to Aeff , let cn be a sequence bounded inLp(Ω; RN ) with p > N if N ≥ 2, p = 2 if N = 1, and assume that a sequence un converges in H1

loc(Ω) weakto u∞ and satisfies

−div(An grad(un)

)+(cn.grad(un)

)→ f in H−1

loc (Ω) strong. (6.16)

Then u∞ satisfies the equation

−div(Aeff grad(u∞)

)+(ceff .grad(u∞)

)= f in Ω, (6.17)

with an effective coefficient ceff such that for a subsequence

(Pm)T cm ceff in L2p/(p+2) weak if N ≥ 2, in the sense of measures if N = 1, (6.18)

for a sequence of correctors Pm.Proof: One extracts a subsequence such that (Pm)T cm converges in L2p/(p+2)(Ω) weak to ceff if N ≥ 2 andin the sense of measures if N = 1.49 Using Sobolev’s imbedding theorem,

(cm.grad(um)

)stays in a compact

49 There could be different subsequences of (Pm)T cm converging to different limits, but Proposition 22shows that all these limits give the same value for

(ceff .grad(u∞)

).

34

of H−1loc (Ω) and therefore Theorem 21 implies that grad(um)−Pm grad(u∞) converges in L1

loc(Ω; RN ) strongto 0, and one wants to prove that(cm.grad(um)

)(ceff .grad(u∞)

)in L2p/(p+2) weak if N ≥ 2, in the sense of measures if N = 1. (6.19)

Indeed (cm.Pm g) converges to (ceff .g) if g ∈ C(Ω; RN ), and if g satisfies (6.11), then (6.13) implies thatboth (cm.grad(um)− Pm g) and (ceff .grad(u∞)− g) have a small norm in L1, so that one deduces (6.19).

One could add a term dnun in the equation, with dn being a bounded sequence in Lq(Ω) with q > N2 for

N ≥ 2, q = 1 for N = 1. The term dnun stays in a compact of H−1loc (Ω) strong and a subsequence converges

to d∞u∞ if for that subsequence dm converges in Lq(Ω) weak to d∞ for N ≥ 2 or in the sense of measuresif N = 1. Without loss of generality, this term can be put into the right hand side converging in H−1

loc (Ω)strong to a known limit.

I conclude by some computations of Francois MURAT giving other properties of the correctors andshowing how to treat cases with terms converging only in H−1

loc (Ω) weak.

Proposition 23: Let a sequence An ∈ M(α, β; Ω) H-converge to Aeff , let bn be a sequence bounded inL2(Ω; RN ) and let unbe a sequence converging in H1

loc(Ω) weak to u∞ and satisfying

−div(An grad(un) + bn

)→ f in H−1

loc (Ω) strong. (6.20)

Then u∞ satisfies−div

(Aeff grad(u∞) + beff

)= f in Ω, (6.21)

with an effective term beff ∈ L2(Ω; RN ) such that

(Πm)T bm beff in the sense of measures, (6.22)

for a subsequence for which Πm denotes the corresponding correctors associated to (Am)T .50

Proof: One extracts a subsequence such that (6.22) holds and

Am grad(um) + bm converges in L2(Ω; RN ) weak to ξ, (6.23)

and one shows thatξ = Aeff grad(u∞) + beff in Ω. (6.24)

For v∞ ∈ C1c (Ω), let vn ∈ H1

0 (Ω) be the sequence of solutions of

div((Am)T grad(vm)− (Aeff )T grad(v∞)

)= 0 in Ω, (6.25)

so thatvm v∞ in H1

0 (Ω) weak,

(Am)T grad(vm) (Aeff )T grad(v∞) in L2(Ω; RN ) weak,

grad(vm)−Πm grad(v∞)→ 0 in L2loc(Ω; RN ) strong,

(6.26)

and one computes the limit of(Am grad(um) + bm.grad(vm)

), which is

(ξ.grad(v∞)

), by using the Div-Curl

lemma. As(grad(um).(Am)T grad(vm)

)has limit

(grad(u∞).(Aeff )T grad(v∞)

)and as

(bm.grad(vm)

)has

the same limit as(bm.Πm grad(v∞)

), i.e.

(beff .grad(v∞)

), one has proved that(

ξ −Aeff grad(u∞)− beff .grad(v∞))

= 0 in Ω, (6.27)

and by density of C1c (Ω) in H1

0 (Ω), one obtains (6.24).

50 From the explicit computation (6.2) of Pn in the case of layers, one sees that Πm is not in general equalto (Pm)T ; of course, if (An)T = An for all n one can choose Πm = Pm.

35

Francois MURAT also noticed that although Pm may only be bounded in Lp(Ω;L(RN ; RN )

)for some

p < ∞ (with p ∈ (2,∞) if one uses Meyers’s regularity theorem [Me], or p = 2 if one does not use it), it isnevertheless true that for q ∈ [2,∞] and any sequence βn bounded in Lq(Ω) and any i, j = 1, . . . , N , all thelimits of subsequences of (Pm)ij βm in the sense of measures actually belong to Lq(Ω). For q = 2 this is what(6.22) asserts for Πm instead of Pm by taking bm = βmej . For q > 2, assume that (Pm)ij βm converges inL2q/(q+2)(Ω) weak to ξ and β2

m converges in Lq/2(Ω) weak to β2∞ with β∞ ∈ Lq(Ω); then one uses the fact

that (Am Pm λ.Pm λ) converges in the sense of measures to (Aeff λ.λ) by the Div-Curl lemma, and thereforethe limit of any (Pm)2

ij in the sense of measures belongs to L∞(Ω). For ϕ ∈ Cc(Ω) with −1 ≤ ϕ ≤ 1 in Ω,one has ∫

Ω

ϕ ξ dx = limm

∫Ω

ϕ(Pm)ij βm dx ≤ limm

∫Ω

|ϕ|(ε(x)

2(Pm)2

ij +1

2ε(x)β2m

)dx (6.28)

for every function ε > 0; therefore∫

Ωϕ ξ dx ≤

∫Ω|ϕ|(ε2C

2 + 12εβ

2∞

)dx and then taking the infimum in ε

gives∫

Ωϕ ξ dx ≤

∫Ω|ϕ|Cβ∞ dx, i.e. |ξ| ≤ Cβ∞ a.e. ∈ Ω by varying ϕ.

Proposition 24: Under the hypotheses of Proposition 23, one has

grad(um)− Pm grad(u∞)− rm → 0 in L1loc(Ω) strong, (6.29)

for some rm (constructed explicitly) which satisfies

rm 0 in L2(Ω; RN ) weak,

Am rm + bm beff in L2(Ω; RN ) weak.(6.30)

Proof: Let ρn ∈ H10 (Ω) be the solution of

div(An grad(ρn) + bn

)= 0 in Ω, (6.31)

so that ρn is bounded in H10 (Ω) and by Proposition 23 a subsequence ρm converges in H1

0 (Ω) weak to ρ∞,solution of

div(Aeff grad(ρ∞) + beff

)= 0 in Ω. (6.32)

Then one notices that div(Am grad(um) − Am grad(ρm)

)→ f in H−1

loc (Ω) strong so that grad(um) −grad(ρm)− Pm

(grad(u∞)− grad(ρ∞)

)→ 0 in L1

loc(Ω; RN ) strong, and therefore one has (6.29) and (6.30)by taking

rm = grad(ρm)− Pm grad(ρ∞). (6.33)

As a corollary, if a sequence An ∈M(α, β; Ω) H-converges to Aeff , if bn is bounded in L2(Ω; RN ), if cn

is bounded in Lp(Ω; RN ) with p > N if N ≥ 2, p = 2 if N = 1, and if un is a sequence converging in H1loc(Ω)

weak to u∞ and satisfies

−div(An grad(un) + bn

)+(cn.grad(un)

)→ f in H−1

loc (Ω) strong, (6.34)

then, using the definitions of ceff and beff given by (6.18) and (6.22), u∞ satisfies

−div(Aeff grad(u∞) + beff

)+(ceff .grad(u∞)

)+ e∞ = f in Ω, (6.35)

where(cn.rn) e∞ in L2p/(p+2)(Ω) weak. (6.36)

As many seem to believe that Homogenization means periodicity, it is important to notice that in theframework of G-convergence that Sergio SPAGNOLO had developped in the late 60s or in the frameworkof H-convergence that Francois MURAT and I had developed in the early 70s, there were no conditions ofperiodicity. As Francois MURAT and I were looking at questions of Optimal Design, there was no reason

36

for thinking that periodicity had anything to do with our problem, and when we discovered that HenriSANCHEZ-PALENCIA had been working on asymptotic methods for periodic structures [S-P1], [S-P2], ithelped us understand that what we had been doing was related to effective properties of mixtures, butit was not more useful for our purpose. In the Fall of 1974, after I had described my work in Madison,Carl DE BOOR had mentioned some work by Ivo BABUSKA; this work was restricted to some engineeringapplications where periodicity is natural, and when I first met Ivo BABUSKA in the Spring 1975 [Ba], Idid learn from him about some practical questions, quite unrelated to those that we were interested in ourwork.51 In the Fall of 1975, at a IUTAM meeting in Marseille, I learned that Jacques-Louis LIONS had beenconvinced by Ivo BABUSKA of the importance of Homogenization for periodic structures and had workedwith Alain BENSOUSSAN and George PAPANICOLAOU, and I showed him my method of oscillating testfunctions associated with the Div-Curl lemma, and the first mention of it appears then in the article whichhe wrote for the proceedings [Li3]. It was only on the occasion of my lectures on our method at Breau-sans-Nappe in the Summer 1983 [Mu&Ta1] that George PAPANICOLAOU told me that he finally understood whyI had insisted so much about working without periodicity assumptions. Although I taught about generalquestions of Homogenization in my Peccot lectures in the Spring 1977, many who attended these lecturesbut specialized in questions with periodic structures seem to have forgotten to either quote that they wereusing my method or that my method was not restricted to periodic situations: it might be for that reasonthat Olga OLEINIK rediscovered my method by considering first quasi-periodic situations and then generalsituations.

Some people, who seem to try to avoid mentioning either the name of Sergio SPAGNOLO for the intro-duction of G-convergence in the late 60s or the names of Francois MURAT and me for the introduction ofH-convergence in the early 70s, often state that it is enough to consider periodic media; they may be unawarethat such a statement is perfectly meaningless for someone who does not know that there exists a generaltheory; they may not realize either that for those who know about the general theory it clearly shows thatthey have been unable to understand the general framework. It seems that many who started by studyingthe special case of periodic structures have had some trouble learning about the general framework, whilefor all those who have started by learning the general framework, the case of periodic structures appears asthe following simple exercise.

In the periodic setting, one starts with a period cell Y , generated by N linearly independent vectorsy1, . . . , yN , of RN i.e.

Y =y | y ∈ RN , y =

N∑i=1

ξi yi, 0 ≤ ξi ≤ 1 for i = 1, . . . , N, (6.37)

and one says that a function g defined on RN is Y -periodic if

g(y + yi) = g(y) a.e. y ∈ RN , for i = 1, . . . , N. (6.38)

For A ∈M(α, β; RN ) and Y -periodic, one defines An by

An(x) = A( xεn

)a.e. x ∈ Ω, (6.39)

where εn tends to 0.

Proposition 25: The whole sequence An defined by (6.39) H-converges to a constant Aeff , independentof the particular sequence εn used, and Aeff can be computed in the following way. For λ ∈ RN , letwλ ∈ H1

loc(RN ) be the Y -periodic solution (defined up to addition of a constant) of

div(A(grad(wλ) + λ)

)= 0 in RN , (6.40)

51 I could imagine some real situations where our work could be useful, at least after we would have madesome progress on the question of characterization of effective coefficients. I think that it was on this occasionthat I learned from Ivo BABUSKA about the importance of amplification factors for stress, but I do not recallever hearing him mention that the defects of linearized Elasticity were quite worse for mixtures than forhomogeneous materials, and I only realized that many years after.

37

and let P ∈ H1loc

(RN ;L(RN ; RN )

)be the Y -periodic function defined by

Pλ = grad(wλ) + λ a.e. in RN . (6.41)

ThenAeff λ =

1meas(Y )

∫Y

A(grad(wλ) + λ) dy for every λ ∈ RN , (6.42)

and a sequence of corrector is defined by

Pn(x) = P( xεn

)a.e. x ∈ RN . (6.43)

Proof: The sequence un defined by

un(x) = (λ.x) + εnwλ

( xεn

), a.e. x ∈ RN , (6.44)

satisfies un ∈ H1loc(RN ) and div

(An grad(un)

)= 0 in RN . The sequence un converges in H1

loc(RN ) weak tou∞, defined by u∞(x) = (λ.x), grad(un) is the rescaled version of λ + grad(wλ) which is Y -periodic andtherefore it converges in L2

loc(RN ; RN ) weak to its average on the period cell Y , i.e. to grad(u∞) = λ, andAn grad(un) ∈ L2

loc(RN ; RN ) is the rescaled version of A(λ + grad(wλ)

)which is Y -periodic and therefore

it converges in L2loc(RN ; RN ) weak to its average on Y , i.e. to the value Aeff λ as defined by (6.42); using

N linearly independent λ ∈ RN characterizes the H-limit of An as Aeff .As grad(un) = Pn grad(u∞) a.e., and Pn satisfies the conditions (6.7), the sequence Pn gives acceptable

correctors.

Once correctors had become natural objects for studying Homogenization, it was very natural to usethem for obtaining bounds on effective coefficients.

7. Bounds on effective coefficients: second methodA first difference between this new method that I introduced at the end of 1977 in [Ta7] and the preceding

one that I had used with Francois MURAT in the early 70s based on (5.1) and (5.2) (after having used anearlier version of the Div-Curl lemma), is that instead of considering one sequence of solutions one considersN linearly independent sequences of solutions which are the columns of the sequence of correctors Pn. Asecond difference is that the Div-Curl lemma had to be replaced by the more general theory of CompensatedCompactness that I had just developed with Francois MURAT in the meantime [Mu4], [Ta4], [Ta5], [Ta6],[Ta8].52 If An ∈M(α, β; Ω) H-converges to Aeff , then any sequence of correctors Pm has the property that

Pm P∞ = I in L2(Ω;L(RN ; RN )

)weak,

curl(Pm λ) stays in a compact of H−1loc

(Ω;La(RN ; RN )

)strong for all λ ∈ RN ,

(7.1)

52 Jacques-Louis LIONS had asked Francois MURAT to generalize our Div-Curl lemma, and he had givenhim an article by SCHULENBERGER and WILCOX which he thought related. Francois MURAT first proveda bilinear theorem: a sequence Un converged weakly to U∞ and satisfied a list of differential constraints,another sequence V n converged weakly to V∞ and satisfied another list of differential constraints, and hecharacterized which bilinear forms B had the property that B(Un, V n) automatically converged in the senseof measures to B(U∞, V∞). I told him that the bilinear setting looked artificial and that a quadratic settingwas more natural: for a sequence Un converging weakly to U∞ and satisfying a list of differential constraints,he then characterized which quadratic forms Q are such that Q(Un) automatically converges in the sense ofmeasures to Q(U∞). While he was giving a talk about his results at the seminar that Jacques-Louis LIONS

was organizing at Institut Henri POINCARE, it suddenly occurred to me that the right question was to lookat quadratic forms Q such that if Q(Un) converges in the sense of measures to ν then one automatically hasν ≥ Q(U∞) and before the end of the talk I had checked that the same method that we had used for theDiv-Curl lemma gave me the right characterization, and I did not even need the hypothesis of constant rankthat Francois MURAT had to impose, because of a slightly different method of proof.

38

where La(RN ; RN ) is the space of antisymmetric matrices, and if one defines the sequence Qm by

Qm = Am Pm, (7.2)

then Qm has the property that

Qm Q∞ = Aeff in L2(Ω;L(RN ; RN )

)weak,

div(Qm λ) stays in a compact of H−1loc (Ω) strong for all λ ∈ RN .

(7.3)

Of course each column of Pm plays the role of a vector Em and each column of Qm plays the role of avector Dm for which the Div-Curl lemma applies, and this means that (Qm)TPm converges in the senseof measures to (Q∞)TP∞ = (Aeff )T , but the Compensated Compactness theorem creates a few otherinteresting inequalities. While I was visiting the Mathematics Research Center in Madison in the Fall 1977,I had found a crucial additive to the Compensated Compactness theorem, as I had discovered a way touse a formal computation based on “entropies” for passing to the limit in general systems.53 As I waswondering what I should talk about at a meeting in Versailles in December 1977, and I had thought ofimproving bounds on effective coefficients, it was then natural that I tried to use more general functionals,not necessarily quadratic.

Theorem 26: Assume that F is a continuous function on L(RN ; RN )×L(RN ; RN ) which has the propertythat

Pm P∞ in L2(Ω;L(RN ; RN )

)weak,

Qm Q∞ in L2(Ω;L(RN ; RN )

)weak,

curl(Pm λ) stays in a compact of H−1loc

(Ω;La(RN ; RN )

)strong for all λ ∈ RN ,

div(Qm λ) stays in a compact of H−1loc (Ω) strong for all λ ∈ RN ,

(7.4)

imply

lim infm→∞

∫Ω

F (Pm, Qm)ϕdx ≥∫

Ω

F (P∞, Q∞)ϕdx for all ϕ ∈ Cc(Ω), ϕ ≥ 0. (7.5)

One defines the function g on L(RN ; RN ), possibly taking the value +∞, by

g(A) = supP∈L(RN ;RN )

F (P,AP ). (7.6)

Then if An ∈M(α, β; Ω) H-converges to Aeff , then one has

lim infn→∞

∫Ω

g(An)ϕdx ≥∫

Ω

g(Aeff )ϕdx for all ϕ ∈ Cc(Ω), ϕ ≥ 0. (7.7)

Proof: Of course, one assumes that the left side of (7.7) is < +∞, one extracts a subsequence Am forwhich lim infm is a limit and a sequence of correctors Pm exists. For X ∈ C1

(Ω;L(RN ; RN )

), the sequences

Pm = PmX and Qm = QmX satisfy (7.4) with P∞ = X and Q∞ = Aeff X, and therefore by (7.5) one has

lim infm→∞

∫Ω

F (PmX,Am PmX)ϕdx ≥∫

Ω

F (X,Aeff X)ϕdx for all ϕ ∈ Cc(Ω), ϕ ≥ 0, (7.8)

53 This is the improvement which I call the Compensated Compactness Method, on which I based mylectures at Heriot–Watt University in the Summer 1978 [Ta8]. Of course, my framework was never restrictedto hyperbolic systems, and I had already explained in [Ta5] how to use it for minimization problems, and I haddescribed again the same example in [Ta8] in order to show that my approach based on characterizing Youngmeasures associated to a given list of differential constraints was better than the programme that otherspreferred of looking only at sequentially weakly lower semi-continuous functionals. Of course, “entropies”were never specific to hyperbolic situations, and before discussing the case of hyperbolic systems, I hadexplained how “entropies” explain the sequential weak continuity of Jacobian determinants of size largerthan 2 as examples of the Compensated Compactness theorem. In [Ta6] I had advocated a different fact,that “entropy conditions” were also necessary for stationary solutions of Elasticity.

39

and as F (PmX,Am PmX) ≤ g(Am) by (7.6), one deduces that

lim infm→∞

∫Ω

g(Am)ϕdx ≥∫

Ω

F (X,Aeff X)ϕdx for all X ∈ C1(Ω;L(RN ; RN )

), (7.9)

and for all ϕ ∈ Cc(Ω), ϕ ≥ 0. For X ∈ L∞(Ω;L(RN ; RN )

), there exists a sequence Xn ∈ C1

(Ω;L(RN ; RN )

)such that Xn stays bounded in L∞

(Ω;L(RN ; RN )

)and converges a.e. to X, and by Lebesgue dominated

convergence theorem F (Xn, Aeff Xn) converges in L1(Ω) strong to F (X,Aeff X) and therefore (7.9) is true

for all X ∈ L∞(Ω;L(RN ; RN )

).

For r <∞ letgr(A) = sup

||P ||≤rF (P,AP ), (7.10)

which is continuous, as F is uniformly continuous on bounded sets. For ε > 0 let Mε be a measurablefunction taking only a finite number of distinct values in L(RN ; RN ) and such that ||Mε−Aeff || ≤ ε a.e. inΩ. Then one can choose a measurable Xε taking only a finite number of distinct values in L(RN ; RN ), suchthat ||Xε|| ≤ r and F (Xε,MεXε) = gr(Mε) a.e. in Ω, and as gr(Mε) converges uniformly to gr(Aeff ) as εtends to 0, one deduces from the inequality (7.9) for Xε that one has

lim infm→∞

∫Ω

g(Am)ϕdx ≥∫

Ω

gr(Aeff )ϕdx for all r < +∞ and all ϕ ∈ Cc(Ω), ϕ ≥ 0.. (7.11)

Then gr(Aeff ) increases and converges to g(Aeff ) as r increases to +∞, and one deduces (7.7) by Beppo-Levi’s theorem.

Of course, the (quadratic) theorem of Compensated Compactness, which I will state and prove a littlelater, provides an analytic characterization of all the homogeneous quadratic functions F which are suchthat (7.4) implies (7.5), namely it is true if and only if

F (η ⊗ ξ,Qξ) ≥ 0 for all η, ξ ∈ RN and all Qξ ∈ L(RN ; RN ) satisfying Qξ ξ = 0. (7.12)

It was only in June 1980, while I was visiting the Courant Institute at New York University, that Itried to find which F would be suitable for the case of mixing isotropic materials, restricting myself tothe case where Aeff would also be isotropic, i.e. equal to aeff I, and I decided then to look for func-tions F satisfying (7.12) which would also be invariant under a change of orthonormal basis. Of courseas a consequence of the Div-Curl lemma the functions F±ij (P,Q) = ±(QPT )ij = ±

∑kQik Pjk do satisfy

(7.12), and therefore F±(P,Q) = ±trace(QPT ) give two such invariant functions F satisfying (7.12). Astrace(PT P ),

(trace(P )

)2, trace(QT Q) and(trace(Q)

)2 are invariant under a change of orthonormal basis,I checked which linear combinations of these particular functions would satisfy (7.12). It is obvious thatF1(P ) = trace(PT P ) −

(trace(P )

)2 does satisfy (7.12), because if P = ξ ⊗ η then trace(PT P ) = |ξ|2 |η|2

and trace(P ) = (ξ.η), and therefore trace(PT P ) ≥(trace(P )

)2 by Cauchy–Schwarz’s inequality. Then Ifound that F2(Q) = (N −1)trace(QT Q)−

(trace(Q)

)2 also satisfies (7.12), by applying the following lemmato Qξ whose rank is at most N − 1.

Lemma 27: If M ∈ L(RN ; RN ) then

rank(M) trace(MT M)−(trace(M)

)2 ≥ 0. (7.13)

Proof: If rank(M) = k, one chooses an orthogonal basis such that the range of M is spanned by the firstk vectors of the basis, and then trace(M) =

∑iMii and trace(MT M) =

∑i,jM

2ij ≥

∑iM

2ii, which is

≥ 1k (∑iMii)2 by Cauchy–Schwarz’s inequality.

Among the combinations of these particular functions, I quickly selected two simple ones, correspondingto the following two lemmas. In June 1980, I only computed g(A) for A = λ I, but as Francois MURAT

40

suggested in the Fall that the same functionals would also give an optimal result for anisotropic A, we didtogether the computations for general symmetric A, and I show this general computation below.

Lemma 28: If

F1(P,Q) = α[trace(PT P )−

(trace(P )

)2]− trace(QPT ) + 2trace(P ), (7.14)

then for A ∈ L(RN ; RN ) with AT = A and A ≥ α I, and denoting λ1, . . . , λN , the eigenvalues of A, one has

g1(A) =τ

1 + α τ, with τ =

N∑j=1

1λj − α

. (7.15)

Proof: Of course, if α is an eigenvalue of A then τ =∞ and g1(A) = 1α . One chooses an orthonormal basis

where A is diagonal, and the form of F1(P,AP ) is unchanged, and one must compute

supP

N∑i,j=1

P 2ij − α

( N∑i=1

Pii

)2

−N∑

i,j=1

λi P2ij + 2

N∑i=1

Pii

), (7.16)

and for i 6= j a good choice for Pij is 0 (it does not really matter what Pij is if λi = α), and one must thencompute

supP

( N∑i=1

(α− λi)P 2ii − α

( N∑i=1

Pii

)2

+ 2N∑i=1

Pii

). (7.17)

If∑i Pii is a given value t, then in the case where λi > α for all i, maximizing

∑i(α − λi)P 2

ii is obtainedby taking Pii = C

λi−α for all i, so that t = C τ , and one finds t by maximizing −C2 τ − α t2 + 2t, i.e. by

maximizing − t2

τ − α t2 + 2t, which gives the value of t and the maximum equal to τ

1+α τ . If λi = α for somei, the best is to take Pii = t and Pjj = 0 for j 6= i, and then the best value of t and the maximum are equalto 1

α .

Lemma 29: If

F2(P,Q) = (N − 1)trace(QT Q)−(trace(Q)

)2 − β(N − 1)trace(QPT ) + 2trace(Q), (7.18)

then for A ∈ L(RN ; RN ) with AT = A and A ≤ β I, and denoting λ1, . . . , λN the eigenvalues of A, one has

g2(A) =σ

σ +N − 1, with σ =

N∑j=1

λjβ − λj

. (7.19)

Proof: Of course, if β is an eigenvalue of A then σ = ∞ and g2(A) = 1. One chooses an orthonormal basiswhere A is diagonal, and the form of F2(P,AP ) is unchanged, and one must compute

supP

((N − 1)

N∑i,j=1

λ2iP

2ij −

( N∑i=1

λi Pii

)2

− β(N − 1)N∑

i,j=1

λi P2ij + 2

N∑i=1

λi Pii

), (7.20)

and for i 6= j a good choice for Pij is 0 (it does not really matter what Pij is if λi = β), and one must thencompute

supP

((N − 1)

N∑i=1

(λi − β)λi P 2ii −

( N∑i=1

λi Pii

)2

+ 2N∑i=1

λi Pii

). (7.21)

If∑i λi Pii is a given value s, then in the case where λi < β for all i, maximizing

∑i(λi−β)λi P 2

ii is obtainedby taking Pii = C

β−λifor all i, so that s = C σ, and one finds s by maximizing −(N − 1)C2 σ − s2 + 2s, i.e.

by maximizing −N−1σ s2 − s2 + 2s, which gives the value of s and the maximum equal to σ

σ+N−1 . If λi = β

41

for some i, the best is to take Pii = sλi

and Pjj = 0 for j 6= i, and then the best value of s and the maximumare equal to 1.

Of course, I had also considered more general combinations like

F3(P,Q) = −trace(QPT ) + a[trace(PT P )−

(trace(P )

)2]+ b[(N − 1)trace(QT Q)−

(trace(Q)

)2]+ 2c trace(P ) + 2d trace(Q) with a, b ≥ 0,

(7.22)for which the computation of g3(γ I) requires to compute

supP

[(−γ + a+ b(N − 1)γ2)trace(PT P )− (a+ bγ2)

(trace(P )

)2 + 2(c+ γ d)trace(P )]. (7.23)

In order to have g3(γ I) < +∞, one needs to have −γ+a+ b(N − 1)γ2 ≤ 0, and one can then choose all nondiagonal coefficients of P equal to 0; for trace(P ) given one wants to minimize trace(PT P ), and therefore oneonly considers P = p I, and one wants then to maximize

(−γ+a+b(N−1)γ2−N(a+bγ2)

)p2 +2(c+γ d)p

),

and one obtains

g3(γ I) =(c+ γ d)2

(N − 1)a+ γ + b γ2if a, b ≥ 0 and − γ + a+ b(N − 1)γ2 ≤ 0. (7.24)

As it was not so easy to handle, I had choosen the simplification of considering either b = d = 0, whichcorresponds to Lemma 28, or a = c = 0, which corresponds to Lemma 29. I was interested in characterizingthe possible effective tensors Aeff of mixtures obtained by using proportion θ of an isotropic materialwith tensor α I and proportion 1 − θ of an isotropic material with tensor β I, i.e. I considered An =(χn α + (1 − χn)β)I with a sequence of characteristic functions χn converging in L∞(Ω) weak ? to θ, andAn H-converging to Aeff . I already knew (5.3); in order to show explicitly the dependence in θ, (5.3) meansthat the eigenvalues λ1, . . . , λN of Aeff satisfy

λ−(θ) ≤ λj ≤ λ+(θ), j = 1, . . . , N a.e. in Ω, (7.25)

where, as in (5.3)

λ+(θ) = θ α+ (1− θ)β, 1λ−(θ)

α+

1− θβ

, (7.26)

Theorem 26 asserts thatg(Aeff ) ≤ θ g(α I) + (1− θ)g(β I) a.e. in Ω, (7.27)

whenever g is associated to a function F for which (7.4) implies (7.5). For the particular function g1 givenby Lemma 28, one has g1(α I) = 1

α and g1(β I) = N/(β−α)1+αN/(β−α) = N

(N−1)α+β , and therefore (7.27) means thatτeff

1+α τeff ≤ θα + (1−θ)N

(N−1)α+β = (N−θ)α+θ β

α(

(N−1)α+β) , which gives for τeff the upper bound

τeff =N∑j=1

1λj − α

≤ (N − θ)α+ β

(1− θ)α(β − α). (7.28)

Equality occurs for the case of layers, which according to (4.11) corresponds to Aeff having one eigenvalueequal to λ−(θ) and the N − 1 others equal to λ+(θ), i.e

1λ−(θ)− α

+N − 1

λ+(θ)− α=

(N − θ)α+ β

(1− θ)α(β − α). (7.29)

For the particular function g2 given by Lemma 29, one has g2(α I) = N α/(β−α)N α/(β−α)+N−1 = N α

α+(N−1)β and

g2(β I) = 1, and therefore (7.27) means that σeff

σeff +N−1≤ θ N α

α+(N−1)β + (1 − θ) = (θ N+1−θ)α+(1−θ)(N−1)βα+(N−1)β ,

which gives for σeff the upper bound

σeff =N∑j=1

λjβ − λj

≤ (θ N + 1− θ)α+ (1− θ)(N − 1)βθ(β − α)

, (7.30)

42

and equality occurs for the case of layers, i.e

1β − λ−(θ)

+N − 1

β − λ+(θ)=

(θ N + 1− θ)α+ (1− θ)(N − 1)βθ(β − α)

. (7.31)

I discuss now the basic result of Compensated Compactness theory, which has been used for Lemma 29through the condition (7.12); Lemma 28 is more easy, and actually follows from the Div-Curl lemma, statedin (4.8)/(4.9) in Lemma 2, and mostly used in the case of gradients for which a simple proof by integrationby parts has been shown, except for an application in Proposition 9. The necessity of a condition like (7.12)is easy and the general result is not even restricted to quadratic functionals [Ta8].

Proposition 30: Assume that Ω is an open subset of RN , Aijk, i = 1, . . . , q, j = 1, . . . , p, k = 1, . . . , N , arereal constants and F is a continuous real function on Rp such that, whenever

Un U∞ in L∞(Ω; Rp) weak ?F (Un) V∞ in L∞(Ω) weak ?p∑j=1

N∑k=1

Aijk∂Unj∂xk

= 0 for i = 1, . . . , q,(7.32)

one can deduce thatV∞ ≥ F (U∞) a.e. in Ω. (7.33)

Then F is Λ-convex, i.e.

t 7→ F (a+ tλ) is convex for all a ∈ Rp and all λ ∈ Λ, (7.34)

where

Λ =λ | λ ∈ Rp : there exists ξ ∈ RN \ 0,

p∑j=1

N∑k=1

Aijkλjξk = 0 for i = 1, . . . , q. (7.35)

Proof: Let λ ∈ Λ and ξ ∈ RN \ 0 satisfy the condition in (7.35), then if one takes

Un(x) = a+ λ fn(

(ξ.x)), (7.36)

with fn smooth, one has

p∑j=1

N∑k=1

Aijk∂Unj∂xk

=( p∑j=1

N∑k=1

Aijkλjξk

)(fn)′

((ξ.x)

)= 0. (7.37)

One chooses a sequence χn of characteristic functions converging in L∞(R) weak ? to θ and a regularizationfn = ρn ? χn such that fn − χn converges almost everywhere to 0, one has

Un ≈ χn(a+ λ) + (1− χn)a, F (Un) ≈ χnF (a+ λ) + (1− χn)F (a)U∞ = θ(a+ λ) + (1− θ)a, V∞ = θ F (a+ λ) + (1− θ)F (a).

(7.38)

By hypothesis, one has V∞ ≥ F (U∞) and (7.34) follows by varying θ ∈ (0, 1), a ∈ RN , and λ ∈ Λ.

The sufficiency of a condition like (7.12) comes from applying a general result valid for quadraticfunctionals, which I often call the quadratic theorem of Compensated Compactness [Ta8].

Theorem 31: Let Q be a real homogeneous quadratic form on Rp which is Λ-convex, with Λ defined in(7.35), or equivalently

Q(λ) ≥ 0 for all λ ∈ Λ. (7.39)

43

IfUn U∞ in L2

loc(Ω; Rp) weakQ(Un) ν in the sense of measuresp∑j=1

N∑k=1

Aijk∂Unj∂xk

stays in a compact of H−1loc (Ω) strong for i = 1, . . . , q,

(7.40)

then one hasν ≥ Q(U∞) in the sense of measures. (7.41)

Proof: Un − U∞ satisfies (7.40) with U∞ replaced by 0, and ν replaced by ν −Q(U∞), and one may thenassume that U∞ = 0 with the goal of proving that ν ≥ 0. For ϕ ∈ C1

c (Ω), let Wn = ϕUn, which is extendedby 0 outside Ω and let us prove that

lim infn→∞

∫RN

Q(Wn) dx ≥ 0. (7.42)

This shows that 〈ν, ϕ2〉 ≥ 0 for all ϕ ∈ C1c (Ω), and by density for all ϕ ∈ Cc(Ω), and as every nonnegative

function in Cc(Ω) is a square, one deduces that ν ≥ 0 in the sense of measures.If Q(U) =

∑ij qij Ui Uj with qij = qji for all i, j = 1, . . . , p, I still denote Q the Hermitian extension to

Cp, i.e. Q(U) =∑ij qij Ui Uj , and by Plancherel formula, (7.42) is equivalent to

lim infn→∞

∫RN

Q(FWn) dξ ≥ 0, (7.43)

where F denotes Fourier transform, for which I use Laurent SCHWARTZ’s notations

FWn(ξ) =∫

RN

Wn(x) e−2iπ(x.ξ) dx. (7.44)

One can replace Q by <Q in (7.43), because the integral in (7.42) and therefore in (7.43) is real, and onenotices that (7.39) is equivalent to

<Q(λ) ≥ 0 for all λ ∈ Λ + iΛ ⊂ Cp. (7.45)

As Wn converges in L2(RN ; Rp) weak to 0 and keeps its support in a fixed compact set K of RN , itsFourier transform converges pointwise to 0 and is uniformly bounded and therefore by Lebesgue dominatedconvergence theorem it converges in L2

loc(RN ) strong to 0, and the problem for proving (7.43) lies in thebehaviour of FWn at infinity. Information at infinity is given by the partial differential equations satisfiedby Wn, and because

∑j

∑k Aijk

∂Wnj

∂xkmust converge in H−1(RN ) strong to 0 for i = 1, . . . , q, one deduces

thatq∑i=1

∫RN

11 + |ξ|2

∣∣∣ p∑j=1

N∑k=1

Aijk FWnj (ξ)ξk

∣∣∣2 dξ → 0. (7.46)

For |ξ| large ξk√1+|ξ|2

≈ ξk

|ξ| , and (7.46) tells that near infinity FWn is near Λ + iΛ, where <Q ≥ 0, and a

proof of (7.43) follows from the fact that for every ε > 0 there exists Cε such that

<Q(Z) ≥ −ε|Z|2 − Cεq∑i=1

∣∣∣ p∑j=1

N∑k=1

Aijk Zjξk|ξ|

∣∣∣2 for all Z ∈ Cp, ξ ∈ RN \ 0. (7.47)

Applying (7.47) to Z = FWn(ξ) and integrating in ξ for |ξ| ≥ 1, gives a lower bound for∫|ξ|≥1

<Q(FWn) dξwhere the coefficient of −ε is bounded as Wn is bounded in L2(RN ; Rp) and the coefficient of Cε tendsto 0 by (7.46), and therefore one deduces that lim infn

∫RN <Q(FWn) dξ ≥ −M ε, and letting ε tend to

0 proves (7.43). The inequality (7.47) is proved by contradiction: if there exists ε0 > 0 and a sequence

44

Zn ∈ CN , ξn ∈ RN \ 0 such that <Q(Zn) < −ε0|Zn|2 − n∑i

∣∣∑jk Aijk Z

njξn

k

|ξn|∣∣2, then after normalizing

Zn to |Zn| = 1, and extracting a subsequence such that Zn converges to Z∞ and ηn = ξn

|ξn| converges toη∞, one finds that <Q(Z∞) ≤ −ε0, which contradicts the fact that (Z∞, η∞) satisfies the conditions of thedefinition of Λ in (7.35), showing that Z∞ ∈ Λ + iΛ and therefore implying <Q(Z∞) ≥ 0 by (7.45).

8. Computation of effective coefficientsI computed the bounds (7.28) and (7.30) in the Fall 1980 with Francois MURAT, but in June 1980 in New

York I had only done the corresponding computations for the case where Aeff = aeff I, and having shown mynew bounds to George PAPANICOLAOU he had suggested that I compare them with the Hashin–Shtrikmanbounds [Ha&Sh], which I was hearing about for the first time. As for the first method for obtaining boundsfor effective coefficients that I had used before with Francois MURAT, that second method which I haddeveloped in [Ta7] was not restricted to symmetric tensors, and I could not understand what would replacein more general problems the particular minimization formulation that Zvi HASHIN and S. SHTRIKMAN

had used, but there was an obvious gap in their “proof”, and at the time I could not find a mathematicalargument which could explain their computation.54 However, the bounds which I had just found were indeedthe same as the formal bounds that they had derived, and I had therefore given the first mathematical proofthat the Hashin–Shtrikman bounds are indeed valid for mixtures of two isotropic materials in the case wherethe effective tensor is isotropic. I had not yet thought of showing that my bounds were optimal and could beattained (which I would have tried with the method of successive layerings, which was the only simple explicitconstruction that I knew), and as the construction of coated spheres that Zvi HASHIN and S. SHTRIKMAN

had used was clear enough to me, I easily transformed it into a correct mathematical argument, but I didnot try to compare with what the repeated formula for layerings would have given.

When I went back to Paris after spending the Summer at the Mathematics Research Center in Madison,I showed my computations to Francois MURAT an he suggested that the same functionals might also giveoptimal bounds for anisotropic effective tensors, and therefore we computed (7.28) and (7.30) and we tried toshow that the bounds were attained by a construction of coated confocal ellipsoids. Edward FRAENKEL wasvisiting Paris in the Fall of 1980, and as I had mentioned to him our plan, he had given us some advice aboutthe way to compute with ellipsoids, but we could not follow precisely what he had told us. We tried thenfamilies of general surfaces, and in order to simplify a very technical computation, we made a simplifyingassumption, and that gave us exactly the case of confocal ellipsoids, but using different formulas than theones that Edward FRAENKEL had advocated. Indeed the set of bounds (7.25), (7.28), (7.30) gave thecharacterization of the effective tensors of mixtures using exactly proportion θ of an isotropic material withtensor α I and proportion 1−θ of an isotropic material with tensor β I. I will not describe the computationsfor confocal ellipsoids, for which I refer to [Ta9], as I will describe a simpler approach later.

I described our results at a meeting at New York University in June 1981, and they gave the missinglink in the method that I had partially described in 1974 [Ta2]. As I suggested that the case of mixing morethan two isotropic materials would probably be very similar with a construction like the Hashin–Shtrikmancoated spheres in the isotropic case, with materials of increasing conductivity from inside out or from outsidein depending upon which bound was considered, I was surprised to hear a comment by a young participantthat even for three materials it was not so, and that hiding the best conductor in the middle was sometimesgiving a better effective conductivity than if the best conductor was put outside. The comment was comingfrom a young Australian physicist, still a graduate student at the time, who has since imposed himself asthe best specialist for questions of bounds of effective coefficients, Graeme MILTON.55

54 In their argument, Zvi HASHIN and S. SHTRIKMAN used something that did not make any (mathemat-ical) sense at the time, and it seems now related to H-measures, which I only introduced in the late 80s fora different purpose [Ta12]; the new method for deriving bounds which I wrote in [Ta12], generalizing myearlier approach of [Ta7], has actually some analogy with the argument of Zvi HASHIN and S. SHTRIKMAN,but it is not restricted to minimization problems.

55 In the early 90s, while I visited Graeme MILTON in New York, he gave me a physical explanation ofwhy it is sometime good to hide the best conductor available; he argued that if one has a spherical core of avery poor conductor, the electric current tries to avoid it and that creates a high concentration of field linesnear the surface of the poor conductor and therefore it is there that the best conductor is more useful.

45

In the Spring 1982, I gave an introductory course to Homogenization at Ecole Polytechnique, and asI thought that our construction with confocal ellipsoids could not be avoided, I had asked two students,Philippe BRAIDY and Didier POUILLOUX, to make a numerical study comparing the materials that we hadconstructed by using confocal ellipsoids and those that could be constructed by successive layerings, whichwas the method that we had used for the results quoted in [Ta2]. Contrary to my mistaken expectations,they reported that the numerical computations showed that the two sets were the same, and a few daysafter they had a proof of it, using N layerings in orthogonal directions, where in each layering the directionorthogonal to the layers is a common eigenvector for the two materials being mixed [Br&Po]. I immediatelychecked that the repeated layering construction has the same restricted “generalized BERGMAN function”than our construction with confocal ellipsoids. For example, in my interpretation of the construction withcoated spheres of Zvi HASHIN and S. SHTRIKMAN in June 1980, for a parameter θ ∈ [0, 1] I used a sequence ofVitali coverings with smaller and smaller coated spheres, all showing the same proportion of volume betweenthe inside spheres and the outside spherical coats; the geometry being given I imagined all the interiorspheres filled with an isotropic material with tensor α I and all the outside spherical coats filled with anisotropic material with tensor β I, and then I showed that the sequence An defined in this way H-converges toΦ0(α, β) I, where Φ0(α, β) is one of the two Hashin–Shtrikman bounds, corresponding to equality in (7.28).56

This is exactly the type of functions used by David BERGMAN when one mixes two isotropic conductors andone expects the resulting effective material to be isotropic whatever the ratio of the two conductivities are,and therefore one has Φ0(α, β) = αF

(βα

). David BERGMAN made the important observation that F extends

to the complex upper half plane into a holomorphic function satisfying =(F (z)

)≥ 0 [Be].57 What I call a

“generalized Bergman function” is a similar situation where a sequence of geometries is given for mixing rmaterials with proportions θ1, . . . , θr, and if the r materials used have tensors M1, . . . ,Mr, then the resultingeffective tensor is Φ(M1, . . . ,Mr). One assumes that for j = 1, . . . , r, Mj ∈M(αj , βj ; Ω), and that one has rsequences of characteristic functions of measurable sets from a partition χnj , j = 1, . . . , r, satisfying χnj χ

nk = 0

for j 6= k,∑j χ

nj = 1 a.e. in Ω, and one assumes that χnj converges in L∞(Ω) weak ? to θj for j = 1, . . . , r;

then one uses Proposition 17 in order to show that there is a subsequence for which for all such M1, . . . ,Mr,∑j χ

mj Mj H-converges to an element Φ(M1, . . . ,Mr) of M(α, β; Ω) with 0 < α = minα1, . . . , αr ≤ β =

maxβ1, . . . , βr < ∞. I use the qualificative restricted for expressing the fact that one restricts attentionto a special class of M1, . . . ,Mr, for example isotropic tensors m1 I, . . . ,mr I,58 and I do not know how tocompute the generalized Bergman function for a geometry of coated spheres or confocal ellipsoids, and itmay be dependent of other properties of the Vitali covering used.59 In their computation, Philippe BRAIDY

and Didier POUILLOUX had used the same method that Antonio MARINO and Sergio SPAGNOLO or Francois

56 My proof relied on the fact that, despite the huge arbitrariness in the choice of the Vitali coverings, forany λ ∈ RN I could write explicit solutions of div

(An grad(un)

)= 0 for which grad(un) converges weakly to

λ and I could compute the limit of An grad(un). My construction was local, and I followed the computationthat I had just read in the article of Zvi HASHIN and S. SHTRIKMAN [Ha&Sh], i.e. the explicit solutionof div

(Agrad(v)

)= 0 in a coated sphere domain with affine boundary conditions on the outside coat; this

computation is a by-product of the formula for the change in electric field created by an isotropic sphericalconducting inclusion in an infinite isotropic medium, a classical formula for physicists who associate it withvarious names, some as famous as MAXWELL, but it must have been known to GAUSS and to DIRICHLET,who seems to be credited for a similar formula for an ellipsoid (and he may therefore have known the formulasthat Francois MURAT and I (re)discovered for our construction with coated ellipsoids).

57 The idea may have been used before, and I think that I had heard such an idea attributed to PRAGER.In dimension N = 2, the function F also satisfies the relation F (z)F

(1z

)= 1 by an argument of Joseph

KELLER [Ke], and in the early 80s Graeme MILTON showed me that all such functions can be obtained.58 Ken GOLDEN and George PAPANICOLAOU have studied functions of r complex variables F appearing

when one imposes the restriction Φ(m1 I, . . . ,mr I) = F (m1, . . . ,mr)I for all m1, . . . ,mr > 0.59 The computations of Zvi HASHIN and S. SHTRIKMAN for coated spheres and diffusion equation consists

in looking for solutions of the form xjf(r) and one finds that f must satisfy a differential equation; theyalso used the same construction of coated spheres for linearized Elasticity with isotropic materials, and theycould compute the effective bulk modulus because it corresponds to applying a uniform pressure and thedisplacement has the form x g(r), and one finds that g must satisfy a differential equation. I do not know howto compute the effective shear modulus for the geometry of coated spheres, and it may depend upon which

46

MURAT and I had used in the early 70s, and the reiteration of the layering formula was simple enough becauseat each step the direction orthogonal to the layers was a common eigenvector of both (symmetric) tensorswhich were mixed, and actually the two tensors had a common basis of eigenvectors. During the Spring1983, while I was visiting the Mathematical Sciences Research Institute in Berkeley, I tried to compute theformula for mixing arbitrary materials in arbitrary directions, having in mind to reiterate the procedure. Iwanted to rewrite formula (4.11) in a more intrinsic way, and I could easily deduce what the formula (4.11)would become if I used layers orthogonal to a vector e, i.e. An depending only upon (x.e), but that didnot change much, and it was a different idea that simplified the computation. Using layers orthogonal to efor mixing two materials with tensors A and B, with respective proportions θ and 1 − θ, the simplificationcame by considering θ small, and because the formula appeared to have the form B + θ F (A,B, e) + o(θ),it suggested to write a differential equation B′ = F (A,B, e) and integrate it. In other terms, for e fixed,increasing the proportion of A from 0 to 1 creates a curve going from B to A in the space of matrices, andI first computed that curve by considering it as the trajectory of a differential equation, which was easy towrite down. One can first rewrite formula (4.11) for layers orthogonal to e.

1(An e.e)

1

(Aeff e.e)in L∞(Ω) weak ?

(An f.e)(An e.e)

(Aeff f.e)(Aeff e.e)

in L∞(Ω) weak ? for every f⊥e

(An e.g)(An e.e)

(Aeff e.g)(Aeff e.e)

in L∞(Ω) weak ? for every g⊥e

(An f.g)− (An f.e)(An e.g)(An e.e)

(Aeff f.g)− (Aeff f.e)(Aeff e.g)(Aeff e.e)

in L∞(Ω) weak ? for every f⊥e, g⊥e,

(8.1)where I have used the Euclidean structure of RN .60 Mixing A with a small proportion θ and B withproportion 1− θ in layers orthogonal to e gives then

1(Aeff e.e)

=1− θ

(B e.e)+

θ

(Ae.e)

(Aeff e.e) = (B e.e) + θ[(B e.e)− (B e.e)2

(Ae.e)

]+ o(θ),

(8.2)

and then for f and g orthogonal to e(Aeff f.e)(Aeff e.e)

=(1− θ)(B f.e)

(B e.e)+θ(Af.e)(Ae.e)

(Aeff f.e) = (B f.e) + θ[ (Af.e)(B e.e)

(Ae.e)− (B f.e)(B e.e)

(Ae.e)

]+ o(θ)

(8.3)

(Aeff e.g)(Aeff e.e)

=(1− θ)(B e.g)

(B e.e)+θ(Ae.g)(Ae.e)

(Aeff e.g) = (B e.g) + θ[ (Ae.g)(B e.e)

(Ae.e)− (B e.g)(B e.e)

(Ae.e)

]+ o(θ)

(8.4)

(Aeff f.g)− (Aeff f.e)(Aeff e.g)(Aeff e.e)

= (1− θ)(B f.g)− (1− θ)(B f.e)(B e.g)(B e.e)

+ θ(Af.g)− θ(Af.e)(Ae.g)(Ae.e)

(Aeff f.g) = (B f.g) + θ(

(Af.g)− (B f.g)−((B f.e)− (Af.e)

)((B e.g)− (Ae.g)

)(Ae.e)

)+ o(θ).

(8.5)

Vitali covering is used (if I understood correctly what Graeme MILTON told me a few years ago, he knew thatit does depend upon the covering). Gilles FRANCFORT and Francois MURAT have computed in [Fr&Mu] thecomplete effective elasticity tensors of mixtures, but following the method of multiple layerings, adaptingthe extension that I had given in [Ta9] of the computation of Philippe BRAIDY and Didier POUILLOUX.

60 It can be avoided by denoting E the ambient vector space, taking e as an element of the dual E ′, andconsidering the tensors A,B, as elements of L(E ′, E), as well as e⊗ e which appears in some formulas.

47

The form of (8.5) suggests that one has

Aeff = B + θ[A−B − (B −A)

e⊗ e(Ae.e)

(B −A)]

+ o(θ), (8.6)

and indeed this is compatible with (8.2)/(8.4). When e and A are given, formula (8.6) corresponds to adifferential equation

M ′ = A−M − (M −A)e⊗ e

(Ae.e)(M −A). (8.7)

The integral curve corresponds to the formula for layering with a material with tensor A, with layers orthogo-nal to e. Formula (8.1), which corresponds to the first lines of (8.2)/(8.5) when one mixes two materials withtensors A and B says that the integral curves become straight lines if one performs the change of variableA 7→

(1

(Ae.e) ,(Af.e)(Ae.e) ,

(Ae.g)(Ae.e) , (Af.g)− (Af.e)(Ae.g)

(Ae.e)

)when f, g span the subspace orthogonal to e. In the early

70s, we knew that if one does not pay attention to the proportions used of various materials, formula (4.11)means that the set of effective tensors has the property that all its images by maps like the one mentionedabove are automatically convex. This condition gives a geometric characterization of the sets that one cannotenlarge by layering, at least for the case where one is not allowed to rotate the materials used, in the casewhere one starts with some anisotropic materials; in realistic problems, one must also allow for rotations ofthe materials used, i.e. the set must be stable by mappings A 7→ PT AP for P ∈ SO(N), with N = 2 orN = 3 usually.

Assuming that M −A is invertible, (8.7) can be written as

[(M −A)−1]′ = −(M −A)−1A′(M −A)−1 = (M −A)−1 +e⊗ e

(Ae.e), (8.8)

whch is a linear equation in (M − A)−1. Using τ as variable, and assuming that τ = 0 corresponds to B,the solution of (8.8) is

(M −A)−1 = − e⊗ e(Ae.e)

+ eτ(

(B −A)−1 +e⊗ e

(Ae.e)

), (8.9)

and if M corresponds to having used proportion η(τ) of A and 1 − η(τ) of B, then for θ small η(τ + θ) =θ + (1− θ)η(τ) + o(θ) gives η′ = 1− η and therefore η = 1− e−τ or equivalently eτ = 1

1−η for proportion ηof A, giving

(M −A)−1 =(B −A)−1

1− η+

η

1− ηe⊗ e

(Ae.e)for proportion η of A. (8.10)

If (B − A)z = 0 for a nonzero vector z, then (8.7) shows that (M − A)z = 0, and in this case one mustreinterpret (8.10). Of course, exchanging the role of A and B and changing η into 1 − η, (8.10) is replacedby

(M −B)−1 =(A−B)−1

η+

1− ηη

e⊗ e(B e.e)

for proportion η of A. (8.11)

With formula (8.10) at hand I could easily reiterate the layering process with various directions of layers, withthe condition that each layering uses the material with tensor A, and it gave the following generalization ofthe formula which had been obtained by Philippe BRAIDY and Didier POUILLOUX in the special case whereA and B have a common basis of eigenvectors and each e is one of these common eigenvectors.

Proposition 32: For η ∈ (0, 1), let ξ1, . . . , ξp be p positive numbers with∑j ξj = 1− η, let e1, . . . , ep be p

nonzero vectors of RN , then using proportion η of material with tensor A and proportion 1− η of materialwith tensor B, one can construct by multiple layerings the material with tensor M such that

(M −B)−1 =(A−B)−1

η+

( p∑j=1

ξjej ⊗ ej

(B ej .ej)

). (8.12)

48

Proof: Of course, one assumes that B − A is invertible, as the formula must be reinterpreted if B − A isnot invertible. One starts from M0 = A and by induction one constructs Mj by layering Mj−1 and B inproportions ηj and 1− ηj , with layers orthogonal to ej . Formula (8.11) gives

(Mj −B)−1 =(Mj−1 −B)−1

ηj+

1− ηjηj

e⊗ e(B e.e)

for j = 1, . . . , p, (8.13)

which is adapted to reiteration and provides (8.12) with

η = η1 · · · ηpξ1 = 1− η1, ξj = η1 · · · ηj−1(1− ηj) for j = 1, . . . , p,

(8.14)

which gives ξ1 + . . .+ ξj = 1− η1 · · · ηj for j = 1, . . . , p, and this defines in a unique way ηj for j = 1, . . . , p.

The preceding computations did not require any symmetry assumption for A or B. The characterizationof the sum

∑j ξj

ej⊗ej

(B ej .ej) for all ξj > 0 with sum 1 − η and all nonzero vectors ej depends only on thesymmetric part of B (and of η).

Lemma 33: If B is symmetric positive definite then for ξ1, . . . , ξp > 0 and nonzero vectors e1, . . . , ep, onehas

p∑j=1

ξjej ⊗ ej

(B ej .ej)= B−1/2KB−1/2, with K symmetric nonnegative and trace(K) =

p∑j=1

ξj , (8.15)

and conversely any such K can be obtained in this way.Proof: Putting ej = B−1/2 fj for j = 1, . . . , p, one has K =

∑j ξj

fj⊗fj

|fj |2 , and each fj⊗fj

|fj |2 is a nonnegativesymmetric tensor with trace 1, and (8.15) follows. Conversely if K is a symmetric nonnegative tensor withtrace equal to S, then there is an orthonormal basis of eigenvectors f1, . . . , fN , with K fj = κj fj and κj ≥ 0for j = 1, . . . , N , and

∑j κj = S, so that K =

∑j κj fj ⊗ fj .

Using Proposition 32 and Lemma 33, with A = α I and B = β I, one can construct materials with asymmetric tensor M with eigenvalues λ1, . . . , λN , and (8.12) and Lemma 33 mean that

1λj − β

≥ 1η(α− β)

for j = 1, . . . , N

N∑j=1

1λj − β

=N

η(α− β)+

1− ηη β

,

(8.16)

i.e. λj ≤ λ+(η) for j = 1, . . . , N , and equality in (7.30), which implies λj ≥ λ−(η) for j = 1, . . . , N , becauseof (7.31). Exchanging the roles of A and B one can obtain another part of the boundary of possible effectivetensors with equality in (7.28), and filling the interior of the set is then easy.

After I had mentioned these new results to Robert KOHN, who was also visiting MSRI at the time,he wondered if one could find a more direct proof, and I therefore proved again the formulas (8.10)/(8.11)directly.

Lemma 34: Mixing materials with tensor A and B with respective proportions η and 1 − η in layersorthogonal to e gives an effective tensor Aeff given by

Aeff = η A+ (1− η)B − η(1− η)(B −A)e⊗ e

(1− η)(Ae.e) + η(B e.e)(B −A). (8.17)

Proof: One considers a sequence of characteristic functions χn converging in L∞(R) weak ? to η and depend-ing only upon (x.e), and one chooses An = χnA+ (1− χn)B. For an arbitrary vector E∞ ∈ RN , one con-structs a sequence En = grad(un) converging in L2

loc(RN ; RN ) weak to E∞, depending only upon (x.e) and

49

satisfying div(An grad(un)

)= 0, and one computes the limit in L2

loc(RN ; RN ) weak of Dn = An grad(un),which will be D∞ = Aeff E∞, with Aeff given by (8.17).

One looks for EA, EB ∈ RN such that one can take

En = χnEA + (1− χn)EBDn = χnAEA + (1− χn)BEBη EA + (1− η)EB = E∞,

(8.18)

and the constraints curl(En) = div(Dn) = 0 become

EB − EA = c e

(BEB −AEA.e) = 0,(8.19)

and then one should haveη AEA + (1− η)BEB = Aeff E∞. (8.20)

One chooses thenEA = E∞ + cA e; EB = E∞ + cB e; η cA + (1− η)cB = 0, (8.21)

and (8.19) requires that ((B −A)E∞.e

)+ cB(B e.e)− cA(Ae.e) = 0, (8.22)

and (8.21)/(8.22) give ((1− η)(Ae.e) + η(B e.e)

)cA = (1− η)

((B −A)E∞.e

)((1− η)(Ae.e) + η(B e.e)

)cB = −η

((B −A)E∞.e

),

(8.23)

and therefore (8.20) becomes

Aeff E∞ = (η A+ (1− η)B)E∞ +

((B −A)E∞.e

)(1− η)(Ae.e) + η(B e.e)

(η(1− η)Ae− η(1− η)B e

), (8.24)

and as (8.24) is true for every E∞ ∈ RN , one deduces formula (8.17) for Aeff .

One deduces then (8.10)/(8.11) from (8.17) by applying a result of Linear Algebra.

Lemma 35: If M ∈ L(E ,F) is invertible, and if a ∈ F , b ∈ E ′, then M +a⊗ b is invertible if (M−1a.b) 6= −1and

(M + a⊗ b)−1 = M−1 − 11 + (M−1a.b)

M−1(a⊗ b)M−1. (8.25)

Proof: One wants to solve (M + a⊗ b)x = y, i.e. M x+ a(b.x) = y, and therefore x = M−1 y− tM−1 a witht = (b.x), but one needs then to have t = (b.M−1 y)− t(M−1 a.b), which is possible because (M−1 a.b) 6= −1,and gives x = M−1 y −M−1 a (b.M−1 y)

1+(M−1 a.b) , and as y is arbitrary it gives (8.25).

A few years ago, working with Francois MURAT on the relation between Young measures and H-measures,61 we computed the analog of formula (8.17) when one mixes r different materials.

61 I have used in various publications not related to the purpose of this course our construction of admissiblepairs of a Young measure and a H-measure associated with a sequence, the first time at a meeting for the600th anniversary of the University of Ferrara in 1991. Our redaction of these results is still a draft whichwe have not looked at for years, but I have nevertheless given it to a few persons, and I wonder how manywill have claimed our results as theirs.

50

Lemma 36: Mixing r materials with tensors M1, . . . ,Mr, with respective proportions η1, . . . , ηr, in layersorthogonal to e, gives an effective tensor Meff given by

Meff =r∑i=1

ηiMi −∑

1≤i<j≤r

ηiηj(Mi −Mj)Rij(Mi −Mj)

Rij =1

(Mi e.e)e⊗ eH

1(Mj e.e)

for i, j = 1, . . . , r

H =r∑

k=1

ηk(Mk e.e)

.

(8.26)

Proof: As for the proof of Lemma 34, one uses

Ei = E∞ + ci e in the layers of material #i, for i = 1, . . . , r, andr∑i=1

ηi ci = 0, (8.27)

and one must have (MiEi.e) = (Mj Ej .e) if there is an interface between material #i and material #j, andtherefore there exists a constant C such that

(MiEi.e) = C for i = 1, . . . , r. (8.28)

With the definition of Ei, i = 1, . . . , r, (8.28) implies

ci =C − (MiE

∞.e)(Mi e.e)

for i = 1, . . . , r, (8.29)

and the condition∑i ηici = 0 gives

H C =r∑i=1

ηi(MiE

∞.e)(Mi e.e)

, (8.30)

with H given in (8.26). Using (8.28) one obtains

H (Mi e.e)ci =( r∑j=1

ηj(Mj E

∞.e)(Mj e.e)

)− (MiE

∞.e)(Mi e.e)

=r∑j=1

ηj

((Mj −Mi)E∞.e

)(Mj e.e)

for i = 1, . . . , r. (8.31)

This gives

Meff E∞ = (r∑i=1

ηiMi)E∞ +1H

r∑i=1

ηi(Mi e.e)

( r∑j=1

ηj

((Mj −Mi)E∞.e

)(Mj e.e)

)Mi e

= (r∑i=1

ηiMi)E∞ −1

2H

r∑i,j=1

ηiηj

((Mi −Mj)E∞.e

)(Mi e.e)(Mj e.e)

(Mi −Mj)e

= (r∑i=1

ηiMi)E∞ −1H

∑i<j

ηiηj(Mi −Mj)e⊗ e

(Mi e.e)(Mj e.e)(Mi −Mj)E∞,

(8.32)

proving (8.26).

I have shown the derivation of the differential equation (8.7) because it has some intrinsic interest. Inoticed later that by using relaxation techniques related to Lemma 1, one can replace e⊗e

(Ae.e) in (8.7) by any

convex combination∑j θj

ej⊗ej

(Aej .ej) , or more generally∫

SN−1e⊗e

(Ae.e) dπ(e) for a probability measure π on thesphere SN−1, and this gives a differential analogue of formula (8.12).62

62 One can also use a convex combination in (A.e), and I had hoped that this trick would give morecharacterization of effective coefficients. I did talk about this method at a meeting in Minneapolis in 1985,but I only mentioned it in writing for a meeting in Los Alamos in 1987.

51

As I will explain in the next chapter, I also discovered in the Spring 1983 that the characterizationobtained with Francois MURAT was not absolutely necessary for solving the problems that we had in mind.There was an obvious generalization to mixing an arbitrary number of isotropic materials, but there weresome technical details for mixing anisotropic materials, and I only noticed the following results much later(see footnote 44), and Lemma 18 played a crucial role. Although the two methods for obtaining bounds oneffective coefficients that I have described in chapters 5 and 7 are valid for nonnecessarily symmetric operators(as is the third one that I have mentioned in footnote 54, based on H-measures), the only applications that Iknow use symmetric operators,63 and because one allows for arbitrary rotations the set of effective operatorscorresponding to mixtures using precise proportions of each constituent is a set of matrices defined in termsof their eigenvalues, and that point is worth discussing in detail.

One starts from a finite number of materials with symmetric anisotropic tensors Mi, i = 1, . . . , r, andmixing them with local proportions ηi, i = 1, . . . , r, with

∑i ηi = 1 a.e. in Ω, means that one considers

sequences χni of characteristic functions of disjoint measurable sets for i = 1, . . . , r, i.e. χni χnj = 0 whenever

i 6= j, such thatχni ηi in L∞(Ω) weak ?

An =r∑i=1

χni (Rn)T MiRn, with Rn ∈ SO(N) a.e. in Ω

An H-converges to Aeff ,

(8.33)

and one writesAeff ∈ K(η1, . . . , ηr;M1, . . . ,Mr) a.e. in Ω, (8.34)

and the claim is that the set K(η1, . . . , ηr;M1, . . . ,Mr) of effective materials obtained by mixing the ma-terials M1, . . . ,Mr with “exact proportions” η1, . . . , ηr only depends upon Aeff through its eigenvaluesλeff1 , . . . , λeffN , which satisfy

(λeff1 , . . . , λeffN ) ∈ Λ(η1, . . . , ηr;M1, . . . ,Mr) a.e. in Ω, (8.35)

where Λ(η1, . . . , ηr;M1, . . . ,Mr) is a subset of RN which is invariant by permutation of the coordinates, be-cause one has not imposed any rule for ordering the eigenvalues. Like for the discussion following Proposition17, the statement above is clear from the intuitive understanding of what mixing is about, but I have notgiven any precise mathematical definition of what the set K(η1, . . . , ηr;M1, . . . ,Mr) is yet. In 1983, RobertKOHN had asked me a question showing that he was concerned about a problem of this kind: in my workwith Francois MURAT, obtained for mixing two isotropic materials, we had found a necessary condition ofthe type Aeff ∈ S(θ) for a set of matrices S(θ) which was our candidate for K(θ, 1− θ;α I, β I), and becausethis set is convex it is easy to approach a function (θ,A) such that A(x) ∈ S

(θ(x)

)a.e. x ∈ Ω by a piecewise

constant function satisfying the same constraint, as after decomposing Ω into small open cubes (plus a set ofmeasure 0), one can replace θ on any such small cube ω by its average θ on ω and replace A by its projectionon S(θ), giving a new function (θ, A) satisfying the same constraint with θ piecewise constant; then on eachsmall cube one approaches A by a piecewise constant function (on much smaller cubes) taking its values inS(θ). One also uses the fact that the Hausdorff distance from S(θ1) to S(θ2) is O(|θ1 − θ1|. However, theconvexity of each S(θ) is not really important, and one can give a precise meaning of (8.34), for the case(8.33) or for other still more general situations, in the following way.

63 Around 1984, I generalized a formula of Joseph KELLER valid for dimension N = 2 [Ke], and I laterlearned from Graeme MILTON that he had also discovered the same result (and he had kindly proposed thatI cosign an article where he was using these formulas; I do not know the precise reference of his article).He had been led to these formulas by studying Hall effect, and if my memory is correct it is a nonzerovalue of a magnetic field which is responsible for the appearance of a nonsymmetric tensor, and this is theonly instance of a non symmetric situation which I have heard about in real problems. Later I tried to usethese formulas with Michel ARTOLA in order to prove a conjecture of Stefano MORTOLA and Sergio STEFFE

[Mo&St2] (Sergei KOZLOV told me in 1993 in Trieste that the conjecture is false, but he did not provideenough information, so that I do not know if he had proved it or if he just thought that it was wrong).

52

Definition 37: For nonnegative real numbers θ1, . . . , θr, with∑i θi = 1, and P ∈ L(RN ; RN ), one says that

P belongs to K(θ1, . . . , θr;M1, . . . ,Mr) if and only if there exist sequences χni of characteristic functions ofdisjoint measurable sets for i = 1, . . . , r, and a sequence of rotations Rn ∈ SO(N) such that (8.33) holds,and moreover such that there exists x∗ ∈ Ω, Lebesgue point of Aeff and of η1, . . . , ηr, with ηi(x∗) = θi fori = 1, . . . , r, and Aeff (x∗) = P .

Of course, because the set of Lebesgue points of any vector valued function is dense, the fact that (8.34)is valid is now merely the statement of Definition 37, but one must show that this definition is consistentwith the intuitive idea of mixing materials. The proof makes use of an obvious fact, that H-convergencecommutes with translations and dilations. Commuting with translations means that if An H-converges toAeff in Ω and if for a ∈ RN one has Bn(x) = An(x+a) a.e. x ∈ Ω−a, then Bn H-converges to Beff in Ω−aand Beff (x) = Aeff (x+a) a.e. in Ω−a; commuting with dilations, or rescaling, means that if for s 6= 0 onehas Cn(x) = An(s x) a.e. x ∈ s−1 Ω, then Cn H-converges to Ceff in s−1 Ω and Ceff (x) = Aeff (s x) a.e.in s−1 Ω. More generally, one has the following result about changing variables in H-convergence, identicalto the formula first proved by Sergio SPAGNOLO in the case of G-convergence.

Lemma 38: If ϕ is a diffeomorphism from Ω onto ϕ(Ω) and An H-converges to Aeff in Ω, and Bn is definedon ϕ(Ω) by

Bn(ϕ(x)

)=

1det(∇ϕ(x)

)∇ϕ(x)An(x)∇ϕT (x) a.e. x ∈ Ω, (8.36)

then Bn H-converges in ϕ(Ω) to Beff , and

Beff(ϕ(x)

)=

1det(∇ϕ(x)

)∇ϕ(x)Aeff (x)∇ϕT (x) a.e. x ∈ Ω. (8.37)

Proof: If −div(An grad(un)

)= f in Ω, one defines vn in ϕ(Ω) by vn(y) = un

(ϕ−1(y)

)a.e. y ∈ ϕ(Ω), or

equivalently un(x) = vn(ϕ(x)

)a.e. x ∈ Ω, so that grad(un)(x) = ∇ϕT (x) grad(vn)

(ϕ(x)

)a.e. x ∈ Ω.

Writing then the equation in variational form∫

Ω

(An grad(un).grad(w)

)dx =

∫Ωf w dx for all w ∈ C1

c (Ω),one finds that −div

(Bn grad(vn)

)= g in ϕ(Ω), with Bn given by (8.36) and g

(ϕ(x)

)= 1

det(∇ϕ(x)

)f(x) a.e.

x ∈ Ω, in the case f ∈ L2(Ω), with straightforward generalization in the case f ∈ H−1(Ω). Then one relatesthe weak limits of grad(vn) and of Bn grad(vn) in ϕ(Ω) to the weak limits of grad(un) and of An grad(vn)in Ω, and one finds that Beff is defined by (8.37).

Lemma 39: For nonnegative real numbers θ1, . . . , θr, with∑i θi = 1, and P ∈ K(θ1, . . . , θr;M1, . . . ,Mr),

there exist sequences χni of characteristic functions of disjoint measurable sets for i = 1, . . . , r, and a sequenceRn ∈ L∞

(Ω;SO(N)

)such that (8.33) holds, with ηi = θi a.e. in Ω for i = 1, . . . , r, and Aeff = P a.e. in Ω.

Proof: One proves the Lemma with Ω replaced by a cube Q centered at 0 and large enough to containΩ; then one restricts the result to Ω, using Proposition 10. By Definition 37 there exists a point x∗ andsequences χni , i = 1, . . . , r, Rn, which are not yet the ones needed in the Lemma, and one obtains the desiredones by translation and rescaling. For an integer k large enough so that x∗ + 1

kQ ⊂ Ω, one defines χn,ki ,i = 1, . . . , r, Rn,k, An,k in Q by χn,ki (x) = χni

(x∗ + x

k

)a.e. x ∈ Q, i = 1, . . . , r, Rn,k(x) = Rn

(x∗ + x

k

),

An,k(x) = An(x∗ + x

k

)a.e. x ∈ Q. By Lemma 38, for k fixed An,k H-converges in Q to Aeff,k, defined

by Aeff,k(x) = Aeff(x∗ + x

k

)a.e. x ∈ Q, and because x∗ is a Lebesgue point of Aeff , Aeff,k converges in

L∞(Q) weak ? and L1(Q) strong to the constant tensor P = Aeff (x∗), and therefore Aeff,k H-converges toP in Q. Similarly, for k fixed χn,ki converges in L∞(Q) weak ? to χ∞,k defined by χ∞,k(x) = ηi

(x∗+ x

k

)a.e.

x ∈ Q for i = 1, . . . , r, and because x∗ is a Lebesgue point of each ηi, χ∞,k converges in L∞(Q) weak ? andL1(Q) strong to the constant function θi = ηi(x∗), for i = 1, . . . , r. Using the metrizability of H-convergencerestricted to M(α, β;Q), when 0 < α ≤ β < ∞ have been choosen so that Mi ∈ M(α, β) for i = 1, . . . , r,and the metrizability of L∞(Q) weak ? convergence on bounded sets, there exists a diagonal subsequenceindexed by n′, k′ such that An

′,k′ H-converges in Q to the constant tensor P , and χn′,k′

i converges in L∞(Q)weak ? to the constant function θi, for i = 1, . . . , r.

Lemma 40: If for i = 1, . . . , r, ηi ∈ L∞(Ω), ηi ≥ 0 a.e. in Ω, with∑i ηi = 1 a.e. in Ω, and P ∈

L∞(Ω;L(RN ; RN )

)with P (x) ∈ K(η1(x), . . . , ηr(x);M1, . . . ,Mr) a.e. x ∈ Ω, then there exists sequences χni

53

of characteristic functions of disjoint measurable sets for i = 1, . . . , r, and a sequence Rn ∈ L∞(Ω;SO(N)

)such that (8.33) holds and Aeff = P in Ω.Proof: Let g ∈ L1(Ω; Rp) and for ε > 0 let ρε be defined by ρε(x) = 1

εN ρ1

(xε

)with ρ1 ∈ L1(RN ), nonnegative

and with compact support; then∫

Ω×RN |g(x) − g(x − y)|ρε(y) dx dy tends to 0 as ε tends to 0 (as it is≤ 2||g||L1 ||ρ1||L1 , it is enough to prove the result for a dense subspace, and for g ∈ Cc(RN ) it is immediateas g is uniformly continuous). For ρ1 the characteristic function of the cube (0, 1)N , and δ = 1

k one canthen choose ε small enough to have

∫|x−z|≤ε |g(x)− g(z)| dx dz ≤ δ εN . Then decomposing RN into disjoint

cubes ωj of size ε√N

(plus a set of measure 0), one chooses for each cube ωj a point zj ∈ ωj such that∫ωj|g(x)− g(zj)| dx ≤ 2

εN

∫ωj×ωj

|g(x)− g(z)| dx dz ≤ 2εN

∫x∈ωj ,|x−z|≤ε |g(x)− g(z)| dx dz, and if gk denotes

the function equal to g(zj) on ωj , then g is piecewise constant and one has ||g − gk||L1(RN ) ≤ 1k .

One applies the preceding analysis to g = (η1, . . . , ηr, P ), and gk = (ηk1 , . . . , ηkr , P

k) and one has P k ∈K(ηk1 , . . . , η

kr ;M1, . . . ,Mr) on each cube ωj . Using Lemma 39, as well as Proposition 10 in order to glue

the pieces together, there exist sequences χn,ki of characteristic functions of disjoint measurable sets fori = 1, . . . , r, and a sequence Rn,k ∈ L∞

(Ω;SO(N)

)such that (8.33) holds, with ηi replaced by ηki for

i = 1, . . . , r, and Aeff replaced by P k. As k tends to∞, ηki converges in L∞(Ω) weak ? and L1(Ω) strong toηk for i = 1, . . . , r, and P k converges in L∞(Ω) weak ? and L1(Ω) strong to P , and therefore P k H-convergesto P in Ω. Using the metrizability of H-convergence restricted to M(α, β;Q), and the metrizability of L∞(Q)weak ? convergence on bounded sets, there exists a diagonal subsequence indexed by n′, k′ such that χn

′,k′

i

converges in L∞(Ω) weak ? to ηi, for i = 1, . . . , r, and An′,k′ H-converges in Ω to P .

Because of the local character of H-convergence expressed by Proposition 10, there is no mention of aparticular open set in Definition 37, which is actually valid without any hypothesis of symmetry and withoutthe use of rotations Rn (depending measurably in x) in (8.33). However, it is precisely the symmetry of Mi,i = 1, . . . , r, and the use of these rotations which implies that (8.34) has the form (8.35), and this is seenby using Lemma 38 with ϕ(x) = Rx with R ∈ SO(N): it shows that if P ∈ K(η1, . . . , ηr;M1, . . . ,Mr) thenRP RT ∈ K(η1, . . . , ηr;M1, . . . ,Mr) for any R ∈ SO(N), and as P is symmetric, all tensors with the sameeigenvalues than P belong to the set K(η1, . . . , ηr;M1, . . . ,Mr), and one deduces (8.35).64

I must say that prior to writing these notes I had not been so interested in writing down the detail ofthe analysis which I just showed, considering that these are uninteresting details which everyone with a goodknowledge of measure theory can check, and that the important questions of Homogenization lie elsewhere.I have heard some mention of a work of Gianni DAL MASO and Robert KOHN which might have addressedthis question in general, or it may have been the answer to the question that Robert KOHN was asking mein 1983, which was related to linearized Elasticity, which is not a frame indifferent theory, as he alreadyknew; his concern was to identify which are the invariant quantities replacing the eigenvalues in the case offourth order tensors Cijkl. I do not see any special difficulty in adapting Definition 37 to this case, but as Iconsider that questions in linearized Elasticity are too often spoiled by unrealistic effects which deprive themathematical results of much of their value, I have preferred not to be involved in such questions.

The set K(η1, . . . , ηr;M1, . . . ,Mr) is not necessarily convex: in dimension N = 2 with only one materialM1 symmetric with distinct eigenvalues α < β, K(1;M1) is the set of symmetric matrices with eigenvaluesλ1, λ2 ∈ [α, β] with λ1 λ2 = αβ. However, some projections of K(η1, . . . , ηr;M1, . . . ,Mr) are convex [Ta14],and this is not dependent on symmetry requirements or use of rotations.

Lemma 41: Consider η1, . . . , ηr, nonnegative numbers adding up to 1, and M1, . . . ,Mr, tensors fromM(α, β); then for every set of N − 1 vectors a1, . . . , aN−1 of RN ,

(P a1, . . . , P aN−1) | P ∈ K(η1, . . . , ηr;M1, . . . ,Mr) is convex. (8.38)

64 The first method for obtaining bounds on effective coefficients, which I developed with Francois MURAT

in the early 70s, and which I have described in chapter 5, is not restricted to symmetric operators, but I donot know what kind of transformations to impose in the nonsymmetric case. I have not read carefully thearticle of Graeme MILTON mentioned in footnote 63 as the only instance of a nonsymmetric situation thatI have heard of in a realistic situation, but knowing that the Hall effect is concerned with electrical currentsin thin ribbons, the macroscopic direction of the current is obviously imposed and therefore the situation isnot subject to frame indifference.

54

MoreoverP1, P2 ∈ K(η1, . . . , ηr;M1, . . . ,Mr) and rank(P2 − P1) ≤ N − 1imply θ P1 + (1− θ)P2 ∈ K(η1, . . . , ηr;M1, . . . ,Mr) for all θ ∈ (0, 1).

(8.39)

Proof: For P1, P2 ∈ K(η1, . . . , ηr;M1, . . . ,Mr) and θ ∈ (0, 1) one exhibits P3 ∈ K(η1, . . . , ηr;M1, . . . ,Mr)such that

P3 ai = θ P1 a

i + (1− θ)P2 ai for i = 1, . . . , N − 1, (8.40)

and this is done by layering materials with tensors P1 and P2, respectively with proportions η and 1 − η,and this obviously creates a tensor belonging to K(η1, . . . , ηr;M1, . . . ,Mr), and then one uses formula (8.37)of Lemma 34. The nonzero vector e orthogonal to the layers must satisfy(

(P2 − P1)ai.e)

= 0 for i = 1, . . . , N − 1, (8.41)

and this is possible because there are only N − 1 vectors ai, but if P2 − P1 does not have full rank, one cantake e orthogonal to the range of P2 − P1, and (8.39) follows.

It means that if one considers the first N − 1 columns of elements from K(η1, . . . , ηr;M1, . . . ,Mr) in abasis a1, . . . , aN , then one has a convex set. If the basis is orthogonal and all Mi are symmetric, then thechoice (8.41) for e in the proof of Lemma 41 means that (P2 − P1)e is proportional to aN and the last termin the formula (8.37) is then proportional to aN ⊗ aN and therefore if one considers the N2 − 1 entries ofelements P of K(η1, . . . , ηr;M1, . . . ,Mr) obtained by deleting the entry PNN , then one has a convex set.Even though K(η, 1− η;α I, β I) is characterized by explicit inequalities (7,25), (7.28), (7.30), I do not knowa simple description of what all these convex sets are in this special case, but if one just wants to identifyone column of elements of K(η1, . . . , ηr;M1, . . . ,Mr), then one has the following result in the symmetric caseusing rotations [Ta14].

Lemma 42: Consider η1, . . . , ηr, nonnegative numbers adding up to 1, and M1, . . . ,Mr, symmetric tensorsfrom M(α, β), with eigenvalues λj(Mi), i = 1, . . . , r, j = 1, . . . , N and N ≥ 2; then for every E ∈ RN ,

(P E | P ∈ K(η1, . . . , ηr;M1, . . . ,Mr) is the closed ball with diameter [bE, aE]

a =r∑i=1

ηi maxjλj(Mi)

1b

=r∑i=1

ηiminjλj(Mi)

.

(8.42)

Proof: Let λ−(Mi) = minjλj(Mi) and λ+(Mi) = maxjλj(Mi). Let e1, . . . , eN be an orthonormalbasis, and rotate the materials in such a way that it becomes a basis of eigenvector for each RTi MiRi,with RTi MiRi e1 = λ−(Mi)e1 and RTi MiRi eN = λ+(Mi)eN ; layering in direction orthogonal to e1 thesematerials RTi MiRi with proportion ηi gives by formula (4.11) a tensor P ∈ K(η1, . . . , ηr;M1, . . . ,Mr)such that P e1 = b e1 and P eN = a eN . If E belongs to the plane spanned by e1 and eN , then P E =b(E.e1)e1 +a(E.eN )eN , so that P E−bE = (a−b)(E.eN )eN and P E−aE = (b−a)(E.e1)e1, and therefore(P E− bE.P E−aE) = 0, showing that P E belongs to the sphere with diameter [bE, aE]. By choosing allpossible orientations, the set of P E contains at least the sphere with diameter [bE, aE], and as by Lemma41, the set that we are looking for is convex, it must contain the closed ball with diameter [bE, aE].

That the desired set is included in the closed ball with diameter [bE, aE] follows from Proposition12 and Proposition 14 (or from Lemma 18). Assume that An =

∑i χ

ni (Rn)T MiR

n H-converges to Aeff

in Ω, where the χni are characteristic functions of disjoint measurable sets with χni converging in L∞(Ω)weak ? to ηi for i = 1, . . . , r. As (Rn)T MiR

n has eigenvalues between λ−(Mi) and λ+(Mi) then one hasBn =

(∑i χ

ni λ−(Mi)

)I ≤ An ≤

(∑i χ

ni λ+(Mi)

)= Cn, and if one extracts a subsequence such that Bm

H-converges to Beff and Cm H-converges to Ceff , then by Proposition 14 one has Beff ≤ Aeff ≤ Ceff .By Proposition 12, an upper bound for Ceff is C+, the limit in L∞

(Ω;L(RN ; RN )

)weak ? of Cn, which by

(8.42) is a I, and a lower bound for Beff is B−, where (B−)−1 is the limit in L∞(Ω;L(RN ; RN )

)weak ? of

(Bn)−1, which by (8.42) is 1b I. One has then b I ≤ Aeff ≤ a I, and therefore (as was shown before Lemma

18), Aeff E belongs to the closed ball with diameter [bE, aE], a.e. in Ω.

55

9. Necessary conditions of optimality: first stepAt a meeting in New York in 1981, I had described the characterization found with Francois MURAT of

all effective materials obtained by mixing two isotropic materials, i.e. (7.25) (Proposition 10, described in[Ta1]) and (7.28), (7.30), obtained by following my second method for obtaining bounds satisfied by effectivecoefficients (Theorem 26, described in [Ta7], and Lemmas 28, 29). It was clear that our program for questionsof Optimal Design initialized in the early 70s was completed, but I did not try immediately to extend thecomputation of necessary conditions that I had done in [Ta2].

During the Spring 1983, I was invited at a Midwest PDE Seminar in Madison, and I heard MichaelRENARDY talk about his work with Daniel JOSEPH on Poiseuille flows of two immiscible fluids.65 For acylinder with arbitrary cross section Ω, there are infinitely many possible Poiseuille flows, but when Ω is adisc one only observes the flow where the less viscous fluid occupies an annular region near the boundary,and they had imagined that it was related to a maximization of dissipation. This situation is describedby (4.1) with f = 1 (a is the viscosity of the fluid, independent of the direction along the pipe, and thegradient of the pressure is constant, pointing in the direction of the pipe), and one wants to maximize∫

Ωa|grad(u)|2 dx =

∫Ωu dx, and therefore it corresponds to g(x, u, a) = −u in (4.3).There is indeed a

classical solution in the case of a circular cross section, which is as described, but as they were conjecturingthe existence of a classical solution in general, I told Michael RENARDY that on the contrary I expectedthat no classical solution would exist in general.66 In order to check about this question, I looked at thenecessary conditions of optimality, as I had done in [Ta2], but using now the characterization that FrancoisMURAT and I had obtained two years before. Surprisingly, I discovered that our precise characterizationcould be ignored completely, and that the same necessary conditions could be deduced from the crudebounds that we had obtained almost ten years earlier.67 Assuming that a classical solution exists, and thatthe interface between the two materials is smooth enough, the necessary conditions of optimality imply thatsome Dirichlet conditions and some Neumann conditions must be satisfied on the interface, which is quiteunlikely in general. On my way back to France I stopped in New York, and discussing about this matterin the lounge at the Courant Institute, the precise argument for rejecting that possibility was mentioned tome by Joel SPRUCK, who reminded me of a result of James SERRIN that I had heard at a conference inJerusalem in 1972 (and he told me about a quicker proof by Hans WEINBERGER); this result assumes thedomain to be simply connected, and shows that among simply connected domains a classical solution onlyexists for circular domains.68

In July 1983, I described completely our method for a Summer Course CEA - EDF - INRIA on Ho-mogenization at Breau sans Nappe, which gave Francois MURAT and I the occasion to write [Mu&Ta1]. Asthis first text had been written in French, we wrote a summary in English for a conference at Isola d’Elba inthe Fall [Mu&Ta2]. In November, I took the occasion of a meeting in Paris dedicated to Ennio DE GIORGI

for writing down the details of the characterization obtained with Francois MURAT in 1980 [Ta9], charac-terization which I had first described at a meeting in New York in 1981 but only quoted in [Mu&Ta1], andin [Ta9] I gave our initial construction with coated ellipsoids generalizing the idea of using coated spheresthat I had learned in the original work of Zvi HASHIN and S. SHTRIKMAN [Ha&Sh], just after it had beenpointed out to me by George PAPANICOLAOU, and for the computations with multiple layerings I gave thegeneralization that I had obtained in the Spring 1983 in Berkeley of the work initially done the year beforeby Philippe BRAIDY and Didier POUILLOUX [Br&Po].

65 In applications, it either meant melted polymers or crude oil and water, the water being added in smallquantity in a pipeline for lubricating it.

66 The idea that turbulent flows are trying to optimize something has been suggested before, as I have readin a book by Daniel JOSEPH [Jo], but it is not clear if fluids are trying to optimize something, and what itcould be; however, this idea is consistent with the Homogenization approach that in order to optimize somecriteria one might have to create adequate microstructures.

67 As I learned later, this point had already been noticed in the late 70s by RAITUM [Ra].68 Of course, when Ω is not a disc this negative result does not tell what configurations will be chosen by a

mixture of fluids in a pipe, as the question of what mixtures of fluids really try to optimize must be studiedfurther. Daniel JOSEPH, Michael RENARDY and Yuriko RENARDY have also studied the stability of thesolution for circular domains, and for such questions of stability one must go back to the full Navier–Stokesequation, of course.

56

In the early 70s, Francois MURAT and I had developed the Homogenization approach in order to describea relaxed problem, and this is a way to prove existence of generalized solutions. For the sake of generality, Idescribe here the more general framework that I developed later in [Ta14], instead of the restricted problemof mixing only two isotropic materials, as we had done in [Mu&Ta1]. For a bounded open set Ω of RN , oneconsiders a mixture of r symmetric possibly anisotropic materials M1, . . . ,Mr ∈ M(α, β), in finite quantityκ1, . . . , κr, assuming that

∑i κi ≥ meas(Ω), of course. Then one defines

A =r∑i=1

χiRT MiR in Ω, (9.1)

where χi, i = 1, . . . , r, are characteristic functions of disjoint measurable sets with union equal to Ω (up toa set of measure 0), and R is measurable and takes values in SO(N), a.e. in Ω, and one assumes that∫

Ω

χi dx ≤ κi for i = 1, . . . , r. (9.2)

Then one solves the state equation

−div(Agrad(u)

)= f in Ω, with u ∈ H1

0 (Ω), (9.3)

with f ∈ H−1(Ω), and I have choosen homogeneous Dirichlet conditions for u as an example. Finally onewants to minimize the cost function J defined by

J(χ1, . . . , χr, R) =∫

Ω

r∑i=1

χi Fi(x, u(x)

)dx, (9.4)

so that there is no cost for rotating materials, and one adds regularity and growth hypotheses on F1, . . . , Fr,in order to have

u 7→ Fi(·, u) is sequentially continuous from H10 (Ω) weak into L1(Ω) strong, for i = 1, . . . , r, (9.5)

which is usually obtained by assuming that each Fi is a Caratheodory function, and that∑i |Fi(x, v)| ≤

µ(x)+C|v|p for all v ∈ R a.e. x ∈ Ω, with µ ∈ L1(Ω) and p < 2NN−2 for N ≥ 2 (but for N = 2 one can assume

instead that∑i |Fi(x, v)| ≤ µ(x) + G(|v|) and allow some exponential growth for G), and for N = 1 one

assumes that∑i sup|v|≤r |Fi(x, v)| ≤ µr(x) with µr ∈ L1(Ω) for all r. In the case where

∑i κi > meas(Ω),

one can also add in J a term of the form G(∫

Ωχ1 dx, . . . ,

∫Ωχr dx).

The set of admissible A is not empty, because∑i κi ≥ meas(Ω), and it is included in some M(α, β; Ω),

so that all the u obtained belong to a bounded set of H10 (Ω). For a sequence An, for example a minimizing

sequence, one can extract a subsequence which H-converges to Aeff , and also such that for i = 1, . . . , r, thesequence χni converges in L∞(Ω) weak ? to a nonnegative function ηi, which satisfies and∫

Ω

ηi dx ≤ κi for i = 1, . . . , r, (9.6)

with∑i ηi = 1 in Ω, and therefore by Definition 37

Aeff ∈ K(η1, . . . , ηr;M1, . . . ,Mr) a.e. in Ω, (9.7)

and also such that un converges in H10 (Ω) weak to u∞, and therefore (9.5) implies

J(χn1 , . . . , χnr , R

n)→∫

Ω

( r∑i=1

ηiFi(x, u∞))dx. (9.8)

By Lemma 40, if r nonnegative functions ηi, i = 1, . . . , r, satisfy∑i ηi = 1 in Ω and (9.6), and P ∈

K(η1, . . . , ηr;M1, . . . ,Mr) a.e. in Ω, there exist sequences χni of characteristic functions of disjoint measurable

57

sets converging in L∞(Ω) weak ? to ηi for i = 1, . . . , r, and An H-converges to P , but there is a small technicalpoint to resolve, because the sequence created in the proof of Lemma 40 might not satisfy (9.2). In thatcase one has

∫Ωχni dx ≤ κi + εn for i = 1, . . . , r, and εn tends to 0 because

∫Ωηi dx ≤ κi for i = 1, . . . , r. For

each i one can change χni on a set of measure at most εn in order to create sequences χni of characteristicfunctions of disjoint measurable sets converging in L∞(Ω) weak ? to ηi for i = 1, . . . , r and satisfying(9.2), and An =

∑i χ

ni (Rn)T MiR

n H-converges to P , and one must show that P = P a.e. in Ω. As theperturbations are not small in L∞ norm, one cannot apply Proposition 16, but one can control the effect ofsmall perturbations of the coefficients in Ls norm with s < ∞ (with the coefficients staying in M(α, β; Ω))by using Meyers’s regularity theorem [Me]; however, one can also construct a more direct proof based onProposition 10 as follows. If for i = 1, . . . , r, one chooses a small cube ωi such that

∫ωiηi dx > 0, and one

chooses n large enough so that∫ωiχni dx ≥ εn for i = 1, . . . , r, then one can manage to make the changes

only inside⋃i ωi, and as An = An in Ω \ (

⋃i ωi), Proposition 10 implies that P = P in Ω \ (

⋃i ωi); the ωi

can be taken as small as one wants (around a density point of ηi), and therefore P = P a.e. in Ω.One has therefore identified a relaxed problem (as defined in the end of chapter 3): the new set of

controls is the set of (η1, . . . , ηr, A) satisfying

0 ≤ ηi a.e. in Ω for i = 1, . . . , r,r∑i=1

ηi = 1 a.e. in Ω∫Ω

ηi dx ≤ κi for i = 1, . . . , r

A ∈ K(η1, . . . , ηr;M1, . . . ,Mr) a.e. in Ω,

(9.9)

the state u is still given by (9.3), and the cost function J to be minimized is given by

J(η1, . . . , ηr, A) =∫

Ω

[ r∑i=1

ηi Fi(x, u(x)

)]dx. (9.10)

The initial problem corresponds to the case where each ηi is a characteristic function.

Lemma 43: The function J given by (9.10) with u defined by (9.3) attains its minimum on the set describedby (9.9).Proof: The topology on the new control set described by (9.9) is the L∞(Ω) weak ? topology for each ηi andthe H-convergence topology for A, and this topology is metrizable. The new control set is the completion ofthe initial control set, and it is compact for this topology (this uses Theorem 6, Lemma 40, and the simpletrick shown above for enforcing (9.2), together with classical results of Functional Analysis for dealing withthe L∞(Ω) weak ? topology). Moreover, J is continuous for that topology. A consequence of all these factsis that the minimum of J is attained.

The questions are rather different for more general functionals depending upon grad(u), as we alreadyknew in the early 70s; I have described some very partial results on this question in [Ta13]. However, theproperty (9.4)/(9.5) is still true for some functionals depending upon grad(u) in a “fake” way. For example,(9.3) implies that

∫Ω

(Agrad(u).grad(u)

)dx =

∫Ωf u dx if f ∈ L2(Ω), so that one should not really say that

the first integral depends upon grad(u)! There are other situations of this kind like the following ones, whereone cannot make grad(u) disappear completely; for simplicity, I do not try to use Meyers’s regularity theorem[Me]. Assume that for some j, one has J(χ1, . . . , χr, R) =

∫Ωg(u) ∂u∂xj

v dx, with g continuous and havinglimited growth |g(u)| ≤ C(1+ |u|p) and p < N

N−2 if N ≥ 2, with v ∈ Lq(Ω) and p(

12 −

1N

)+ 1

2 + 1q ≤ 1, so that

q < ∞ and the integral makes sense by an application of Sobolev imbedding theorem. If v is not regular,one cannot make grad(u) disappear entirely by integration by parts, but one can decompose v as v1 + v2

with v1 small in Lq(Ω) and v2 ∈ C1c (Ω), and one notices that

∫Ωg(u) ∂u∂xj

v2 dx = −∫

ΩG(u) ∂v2∂xj

dx, whereG(s) =

∫ s0g(σ) dσ; because the last integral is sequentially continuous on H1

0 (Ω) weak, one finds that thesame is true for J as a uniform limit of sequentially weakly continuous functionals. One can prove the sameresult differently, and it also applies to the case where one has J(χ1, . . . , χr, R) =

∫Ωg(u)

(Agrad(u)

)jv dx,

58

with the same hypotheses on g and v. Because the sequences considered are such that An grad(un) convergesin L2(Ω; RN ) weak to Aeff grad(u∞), it is enough to show that g(un)v converges in L2(Ω) strong to g(u∞)v,and using another decomposition of v as v3 + v4 with v3 small in Lq(Ω) and v4 ∈ L∞(Ω), it is enough toshow that g(un) converges in L2(Ω) strong to g(u∞), and this follows from Sobolev imbedding theorem andthe fact that the injection of H1

0 (Ω) into L2(Ω) is compact.

The next step is to write necessary conditions of optimality, and therefore one wants to use differen-tiable paths between the candidate to optimality and any other control. One assumes then that each Fi iscontinuously differentiable in u, that ∂Fi(·,u)

∂u is a Caratheodory function, and that some growth condition issatisfied so that

u 7→ ∂Fi(·, u)∂u

is continuous from H10 (Ω) into Ls(Ω) ⊂ H−1(Ω), (9.11)

for example∣∣∂Fi

∂u

∣∣ ≤ g(x) + C|u|q with g ∈ Lr(Ω) and r ≥ 2NN+2 if N ≥ 3 or r > 1 if N = 2 (and r = 1 for

N = 1), and q ≤ N+2N−2 if N ≥ 3 or q < ∞ if N ≤ 2. The condition of optimality will be simplified by using

an adjoint state, defined by

−div(AT grad(p)

)=

r∑i=1

ηi∂Fi(·, u)∂u

in Ω, p ∈ H10 (Ω), (9.12)

and although I am dealing with symmetric A here, I have used AT in (9.12) in order to show what is neededin a nonsymmetric situation. In 1983, I was checking the conditions of optimality for the special case ofmixing two anisotropic materials, using the characterization obtained with Francois MURAT, i.e. (7.25),(7.28) and (7.30), and as these conditions define a convex set of matrices, I could use straight segmentsfor mixtures corresponding to the same local proportions. I discovered that the precise form of (7.28) and(7.30) is not really important, and generalizing to mixing more than two isotropic materials became an easyexercise, but I only understood the general case of mixing anisotropic materials in given proportions whenI observed the property of Lemma 42. For the generalization, which I show directly without starting by theparticular case that I had done first, I will use the notation

B(η1, . . . , ηr;M1, . . . ,Mr) = A | b I ≤ A ≤ a I

1b

=r∑i=1

ηiminjλj(Mi)

a =r∑i=1

maxjλj(Mi),

(9.13)

which defines a convex set B(η1, . . . , ηr;M1, . . . ,Mr) containing K(η1, . . . , ηr;M1, . . . ,Mr), but although theyare different sets, Lemma 42 tells that some of their projections are identical.

Lemma 44: For N ≥ 2, let η∗1 , . . . , η∗r , A

∗ satisfy (9.9), and

J(η∗1 , . . . , η∗r , A

∗) ≤ J(η∗1 , . . . , η∗r , A) for all A ∈ K(η∗1 , . . . , η

∗r ,M1, . . . ,Mr), (9.14)

with J defined by (9.10). Then, if u∗ and p∗ are the corresponding solutions of respectively (9.3) and (9.12),one has∫

Ω

(A∗ grad(u∗).grad(p∗)

)dx ≥

∫Ω

(B grad(u∗).grad(p∗)

)dx for all B ∈ B(η∗1 , . . . , η

∗r ,M1, . . . ,Mr). (9.15)

Proof: For B ∈ B(η∗1 , . . . , η∗r ;M1, . . . ,Mr), and ε ∈ (0, 1), one defines

A(ε) = (1− ε)A∗ + εB = A∗ + ε δA, (9.16)

and A(ε) ∈ B(η∗1 , . . . , η∗r ;M1, . . . ,Mr) because it is convex and contains K(η∗1 , . . . , η

∗r ,M1, . . . ,Mr). As A(ε)

is analytic in ε with values in M(α, β; Ω), the operator Aε = −div(A(ε)grad(·)

)is analytic with values in

59

L(H1

0 (Ω), H−1(Ω))

and invertible by Lax–Milgram lemma, and therefore its inverse A−1ε is analytic with

values in L(H−1(Ω), H1

0 (Ω)), so that the corresponding solution u(ε) is analytic in ε with values in H1

0 (Ω),and its derivative δu at ε = 0 satisfies

u(ε) = u∗ + ε δu+ o(ε) in H10 (Ω)

div(A∗ grad(δu) + δA grad(u∗)

)= 0,

(9.17)

where δA = B − A∗. The function J(η∗1 , . . . , η

∗r , A(ε)

)is therefore differentiable in ε, and its derivative δJ

at ε = 0 satisfiesJ(η∗1 , . . . , η

∗r , A(ε)

)= J(η∗1 , . . . , η

∗r , A

∗) + ε δJ + o(ε)

δJ =∫

Ω

( r∑i=1

η∗i∂Fi(x, u∗)

∂u

)δu dx,

(9.18)

and using the definition (9.12) of the adjoint state p∗ and (9,17), one finds

δJ =∫

Ω

((A∗)T grad(p∗).grad(δu)

)dx =

∫Ω

(A∗ grad(δu).grad(p∗)

)dx

= −∫

Ω

(δA grad(u∗).grad(p∗)

)dx =

∫Ω

((A∗ −B) grad(u∗).grad(p∗)

)dx.

(9.19)

If one had A(ε) ∈ K(η∗1 , . . . , η∗r ,M1, . . . ,Mr) for all ε ∈ (0, 1), one would have J

(η∗1 , . . . , η

∗r , A(ε)

)≥

J(η∗1 , . . . , η∗r , A

∗) by (9.14), and therefore δJ ≥ 0 by (9.18), and that would give (9.15) for the particu-lar choice of B. However, almost everywhere in Ω, A(ε)grad

(u(ε)

)belongs to the closed ball with diameter

[b grad(u(ε)

), a grad

(u(ε)

)], and therefore by Lemma 42

there exists M(ε) ∈ K(η∗1 , . . . , η∗r ,M1, . . . ,Mr) : A(ε)grad

(u(ε)

)= M(ε)grad

(u(ε)

)a.e. in Ω. (9.20)

Because A(ε) and M(ε) create then the same state u(ε), one has

J(η∗1 , . . . , η

∗r , A(ε)

)= J

(η∗1 , . . . , η

∗r ,M(ε)

)≥ J(η∗1 , . . . , η

∗r , A

∗), (9.21)

and therefore the conclusion δJ ≥ 0 is valid.

In the derivation of (9.20), there is a small technical difficulty, because Lemma 42 was given without anydependence upon x ∈ Ω, and one must then check that there is a measurable M(ε) satisfying (9.20) (ε > 0being fixed). Of course, one can make Lemma 42 more precise by constructing an explicit lifting, whichmaps η1, . . . , ηr, B ∈ B(η1, . . . , ηr;M1, . . . ,Mr), e ∈ RN to A ∈ K(η1, . . . , ηr;M1, . . . ,Mr) with Ae = B e,but a natural construction relies on different cases, e being 0 or not, B e being parallel to e or not, B e beingon the sphere with diameter [b(η)e, a(η)e] or not, and so on, and in each case one only checks continuityof the lifting, so that the constructed A is measurable. One may also avoid this question of measurabilityaltogether by restricting one’s attention to specific mixtures obtained by layering in arbitrary directions, andin the end it gives the same condition (because in Lemma 42 the elements on the boundary of the ball arecreated by layerings). One chooses an orthonormal basis e1, . . . , eN , and one orients material #i so that ej isan eigenvector for the eigenvalue λj(Mi), with eigenvalues increasing with j (so that λ1(Mj) = minj λj(Mi)and λN (Mi) = maxj λj(Mi)), and then one uses layers orthogonal to e1, with proportions η∗1 , . . . , η

∗r , and

one obtains a material with tensor

P =N∑j=1

πj ej ⊗ ej ∈ K(η∗1 , . . . , η∗r ;M1, . . . ,Mr)

b = π1 ≤ . . . ≤ πN = a a.e. in Ω.

(9.22)

For a measurable subset ω of Ω one keeps A(ε) = A∗ in Ω\ω, and in ω one defines A(ε) by mixing the materialwith tensor A∗ and the material with tensor P , in layers orthogonal to e and with respective proportions

60

1− ε and ε. Of course, one uses Lemma 40 for showing that such an A(ε) is allowed in (9.14), and formula(8.17) gives

A(ε) = A∗ + ε δA+ o(ε) in ω with δA = P −A∗ − (P −A∗) e⊗ e(P e.e)

(P −A∗), (9.23)

and this different definition of δA corresponds to a different δu to use in (9.17) and (9.18), and the lastequality in (9.19) changes because of the different definition of δA, and by varying ω one deduces that (9.9)and (9.14) imply that for all e 6= 0 one has

(A∗ grad(u∗).grad(p∗)

)≥(P grad(u∗).grad(p∗)

)−((P −A∗)grad(u∗).e

)((P −A∗)grad(p∗).e

)(P e.e)

, (9.24)

a.e. in Ω, and by choosing e 6= 0 orthogonal to (P −A∗)grad(u∗), one finds(A∗ grad(u∗).grad(p∗)

)≥(P grad(u∗).grad(p∗)

)for all P satisfying (9.22), a.e. in Ω. (9.25)

From (9.15) or (9.25) one deduces the following necessary condition of optimality.

Proposition 45: For N ≥ 2, let η∗1 , . . . , η∗r , A

∗ satisfy (9.9) and (9.14), then at almost every point where|grad(u∗)| |grad(p∗)| 6= 0 one has

if grad(p∗) = c grad(u∗) with c > 0, then A∗ grad(u∗) = a grad(u∗), A∗ grad(p∗) = a grad(p∗), (9.26)

if grad(p∗) = c grad(u∗) with c < 0, then A∗ grad(u∗) = b grad(u∗), A∗ grad(p∗) = b grad(p∗), (9.27)

if grad(p∗) is not parallel to grad(u∗) then

A∗ grad(u∗) =a+ b

2grad(u∗) +

a− b2|grad(u∗)||grad(p∗)|

grad(p∗)

A∗ grad(p∗) =a+ b

2grad(p∗) +

a− b2|grad(p∗)||grad(u∗)|

grad(u∗).

(9.28)

Proof: As |grad(u∗)| 6= 0, when one changes the basis e1, . . . , eN , the vector P grad(u∗) spans the spherewith diameter [b grad(u∗), a grad(u∗)], and the quantity

(P grad(u∗).grad(p∗)

)attains its maximum at a

unique point of the sphere; as A∗ grad(u∗) belongs to the closed ball with diameter [b grad(u∗), a grad(u∗)],A∗ grad(u∗) must be equal to the value of P grad(u∗) for the P for which the maximum is attained, and thisgives the formulas for A∗ grad(u∗) in (9.26)/(9.28). The corresponding formulas for A∗ grad(p∗) followby a simple remark from Linear Algebra, applied to A∗ − a+b

2 I, namely that if M is symmetric withnorm ≤ γ and if for a vector E 6= 0 one has M E = F and |F | = γ|E|, then M F = γ2E (because|M F − γ2E|2 = |M F |2 − 2γ2(F.M E) + γ4|E|2 ≤ γ2(γ2|E|2 − |F |2|) = 0).

On the set Ω+ where neither grad(u∗) nor grad(p∗) vanish, one may therefore replace A∗ by P which isobtained by layering the materials with the same proportions η∗1 , . . . , η

∗r , and one has the same cost because

one can keep the same u∗, as A∗ grad(u∗) = P grad(u∗) a.e. in Ω+. On the set Ωu where grad(u∗) = 0, onecan also replace A∗ by any material without changing A∗ grad(u∗) (which remains 0), and in particular onecan replace A∗ by a material obtained by layering the materials with the same proportions. However, thesituation is different for the set Ωp where grad(p∗) = 0 and grad(u∗) 6= 0, but one can adapt an argument ofRAITUM [Ra] for replacing A∗ by materials obtained by layering but with different proportions (if A∗ grad(u∗)is not on the sphere with diameter [b grad(u∗), a grad(u∗)], it may only belong to the boundary of anothersphere obtained by changing b or a, and therefore one needs to change some of the proportions). Let

Ωc =x | x ∈ Ω,

(A∗grad(u∗)− b grad(u∗).A∗grad(u∗)− a grad(u∗)

)< 0, (9.29)

i.e. the set where A∗grad(u∗) cannot be obtained by a layered material so that grad(u∗) 6= 0, and thepreceding analysis shows that on Ωc one must have grad(p∗) = 0, i.e. Ωc ⊂ Ωp. One defines b(η) and a(η)for

η = (η1, . . . , ηr) ∈ L∞(Ωc;Rr) such that ηi ≥ 0 a.e. in Ωc for i = 1, . . . , r,r∑i=1

ηi = 1 a.e. in Ωc, (9.30)

61

by the formulas1b(η)

=r∑i=1

ηiminj λj(Mi)

, a(η) =r∑i=1

ηi maxjλj(Mi), (9.31)

and one imposes the two constraints(A∗grad(u∗)− a(η)grad(u∗).A∗grad(u∗)− b(η)grad(u∗)

)≤ 0 a.e. in Ωc, (9.32)∫

Ωc

ηi dx =∫

Ωc

η∗i dx, i = 1, . . . , r, (9.33)

and one wants to minimize

Jc(η) =∫

Ωc

( r∑i=1

ηi Fi(x, u∗))dx. (9.34)

The set of η satisfying (9.30), (9.32)/(9.33) contains η∗ (as (9.32) is true by Lemma 42). The condition(9.30) defines a convex set, which is L∞(Ωc;Rr) weak ? compact; the constraint (9.32) defines a L∞(Ωc; R2)weak ? closed convex set in

(a, 1

b

)by Lemma 19, and by (9.31) the set of η satisfying (9.32) is therefore

convex, and L∞(Ωc; Rr) weak ? closed, and similarly for the constraint (9.33). As Jc is linear and L∞(Ωc; Rr)weak ? continuous, it attains its minimum on at least one extreme point of the L∞(Ωc; Rr) weak ? compactconvex set defined by (9.30) and (9.32)/(9.33). One shows then that for such an extreme point one musthave equality in (9.32) a.e. in Ωc, by using an argument of Zvi ARTSTEIN [Ar]. Indeed, assume that(A∗grad(u∗) − a(η)grad(u∗).A∗grad(u∗) − b(η)grad(u∗)

)≤ −ε < 0 on a subset K ⊂ Ωc with positive

measure; one may assume that one can find r + 1 disjoint subsets Kk ⊂ K with positive measure and oneach such set Kk one can find two distinct indices i(k), j(k) ∈ 1, . . . , r such that ηi(k), ηj(k) ≥ δk > 0 a.e.in Kk (if it was not true, a.e. in Kk there would only be one ηi different from 0 and thus equal to 1, in whichcase a(η) = b(η) and one could not be in Ωc); for |ck| ≤ δk, one can add ck to ηi(k) in Kk and subtract ckto ηj(k) in Kk and still satisfy (9.30), and by restricting a little more |ck| one can still satisfy (9.32); thenone has r+ 1 small arbitrary constants that one can play with and only r linear constraints (9.33) that theymust satisfy, and it leaves the possibility to move both ways in at least one direction while satisfying all theconstraints, contradicting the assumption that one has choosen an extreme point. Therefore one has foundan optimal solution of the initial problem which can be realized by layerings inside Ωc.

At least one solution of the optimization problem corresponds then to a material obtained by layerings(with varying proportions and directions), and therefore at the end the Homogenization questions seem todisappear from the analysis, and one may wonder if it was really necessary to study Homogenization in thefirst place.

There is an analogous question about Functional Analysis. One wants to show the existence of acharacteristic function χ ∈ L∞(Ω) which minimizes

∫Ωχf dx, where f is given in L1(Ω) and where the

minimization is only considered for characteristic functions of measurable subsets of Ω satisfying a finitenumber of constraints

∫Ωχ gi dx = αi for i = 1, . . . ,m, where g1, . . . , gm are given in L1(Ω); of course,

one assumes that at least one such χ exists. The proof of existence that I know was taught to me by ZviARTSTEIN in 1975 [Ar],69 and it consists in minimizing L(θ) =

∫Ωθ f dx among functions θ ∈ C = θ |

θ ∈ L∞(Ω), 0 ≤ θ ≤ 1 a.e. in Ω,∫

Ωθ gi dx = αi for i = 1, . . . ,m; as C is a nonempty compact set for the

L∞(Ω) weak ? topology (which is metrizable when restricted to C), and L is continuous for this topology,the minimum of L is attained, and because C is convex and L is affine, the minimum is actually attained ona (nonempty) convex compact subset C0 of C, and C0 has at least one extreme point θ0 by Krein–Milmantheorem. Finally, Zvi ARTSTEIN argues that θ0 belongs to a finite dimensional face of the bigger convex setC1 = θ | θ ∈ L∞(Ω), 0 ≤ θ ≤ 1 a.e. in Ω, and using the fact that the Lebesgue measure has no atoms, heshows that θ0 must be a characteristic function, which is therefore optimal among characteristic functionsfrom C.

In both situations, one wants to minimize on a set X some function Fi, belonging to a family of suchfunctions indexed by i ∈ I, and one has constructed a compact space X containing X as a dense subspace,

69 The following argument is precisely Zvi ARTSTEIN’s proof of Lyapunov’s theorem.

62

and each Fi becomes the restriction to X of a function Fi which is continuous on X. Of course, each functionFi attains its minimum on X, but one exhibits a subset Y such that X ⊂ Y ⊂ X (and equal to X in thesecond case), which has the property that for each i ∈ I the set of minima of Fi intersects Y . Of course, asX is dense in X, Y is also dense in X, and therefore not compact if Y 6= X. It is not clear if there existsanother topology for which Y is compact with X dense in Y and each Fi is the restriction to X of a functionGi which is lower semi-continuous on Y .

I do not know any obvious abstract framework which explains how to avoid mentioning FunctionalAnalysis or Homogenization in these examples, and although it could be useful to develop special results forthe growing number of students who hope to use only elementary tools in Mathematics, it seems impossible tomaintain such a goal if one is interested in practical problems. It is indeed often emphasized by those who havedealt with practical problems of Optimization, that one is rarely given a very precise function to minimize,and that one must reassert the goal in view of preliminary results. For what concerns Homogenizationappearing in the problems that I have been considering, an obvious remark is that in the realization of suchan optimal solution including mixtures in some areas, one needs to reconsider the function to be minimizedby adding the cost of creating these mixtures.70

Actually, as was pointed out later by Robert KOHN, there is another natural class of functionals wherethe optimal solution seems to require more precise information on optimal bounds, and more general designsthan simple layerings. In identification problems, one does a few experiments with the same arrangement ofmaterials, and one tries to estimate some coefficient by using a finite number of measurements.As a simplemodel of this kind, one considers q equations of the form

−div(Agrad(ui)

)= fi in Ω, ui ∈ H1

0 (Ω), for i = 1, . . . , q, (9.35)

and all these equations use the same A ∈ M(α, β; Ω) but different functions fi ∈ H−1(Ω), i = 1, . . . , q, andq functions vi ∈ H1

0 (Ω) (or simply vi ∈ L2(Ω)) are given for i = 1, . . . , q, corresponding to measurements ina material using an unknown value A0 that one wants to identify.71 One idea would be to minimize a costfunction J of the form

J(A) =∫

Ω

( q∑i=1

|ui − vi|2)dx, (9.36)

and if one knows that A is one of the materials M1, . . . ,Mr rotated in an arbitrary way (a.e. in Ω), one getsa problem where the Homogenization approach will appear, but if one considers functionals of the form

J1(A) =∫

Ω

( q∑i=1

|grad(ui − vi)|2)dx, (9.37)

the situation is quite different [Ta13]. In the case of (9.36), minimizing sequences will make some Aeff ∈K(η1, . . . , ηr;M1, . . . ,Mr) appear, and there exists then an optimal homogenized A, and one may wonder

70 It is unfortunately often the case that some people dealing with questions in Elasticity not only forgetto mention the defects of Linearized Elasticity but only consider extremely particular functionals. Onthe contrary, on the engineering side, as was emphasized by Martin BENDSOE at a meeting in Trieste in1993, it is important to have an interactive point of view when dealing with Optimal Design problems; onerarely needs to compute with high precision which mixtures will appear in connection with minimizing aparticular functional, because one may well abandon this particular functional during the interactive partof the procedure, and one may end up minimizing something different, in principle more adapted to theengineering application.

71 Practical problems may include elecrical or heat conduction questions or permeability of oil reservoirs,and they do not correspond to homogeneous Dirichlet data. Instead of using different fi one may use differentnonhomogeneous Dirichlet data on some parts of the boundary and Neumann data on other parts of theboundary, and the different measurements may give the value of ui or the flux (Agrad(ui).n) on various otherparts of the boundary. Another important class of problems arising in applications deals with eigenvalues,in which case one may measure a few of the lowest eigenvalues corresponding to some unknown A0, whichone tries to identify from these measurements. It is not difficult however to adapt the methods described inthis course (for some academic models) in order to deal with these more realistic variants.

63

what its relation with the actual A0 is, for example if in the case where Mi = µi I for i = 1, . . . , r andtherefore A0 is locally an isotropic material, but one finds that all the optimal A are anisotropic in someregions. Of course, if the measured values were exactly those associated with A0 when one uses (9.35), thenone would have J(A0) = 0, but measurements are not always entirely accurate and this explains why, evenwhen A0 only takes isotropic values, it may do so on relatively small pieces, and A0 may well be near ananisotropic A in a distance corresponding to H-convergence (mentioned after Definition 5). If one decidesthen to minimize (9.36) and one wants to write necessary conditions of optimality, an increase δA createsincreases δui, i = 1, . . . , q, solutions of

−div(Agrad(δui) + δA grad(ui)

)= 0 in Ω, δui ∈ H1

0 (Ω), i = 1, . . . , q, (9.38)

and an increase δJ given by

δJ =∫

Ω

2( q∑i=1

(ui − vi)δui)dx, (9.39)

and in order to express δJ in terms of δA, one introduces q adjoint states p1, . . . , pq, solutions of

−div(Agrad(pi)

)= ui − vi in Ω, pi ∈ H1

0 (Ω), i = 1, . . . , q, (9.40)

and the necessary condition of optimality becomes

δJ = −2∫

Ω

[ q∑i=1

(δA grad(ui).grad(pi)

)]dx ≥ 0 for admissible δA. (9.41)

In the first step of the obtention of necessary conditions that I have described, i.e. keeping the proportionsη∗1 , . . . , η

∗r fixed, one sees that it would be useful to characterize the set of

(Agrad(u1), . . . , A grad(uq)

)for

A ∈ K(η1, . . . , ηr;M1, . . . ,Mr), but for q ≥ 2 the analogue of Lemma 42 is not known. Even for the case ofmixing two isotropic materials, i.e. M1 = α I,M2 = β I, for which I had obtained the characterization (7.25),(7.28), (7.30) of K(θ, 1− θ;α I, β I) with Francois MURAT in 1980, I do not know a simple characterizationof this set, although Lemma 41 asserts that it is convex if q ≤ N − 1. It was for this case that Robert KOHN

had pointed out that the full characterization seems necessary, but I do not know if the necessary condition(9.41) has given rise to a precise analysis of what optimal solutions look like.

10. Necessary conditions of optimality: second stepWe can now look at the second step of the derivation of necessary conditions of optimality, which

consists in varying η1, . . . , ηr, and is therefore more classical. I begin by treating two very special examples,corresponding to p = u or to p = −u, where one can avoid Homogenization almost completely. The specialexamples correspond to minimizing either the cost function J1 or the cost function J2 defined by

J1(χ1, . . . , χr, R) =∫

Ω

f u dx

J2(χ1, . . . , χr, R) = −∫

Ω

f u dx,

(10.1)

and it is important that in (10.1) one uses the same function f which appears in (9.3). The reason why thesetwo functionals are special is that the definition (9.12) of the adjoint state in the case of minimizing J1 givesp = u, and therefore A∗ grad(u∗) = a(η∗)grad(u∗) by (9.26) for N ≥ 2, while in the case of minimizing J2

it gives p = −u, and therefore A∗ grad(u∗) = b(η∗)grad(u∗) by (9.27) (which is valid for N ≥ 1, and alsofor minimizing J1 in the case N = 1), where the functions a and b are defined by (9.13), repeated as (9.31).Another important fact for the argument is that, because A is symmetric a.e. in Ω, solving (9.3) amountsto a minimization problem, namely

Minv∈H10 (Ω)

∫Ω

[(Agrad(v).grad(v)

)− 2f v

]dx is attained for v = u defined by (9.3),

and the value of the minimum is −∫

Ω

f u dx.

(10.2)

64

Lemma 46: The minimization of J1 is equivalent to the following min-max problem

MaxηMinv∈H10 (Ω)

∫Ω

(a(η)|grad(v)|2 − 2f v

)dx if N ≥ 2, (10.3)

MaxηMinv∈H10 (Ω)

∫Ω

(b(η)|grad(v)|2 − 2f v

)dx if N = 1, (10.4)

the set of η on which one maximizes is given by (9.6), repeated in (9.9), and in both cases u∗ is defined in aunique way.Proof: As already mentioned, for any optimal solution u∗ one has p∗ = u∗, and therefore one does notneed the argument of RAITUM described after Proposition 45. For N ≥ 2 one has (9.26) at points wheregrad(u∗) 6= 0, and as it holds automatically at points where grad(u∗) = 0, the formula (10.2) gives (10.3).For N = 1 one automatically has Aeff = b(η)I and therefore (9.27) holds, and the formula (10.2) gives(10.4).

That the problems in (10.3) or (10.4) have at least one solution follows from classical min-max theorems,because by using the information A ∈ M(α, β; Ω) one may restrict the minimization in v to a large enoughclosed ball of H1

0 (Ω), which is a compact convex set for the weak topology of H10 (Ω), the set of η is a compact

convex set for the L∞(Ω; Rr) weak ? topology, and the functional is convex lower semi-continuous in v andconcave upper semi-continuous in η (it is actually linear continuous in the case N ≥ 2). Moreover, becausethe functional is strictly convex in v, the solution u∗ is unique.

Of course, if without knowing anything about Homogenization one has guessed that the problem (10.3)is a way to solve the initial problem of minimizing J1, one still has to use Homogenization in order tointerpret what a solution η∗ means, and that it can be created by layerings. Jean CEA and K. MALANOWSKI

had unknowingly taken advantage of a similar miracle in [Ce&Ma], but they had started from acceptingall possible γ I with α ≤ γ ≤ β, and they did not even have to explain by Homogenization the (classical)solution that they had obtained.

Lemma 47: The minimization of J2 is equivalent to the following minimization problem

MinηMinv∈H10 (Ω)

∫Ω

(b(η)|grad(v)|2 − 2f v

)dx, (10.5)

and the set of η on which one minimizes is given by (9.6), repeated in (9.9), and b(η∗)grad(u∗) is defined ina unique way.Proof: As already mentioned, for any optimal solution u∗ one has p∗ = −u∗, and therefore one does notneed the argument of RAITUM described after Proposition 45. For N ≥ 2 one has (9.27) at points wheregrad(u∗) 6= 0, and as it holds automatically at points where grad(u∗) = 0, the formula (10.2) gives (10.5).

That the problem in (10.5) has a solution follows from classical theorems in Optimization, because byusing the information A ∈ M(α, β; Ω) one may restrict the minimization in v to a large enough closed ballof H1

0 (Ω), which is a compact convex set for the weak topology of H10 (Ω), the set of η is a compact convex

set for the L∞(Ω; Rr) weak ? topology, and the functional is convex in (v, η) and lower semi-continuous byLemma 11 with M = b(η) I; moreover, as P = M−1 in formula (5.6), one sees that the functional is strictlyconvex in b(η)grad(v) and therefore b(η∗)grad(u∗) is unique.

Of course, if without knowing anything about Homogenization one has guessed that the problem (10.5)is a way to solve the initial problem of minimizing J2, one still has to use Homogenization in order to interpretwhat a solution η∗ means, and that it can be created by layerings.

These two examples are not always instances of an intermediate problem in the abstract setting men-tioned after (9.34), and they do not always fit the framework of relaxed problems described in chapter 3either, because the set X for which the minimization of J1 or J2 is considered (which is the set of charac-teristic functions χ1, . . . , χr of disjoints sets satisfying (9.2), together with a measurable rotation R so thatA is defined by (9.1)) is not always included in the set Y on which (10.3) or (10.4) are considered (which isthe set of η1, . . . , ηr satisfying (9.6), and A is a(η)I for minimizing J1 for N ≥ 2, or b(η)I for minimizing J1

for N = 1 or for minimizing J2). X can only be considered as a subset of Y if one can avoid the use of the

65

rotation R, i.e. in the case where Mi = µi I for i = 1, . . . , r; however, even in this case, the problem is not acompletion, except for N = 1, where there is a formula for Aeff , as Aeff = b(η).

Coming back to the general case, one wants now to write that the optimal mixture using the proportionsη∗1 , . . . , η

∗r (with A∗ corresponding to a layered material) is better than any other mixture using different

proportions η1, . . . , ηr, and for this task one follows the arguments used in Lemma 44 (or the variant beforeProposition 45), but one starts now from A ∈ K(η1, . . . , ηr;M1, . . . ,Mr), with proportions η1, . . . , ηr satis-fying (9.9). By mixing then A with A∗ in an adapted way one will create an admissible differentiable pathη(ε) = η∗+ε δη+o(ε), A(ε) = A∗+ε δA+o(ε), u(ε) = u∗+ε δu+o(ε), J

(η(ε), A(ε)

)= J(η∗, A∗)+εδJ+o(ε),

with δu and δA still related by (9.17), but with δJ given now by

δJ =∫

Ω

[ r∑i=1

δηi Fi(x, u∗) +( r∑i=1

η∗i∂Fi(x, u∗)

∂u

)δu]dx, (10.6)

and using the definition (9.12) of the adjoint state p∗ and (9.17), one finds

δJ =∫

Ω

[ r∑i=1

δηi Fi(x, u∗)−(δA grad(u∗).grad(p∗)

)]dx. (10.7)

Proposition 48: Let η∗1 , . . . , η∗r , A

∗ satisfy (9.9) and

J(η∗1 , . . . , η∗r , A

∗) ≤ J(η1, . . . , ηr, A) for all η1, . . . , ηr, A satisfying (9.9), (10.8)

with J defined by (9.10). For N ≥ 2, (10.8) implies∫Ω

[ r∑i=1

η∗i Fi(x, u∗)−

(A∗ grad(u∗).grad(p∗)

)]dx ≤

∫Ω

[ r∑i=1

ηi Fi(x, u∗)−(B grad(u∗).grad(p∗)

)]dx

for all η1, . . . , ηr satisfying (9.9), B ∈ B(η1, . . . , ηr;M1, . . . ,Mr),(10.9)

with u∗, p∗,B(η1, . . . , ηr;M1, . . . ,Mr) defined by (9.3) (using A∗), (9.12) (using A∗), and (9.13). For N = 1,one has a∗ = b(η∗) and (10.8) implies∫

Ω

[ r∑i=1

(Fi(x.u∗) +

(a∗)2

λ(Mi)d u∗

dx

d p∗

dx

)η∗i

]dx ≤

∫Ω

[ r∑i=1

(Fi(x.u∗) +

(a∗)2

λ(Mi)d u∗

dx

d p∗

dx

)ηi

]dx

for all η1, . . . , ηr satisfying (9.9).

(10.10)

Proof: Let η1, . . . , ηr satisfy (9.9), and B ∈ B(η1, . . . , ηr;M1, . . . ,Mr). For ε ∈ (0, 1), one defines η(ε) by

ηi(ε) = (1− ε)η∗i + ε ηi, (10.11)

so that η(ε) satisfies (9.9). Using Lemma 42 (and preceding remarks for constructing measurable liftings),there exists A such that

Agrad(u∗) = B grad(u∗), A ∈ K(η1, . . . , ηr;M1, . . . ,Mr), a.e. x ∈ Ω. (10.12)

For a nonvanishing e ∈ L∞(Ω; RN ) one defines A(ε) by layering A∗ and A with respective proportions 1− εand ε, in layers orthogonal to e, and Lemma 34 gives

A(ε) = (1− ε)A∗ + εA− ε(1− ε)(A−A∗) e⊗ eε(A∗ e.e) + (1− ε)(Ae.e)

(A−A∗), (10.13)

and by construction one has A(ε) ∈ K(η1(ε), . . . , ηr(ε);M1, . . . ,Mr) a.e. in Ω. Of course, (10.8) implies thatδJ ≥ 0, for which one uses (10.7); the value of δηi following from (10.11) is δηi = ηi − η∗i , and the value

66

of δA following from (10.13) is δA = A − A∗ − (A − A∗) e⊗e(Ae.e) (A − A∗), so that

(δA grad(u∗).grad(p∗)

)=(

(A−A∗)grad(u∗).grad(p∗))− 1

(Ae.e)

((A−A∗)grad(u∗).e

)((A−A∗)e.grad(p∗)

), and by choosing e orthogonal

to (A − A∗)grad(u∗), one has(δA grad(u∗).grad(p∗)

)=((A − A∗)grad(u∗).grad(p∗)

), which by (10.12) is(

(B −A∗)grad(u∗).grad(p∗)), and δJ ≥ 0 is then exactly (10.9).

In the case N = 1, A ∈ K(η1, . . . , ηr;M1, . . . ,Mr) means A = b(η), with b defined in (9.13), andtherefore δA =

∑i∂b∂ηi

δηi; (9.13) implies that ∂b∂ηi

= − b2

λ(Mi)(where λ(Mi) is the value of the coefficient for

the material Mi), and therefore one has

δJ =∫

Ω

[ r∑i=1

(Fi(x.u∗) +

(a∗)2

λ(Mi)d u∗

dx

d p∗

dx

)δηi

]dx, (10.14)

from which one immediately deduces (10.10).

Inequality (10.9) consists in minimizing a linear functional on a convex set, and it can be further simpli-fied by noticing that B ∈ B(η1, . . . , ηr;M1, . . . ,Mr) only enters (10.9) through D = B grad(u∗), and when Bruns through B(η1, . . . , ηr;M1, . . . ,Mr), D spans the closed ball of diameter [b(η) grad(u∗), a(η) grad(u∗)],and therefore (10.9) is equivalent to∫

Ω

[ r∑i=1

η∗i Fi(x, u∗)−

(A∗ grad(u∗).grad(p∗)

)]dx ≤

∫Ω

[ r∑i=1

ηi Fi(x, u∗)−(D.grad(p∗)

)]dx

for all η1, . . . , ηr satisfying (9.9),(D − b(η) grad(u∗).D − a(η) grad(u∗)

)≤ 0,

(10.15)

and the convexity of the set of η1, . . . , ηr, D used in (10.15) follows from Lemma 19, and the convexity ofthe functions a and 1

b , consequence of their definition in (9.13).One can go further in the analysis of the necessary conditions obtained, (10.9) or (10.15) for the case

N ≥ 2, or (10.10) for the case N = 1, as they consist in minimizing linear functionals on convex sets definedby linear constraints, and Lagrange multipliers can then be used for making these conditions more precise,but I will not describe this question here as it is a more classical subject.

11. ConclusionI have now completed the description of the particular subject of this course, which was to show how

questions of Homogenization appear in Optimal Design problems, following the work which I had pioneeredin the early 70s with Francois MURAT.

It is time now that I describe the intuitive ideas behind the necessary conditions of optimality obtainedby Konstantin LUR’IE, as they were explained to me in the early 80s by Jean-Louis ARMAND, after hehad visited LUR’IE in Leningrad, where he had been told about my work [Ta2]. As I do not read much, Ido not know where these ideas have appeared in print,72 and although the idea is quite natural, it seemshard to transform into sound mathematical estimates.73 Assume that we consider a problem involving two(isotropic) materials, with imposed global proportions, and that we want to test the optimality of a givenclassical design, with a smooth interface between the two materials; the classical idea, which goes back toHADAMARD,74 consists in pushing the interface along its normal of variable amounts (with one constraint

72 Having learned about some ideas of LUR’IE, I do not need a published reference in order to attributethese ideas to him (and I could not remember of anyone else claiming them as his/hers, although I havenot checked the work of Richard DUFFIN, as I mention in footnote 75). I may be alone in thinking thatif a new idea is only mentioned orally by someone who does not put it in print immediately it should beattributed to this person, eventually with the names of those who would have found the same idea later butindependently, but not with the names of those who had heard about the idea and had put it in print undertheir name, expecting to acquire fame for an idea which was not theirs.

73 I wonder then if LUR’IE had been able to carry out these computations in a mathematician’s way, or ifhe had just acquired convincing evidence that some formulas must hold, as nonmathematicians often do.

74 I have already mentioned the precise analysis along this line of thought, carried out by Francois MURAT

and Jacques SIMON in [Mu&Si].

67

related to the global proportion imposed), and computing the change in the cost function leads to a necessarycondition of optimality valid along the interface; LUR’IE’s first idea was to work away from the interface,taking a small sphere imbedded in one material and an identical one imbedded in the other material andexchanging their content, and computing the change in the cost function leads to a necessary condition ofoptimality valid everywhere;75 LUR’IE’s second idea was to consider ellipsoids of the same volume insteadof spheres, and his new necessary conditions of optimality were stronger; as he could now play with theorientation of the ellipsoids and the ratio of the axes, he realized that it was better to take them veryslender, and in the limit he could understand that layered materials were important for his problem. AsI have mentioned earlier, this was a quite good extension of the ideas of PONTRYAGUIN to a setting ofpartial differential equations, but not quite the right way to discover the analysis which I had performedwith Francois MURAT, which was in some way a good extension of the better ideas of Laurence C. YOUNG.

The description of the method developed in this course, which corresponds in part to results which I hadobtained with Francois MURAT in the 70s, is analogous to what I had already taught in 1983 at the CEA -EDF - INRIA Summer course at Breau-sans-Nappe, written in [Mu&Ta1], and similar to what I had taughtagain in 1986 in Durham [Ta11]. I have included here much more of the basic results on Homogenization,which I had only alluded to before, and actually the description of the original method which I had followedwith Francois MURAT in the early 70s had never appeared in print before, the reason being that it had beengreatly simplified by my method of oscillating test functions which extends easily to all (linear) variationalformulations; because I had decided to use a chronological point of view in this course, in order to show hownew ideas had appeared, it was natural that I should describe first our original ideas, even though I hadimproved them later. I have added a few simplifications, which I had first written in 1995 in [Ta14], andwhich were therefore not included in my previous courses on the subject.

As I have mentioned earlier, some people have led a campaign of misattribution of my ideas which seemsto have intensified around 1983. I may have inadvertently added to the confusion by forgetting to mentionthe reference [Ta7] of my second method for obtaining bounds on effective coefficients,76 and I did not thinkthat it had any importance because I could not imagine that there were people ready to steal an idea thatthey would have heard if they thought that it had not been written yet.77 Unfortunately, the confusion mayalso have increased because Graeme MILTON called my method the “translation method”,78 and many haveused the possibility of quoting my method by this new name, without attributing it correctly.

I have been told that Alexandre GROTHENDIECK has analyzed in an unpublished book a few ways inwhich misattribution of ideas is organized,79 I have not read it and I cannot assert that my observationson this unfortunate aspect of academic behaviour coincide or not with his own. This “book” had first beenmentioned to me in 1984 by Jean LERAY, who had pointed out that it was a good sign that my ideas werestolen, and that many who steal others ideas would probably like that some of their ideas be stolen too, as itwould prove that they had had some of their own; Jean LERAY also had to face such an adverse behaviour,but if some of his ideas had been “borrowed” by a famous mathematician who had shown enough creativity

75 In the late 80s, my late colleague Richard DUFFIN had mentioned to me that he had worked on questionsof Optimal Design, and he used to call such necessary conditions a “principle of democracy”. Unfortunately,as I knew about such questions, I failed to enquire about his precise results, so that I do not know if hisconditions were of the Hadamard type, or the Lur’ie type, and I do not know when he had first obtainedsuch results.

76 The reason for not giving the reference of that 1977 conference in Versailles was that the organizershad forgotten to send me a copy of the proceedings, and therefore I did not know the exact reference of myarticle.

77 It had not been my intention to hide the existence of a written reference in order to confront later thosewho were ready to steal my ideas.

78 I do not find that name so well adapted to what my method is about. On the other hand, when I wasa student (in Paris in the late 60s), this precise term was used for describing a method of Louis NIRENBERG

for proving regularity of solutions of elliptic equations in smooth domains.79 Laurent SCHWARTZ told me recently that the publication of GROTHENDIECK’s book, “Recoltes et

Semailles”, was not possible because of the numerous personal attacks that it contains. It is neverthelessavailable on the Internet, in Russian translation!

68

of his own, for political reasons which were not too dissimilar to those which I had encountered myself morethan thirty years later, I have not found myself as fortunate and many who use my ideas without saying itpresent such a distorted view that one does not have to be a very good student in Mathematics for performingthe small detective work of identifying those who have stolen ideas that they do not even understand wellenough.

This being said, I must say that I think that the worst sin of a teacher is to induce students in error,and I do consider it actually a minor sin to forget to name the inventor of an idea,80 but a major sin to give abad explanation of what an idea is, or to forget mentioning an important idea on a subject. In consequence,if someone would feel such a pressure for avoiding to mention the author of one of the ideas that I havedescribed in this course, it would be better if he/she would start by learning well the content of this course,and then teach an improved version of it.

12. AcknowledgementsI want to thank Arrigo CELLINA and Antonio ORNELAS for their invitation to teach in the CIME / CIM

Summer course on Optimal Design, held in Troia in June 1998. The first meeting that I ever attended wasa CIME course in Varenna in 1970 (where my advisor, Jacques-Louis LIONS, was one of the main speakers),and I had enjoyed very much the working atmosphere of that course, which was also the occasion of my firstvisit to Italy. I had not been able to attend another CIME course since, and as I had not been able to accepta previous invitation to teach in such a course, it was a great pleasure to lecture in a CIME course, and asurprise that such a course would actually be held in Portugal.

My research is supported by Carnegie Mellon University, and the National Science Foundation (grantDMS-97.04762, and through a grant to the Center for Nonlinear Analysis). It is a pleasure to acknowledgethe support of the Max Planck Institute for Mathematics in the Sciences in Leipzig for my sabbatical year1997-98.

13. References[Ar] ARTSTEIN Z., “Look for the extreme points,” SIAM Review 22 (1980), 172–185.[Ba] BABUSKA I., “Homogenization and its application. Mathematical and computational problems,” Nu-merical Solution of Partial Differential Equations-III, (SYNSPADE 1975, College Park MD, May 1975),Academic Press, New York, 1976, 89–116.[Be&Li&Pa] BENSOUSSAN A. & LIONS J.-L. & PAPANICOLAOU G., Asymptotic Analysis for Periodic Struc-tures, Studies in Mathematics and its Applications 5. North-Holland, Amsterdam 1978.[Be] BERGMAN D., “Bulk physical properties of composite media,” Homogenization methods: theory andapplications in physics (Breau-sans-Nappe, 1983), 1–128, Collect. Dir. Etudes Rech. Elec. France, 57,Eyrolles, Paris, 1985.[Br&Po] BRAIDY P. & POUILLOUX D., Memoire d’Option, Ecole Polytechnique, 1982, unpublished.[Ce&Ha] Optimization of distributed parameter structures, Vol. I, II, J. Cea & E. J. Haug eds. Proceedingsof the NATO Advanced Study Institute on Optimization of Distributed Parameter Structural Systems, IowaCity, May 20-June 4, 1980. NATO advanced Study Institute Series E: Applied Sciences, 49, 50, MartinusNijhoff Publishers, The Hague, 1981.[Ce&Ma] CEA J. & MALANOWSKI K., “An example of a max-min problem in partial differential equations,”SIAM J. Control 8 1970, 305–316.[Ch] CHENAIS D., “Un ensemble de varietes a bord Lipschitziennes dans RN : compacite, theoreme deprolongements dans certains espaces de Sobolev,” C. R. Acad. Sci. Paris Ser. A-B 280 (1975), Aiii,A1145–A1147.[DG&Sp] DE GIORGI E. & SPAGNOLO S., “Sulla convergenza degli integrali dell’energia per operatori elliticidel 2 ordine,” Boll. Un. Mat. Ital. (4) 8 (1973), 391–411.[Fr&Mu] FRANCFORT G. & MURAT F., “Homogenization and Optimal Bounds in Linear Elasticity”, Arch.Rational Mech. Anal., 94, 1986, 307–334.

80 My religious upbringing forbids me to steal, but my memory is not perfect, and I may have forgotten toquote some authors for their ideas. If I realized that I had made such a mistake, either by being told aboutit or by finding it myself, I would certainly try to give a corrected statement on the next occasion where Iwould write on the subject (and I hope that a second lapse of memory would not occur at that time).

69

[Gu] GUTIERREZ S., “Laminations in Linearized Elasticity and a Lusin type Theorem for Sobolev Spaces,”PhD thesis, Carnegie Mellon University, May 1997.[Ha&Sh] HASHIN Z. & SHTRIKMAN S., “A variational approach to the theory of effective magnetic perme-ability of multiphase materials”, J. Applied Phys. 33, (1962) 3125–3131.[Jo] JOSEPH D., Stability of fluid motions I, II, Springer Tracts in Natural Philosophy, Vol. 27, 28, Springer-Verlag, Berlin-New York, 1976.[Ke] KELLER J., “A theorem on the conductivity of a composite medium,” J. Mathematical Phys. 5 (1964)548–549.[Li1] LIONS J.-L., Controle optimal de systemes gouvernes par des equations aux derivees partielles, Dunod- Gauthier-Villars, Paris, 1968.[Li2] LIONS J.-L. , Quelques methodes de resolution des problemes aux limites non lineaires, Dunod; Gauthier-Villars, Paris, 1969.[Li3] LIONS J.-L., “Asymptotic behaviour of solutions of variational inequalities with highly oscillating co-efficients,” Applications of methods of functional analysis to problems in mechanics (Joint Sympos., IU-TAM/IMU, Marseille, 1975) , pp. 30–55. Lecture Notes in Math., 503. Springer, Berlin, 1976.[Lu] LUR’IE K. A., “On the optimal distribution of the resistivity tensor of the working substance in amagnetohydrodynamic channel,” J. Appl. Mech. (PMM) 34 (1970), 255–274.[Ma&Sp] MARINO A. & SPAGNOLO S., “Un tipo di approssimazione dell’operator

∑ij Di

(aijDj

)con oper-

atori∑j Dj(bDj),” Ann. Scuola Norm. Sup. Pisa Cl. Sci. (3) 23 (1969), 657–673.

[MC] MCCONNELL W., “On the approximation of elliptic operators with discontinuous coefficients,” Ann.Scuola Norm. Sup. Pisa Cl. Sci. (4) 3 (1976), no. 1, 123–137.[Me] MEYERS N., “An Lp-estimate for the gradient of solutions of second order elliptic divergence equations,”Ann. Scuola Norm. Sup. Pisa (3) 17 (1963), 189–206.[Mo& St1] MORTOLA S. & STEFFE S., Unpublished report, Scuola Normale Superiore, Pisa.[Mo& St2] MORTOLA S. & STEFFE S., “A two-dimensional homogenization problem,” Atti Accad. Naz.Lincei Rend. Cl. Sci. Fis. Mat. Natur. (8) 78 (1985), no. 3, 77–82.[Mu1] MURAT F., “Un contre-exemple pour le probleme du controle dans les coefficients,” C. R. Acad. Sci.Paris, Ser. A-B 273 (1971), A708–A711.[Mu2] MURAT F., “Theoremes de non-existence pour des problemes de controle dans les coefficients,” C. R.Acad. Sci. Paris, Ser. A-B 274 (1972), A395–A398.[Mu3] MURAT F., “H-convergence,” Seminaire d’analyse fonctionnelle et numerique, Universite d’Alger,1977-78. Translated into English as MURAT F. & TARTAR L., “H-convergence,” Topics in the mathematicalmodelling of composite materials, 21–43, Progr. Nonlinear Differential Equations Appl., 31, BirkhauserBoston, Boston, MA, 1997.[Mu4] MURAT F., “Compacite par compensation,” Ann. Scuola Norm. Sup. Pisa Cl. Sci. (4), 5 (1978),489–507.[Mu&Si] MURAT F. & SIMON J., “Sur le controle par un domaine geometrique,” Publication 76015, Labora-toire d’Analyse Numerique, Universite Paris 6, 1976.[Mu&Ta1] MURAT F. & TARTAR L., “Calcul des variations et homogeneisation,” Homogenization meth-ods: theory and applications in physics (Breau-sans-Nappe, 1983), 319–369, Collect. Dir. Etudes Rech.Elec. France, 57, Eyrolles, Paris, 1985. Translated into English as “Calculus of variations and homogeniza-tion,” Topics in the mathematical modelling of composite materials, 139–173, Prog. Nonlinear DifferentialEquations Appl., 31, Birkhauser Boston, M A, 1997.[Mu&Ta2] MURAT F. & TARTAR L., “Optimality conditions and homogenization,” Nonlinear variationalproblems (Isola d’Elba, 1983), 1–8, Res. Notes in Math., 127, Pitman, Boston, Mass. -London, 1985.[Ra] RAITUM U.E., “The extension of extremal problems connected with a linear elliptic equation,” SovietMath. Dokl. 19, (1978), 1342-1345.[S-P1] SANCHEZ-PALENCIA E., “Solutions periodiques par rapport aux variables d’espaces et applications,”C. R. Acad. Sci. Paris, Ser. A-B 271 (1970), A1129–A1132.[S-P2] SANCHEZ-PALENCIA E., “Equations aux derivees partielles dans un type de milieux heterogenes,” C.R. Acad. Sci. Paris, Ser. A-B 272 (1972), A395–A398.[Si] SIMON L., “On G-convergence of elliptic operators,” Indiana Univ. Math. J. 28 (1979), no. 4, 587–594.

70

[Sp1] SPAGNOLO S., “Sul limite delle soluzioni di problemi di Cauchy relativi all’equazione del calore,” Ann.Scuola Norm. Sup. Pisa (3) 21 (1967), 657–699.[Sp2] SPAGNOLO S., “Sulla convergenza di soluzioni di equazioni paraboliche ed ellitiche,” Ann. ScuolaNorm. Sup. Pisa (3) 22 (1968), 571–597.[Ta1] TARTAR L., “Convergence d’operateurs differentiels,” Atti Giorni Analisi Convessa e Applicazioni(Roma, 1974), 101–104.[Ta2] TARTAR L., “Problemes de controle des coefficients dans des equations aux derivees partielles,” Controltheory, numerical methods and computer systems modelling (Internat. Sympos., IRIA LABORIA, Rocquen-court, 1974), pp. 420–426. Lecture Notes in Econom. and Math. Systems, Vol. 107, Springer, Berlin, 1975.Translated into English as MURAT F. & TARTAR L., “On the control of coefficients in partial differentialequations,” Topics in the mathematical modelling of composite materials, 1–8, Prog. Nonlinear DifferentialEquations Appl., 31, Birkhauser Boston, MA, 1997.[Ta3] TARTAR L., “Quelques remarques sur l’homogeneisation,” Functional analysis and numerical analysis(Tokyo and Kyoto, 1976), 469–481, Japan Society for the Promotion of Science, Tokyo, 1978.[Ta4] TARTAR L., “Weak convergence in nonlinear partial differential equations,” Existence Theory in Non-linear Elasticity, 209–218. The University of Texas at Austin, 1977.[Ta5] TARTAR L., “Une nouvelle methode de resolution d’equations aux derivees partielles non lineaires,”Journees d’Analyse Non Lineaire (Proc. Conf., Besancon, 1977), pp. 228–241, Lecture Notes in Math.,665, Springer, Berlin, 1978.[Ta6] TARTAR L., “Nonlinear constitutive relations and homogenization,” Contemporary developments incontinuum mechanics and partial differential equations (Proc. Internat. Sympos., Inst. Mat., Univ. Fed.Rio de Janeiro, Rio de Janeiro) , pp. 472–484, North-Holland Math. Stud., 30, North-Holland, Amsterdam-New York, 1978.[Ta7] TARTAR L., “Estimations de coefficients homogeneises,” Computing methods in applied sciences andengineering (Proc. Third Internat. Sympos., Versailles, 1977), I, pp. 364–373, Lecture Notes in Math.,704, Springer, Berlin, 1979. Translated into English as “Estimations of homogenized coefficients,” Topics inthe mathematical modelling of composite materials, 9–20, Prog. Nonlinear Differential Equations Appl., 31,Birkhauser Boston , MA, 1997.[Ta8] TARTAR L., “Compensated compactness and applications to partial differential equations,” Nonlinearanalysis and mechanics: Heriot-Watt Symposium, Vol. IV, pp. 136–212. Res. Notes in Math., 39, Pitman,San Francisco, Calif., 1979.[Ta9] TARTAR L., “Estimations fines des coefficients homogeneises,” Ennio De Giorgi colloquium (Paris,1983), 168–187, Res. Notes in Math., 125, Pitman, Boston, MA-London, 1985.[Ta10] TARTAR L., “Remarks on homogenization,” Homogenization and effective moduli of materials andmedia (Minneapolis, Minn., 1984/1985), 228–246, IMA Vol. Math. Appl., 1, Springer, New York-Berlin,1986.[Ta11] TARTAR L., “The appearance of oscillations in optimization problems,” Nonclassical continuum me-chanics (Durham, 1986), 129–150, London Math. Soc. Lecture Note Ser., 122, Cambridge Univ. Press,Cambridge-New York, 1987.[Ta12] TARTAR L., “H-measures, a new approach for studying homogenisation, oscillations and concentrationeffects in partial differential equations,” Proc. Roy. Soc. Edinburgh Sect. A 115, (1990), no. 3-4, 193–230.[Ta13] TARTAR L., “Remarks on Optimal Design Problems,” Calculus of variations, homogenization andcontinuum mechanics (Marseille, 1993), 279–296, Ser. Adv. Math. Appl. Sci., 18, World Sci. Publishing,River Edge, NJ, 1994.[Ta14] TARTAR L., “Remarks on the homogenization method in optimal design problems,” Homogenizationand applications to material sciences (Nice, 1995), 393–412, GAKUTO Internat. Ser. Math. Sci. Appl., 9,Gakkokotosho, Tokyo, 1995 .[Yo] YOUNG, L. C., Lectures on the Calculus of Variation and Optimal Control Theory, (W. B. Saunders,Philadelphia: 1969).[Zo] ZOLEZZI T., “Teoremi d’esistensa per problemi di controllo ottimo retti da equazioni ellittiche oparaboliche,” Rend. Sem. Mat. Padova, 44 (1970), 155–173.

71