a uniﬁed approach to the global exactness of penalty and ... · penalty function for nonlinear...

arX

iv:1

709.

0707

3v3

[m

ath.

OC

] 1

5 N

ov 2

018

A Unified Approach to the Global Exactness of

Penalty and Augmented Lagrangian Functions I:

Parametric Exactness

M.V. Dolgopolik

November 16, 2018

Abstract

In this two-part study we develop a unified approach to the analysis ofthe global exactness of various penalty and augmented Lagrangian func-tions for constrained optimization problems in finite dimensional spaces.This approach allows one to verify in a simple and straightforward man-ner whether a given penalty/augmented Lagrangian function is exact, i.e.whether the problem of unconstrained minimization of this function isequivalent (in some sense) to the original constrained problem, providedthe penalty parameter is sufficiently large. Our approach is based on theso-called localization principle that reduces the study of global exactnessto a local analysis of a chosen merit function near globally optimal solu-tions. In turn, such local analysis can usually be performed with the useof sufficient optimality conditions and constraint qualifications.

In the first paper we introduce the concept of global parametric exact-ness and derive the localization principle in the parametric form. Withthe use of this version of the localization principle we recover existing sim-ple necessary and sufficient conditions for the global exactness of linearpenalty functions, and for the existence of augmented Lagrange multipliersof Rockafellar-Wets’ augmented Lagrangian. Also, we obtain completelynew necessary and sufficient conditions for the global exactness of generalnonlinear penalty functions, and for the global exactness of a continuouslydifferentiable penalty function for nonlinear second-order cone program-ming problems. We briefly discuss how one can construct a continuouslydifferentiable exact penalty function for nonlinear semidefinite program-ming problems, as well.

1 Introduction

One of the main approaches to the solution of a constrained optimization prob-lem consists in the reduction of this problem to an unconstrained one (or a se-quence of unconstrained problems) with the use ofmerit (or auxiliary) functions.Such merit functions are usually defined as a certain convolution of the objectivefunction and constraints, and they almost always include the penalty parameterthat must be properly chosen for the reduction to work. This approach led tothe development of various penalty and barrier methods [50, 4, 6, 3], primal-dual methods based on the use of augmented Lagrangians [10] and many othermethods of constrained optimization.

1

http://arxiv.org/abs/1709.07073v3

There exist numerous results on the duality theory for various merit func-tions, such as penalty and augmented Lagrangian functions. A modern generalformulation of the augmented Lagrangian duality for nonconvex problems basedon a geometric interpretation of the augmented Lagrangian in terms of subgradi-ents of the optimal value function was proposed by Rockafellar and Wets in [92],and further developed in [61, 127, 62, 89]. Let us also mention several exten-sions [55, 15, 128, 122, 129, 14, 130, 110] of this augmented Lagrangian dualitytheory aiming at including some other augmented Lagrangian and penalty func-tions into the unified framework proposed in [92]. A general duality theory fornonlinear Lagrangian and penalty functions was developed in [94, 97, 90, 109].Another general approach to the study of duality based on the image spaceanalysis was systematically studied in [48, 56, 57, 85, 69, 132, 133, 116].

In contract to duality theory, few attempts [119, 47, 29, 22, 49] have beenmade to develop a general theory of a global exactness of merit functions, despitethe abundance of particular results on the exactness of various penalty/augmentedLagrangian functions. Furthermore, the existing general results on global ex-actness are unsatisfactory, since they are very restrictive and cannot be appliedto many particular cases.

Recall that a penalty function is called exact iff its points of global minimumcoincide with globally optimal solutions of the constrained optimization problemunder consideration. The concept of exactness of a linear penalty function wasintroduced by Eremin [45] and Zangwill [120] in the mid-1960s, and was furtherinvestigated by many researches (see [91, 46, 7, 59, 64, 93, 84, 17, 115, 2, 20,19, 21, 28, 29, 23, 121, 39] and the references therein). A class of continuouslydifferentiable exact penalty functions was introduced by Fletcher [51] in 1970.Fletcher’s penalty functions was modified and thoroughly investigated in [51, 52,88, 58, 11, 8, 60, 26, 27, 77, 18, 54, 1]. Di Pillo and Grippo proposed to consideran exact augmented Lagrangian function [24] in 1979. This class of augmentedLagrangian functions was studied and applied to various optimization problemsin [30, 25, 76, 35, 33, 34, 32, 31, 44, 43, 78, 36, 53], while a general theory ofglobally exact augmented Lagrangian functions was developed by the author in[42]. The theory of nonlinear exact penalty functions was developed by Rubinovand his colleagues [96, 98, 95, 97, 118] in the late 1990s and the early 2000s.Finally, a new class of exact penalty functions was introduced by Huyer andNeumaier [63] in 2003. Later on, this class of penalty functions was studiedby many researchers, and applied to various optimization problems, includingoptimal control problems [9, 106, 68, 82, 71, 65, 70, 83, 124, 38, 37, 41].

It should be noted that the problem of the existence of global saddle pointsof augmented Lagrangian functions is closely related to the exactness propertyof these functions. This problem was studied for general cone constrained op-timization problems in [100, 131], for mathematical programming problems in[73, 108, 72, 80, 126, 113, 107, 104, 105], for nonlinear second order cone pro-gramming problems in [125], for nonlinear semidefinite programming problemsin [114, 79], and for semi-infinite programming problems in [99, 16]. A generaltheory of the existence of global saddle point of augmented Lagrangian functionsfor cone constrained optimization problems was presented in [42]. Finally, thereis also a problem of the existence of augmented Lagrange multipliers, which canbe viewed as the study of the global exactness of Rockafellar-Wets’ augmentedLagrangian function. Various results on the existence of augmented Lagrangemultipliers were obtained in [100, 131, 40, 99, 66, 67, 16].

2

The anaylsis of the proofs of the main results of the aforementioned papersindicates that the underlying ideas of these papers largely overlap. Our maingoal is to unveil the core idea behind these result, and present a general theoryof the global exactness of penalty and augmented Lagrangian functions for fi-nite dimensional constrained optimization problems that can be applied to allexisting penalty and augmented Lagrangian functions. The central result of ourtheory is the so-called localization principle. This principle allows one to reducethe study of the global exactness of a given merit function to a local analysisof the behaviour of this function near globally optimal solutions of the origi-nal constrained problem. In turn, such local analysis can be usually performedwith the use of sufficient optimality conditions and/or constraint qualifications.Thus, the localization principle furnishes one with a simple technique for provingthe global exactness of almost any merit function with the use of the standardtools of constrained optimization (namely, constraint qualifications and opti-mality conditions). The localization principle was first derived by the authorfor linear penalty functions in [39], and was further extended to other penaltyand augmented Lagrangian functions in [42, 41, 40]

In order to include almost all imaginable penalty and augmented Lagrangianfunctions into the general theory, we introduce and study the concept of globalexactness for an arbitrary function depending on the primal variables, thepenalty parameter and some additional parameters, and do not impose anyassumptions on the structure of this function. Instead, natural assumptions onthe behaviour of this function arise within the localization principle as necessaryand sufficient conditions for the global exactness.

It might seem natural to adopt the approach of the image space analysis[56, 57, 85, 69, 132, 133, 116] for the study of global exactness. However, thedefinition of separation function from the image space analysis imposes someassumptions on the structure of admissible penalty/augmented Lagrangian func-tions, which create some unnecessary restrictions. In contrast, our approach tothe global exactness avoids any such assumptions.

Finally, let us note that there are several possible ways to introduce the con-cept of the global exactness of a merit function. Each part of this two-part studyis devoted to the analysis of one of the possible approaches to the definition ofglobal exactness. In this paper we study the so-called global parametric exact-ness, which naturally arises during the study of various exact penalty functionsand augmented Lagrange multipliers.

The paper is organized as follows. In Section 3 we introduce the definitionof global parametric exactness and derive the localization principle in the para-metric form. This version of localization principle is applied to the study ofthe global exactness of several penalty and augmented Lagrangian in Section 4.In particular, in this section we recover existing necessary and sufficient condi-tions for the global exactness of linear penalty function, and for the existenceof augmented Lagrange multipliers. We also obtain completely new necessaryand sufficient conditions for the global exactness of a continuously differentiablepenalty function for nonlinear second-order cone programming problems, andbriefly discuss how one can define a globally exact continuously differentiablepenalty function for nonlinear semidefinite programming problems. Necessarypreliminary results are given in Section 2.

3

2 Preliminaries

Let X be a finite dimensional normed space, and M,A ⊂ X be nonempty sets.Throughout this article, we study the following optimization problem

min f(x) subject to x ∈ M, x ∈ A, (P)

where f : X → R ∪ +∞ is a given function. Denote by Ω = M ∩A the set offeasible points of this problem. From this point onwards, we suppose that thereexists x ∈ Ω such that f(x) < +∞, and that there exists a globally optimalsolution of (P).

Our aim is to somehow “get rid” of the constraint x ∈ M in the problem(P) with the use of an auxiliary function F (·). Namely, we want to constructan auxiliary function F (·) such that globally optimal solutions of the problem(P) can be easily recovered from points of global minimum of F (·) on the setA. To be more precise, our aim is to develop a general theory of such auxiliaryfunctions.

Remark 2.1. It should be underlined that only the constraint x ∈ M is incorpo-rated into an auxiliary function F (·), while the constraint x ∈ A must be takeninto account explicitly. Usually, the set A represents “simple” constrains suchas bound or linear ones. Alternatively, one can utilize one auxiliary function inorder to “get rid” of one kind of constraints, and then utilize a different typeof auxiliary functions in order to “get rid” of other kind of constraints. Over-all, the differentiation of the constraints onto the main ones (x ∈ M) and theadditional ones (x ∈ A) gives one more flexibility in the choice of the tools forsolving constrained optimization problems.

Let Λ be a nonempty set of parameters that are denoted by λ, and let c > 0be the penalty parameter. Hereinafter, we suppose that a function F : X × Λ×(0,+∞) → R ∪ +∞, F = F (x, λ, c), is given. A connection between thisfunction and the problem (P) is specified below.

The function F can be, for instance, a penalty function with Λ being theempty set or an augmented Lagrangian function with λ being a Lagrange mul-tiplier. However, in order not to restrict ourselves to any specific case, we callF (x, λ, c) a separating function for the problem (P).

Remark 2.2. The motivation behind the term “separating function” comes froma geometric interpretation of many penalty and augmented Lagrangian functionas nonlinear functions separating some nonconvex sets. This point of view onpenalty and augmented Lagrangian functions is systematically utilized withinthe image space analysis [56, 57, 85, 69, 78, 132, 133, 116].

Remark 2.3. Let us note that since we consider only separating functions de-pending on the penalty parameter c > 0, the so-called objective penalty functions(see, e.g., [49, 87, 86]) cannot be considered within our theory.

3 A General Theory of Parametric Exactness

In the first part of our study, we consider the simplest case when one minimizesthe function F (x, λ, c) with respect to x, and views λ as a tuning parameter.Let us introduce the formal definition of exactness of the function F (x, λ, c) inthis case.

4

Definition 3.1. The separating function F (x, λ, c) is said to be globally para-metrically exact iff there exist λ∗ ∈ Λ and c∗ > 0 such that for any c ≥ c∗ onehas

argminx∈A

F (x, λ∗, c) = argminx∈Ω

f(x).

The greatest lower bound of all such c∗ > 0 is called the least exact penaltyparameter of the function F (x, λ∗, c), and is denoted by c∗(λ∗), while λ∗ iscalled an exact tuning parameter.

Thus, if F (x, λ, c) is globally parametrically exact and an exact tuning pa-rameter λ∗ is known, then one can choose sufficiently large c > 0 and minimizethe function F (·, λ∗, c) over the set A in order to find globally optimal solu-tions of the problem (P). In other words, if the function F (x, λ, c) is globallyexact, then one can remove the constraint x ∈ M with the use of the functionF (x, λ, c) without loosing any information about globally optimal solutions ofthe problem (P).

Our main goal is to demonstrate that the study of the global parametricexactness of the separating function F (x, λ, c) can be easily reduced to thestudy of a local behaviour of F (x, λ, c) near globally optimal solutions of theproblem (P). This reduction procedure is called the localization principle.

At first, let us describe a desired local behaviour of the function F (x, λ, c)near optimal solutions.

Definition 3.2. Let x∗ be a locally optimal solution of the problem (P). Theseparating function F (x, λ, c) is called locally parametrically exact at x∗ iff thereexist λ∗ ∈ Λ, c∗ > 0 and a neighbourhood U of x∗ such that for any c ≥ c∗ onehas

F (x, λ∗, c) ≥ F (x∗, λ∗, c) ∀ x ∈ U ∩ A.

The greatest lower bound of all such c∗ > 0 is called the least exact penaltyparameter of the function F (x, λ∗, c) at x∗, and is denoted by c∗(x∗, λ∗), whileλ∗ is called an exact tuning parameter at x∗.

Thus, F (x, λ, c) is locally parametrically exact at a point x∗ with an exacttuning parameter λ∗ iff there exists c∗ > 0 such that x∗ is a local (uniformlywith respect to c ∈ [c∗,+∞)) minimizer of the function F (·, λ∗, c) on the set A.Observe also that if the function F (x, λ, c) is nondecreasing in c, then F (x, λ, c)is locally parametrically exact at x∗ with an exact tuning parameter λ∗ iff thereexists c∗ such that x∗ is a local minimizer of F (·, λ∗, c∗) on the set A.

Recall that c > 0 in F (x, λ, c) is called the penalty parameter ; however, aconnection of the parameter c with penalization is unclear from the definitionof the function F (x, λ, c). We need the following definition in order to clarifythis connection.

Definition 3.3. Let λ∗ ∈ Λ be fixed. One says that F (x, λ, c) is a penalty-typeseparating function for λ = λ∗ iff there exists c0 > 0 such that if

1. cn ⊂ [c0,+∞) is an increasing unbounded sequence;

2. xn ∈ argminx∈A F (x, λ∗, cn), n ∈ N;

3. x∗ is a cluster point of the sequence xn,

then x∗ is a globally optimal solution of the problem (P).

5

Roughly speaking, F (x, λ, c) is a penalty-type separating function for λ = λ∗

iff global minimizers of F (·, λ∗, c) on the set A tend to globally optimal solutionsof the problem (P) as c → +∞. Thus, if the separating function F (x, λ, c) isof penalty-type, then c plays the role of penalty parameter, since an increase ofc forces global minimizers of F (·, λ∗, c) to get closer to the feasible set of theproblem (P).

Note that if the function F (·, λ∗, c) does not attain a global minimum onthe set A for any c greater than some c0 > 0, then, formally, F (x, λ, c) is apenalty-type separating function for λ = λ∗. Similarly, if all sequences xn,such that xn ∈ argminx∈A F (x, λ∗, cn), n ∈ N and cn → +∞ as n → ∞, donot have cluster points, then F (x, λ, c) is a penalty-type separating functionfor λ = λ∗, as well. Therefore we need an additional definition that allowsone to exclude such pathological behaviour of the function F (x, λ, c) as c → ∞(see [39], Sections 3.2–3.4, for the motivation behind this definition).

Recall that A is a subset of a finite dimensional normed space X .

Definition 3.4. Let λ∗ ∈ Λ be fixed. The separating function F (x, λ, c) is saidto be non-degenerate for λ = λ∗ iff there exist c0 > 0 and R > 0 such that forany c ≥ c0 the function F (·, λ∗, c) attains a global minimum on the set A, andthere exists x(c) ∈ argminx∈A F (x, λ∗, c) such that ‖x(c)‖ ≤ R.

Roughly speaking, the non-degeneracy condition does not allow global mini-mizers of F (·, λ∗, c) on the set A to escape to infinity as c → ∞. Note that if theset A is bounded, then F (x, λ, c) is non-degenerate for λ = λ∗ iff the functionF (·, λ∗, c) attains a global minimum on the set A for any c large enough.

Now, we are ready to formulate and prove the localization principle. Recallthat Ω is the feasible set of the problem (P). Denote by Ω∗ the set of globallyoptimal solutions of this problem.

Theorem 3.1 (Localization Principle in the Parametric Form I). Suppose thatthe validity of the condition

Ω∗ ∩ argminx∈A

F (x, λ∗, c) 6= ∅ (1)

for some λ∗ ∈ Λ and c > 0 implies that F (x, λ, c) is globally parametricallyexact with the exact tuning parameter λ∗. Let also Ω be closed, and f be l.s.c.on Ω. Then the separating function F (x, λ, c) is globally parametrically exact ifand only if there exists λ∗ ∈ Λ such that

1. F (x, λ, c) is of penalty-type and non-degenerate for λ = λ∗;

2. F (x, λ, c) is locally parametrically exact with the exact tuning parameterλ∗ at every globally optimal solution of the problem (P).

Proof. Suppose that F (x, λ, c) is globally parametrically exact with an exacttuning parameter λ∗. Then for any c > c∗(λ∗) one has

argminx∈A

F (x, λ∗, c) = Ω∗.

In other words, for any c > c∗(λ∗) every globally optimal solution x∗ of the prob-lem (P) is a global (and hence local uniformly with respect to c ∈ (c∗(λ∗),+∞))minimizer of F (·, λ∗, c) on the set A. Thus, F (x, λ, c) is locally parametrically

6

exact with the exact tuning parameter λ∗ at every globally optimal solution ofthe problem (P).

Fix arbitrary x∗ ∈ Ω∗. Then for any c > c(λ∗) the point x∗ is a global min-imizer of F (·, λ∗, c), which implies that F (x, λ, c) is non-degenerate for λ = λ∗.Furthermore, if a sequence xn ⊂ A is such that xn ∈ argminx∈A F (x, λ∗, cn)for all n ∈ N , where cn → +∞ as n → ∞, then due to the global exactness of Fone has that for all n large enough the point xn coincides with one of the globallyoptimal solution of (P), which implies that xn ∈ Ω, and f(xn) = minx∈Ω f(x).Hence applying the facts that Ω is closed and f is l.s.c. on Ω one can easily verifythat a cluster point of the sequence xn, if exists, is a globally optimal solutionof (P). Thus, F (x, λ, c) is a penalty-type separating function for λ = λ∗.

Let us prove the converse statement. Our aim is to verify that there existc > 0 and x∗ ∈ Ω∗ such that

infx∈A

F (x, λ∗, c) = F (x∗, λ∗, c). (2)

Then taking into account condition (1) one obtains that the separating functionF (x, λ, c) is globally parametrically exact. Arguing by reductio ad absurdum,suppose that (2) is not valid. Then, in particular, for any n ∈ N one has

infx∈A

F (x, λ∗, n) < F (x∗, λ∗, n) ∀x∗ ∈ Ω∗. (3)

By condition 1, the function F (x, λ, c) is non-degenerate for λ = λ∗. Thereforethere exist n0 ∈ N and R > 0 such that for any n ≥ n0 there exists xn ∈argminx∈A F (x, λ∗, n) with ‖xn‖ ≤ R.

Recall that X is a finite dimensional normed space. Therefore there existsa subsequence xnk

converging to some x∗. Consequently, applying the factthat F (x, λ, c) is a penalty-type separating function for λ = λ∗ one obtains thatx∗ ∈ Ω∗. By condition 2, F (x, λ, c) is locally parametrically exact at x∗ with theexact tuning parameter λ∗. Therefore there exist c0 > 0 and a neighbourhoodU of x∗ such that for any c ≥ c0 one has

F (x, λ∗, c) ≥ F (x∗, λ∗, c) ∀x ∈ U ∩ A. (4)

Since the subsequence xnk converges to x∗, there exists k0 such that for any

k ≥ k0 one has xnk∈ U . Moreover, one can suppose that nk ≥ c0 for all k ≥ k0.

Hence with the use of (4) one obtains that

F (xnk, λ∗, nk) ≥ F (x∗, λ∗, nk),

which contradicts (3) and the fact that xnk∈ argminx∈A F (x, λ∗, nk). Thus,

F (x, λ, c) is globally parametrically exact.

Remark 3.1. (i) Condition (1) simply means that in order to prove the globalparametric exactness of F (x, λ, c) it is sufficient to check that at least one glob-ally optimal solution of the problem (P) is a point of global minimum of the func-tion F (·, λ∗, c) instead of verifying that the sets Ω∗ and argminx∈A F (x, λ∗, c)actually coincide. It should be pointed out that in most particular cases thevalidity of condition (1) is equivalent to global parametric exactness. In fact,the equivalence between (1) and global parametric exactness automatically, i.e.

7

without any additional assumptions, holds true in all but one example (see sub-section 4.4 below) presented in this article. Note, finally, that condition (1) isneeded only to prove the “if” part of the theorem.(ii) The theorem above describes how to construct a globally exact separatingfunction F (x, λ, c). Namely, one has to ensure that a chosen function F (x, λ, c)is of penalty-type (which can be guaranteed by adding a penalty term to thefunction F (x, λ, c)), non-degenerate (which can usually be guaranteed by theintroduction of a barrier term into the function F (x, λ, c)) and is locally exactnear all globally optimal solutions of the problem (P), which is typically donewith the use of constraint qualifications (metric (sub-)regularity assumptions)and/or sufficient optimality conditions. Below, we present several particularexamples illustrating the usage of the previous theorem.(iii) Note that the previous theorem can be reformulated as a theorem describingnecessary and sufficient conditions for a tuning parameter λ∗ ∈ Λ to be exact.It should also be mentioned that the theorem above can be utilized in orderto obtain necessary and/or sufficient conditions for the uniqueness of an exacttuning parameter, In particular, it is easy to see that a globally exact tuningparameter λ∗ is unique, if there exists x∗ ∈ Ω∗ such that a locally exact tuningparameter at x∗ is unique.

The theorem above can be vaguely formulated as follows. The separatingfunction F (x, λ, c) is globally parametrically exact iff it is of penalty-type, non-degenerate and locally exact at every globally optimal solution of the problem(P). Thus, under natural assumptions the function F (x, λ, c) is globally exactiff it is exact near globally optimal solutions of the original problem. That iswhy Theorem 3.1 is called the localization principle.

Let us reformulate the localization principle in the form that is slightly moreconvenient for applications.

Theorem 3.2 (Localization Principle in the Parametric Form II). Suppose thatthe validity of the condition


F (x, λ∗, c) 6= ∅

for some λ∗ ∈ Λ and c > 0 implies that F (x, λ, c) is globally parametricallyexact with the exact tuning parameter λ∗. Let also the sets A and Ω be closed,the objective function f be l.s.c. on Ω, and the function F (·, λ, c) be l.s.c. onA for all λ ∈ Λ and c > 0. Then the separating function F (x, λ, c) is globallyparametrically exact if and only if there exists λ∗ ∈ Λ such that

1. F (x, λ, c) is of penalty-type for λ = λ∗;

2. there exist c0 > 0, x∗ ∈ Ω∗ and a bounded set K ⊂ A such that

S(c, x∗) :=x ∈ A | F (x, λ∗, c) < F (x∗, λ∗, c)

⊂ K ∀c ≥ c0; (5)

3. F (x, λ, c) is locally parametrically exact at every globally optimal solutionof the problem (P) with the exact tuning parameter λ∗.

Proof. Suppose that F (x, λ, c) is globally parametrically exact with the exacttuning parameter λ∗. Then, as it was proved in Theorem 3.1, F (x, λ, c) is a

8

penalty-type separating function for λ = λ∗, and F (x, λ, c) is locally parametri-cally exact with the exact tuning parameter λ∗ at every globally optimal solutionof the problem (P). Furthermore, from the definition of global exactness it fol-lows that S(c, x∗) = ∅ for all c > c∗(λ∗) and x∗ ∈ Ω∗, which implies that (5) issatisfied for all c0 > c∗(λ∗), x∗ ∈ Ω∗ and any bounded set K.

Let us prove the converse statement. By our assumption there exist c0 > 0and x∗ ∈ Ω∗ such that for all c ≥ c0 the sublevel set S(c, x∗) is contained in abounded set K and, thus, is bounded. Therefore taking into account the factsthat the function F (·, λ∗, c) is l.s.c. on A, and the set A is closed one obtainsthat F (·, λ∗, c) attains a global minimum on the set A at a point x(c) ∈ K(if S(c, x∗) = ∅ for some c ≥ c0, then x(c) = x∗). From the fact that K isbounded it follows that that there exists R > 0 such that ‖x(c)‖ ≤ R for allc ≥ c0, which implies that F (x, λ, c) is non-degenerate for λ = λ∗. Consequently,applying Theorem 3.1 one obtains the desired result.

Note that the definition of global parametric exactness does not specify howthe optimal value of the problem (P) and the infimum of the function F (·, λ∗, c)over the set A are connected. In some particular cases (see subsection 4.4below), this fact might significantly complicate the application of the localizationprinciple. Therefore, let us show how one can incorporate the assumption onthe value of infx∈A F (x, λ∗, c) into the localization principle.

Definition 3.5. The separating function F (x, λ, c) is said to be strictly glob-ally parametrically exact if F (x, λ, c) is globally parametrically exact, and thereexists c0 > 0 such that

infx∈A

F (x, λ∗, c) = f∗ ∀c ≥ c0, (6)

where λ∗ is an exact tuning parameter, and f∗ = infx∈Ω f(x) is the optimalvalue of the problem (P). An exact tuning parameter satisfying (6) is calledstrictly exact.

Arguing in a similar way to the proofs of Theorems 3.1 and 3.2 one can easilyextend the localization principle to the case of strict exactness.

Theorem 3.3 (Strengthened Localization Principle in the Parametric Form I).Suppose that the validity of the conditions


F (x, λ∗, c) 6= ∅, minx∈A

F (x, λ∗, c) = f∗ (7)

for some λ∗ ∈ Λ and c > 0 implies that F (x, λ, c) is strictly globally parametri-cally exact with λ∗ being a strictly exact tuning parameter. Let also Ω be closed,and f be l.s.c. on Ω. Then the separating function F (x, λ, c) is strictly globallyparametrically exact if and only if there exists λ∗ ∈ Λ such that

1. F (x, λ, c) is of penalty-type and non-degenerate for λ = λ∗;

2. F (x, λ, c) is locally parametrically exact at every globally optimal solutionof the problem (P) with the exact tuning parameter λ∗;

3. there exists c0 > 0 such that F (x∗, λ∗, c) = f∗ for all x∗ ∈ Ω∗ and c ≥ c0.

9

Theorem 3.4 (Strengthened Localization Principle in the Parametric Form II).Suppose that the validity of the conditions


F (x, λ∗, c) 6= ∅, minx∈A

F (x, λ∗, c) = f∗

for some λ∗ ∈ Λ and c > 0 implies that F (x, λ, c) is strictly globally paramet-rically exact with λ∗ being a strictly exact tuning parameter. Let also the setsA and Ω be closed, the objective function f be l.s.c. on Ω, and the functionF (·, λ, c) be l.s.c. on A for all λ ∈ Λ and c > 0. Then the separating functionF (x, λ, c) is strictly globally parametrically exact if and only if there exist λ∗ ∈ Λand c0 > 0 such that

1. F (x, λ, c) is of penalty-type for λ = λ∗;

2. there exists a bounded set K such thatx ∈ A

∣∣∣ F (x, λ∗, c) < f∗

⊂ K ∀c ≥ c0;

3. F (x, λ, c) is locally parametrically exact with the exact tuning parameterλ∗ at every globally optimal solution of the problem (P);

4. F (x∗, λ∗, c) = f∗ for all x∗ ∈ Ω∗ and c ≥ c0.

4 Applications of the Localization Principle

Below, we provide several examples demonstrating how one can apply the lo-calization principle in the parametric form to the study of the global exactnessof various penalty and augmented Lagrangian functions.

4.1 Example I: Linear Penalty Functions

We start with the simplest case when the function F (x, λ, c) is affine with respectto the penalty parameter c and does not depend on any additional parameters.Let a function ϕ : X → [0,+∞] be such that ϕ(x) = 0 iff x ∈ M . Define

F (x, c) = f(x) + cϕ(x).

The function F (x, c) is called a linear penalty function for the problem (P).

Remark 4.1. In order to rigorously include linear penalty functions (as well asnonlinear penalty functions from the following two examples) into the theory ofparametrically exact separating functions one has to define Λ to be a one-pointset, say Λ = −1, introduce a new separating function F (x,−1, c) ≡ F (x, c),

and consider the separating function F (x, λ, c) instead of the penalty functionF (x, c). However, since this transformation is purely formal, we omit it for thesake of shortness. Moreover, since in the case of penalty functions the parameterλ is absent, it is natural to omit the term “parametric”, and say that F (x, c) isglobally/locally exact.

10

Let us obtain two simple characterizations of the global exactness of thelinear penalty function F (x, c) with the use of the localization principle (Theo-rems 3.1 and 3.2). These characterizations were first obtained by the author in([39], Therems 3.10 and 3.17).

Before we formulate the main result, let us note that F (x∗, c) = f∗ for anyglobally optimal solution x∗ of the problem (P) and for all c > 0. Therefore, inparticular, the linear penalty function F (x, c) is globally parametrically exactiff it is strictly globally parametrically exact.

Theorem 4.1 (Localization Principle for Linear Penalty Functions). Let A andΩ be closed, and let f and ϕ be l.s.c. on A. Then the linear penalty functionF (x, c) is globally exact if and only if F (x, c) is locally exact at every globallyoptimal solution of the problem (P) and one of the following two conditions issatisfied

1. F is non-degenerate;

2. there exists c0 > 0 such that the set x ∈ A | F (x, c0) < f∗ is bounded.

Proof. Note that F (x∗, c) = f(x∗) = f∗ for any x∗ ∈ Ω∗ and c > 0. Therefore


F (x, c) 6= ∅ =⇒ Ω∗ ⊂ argminx∈A

F (x, c).

Note also that if x /∈ M , then either F (x, c) is strictly increasing in c orF (x, c) = +∞ for all c > 0. On the other hand, if x ∈ M , then F (x, c) = f(x).Consequently, if for some c0 > 0 one has Ω∗ ⊂ argminx∈A F (x, c), then for anyc > c0 one has Ω∗ = argminx∈A F (x, c). Thus, the validity of the conditionΩ∗ ∩ argminx∈A F (x, c) 6= ∅ for some c > 0 implies the global exactness ofF (x, c).

Our aim, now, is to verify that F is a penalty-type separating function. Thenapplying Theorems 3.1 and 3.2 one obtains the desired result.

Indeed, let cn ⊂ (0,+∞) be an increasing unbounded sequence, xn ∈argminx∈A F (x, c) for all n ∈ N, and let x∗ be a cluster point of the sequencexn. By [39], Proposition 3.5, one has ϕ(xn) → 0 as n → ∞. Hence takinginto account the facts that A is closed and ϕ is l.s.c. on A one gets that x∗ ∈ Aand ϕ(x∗) = 0. Therefore x∗ is a feasible point of the problem (P).

As it was noted above, for any y∗ ∈ Ω∗ one has F (y∗, c) = f(y∗) for all c > 0.Hence taking into account the definition of xn and the fact that the function ϕis non-negative one gets that f(xn) ≤ f(y∗) for all n ∈ N. Consequently, withthe use of the lower semicontinuity of f one obtains that f(x∗) ≤ f(y∗), whichimplies that x∗ is a globally optimal solution of the problem (P). Thus, F (x, c)is a penalty-type separating function.

Let us also give a different formulation of the localization principle for linearpenalty functions in which the non-degeneracy condition is replaced by somemore widely used conditions.

Corollary 4.1. Let A and Ω be closed, and let f and ϕ be l.s.c. on A. Supposealso that one of the following conditions is satisfied:

1. the set x ∈ A | f(x) < f∗ is bounded;

11

2. there exist c0 > 0 and δ > 0 such that the function F (·, c0) is boundedfrom below on A and the set x ∈ A | f(x) < f∗, ϕ(x) < δ is bounded;

3. there exist c0 > 0 and a feasible point x0 of the problem (P) such that theset x ∈ A | F (x, c0) ≤ f(x0) is bounded;

4. the function f is coercive on the set A, i.e. f(xn) → +∞ as n → ∞ forany sequence xn ⊂ A such that ‖xn‖ → +∞ as n → ∞;

5. there exists c0 > 0 such that the function F (·, c0) is coercive on the set A;

6. the function ϕ is coercive on the set A and there exists c0 > 0 such thatthe function F (·, c0) is bounded from below on A.

Then F (x, c) is globally exact if and only if it is locally exact at every globallyoptimal solution of the problem (P).

Proof. One can easily verify that if one of the above assumptions holds true,then the set x ∈ A | F (x, c0) < f∗ is bounded for some c0 > 0. Thenapplying the localization principle for linear penalty functions one obtains thedesired result.

Remark 4.2. The corollary above provides an example of how one can refor-mulate the localization principle in a particular case with the use of some well-known and widely used conditions such as coercivity or the boundedness of acertain sublevel set. For the sake of shortness, we do not provide such reformu-lations of the localization principle for particular separating function F (x, λ, c)studied below. However, let us underline that one can easily reformulate the lo-calization principle with the use of coercivity-type assumptions in any particularcase.

For the sake of completeness, let us also formulate simple sufficient conditionsfor the local exactness of the function F . These conditions are well-known (see,e.g. [39], Theorem 2.4 and Proposition 2.7) and rely on an error bound for thepenalty term ϕ.

Proposition 4.1. Let x∗ be a locally optimal solution of the problem (P), and fbe Holder continuous with exponent α ∈ (0, 1] in a neighbourhood of x∗. Supposealso that there exist τ > 0 and r > 0 such that

ϕ(x) ≥ τ[dist(x,Ω)

]α∀x ∈ A : ‖x− x∗‖ < r,

where dist(x,Ω) = infy∈Ω ‖x − y‖. Then the linear penalty function F (x, c) islocally exact at x∗.

4.2 Example II: Nonlinear Penalty Functions

Let a function ϕ : X → [0,+∞] be as above. For the sake of convenience,suppose that the objective function f is non-negative onX . From the theoreticalpoint of view this assumption is not restrictive, since one can always replace thefunction f with the function ef(·). Furthermore, it should be noted that thenon-negativity assumption on the objective function f is standard in the theoryof nonlinear penalty functions (cf. [96, 98, 95, 97, 118]).

12

Let a function Q : [0,+∞]2 → (−∞,+∞] be fixed. Suppose that the restric-tion of Q to the set [0,+∞)2 is strictly monotone, i.e. Q(t1, s1) < Q(t2, s2) forany (t1, s1), (t2, s2) ∈ [0,+∞)2 such that t1 ≤ t2, s1 ≤ s2 and (t1, s1) 6= (t2, s2).Suppose also that Q(+∞, s) = Q(t,+∞) = +∞ for all t, s ∈ [0,+∞].

DefineF (x, c) = Q

(f(x), cϕ(x)

).

Then F (x, c) is a nonlinear penalty function for the problem (P). This type ofnonlinear penalty functions was studied in [96, 98, 95, 97, 118].

The simplest particular example of nonlinear penalty function is the functionF (x, c) of the form

F (x, c) =((

f(x))q

+(cϕ(x)

)q) 1q

(8)

with q > 0. Here

Q(t, s) =(tq + sq

) 1q

.

Clearly, this function is monotone. In this article, the function (8) is called theq-th order nonlinear penalty function for the problem (P). Let us note thatthe least exact penalty parameter of the q-th order nonlinear penalty functionis often smaller than the least exact penalty parameter of the linear penaltyfunction f(x) + cϕ(x) (see [98, 97] for more details).

Let us obtain a new simple characterization of global exactness of the non-linear penalty function F (x, c), which does not rely on any assumptions on theperturbation function for the problem (P) (cf. [98, 97]). Furthermore, to thebest of author’s knowledge, exact nonlinear penalty functions has only beenconsidered for mathematical programming problems, while our results are ap-plicable in the general case.

Theorem 4.2 (Localization Principle for Nonlinear Penalty Functions). Letthe set A be closed, and the functions f , ϕ and F (·, c) be l.s.c. on the setA. Suppose also that Q(0, s) → +∞ as s → +∞. Then the nonlinear penaltyfunction F (x, c) is globally exact if and only if it is locally exact at every globallyoptimal solution of the problem (P) and one of the two following assumptionsis satisfied:

1. the function F (x, c) is non-degenerate;

2. there exists c0 > 0 such that the set x ∈ A | Q(f(x), c0ϕ(x)) < Q(f∗, 0)is bounded.

Proof. From the fact that Q is strictly monotone it follows that for any c > 0one has

F (x, c) = Q(f(x), cϕ(x)) = Q(f(x), 0) > Q(f∗, 0) ∀x ∈ Ω \Ω∗.

Furthermore, if for some c0 > 0 one has

infx∈A

F (x, c0) := infx∈A

Q(f(x), c0ϕ(x)) = Q(f∗, 0),

then applying the strict mononicity of Q again one obtains that for any c > c0the following inequality holds true

F (x, c) = Q(f(x), cϕ(x)) > Q(f∗, 0) ∀x ∈ A \ Ω.

13

Therefore the validity of the condition Ω∗ ∩ argminx∈A F (x, c0) 6= ∅ for somec0 > 0 is equivalent to the global exactness of F (x, c) by virtue of the fact thatfor any c > 0 and x∗ ∈ Ω∗ one has F (x∗, c) = Q(f∗, 0).

Let us verify that F is a penalty-type separating function. Then applyingTheorems 3.1 and 3.2 one obtains the desired result.

Indeed, let cn ⊂ (0,+∞) be an increasing unbounded sequence, xn ∈argminx∈A F (x, cn) for all n ∈ N, and let x∗ be a cluster point of the sequencexn. Let us check, at first, that ϕ(xn) → 0 as n → ∞. Arguing by reduc-tio ad absurdum, suppose that there exist ε > 0 and a subsequence xnk

ofthe sequence xn such that ϕ(xnk

) > ε for all k ∈ N. Hence applying themonotonicity of Q one obtains that

F (xnk, cnk

) := Q(f(xnk

), cnkϕ(xnk

))≥ Q(0, cnk

ε) ∀k ∈ N.

Consequently, taking into account the fact that Q(0, s) → +∞ as s → +∞ onegets that F (xnk

, cnk) → +∞ as k → ∞, which contradicts the fact that

infx∈A

F (x, c) ≤ F (y∗, c) = Q(f∗, 0) < +∞ ∀c > 0, y∗ ∈ Ω∗ (9)

(the inequality Q(f∗, 0) < +∞ follows from the strict monotonicity of Q). Thus,ϕ(xn) → 0 as n → ∞. Applying the fact that A is closed and ϕ is l.s.c. on Aone gets that the cluster point x∗ is a feasible point of the problem (P).

Note that from (9) and the monotonicity of Q it follows that f(xn) ≤ f∗ forall n ∈ N. Hence taking into account the fact that f is l.s.c. on A one obtainsthat f(x∗) ≤ f∗, which implies that x∗ is a globally optimal solution of (P).Thus, F (x, c) is a penalty-type separating function.

Let us also obtain new simple sufficient conditions for the local exactness ofthe function F (x, c) which can be applied to the q-th order nonlinear penaltyfunction with q ∈ (0, 1). Note that since the function Q is strictly monotone, thepoint (0, 0) is a global minimizer of Q on the set [0,+∞]× [0,+∞]. Therefore,if x∗ is a locally optimal solution of (P) such that f(x∗) = 0, then x∗ is a globalminimizer of F (·, c) on A for all c > 0, which implies that F (x, c) is locally exactat x∗. Thus, it is sufficient to consider the case f(x∗) > 0.

Theorem 4.3. Let x∗ be a locally optimal solution of the problem (P) such thatf(x∗) > 0. Suppose that f is Holder continuous with exponent α ∈ (0, 1] nearx∗ , and there exist τ > 0 and r > 0 such that

ϕ(x) ≥ τ [dist(x,Ω)]α ∀x ∈ A : ‖x− x∗‖ < r. (10)

Suppose also that there exist t0 > 0 and c0 > 0 such that

Q(f(x∗)− t, c0t

)≥ Q(f(x∗), 0) ∀t ∈ [0, t0). (11)

Then the nonlinear penalty function F (x, c) is locally exact at x∗.

Proof. Since f is Holder continuous with exponent α near the locally optimalsolution x∗ of the problem (P), there exist L > 0 and δ < r such that

f(x) ≥ f(x∗)− L[dist(x,Ω)

]α≥ 0 ∀x ∈ A : ‖x− x∗‖ < δ

14

([39], Proposition 2.7). Consequently, applying (10) and the fact that Q ismonotone one obtains that for any x ∈ A with ‖x− x∗‖ < δ one has

Q(f(x), cϕ(x)

)≥ Q

(f(x∗)− L

[dist(x,Ω)

]α, cτ[dist(x,Ω)

]α).

Hence with the use of (11) one gets that there exists t0 > 0 and c0 > 0 suchthat for any c ≥ Lc0/τ one has

Q(f(x), cϕ(x)

)≥ Q(f(x∗), 0) ∀x ∈ A : ‖x− x∗‖ < min

δ,

(t0L

)1/α,

which implies that F (x, c) is locally exact at x∗ and c∗(x∗) ≤ c0/τ .

Remark 4.3. Assumption (11) always holds true for the q-th order nonlinearpenalty function with 0 < q ≤ 1. Indeed, applying the fact that the functionω(t) = tq is Holder continuous with exponent q and the Holder coefficient C = 1on [0,+∞) one obtains that

(f(x∗)− t

)q+ cqtq ≥ f(x∗)q − tq + cqtq ≥ f(x∗)q

for any t ∈ [0, f(x∗)) and c ≥ 1. Hence

Q(f(x∗)− t, ct) ≥ Q(f(x∗), 0) ∀t ∈ [0, f(x∗)) ∀c ≥ 1,

which implies the required result.

Remark 4.4. Note that assumption (11) in the theorem above cannot be strength-ened. Namely, one can easily verify that if the nonlinear penalty function F islocally exact at a locally optimal solution x∗ for all Lipschitz continuous func-tions f and all function ϕ satisfying (10), then (11) holds true (one simply hasto define f(x) = −L dist(x,Ω) and ϕ(x) = dist(x,Ω)). Note also that the q-thorder nonlinear penalty function does not satisfy assumption (11) for q > 1.

4.3 Example III: Continuously Differentiable Exact Penalty

Functions

In this section, we utilize the localization principle in order to improve exist-ing results on the global exactness of continuously differentiable exact penaltyfunctions. A continuously differentiable exact penalty function for mathemat-ical programming problems was introduced by Fletcher in [51, 52]. Later on,Fletcher’s penalty function was modified and thoroughly investigated by manyresearchers [29, 22, 88, 58, 11, 8, 60, 26, 27, 77, 18, 54, 1]. Here, we studya modification of the continuously differentiable penalty function for nonlin-ear second-order cone programming problems proposed by Fukuda, Silva andFukushima in [54]. However, it should be pointed out that the results of thissubsection can be easily extended to the case of any existing modification ofFletcher’s penalty function.

Let X = A = Rd, and suppose that the set M has the form

M =x ∈ R

d∣∣∣ gi(x) ∈ Qli+1, i ∈ I, h(x) = 0,

15

where gi : X → Rli+1, I = 1, . . . , r, and h : X → Rs are given functions, and

Qli+1 =y = (y0, y) ∈ R× R

li∣∣ y0 ≥ ‖y‖

is the second order (Lorentz) cone of dimension li + 1 (here ‖ · ‖ is the Eu-clidean norm). In this case the problem (P) is a nonlinear second-order coneprogramming problem.

Following the ideas of [54], let use introduce a continuously differentiableexact penalty function for the problem under consideration. Suppose that thefunctions f , gi, i ∈ I and h are twice continuously differentiable. For anyλ = (λ1, . . . , λr) ∈ Rl1+1 × . . .× Rlr+1 and µ ∈ Rs denote by

L(x, λ, µ) = f(x) +r∑

i=1

〈λi, gi(x)〉+ 〈µ, h(x)〉,

the standard Lagrangian function for the nonlinear second-order cone program-ming problem. Here 〈·, ·〉 is the inner product in R

k.For a chosen x ∈ Rn consider the following unconstrained minimization

problem, which allows one to obtain an estimate of Lagrange multipliers:

minλ,µ

∥∥∇xL(x, λ, µ)∥∥2 + ζ1

r∑

i=1

(〈λi, gi(x)〉

2 + ‖(λi)0gi(x) + (gi)0(x)λi‖2)

+ζ22

(‖h(x)‖2 +

r∑

i=1

dist2(gi(x), Qli+1

))

·(‖λ‖2 + ‖µ‖2

), (12)

where ζ1 and ζ2 are some positive constants, λi = ((λi)0, λi) ∈ R×Rli , and the

same notation is used for gi(x). Observe that if (x∗, λ∗, µ∗) is a KKT-point ofthe problem (P), then (λ∗, µ∗) is a globally optimal solution of problem (12)(see [54]). Moreover, it is easily seen that for any x ∈ Rd there exists a globallyoptimal solution of this problem, which we denote by (λ(x), µ(x)). In order toensure that an optimal solution is unique one has to utilize a proper constraintqualification.

Recall that a feasible point x is called nondegenerate ([13], Def. 4.70), if

Jg1(x)...

Jgr(x)Jh(x)

R

d +

linTQl1+1

(g1(x)

)

...lin TQlr+1

(gr(x)

)

0

=

Rl1+1

...Rlr+1

Rs

,

where Jgi(x) is the Jacobian of gi(x), “lin” stands for the lineality subspaceof a convex cone, i.e. the largest linear space contained in this cone, andTQl1+1

(g1(x)

)is the contingent cone to Qli+1 at the point gi(x). Let us note

that the nondegeneracy condition can be expressed as a “linear independence-type” condition (see [54], Lemma 3.1, and [12], Proposition 19). Furthermore,by [13], Proposition 4.75, the nondegeneracy condition guarantees that if x is alocally optimal solution of the problem (P), then there exists a unique Lagrangemultiplier at x.

Suppose that every feasible point of the problem (P) is nondegenerate. Thenone can verify that a globally optimal solution (λ(x), µ(x)) of problem (12) is

16

unique for all x ∈ Rd, and the functions λ(·) and µ(·) are continuously differen-tiable ([54], Proposition 3.3).

Now we can introduce a new continuously differentiable exact penalty func-tion for nonlinear second-order cone programming problems, which is a simplemodification of the penalty function from [54]. Namely, choose α > 0 and κ ≥ 2,and define

p(x) =a(x)

1 +∑r

i=1 ‖λi(x)‖2, q(x) =

b(x)

1 + ‖µ(x)‖2, (13)

where

a(x) = α−r∑

i=1

distκ(gi(x), Qli+1

), b(x) = α− ‖h(x)‖2.

Finally, denote Ωα = x ∈ Rd | a(x) > 0, b(x) > 0, and define

F (x, c) = f(x)

+c

2p(x)

r∑

i=1

[dist2

(gi(x) +

p(x)

cλi(x), Qli+1

)−

p(x)2

c2‖λi(x)‖

2

]

+ 〈µ(x), h(x)〉 +c

2q(x)‖h(x)‖2, (14)

if x ∈ Ωα, and F (x, c) = +∞ otherwise. Let us point out that F (x, c) is,in essence, a straightforward modification of the Hestenes-Powell-Rockafellaraugmented Lagrangian function to the case of nonlinear second-order cone pro-gramming problems [74, 75, 125] with Lagrange multipliers λ and µ replaced bytheir estimates λ(x) and µ(x). One can easily verify that the function F (·, c) isl.s.c. on R

d, and continuously differentiable on its effective domain (see [54]).Let us obtain first simple necessary and sufficient conditions for the global

exactness of continuously differentiable penalty functions.

Theorem 4.4 (Localization Principle for C1 Penalty Functions). Let the func-tions f , gi, i ∈ I, and h be twice continuously differentiable, and suppose thatevery feasible point of the problem (P) is nondegenerate. Then the continuouslydifferentiable penalty function F (x, c) is globally exact if and only if it is locallyexact at every globally optimal solution of the problem (P) and one of the twofollowing assumptions is satisfied:

1. the function F (x, c) is non-degenerate;

2. there exists c0 > 0 such that the set x ∈ R | F (x, c0) < f∗ is bounded.

In particular, if the set x ∈ Rd | f(x) < f∗ + γ, a(x) > 0, b(x) > 0 is boundedfor some γ > 0, then F (x, c) is globally exact if and only if it is locally exact atevery globally optimal solution of the problem (P).

Proof. Our aim is to apply the localization principle in the parametric formto the separating function (14). To this end, define G(·) = (g1(·), . . . , gr(·)),K = Ql1+1 × . . .×Qlr+1, and introduce the function

Φ(x, c) = miny∈K−G(x)

(−p(x)〈λ(x), y〉 +

c

2‖y‖2

). (15)

17

Note that the minimum is attained at a unique point y(x, c) due to the factsK − G(x) is a closed convex cone, and the function on the right-hand side ofthe above equality is strongly convex in y. Observe also that

F (x, c) = f(x) +1

p(x)Φ(x, c) + 〈µ(x), h(x)〉 +

c

2q(x)‖h(x)‖2 (16)

(see [100], formulae (2.5) and (2.7)). Consequently, the function F (x, c) is non-decreasing in c.

From (15) and (16) it follows that F (x, c) ≤ f(x) for any feasible point x(in this case y = 0 ∈ K −G(x)). Let, now, (x∗, λ∗, µ∗) be a KKT-point of theproblem (P). Then by [54], Proposition 3.3(c) one has λ(x∗) = λ∗ and µ(x∗) =µ∗, which, in particular, implies that λi(x

∗) ∈ (Qli+1)∗ and 〈λi(x

∗), gi(x∗)〉 = 0,

where (Qli+1)∗ is the polar cone of Qli+1. Then applying the standard first

order necessary and sufficient conditions for a minimum of a convex function ona convex set one can easily verify that the infimum in

dist2(gi(x

∗) +p(x∗)

cλi(x

∗), Qli+1

)= inf

z∈Qli+1

∥∥∥∥gi(x∗) +

p(x∗)

cλi(x

∗)− z

∥∥∥∥2

is attained at the point z = gi(x∗). Therefore F (x∗, c) = f(x∗) for all c > 0

(see (14)). In particular, if x∗ is a globally optimal solution of the problem (P),then F (x∗, c) ≡ f∗.

Suppose that for some c0 > 0 one has

Ω∗ ∩ argminx∈Rd

F (x, c0) 6= ∅. (17)

Thenminx

F (x, c) = f∗ = F (x∗, c) ∀c ≥ c0 ∀x∗ ∈ Ω∗ (18)

due to the fact that F (x, c) is nondecreasing in c. Thus, for all c ≥ c0 one hasΩ∗ ⊆ argminx∈Rd F (x, c).

Let, now, c > c0 and x∗ ∈ argminx F (x, c) be arbitrary. Clearly, if h(x∗) 6=0, then F (x∗, c) > F (x∗, c0), which is impossible. Therefore h(x∗) = 0. Let,now, y(x∗, c) be such that

Φ(x∗, c) = −p(x∗)〈λ(x∗), y(x∗, c)〉+c

2‖y(x∗, c)‖2

(see (15)). From the definitions of x∗ and c0 it follows that

f∗ = F (x∗, c0) = f(x∗) +1

p(x∗)Φ(x∗, c0)

≤ f(x∗) +1

p(x∗)

(−p(x∗)〈λ(x∗), y〉+

c02‖y‖2

)

for any y ∈ K −G(x∗). Hence for any y ∈ (K −G(x∗)) \ 0 one has

f(x∗) +1

p(x∗)

(−p(x∗)〈λ(x∗), y〉+

c

2‖y‖2

)> f∗.

Consequently, taking into account the first equality in (18) and the definitionof x∗ one obtains that y(x∗, c) = 0 and Φ(x∗, c) = 0, which yields that 0 ∈

18

K −G(x∗), i.e. x∗ is feasible, and F (x∗, c) = f(x∗) = f∗. Therefore x∗ ∈ Ω∗.Thus, argminx∈Rd F (x, c) = Ω∗ for all c > c0 or, in other words, the validity ofcondition (17) implies that the penalty function F (x, c) is globally exact.

Let us now check that F (x, c) is a penalty-type separating function. Thenapplying Theorems 3.1 and 3.2 we arrive at the required result.

Indeed, let cn ⊂ (0,+∞) be an increasing unbounded sequence, xn ∈argminx F (x, cn) for all n ∈ N, and x∗ be a cluster point of the sequence xn.As it was noted above, F (y∗, c) = f∗ for any globally optimal solution of theproblem (P). Therefore F (xn, cn) ≤ f∗ for all n ∈ N. On the other hand,minimizing the function ω(x, t) = −‖µ(x)‖t + ct2/2q(x) with respect to t oneobtains that

F (xn, cn) ≥ f(xn)−r∑

i=1

p(xn)

2cn‖λi(xn)‖

2 −q(xn)

2cn‖µ(xn)‖

2 ≥ f(xn)−α

cn. (19)

Hence passing to the limit as n → +∞ one obtains that f(x∗) ≤ f∗. Thereforeit remains to show that x∗ is a feasible point of the problem (P).

Arguing by reductio ad absurdum, suppose that x∗ is not feasible. Let, atfirst, h(x∗) 6= 0. Then there exist ε > 0 and a subsequence xnk

such that‖h(xnk

)‖ ≥ ε for all k ∈ N. Note that since xn is a convergent sequence andthe function µ(·) is continuous, there exists µ0 > 0 such that ‖µ(xn)‖ ≤ µ0

for all n ∈ N. Furthermore, it is obvious that ‖h(xn)‖2 < α for all n ∈ N.Consequently, one has

F (xnk, cnk

) ≥ f(xn)−α

2cnk

− µ0α+cnk

ε2

2(α− ε2).

(clearly, one can suppose that ε2 < α). Therefore lim supn→∞F (xn, cn) = +∞,

which is impossible. Thus, h(x∗) = 0.Suppose, now, that gi(x

∗) /∈ Qli+1 for some i ∈ I. Then there existε > 0 and a subsequence xnk

such that dist(gi(xnk), Qli+1) ≥ ε for all

k ∈ N. Note that p(x)‖λi(x)‖/c < α/c. Consequently, one has dist(gi(xnk) +

p(xnk)λi(xnk

)/cnk, Qli+1) ≥ ε/2 for any sufficiently large k ≥ n. Therefore

F (xnk, cnk

) ≥ f(xnk) +

cnkε2

2(α− εκ)−

α

cnk

for any k large enough (obviously, we can assume that εκ < α). Passing tothe limit as k → ∞ one obtains that lim supn→∞

F (xn, cn) = +∞, which isimpossible. Thus, x∗ is a feasible point of the problem (P).

Finally, note that from (19) it follows that

x ∈ R

d∣∣ F (x, c) < f∗

⊆x ∈ R

d∣∣ f(x) < f∗ + γ, a(x) > 0, b(x) > 0

for all c > α/γ.

Remark 4.5. (i) Let us note that the local exactness of penalty function (14)at a globally optimal solution of the problem (P) can be easily established withthe use of second sufficient optimality conditions (see [54], Theorem 5.7).(ii) Note that from the proof of the theorem above it follows that F (x∗, c) =f(x∗) for any KKT-point (x∗, λ∗, µ∗) of the problem (P).

19

Following the underlying idea of the localization principle and utilizing somespecific properties of continuously differentiable penalty function (14) we canobtain stronger necessary and sufficient conditions for the global exactness ofthis function than the ones in the theorem above. These conditions does notrely on the local exactness property and, furthermore, strengthen existing suffi-cient conditions for global exactness of continuously differentiable exact penaltyfunctions for nonlinear second-order cone programming problems ([54], Propo-sition 4.9). However, it should be emphasized that these conditions heavily relyon the particular structure of the penalty function under consideration.

Theorem 4.5. Let the functions f , gi, i ∈ I, and h be twice continuously dif-ferentiable, and suppose that every feasible point of the problem (P) is nondegen-erate. Then the continuously differentiable penalty function F (x, c) is globallyexact if and only if there exists c0 > 0 such that the set x ∈ Rd | F (x, c0) < f∗is bounded. In particular, it is exact, if there exists γ > 0 such that the setx ∈ R

d | f(x) < f∗ + γ, a(x) > 0, b(x) > 0 is bounded.

Proof. Denote S(c) = x ∈ Rd | F (x, c) < f∗. If F (x, c) is globally exact,

then, as it is easy to check, one has S(c) = ∅. Therefore it remains to prove the“if” part of the theorem.

If S(c) = ∅ for some c > 0, then F (x, c) ≥ f∗ for all x ∈ Rd, and arguingin the same way as at the beginning of the proof of Theorem 4.4 one can easilyobtain the desired result. Thus, one can suppose that S(c) 6= ∅ for all c > 0.

Choose an increasing unbounded sequence cn ⊂ [c0,+∞). Taking intoaccount the facts that F (·, c) is l.s.c. and nondecreasing in c, and applyingthe boundedness of the set S(c0) one obtains that for any n ∈ N the functionF (·, cn) attains a global minimum at a point xn ∈ S(cn) ⊆ S(c0). Applying theboundedness of the set S(c0) once again one obtains that there exists a clusterpoint x∗ of the sequence xn. Replacing, if necessary, the sequence xn withits subsequence we can suppose that xn converges to x∗. As it was shown inTheorem 4.4, F (x, c) is a penalty-type separating function. Therefore x∗ is aglobally optimal solution of the problem (P), and F (x∗, c) = f∗ for all c > 0.

From the fact that xn is a point of global minimum of F (·, cn) it follows that∇xF (xn, cn) = 0. Then applying a direct modification of [54], Proposition 4.3to the case of penalty function (14) one obtains that for any xn in a sufficientlysmall neighbourhood of x∗ (i.e. for any sufficiently large n ∈ N) the triplet(xn, λ(xn), µ(xn)) is a KKT-point of the problem (P). Hence taking into accountRemark 4.5 one gets that F (xn, cn) = f(xn) ≥ f∗ for any sufficiently largen ∈ N, which contradicts our assumption that S(c) 6= ∅ for all c > 0 due to thedefinition of xn. Thus, the penalty function F (x, c) is globally exact.

Let us note that the results of this subsection can be easily extended to thecase of nonlinear semidefinite programming problems (cf. [42], Sections 8.3 and8.4). Namely, suppose that A = Rd, and let

M =x ∈ R

d∣∣∣ G(x) 0, h(x) = 0

,

where G : X → Sl and h : X → Rs are given functions, Sl is the set of all l × lreal symmetric matrices, and the relation G(x) 0 means that the matrix G(x)is negative semidefinite. We suppose that the space Sl is equipped with the

20

Frobenius norm ‖A‖F =√Tr(A2). In this case the problem (P) is a nonlinear

semidefinite programming problem.Suppose that the functions f , G and h are twice continuously differentiable.

For any λ ∈ Sl and µ ∈ Rs denote by

L(x, λ, µ) = f(x) + Tr(λG(x)) + 〈µ, h(x)〉,

the standard Lagrangian function for the nonlinear semidefinite programmingproblem. For a chosen x ∈ Rn consider the following unconstrained minimiza-tion problem, which allows one to compute an estimate of Lagrange multipliers:

minλ,µ

∥∥∇xL(x, λ, µ)∥∥2 + ζ1 Tr(λ

2G(x)2)

+ζ22

(‖h(x)‖2 +

r∑

i=1

dist2(G(x), Sl

−

))

·(‖λ‖2F + ‖µ‖2

), (20)

where ζ1 and ζ2 are some positive constants, and Sl−

is the cone of l × l realnegative semidefinite matrices. One can verify (cf. [42], Lemma 4) that forany x ∈ Rd there exists a unique globally optimal solution (λ(x), µ(x)) of thisproblem, provided every feasible point of the problem (P) is non-degenerate,i.e. provided for any feasible x one has

[DG(x∗)Jh(x∗)

]R

d +

[linTSl

−

(G(x∗)

)

0

]=

[Sl

Rs

].

(see [13], Def. 4.70). Let us note that, as in the case of second order coneprogramming problems, the above nondegeneracy condition can be rewritten asa “linear independence-type” condition (see [13], Proposition 5.71).

Now we can introduce first continuously differentiable exact penalty functionfor nonlinear semidefinite programming problems. Namely, choose α > 0 andκ ≥ 1, and define

p(x) =a(x)

1 + Tr(λ(x)2), q(x) =

b(x)

1 + ‖µ(x)‖2,

wherea(x) = α− Tr

([G(x)]2+

)κ, b(x) = α− ‖h(x)‖2,

and [·]+ denotes the projection of a matrix onto the cone of l×l positive semidef-inite matrices. Denote Ωα = x ∈ Rd | a(x) > 0, b(x) > 0, and define

F (x, c) = f(x) +1

2cp(x)

(Tr([cG(x) + p(x)λ(x)]2+

)− p(x)2 Tr(λ(x)2)

)

+ 〈µ(x), h(x)〉 +c

2q(x)‖h(x)‖2, (21)

if x ∈ Ωα, and F (x, c) = +∞ otherwise. Let us point out that F (x, c) is, inessence, a direct modification of the Hestenes-Powell-Rockafellar augmented La-grangian function to the case of nonlinear semidefinite programming problems[103, 101, 123, 102, 111, 81, 112, 114, 117] with Lagrange multipliers λ and µreplaced by their estimates λ(x) and µ(x). One can verify that the function

21

F (·, c) is l.s.c. on Rd, and continuously differentiable on its effective domain.Furthermore it is possible to extend Theorems 4.4 and 4.5 to the case of contin-uously differentiable penalty function (21), thus obtaining first necessary andsufficient conditions for the global exactness of C1 penalty functions for nonlin-ear semidefinite programming problems. However, we do not present the proofsof these results here, and leave them to the interested reader.

4.4 Example IV: Rockafellar-Wets’ Augmented Lagrangian

Function

The separating functions studied in the previous examples do not depend onany additional parameters apart from the penalty parameter c. This fact doesnot allow one to fully understand the concept of parametric exactness. In orderto illuminate the main features of parametric exactness, in this example weconsider a separating function that depends on additional parameters, namelyLagrange multipliers. Below, we apply the general theory of parametricallyexact separating functions to the augmented Lagrangian function introducedby Rockafellar and Wets in [92] (see also [100, 61, 62, 131, 40, 99, 66, 67, 16]).

Let P be a topological vector space of parameters. Recall that a functionΦ: X ×P → R∪ +∞∪−∞ is called a dualizing parameterization functionfor f iff Φ(x, 0) = f(x) for any feasible point of the problem (P). A functionσ : P → [0,+∞] such that σ(0) = 0 and σ(p) > 0 for all p 6= 0 is called anaugmenting function. Let, finally, Λ be a vector space of multipliers, and let thepair (Λ, P ) be equipped with a bilinear coupling function 〈·, ·〉 : Λ× P → R.

Following the ideas of Rockafellar and Wets [92], define the augmented La-grangian function

L (x, λ, c) = infp∈P

(Φ(x, p)− 〈λ, p〉 + cσ(p)

), (22)

We suppose that L (x, λ, c) > −∞ for all x ∈ X , λ ∈ Λ and c > 0. Let usobtain simple necessary and sufficient conditions for the strict global parametricexactness of the augmented Lagrangian function L (x, λ, c) with the use of thelocalization principle. These conditions were first obtained by the author in[40].

Remark 4.6. It is worth mentioning that in the context of the theory of aug-mented Lagrangian functions, a vector λ∗ ∈ Λ is a strictly exact tuning param-eter of the function L (x, λ, c) iff λ∗ supports an exact penalty representationof the problem (P) (see [92], Definition 11.60). Furthermore, if the infimum in(22) is attained for all x, λ and c, then the strict global parametric exactness ofthe augmented Lagrangian function L (x, λ, c) is equivalent to the existence ofan augmented Lagrange multiplier (see [92], Theorem 11.61, and [40], Proposi-tion 4 and Corollary 1). Furthermore, in this case λ∗ is a strictly exact tuningparameter iff it is an augmented Lagrange multiplier.

Remark 4.7. Clearly, the definitions of strict global parametric exactness andglobal parametric exactness do not coincide in the case of the augmented La-grangian function L (x, λ, c) defined above. For example, let P be a normedspace, Λ be the topological dual of P , σ(p) = ‖p‖2, and

Φ(x, p) =

f(x) + max−1,−‖p‖, if x ∈ Ω,

+∞, if x /∈ Ω.,

22

Then, as it easy to check, L (x, λ, c) is globally parametrically exact with theexact tuning parameter λ∗ = 0 and c∗(λ∗) = 0, but it is not strictly globallyparametrically exact, since L (x, λ, c) < f(x) for all x ∈ Ω, λ ∈ Λ and c > 0.When one compares strict global parametric exactness and global parametricexactness in the case of the augmented LagrangianL (x, λ, c), it appears that thestrict global parametric exactness is more natural in this case. Apart from thefact that there exist many connections of the strict global parametric exactnesswith existing results on augmented Lagrangian function L (x, λ, c) pointed outabove, the application of the localization principle leads to simpler results in thecase of the strict global parametric exactness. In particular, it is rather difficultto verify that the validity of the condition Ω∗ ∩ argminx∈A L (x, λ∗, c) 6= ∅implies the global parametric exactness, while the condition


L (x, λ∗, c) 6= ∅, minx∈A

L (x, λ∗, c) = f∗ (23)

is equivalent to the strict global parametric exactness of L (x, λ, c) under somenatural assumptions (see Theorem 4.6 below).

Recall that the augmenting function σ is said to have a valley at zero iff forany neighbourhood U ⊂ P of zero there exists δ > 0 such that σ(p) ≥ δ forany p ∈ P \U . The assumption that the augmenting function σ has a valley atzero is widely used in the literature on augmented Lagrangian functions (see,e.g., [15, 129, 130, 131]).

Theorem 4.6 (Localization Principle for Augmented Lagrangian Functions).Suppose that the following assumptions are valid:

1. A and Ω are closed;

2. f and L (·, λ, c) for all λ ∈ Λ and c > 0 are l.s.c. on A;

3. Φ is l.s.c. on A× 0;

4. σ has a valley at zero;

5. there exists r > 0 such that for any c ≥ r, x ∈ A and λ ∈ Λ one has

argminp∈P

(Φ(x, p)− 〈λ, p〉+ cσ(p)

)6= ∅

i.e. the infimum in (22) is attained.

Then the augmented Lagrangian function L (x, λ, c) is strictly globally paramet-rically exact if and only if there exist λ∗ and c0 > 0 such that L (x, λ, c) islocally parametrically exact at every globally optimal solution of the problem (P)with the exact tuning parameter λ∗,

L (x∗, λ∗, c) = f∗ ∀x∗ ∈ Ω∗ ∀c ≥ c0,

and one of the following two conditions is valid:

1. the function L (x, λ, c) is non-degenerate for λ = λ∗;

2. the set x ∈ A∣∣∣ L (x, λ∗, c0) < f∗

is bounded.

23

Proof. The fact that the validity of (23) is equivalent to strict global parametricexactness of the function L (x, λ, c) follows directly from [40], Proposition 4 andCorollary 1. Furthermore, by [40], Proposition 8, the function L (x, λ, c) is apenalty-type separating function for any λ ∈ Λ. Then applying Theorems 3.3and 3.4 one obtains the desired result.

Remark 4.8. Note that under the assumptions of the theorem the multipliers λ∗

is a strictly exact tuning parameter if and only if it is an augmented Lagrangremultiplier [40]. Thus, the theorem above, in essence, contains necessary andsufficient conditions for the existence of augmented Lagrange multipliers forthe problem (P). See [40] for applications of this theorem to some particularoptimization problems.

Note that from the localization principle it follows that for the strict globalparametric exactness of the augmented Lagrangian L (x, λ, c) it is necessarythat there exists a tuning parameter λ∗ ∈ Λ such that λ∗ is a locally exacttuning parameter at every globally optimal solution of the problem (P). Onecan give a simple interpretation of this condition in the case when L (x, λ, c)is a proximal Lagrangian. Namely, let L (x, λ, c) be the proximal Lagrangian(see [92], Example 11.57), and suppose that it is strictly globally parametricallyexact with a strictly exact tuning parameter λ∗ ∈ Λ. By the definition of strictglobal exactness, any globally optimal solution x∗ of the problem (P) is a globalminimizer of the function L(·, λ∗, c) for all sufficiently large c. Then applyingthe first order necessary optimality condition to the function L (·, λ, c) one caneasily verify that under natural assumptions the pair (x∗, λ∗) is a KKT-point ofthe problem (P) for any x∗ ∈ Ω∗ (see [100], Proposition 3.1). Consequently, onegets that for the strict global parametric exactness of the augmented Lagrangianfunction L (x, λ, c) it is necessary that there exists a Lagrange multiplier λ∗

such that the pair (x∗, λ∗) is a KKT-point of the problem (P) for any globallyoptimal solution x∗ of this problem. In particular, if there exist two globallyoptimal solutions of the problem (P) with disjoint sets of Lagrange multipliers,then the proximal Lagrangian cannot be strictly globally parametrically exact.

For the sake of completeness, let us mention that in the case of augmentedLagrangian functions, sufficient conditions for the local exactness are typicallyderived with the use of sufficient optimality conditions. In particular, the valid-ity of the second order sufficient optimality conditions at a given globally optimalsolution x∗ guarantees that the proximal Lagrangian is locally parametricallyexact at x∗ with the corresponding Lagrange multiplier being a locally exacttuning parameter (see, e.g., [5], Theorem 9.3.3; [104], Theorem 2.1; [72], Theo-rem 2; [107], Theorem 2.3; [80], Theorems 3.1 and 3.2; [126], Theorem 2.8; [131],Proposition 3.1; [125], Theorem 2.3; [114], Theorem 3, etc.).

5 Conclusions

In this paper we developed a general theory of global parametric exactness ofseparating function for finite-dimensional constrained optimization problems.This theory allows one to reduce a constrained optimization problem to anunconstrained one, provided an exact tuning parameter is known. With the useof the general results obtained in this article we recovered existing results on theglobal exactness of linear penalty functions and Rockafellar-Wets’ augmented

24

Lagrangian function. We also obtained new simple necessary and sufficientconditions for the global exactness of nonlinear and continuously differentiablepenalty functions.

References

[1] R. Andreani, E. H. Fukuda, and P. J. Silva. A Gauss-Newton approachfor solving constrained optimization problems using differentiable exactpenalties. J. Optim. Theory Appl., 156:417–449, 2013.

[2] T. A. Antczak. Lower bound for the penalty parameter in the exactminimax penalty function method for solving nondifferentiable extremumproblems. J. Optim. Theory Appl., 159:437–453, 2013.

[3] A. Auslender. Penalty and barrier methods: a unified framework. SIAMJ. Optim., 10:211–230, 1999.

[4] A. Auslender, R. Cominetti, and M. Haddou. Asymptotic analysis forpenalty and barrier methods in convex and linear programming. Math.Oper. Res., 22:43–62, 1997.

[5] M. S. Bazaraa, H. D. Sherali, and C. C. M. Shetty. Nonlinear Program-ming: Theory and Algorithms. John Wiley & Sons Inc., New Jersey, 2006.

[6] A. Ben-Tal and M. Zibulevsky. Penalty/barrier multiplier methods forconvex programming problems. SIAM J. Optim., 7:347–366, 1997.

[7] D. P. Bertsekas. Necessary and sufficient conditions for a penalty methodto be exact. Math. Prog., 9:87–99, 1975.

[8] D. P. Bertsekas. Constrained Optimization and Lagrange Multiplier Meth-ods. Academic Press, New York, 1982.

[9] L. Bingzhuang and Z. Wenling. A modified exact smooth penalty functionfor nonlinear constrained optimization. J. Inequal. Appl., 1, 2012.

[10] E. G. Birgin and J. M. Martinez. Practical Augmented Lagrangian Meth-ods for Constrained Optimization. SIAM, Philadelphia, 2014.

[11] P. T. Boggs and J. W. Tolle. Augmented Lagrangians which are quadraticin the multiplier. J. Optim. Theory Appl., 31:17–26, 1980.

[12] J. F. Bonnans and C. H. Ramırez. Perturbation analysis of second-ordercone programming problems. Math. Program., 104:205–227, 2005.

[13] J. F. Bonnans and A. Shapiro. Perturbation analysis of optimization prob-lems. Springer Science+Business Media, New York, 2000.

[14] R. S. Burachik, A. N. Iusem, and J. G. Melo. Duality and exact pe-nalization for general augmented Lagrangians. J. Optim. Theory Appl.,147:125–140, 2010.

[15] R. S. Burachik and A. Rubinov. Abstract convexity and augmented La-grangians. SIAM J. Optim., 18:413–436, 2007.

25

[16] R. S. Burachik, X. Q. Yang, and Y. Y. Zhou. Existence of augmentedLagrange multipliers for semi-infinite programming problems. J. Optim.Theory Appl., 173:471–503, 2017.

[17] J. V. Burke. An exact penalization viewpoint on constrained optimization.SIAM J. Control. Optim., 29:968–998, 1991.

[18] G. Contaldi, G. Di Pillo, and S. Lucidi. A continuously differentiable exactpenalty function for nonlinear programming problems with unboundedfeasible set. Oper. Res. Lett., 14:153–161, 1993.

[19] V. F. Demyanov. Exact penalty function in problems of nonsmooth opti-mization. Vestn. St. Peterb. Univ. Math., 27:16–22, 1994.

[20] V. F. Demyanov. Nonsmooth optimization. In G. Di Pillo and F. Schoen,editors, Nonlinear Optimization. Lecture Notes in Mathematics, vol. 1989,pages 55–164. Springer-Verlag, Berlin-Heidelberg, 2010.

[21] V. F. Demyanov, G. Di Pillo, and F. Facchinei. Exact penalization via Diniand Hadamard conditional derivatives. Optim. Methods Softw., 9:19–36,1998.

[22] G. Di Pillo. Exact penalty methods. In E. Spedicato, editor, Algorithmsfor Continuous Optimization: the State of the Art, pages 1–45. KluwerAcademic Press, Boston, 1994.

[23] G. Di Pillo and F. Facchinei. Exact barrier function methods for Lipschitzprograms. Appl. Math. Optim., 32:1–31, 1995.

[24] G. Di Pillo and L. Grippo. A new class of augmented Lagrangians innonlinear programming. SIAM J. Control Optim., 17:618–628, 1979.

[25] G. Di Pillo and L. Grippo. A new augmented Lagrangian function for in-equality constraints in nonlinear programming problems. J. Optim. The-ory Appl., 36:495–519, 1982.

[26] G. Di Pillo and L. Grippo. A continuously differentiable exact penaltyfunction for nonlinear programming problems with inequality constraints.SIAM J. Control Optim., 23:72–84, 1985.

[27] G. Di Pillo and L. Grippo. An exact penalty method with global conver-gence properties for nonlinear programming problems. Math. Program.,36:1–18, 1986.

[28] G. Di Pillo and L. Grippo. On the exactness of a class of nondifferentiablepenalty functions. J. Optim. Theory Appl., 57:399–410, 1988.

[29] G. Di Pillo and L. Grippo. Exact penalty functions in constrained opti-mization. SIAM J. Control. Optim., 27:1333–1360, 1989.

[30] G. Di Pillo, L. Grippo, and F. Lampariello. A method for solving equal-ity constrained optimization problems by unconstrained minimization. InK. Iracki, K. Malanowski, and S. Walukiewicz, editors, Optimization tech-niques: proceedings of the 9th IFIP Conference on Optimization Tech-niques, pages 96–105. Springer-Verlag, Berlin, Heidelberg, 1980.

26

[31] G. Di Pillo, G. Liuzzi, S. Lucidi, and L. Palagi. An exact augmented La-grangian function for nonlinear programming with two-sided constraints.Comput. Optim. Appl., 25:57–83, 2003.

[32] G. Di Pillo, G. Liuzzi, S. Lucidi, and L. Palagi. Fruitful uses of smoothexact merit functions in constrained optimization. In G. Di Pillo andA. Murli, editors, High Performance Algorithms and Software for Nonlin-ear Optimization, pages 201–225. Kluwer Academic Publishers, Dordrecht,2003.

[33] G. Di Pillo and S. Lucidi. On exact augmented Lagrangian functions innonlinear programming. In G. Di Pillo and F. Giannessi, editors, Non-linear Optimization and Applications, pages 85–100. Plenum Press, NewYork, 1996.

[34] G. Di Pillo and S. Lucidi. An augmented Lagrangian function with im-proved exactness properties. SIAM J. Optim., 12:376–406, 2001.

[35] G. Di Pillo, S. Lucidi, and L. Palagi. An exact penalty-Lagrangian ap-proach for a class of constrained optimization problems with boundedvariables. Optim., 28:129–148, 1993.

[36] G. Di Pillo, G. Luizzi, and S. Lucidi. An exact penalty-Lagrangian ap-proach for large-scale nonlinear programming. Optim., 60:223–252, 2011.

[37] M. V. Dolgopolik. Smooth exact penalty function II: a reduction to stan-dard exact penalty functions. Optim. Lett., 10:1541–1560, 2016.

[38] M. V. Dolgopolik. Smooth exact penalty functions: a general approach.Optim. Lett., 10:635–648, 2016.

[39] M. V. Dolgopolik. A unifying theory of exactness of linear penalty func-tions. Optim., 65:1167–1202, 2016.

[40] M. V. Dolgopolik. Existence of augmented Lagrange multipliers: reduc-tion to exact penalty functions and localization principle. Math. Program.,2017.

[41] M. V. Dolgopolik. A unifying theory of exactness of linear penalty func-tions II: parametric penalty functions. Optim., 66:1577–1622, 2017.

[42] M. V. Dolgopolik. Augmented Lagrangian functions for cone constrainedoptimization: the existence of global saddle points and exact penalty prop-erty. J. Glob. Optim., 71:237–296, 2018.

[43] X. Du, Y. Liang, and L. Zhang. Further study on a class of augmentedLagrangians of Di Pillo and Grippo in nonlinear programming. J. ShanghaiUniv. (Engl. Ed.), 10:293–298, 2006.

[44] X. Du, L. Zhang, and Y. Gao. A class of augmented Lagrangians forequality constraints in nonlinear programming problems. Appl. Math.Comput., 172:644–663, 2006.

[45] I. I. Eremin. Penalty method in convex programming. Soviet Math. Dokl.,8:459–462, 1966.

27

[46] J. P. Evans, F. J. Gould, and J. W. Tolle. Exact penalty functions innonlinear programming. Math. Program., 4:72–97, 1973.

[47] Y. G. Evtushenko and V. G. Zhadan. Exact auxiliary functions in non-convex optimization. In W. Oettli and D. Pallaschke, editors, Advances inOptimization, pages 217–226. Springer-Verlag, Berlin, Heidelberg, 1992.

[48] Yu. G. Evtushenko, A. M. Rubinov, and V. G. Zhadan. General lagrange-type functions in contrained global optimization Part I: auxiliary functionsand optimality conditions. Optim. Method. Softw., 16:193–230, 2001.

[49] Yu. G. Evtushenko, A. M. Rubinov, and V. G. Zhadan. General lagrange-type functions in contrained global optimization Part II: Exact auxiliaryfunctions. Optim. Method. Softw., 16:231–256, 2001.

[50] A. V. Fiacco and G. P. McCormick. Nonlinear Programming: SequentialUnconstrained Minimization Techniques. SIAM, Philadelphia, 1990.

[51] R. Fletcher. A class of methods for nonlinear programming with ter-mination and convergence properties. In J. Abadie, editor, Integer andnonlinear programming, pages 157–173. North-Holland, Amsterdam, 1970.

[52] R. Fletcher. An exact penalty function for nonlinear programming withinequalities. Math. Program., 5:129–150, 1973.

[53] E. Fukuda and B. F. Lourenco. Exact augmented lagrangian functions fornonlinear semidefinite programming. Comput. Optim. Appl., 71:457–482,2018.

[54] E. H. Fukuda, P. J. S. Silva, and M. Fukushima. Differentiable exactpenalty functions for nonlinear second-order cone programs. SIAM J.Optim., 22:1607–1633, 2012.

[55] R. N. Gasimov and A. M. Rubinov. On augmented Lagrangian for opti-mization problems with a single constraint. J. Glob. Optim., 28:153–173,2004.

[56] F. Giannessi. Constrained Optimization and Image Space Analysis. Vol-ume 1: Separation of Sets and Optimality Conditions. Springer, NewYork, 2005.

[57] F. Giannessi. On the theory of Lagrangian duality. Optim. Lett., 1:9–20,2007.

[58] T. Glad and E. Polak. A multiplier method with automatic limitation ofpenalty growth. Math. Program., 17:140–155, 1979.

[59] S. P. Han and O. L. Mangasarian. Exact penalty functions in nonlinearprogramming. Math. Program., 17:251–269, 1979.

[60] S. P. Han and O. L. Mangasarian. A dual differentiable exact penaltyfunction. Math. Program., 25:293–306, 1983.

[61] X. X. Huang and X. Q. Yang. A unified augmented Lagrangian approachto duality and exact penalization. Math. Oper. Res., 28:533–552, 2003.

28

[62] X. X. Huang and X. Q. Yang. Further study on augmented Lagrangianduality theory. J. Glob. Optim., 31:193–210, 2005.

[63] W. Huyer and A. Neumaier. A new exact penalty function. SIAM J.Optim., 13:1141–1158, 2003.

[64] A. D. Ioffe. Necessary and sufficient conditions for a local minimum. I: areduction theorem and first order conditions. SIAM J. Control Optim.,17:245–250, 1979.

[65] C. Jiang, Q. Lin, C. Yu, K. L. Teo, and G.-R. Duan. An exact penaltymethod for free terminal time optimal control problem with continuousinequality constraints. J. Optim. Theory Appl., 154:30–53, 2012.

[66] C. Kan and W. Song. Augmented Lagrangian duality for composite opti-mization problems. J. Optim. Theory Appl., 165:763–784, 2015.

[67] C. Kan and W. Song. Second-order conditions for existence of augmentedLagrange multipliers for eigenvalue composite optimization problems. J.Glob. Optim., 63:77–97, 2015.

[68] B. Li, C. J. Yu, K. L. Teo, and G. R. Duan. An exact penalty functionmethod for continuous inequality constrained optimal control problem. J.Optim. Theory Appl., 151:260–291, 2011.

[69] J. Li, S. Q. Feng, and Z. Zhang. A unified approach for constrained ex-tremum problems: image space analysis. J. Optim. Theory Appl., 159:69–92, 2013.

[70] Q. Lin, R. Loxton, K. L. Teo, and Y. H. Wu. Optimal feedback controlfor dynamic systems with state constraints: An exact penalty approach.Optim. Lett., 8:1535–1551, 2014.

[71] Q. Lin, R. Loxton, K. L. Teo, Y. H. Wu, and C. Yu. A new exact penaltymethod for semi-infinite programming problems. J. Comput. Appl. Math.,261:271–286, 2014.

[72] Q. Liu, W. M. Tang, and X. M. Yang. Properties of saddle points forgeneralized augmented Lagrangian. Math. Meth. Oper. Res., 69:111–124,2009.

[73] Q. Liu and X. Yang. Zero duality and saddle points of a class of augmentedLagrangian functions in constrained non-convex optimization. Optim.,57:655–667, 2008.

[74] Y. J. Liu and L. W. Zhang. Convergence analysis of the augmented La-grangian method for nonlinear second-order cone optimization problems.Nonlinear Anal.: Theory, Methods, Appl., 67:1359–1373, 2007.

[75] Y. J. Liu and L. W. Zhang. Convergence of the augmented Lagrangianmethod for nonlinear optimization problems over second-order cones. J.Optim. Theory Appl., 139:557–575, 2008.

[76] S. Lucidi. New results on a class of exact augmented Lagrangians. J.Optim. Theory Appl., 58:259–282, 1988.

29

[77] S. Lucidi. New results on a continuously differentiable exact penalty func-tion. SIAM J. Optim., 2:558–574, 1992.

[78] H. Luo, H. Wu, and J. Liu. Some results on augmented Lagrangiansin constrained global optimization via image space analysis. J. Optim.Theory Appl., 159:360–385, 2013.

[79] H. Luo, H. Wu, and J. Liu. On saddle points in semidefinite optimizationvia separation scheme. J. Optim. Theory Appl., 165:113–150, 2015.

[80] H. Z. Luo, G. Mastroeni, and H. X. Wu. Separation approach for aug-mented Lagrangians in constrained nonconvex optimization. J. Optim.Theory Appl., 144:275–290, 2010.

[81] H. Z. Luo, H. X. Wu, and G. T. Chen. On the convergence of augmentedLagrangian methods for nonlinear semidefinite programming. J. Glob.Optim., 54:599–618, 2012.

[82] C. Ma, X. Li, K.-F. Cedric Yiu, and L.-S. Zhang. New exact penaltyfunction for solving constrained finite min-max problems. Appl. Math.Mech.-Engl. Ed., 33:253–270, 2012.

[83] C. Ma and L. Zhang. On an exact penalty function method for nonlin-ear mixed discrete programming problems and its applications in searchengine advertising problems. Appl. Math. Comput., 271:642–656, 2015.

[84] O. L. Mangasarian. Sufficiency of exact penalty minimization. SIAM J.Contol Optim, 23:30–37, 1985.

[85] G. Mastroeni. Nonlinear separation in the image space with applicationsto penalty methods. Appl. Anal., 91:1901–1914, 2012.

[86] Z. Meng, C. Dang, M. Jiang, X. Xu, and R. Shen. Exactness and algorithmof an objective penalty function. J. Glob. Optim., 56:691–711, 2013.

[87] Z. Meng, Q. Hu, C. Dang, and X. Yang. An objective penalty functionmethod for nonlinear programming. Appl. Math. Lett., 17:683–689, 2004.

[88] H. Mukai and E. Polak. A quadratically convergent primal-dual algorithmwith global convergence properties for solving optimization problems withequality constraints. Math. Program., 9:336–349, 1975.

[89] A. Nedic and A. Ozdaglar. A geometric framework for nonconvex opti-mization duality using augmented Lagrangian functions. J. Glob. Optim.,40:545–573, 2008.

[90] J.-P. Penot and A. M. Rubinov. Multipliers and general Lagrangians.Optim., 54:443–467, 2005.

[91] T. Pietrzykowski. An exact potential method for constrained maxima.SIAM J. Numer. Anal., 6:299–304, 1969.

[92] R. T. Rockafellar and R. J.-B. Wets. Variational Analysis. Springer-Verlar, Berlin, 1998.

30

[93] E. Rosenberg. Exact penalty functions and stability in locally lipschitzprogramming. Math. Program., 30:340–356, 1984.

[94] A. M. Rubinov. Abstract Convexity and Global Optimization. KluwerAcademic Publishers, Boston–Dordrecht–London, 2000.

[95] A. M. Rubinov and R. N. Gasimov. Strictly increasing positively homoge-neous functions with applications to exact penalization. Optim., 52:1–28,2003.

[96] A. M. Rubinov, B. M. Glover, and X. Q. Yang. Decreasing functions withapplications to penalization. SIAM J. Optim., 10:289–313, 1999.

[97] A. M. Rubinov and X. Q. Yang. Lagrange-Type Functions in ConstrainedNon-Convex Optimization. Kluwer Academic Publishers, Dordrecht, 2003.

[98] A. M. Rubinov, X. Q. Yang, and A. M. Bagirov. Penalty functions witha small penalty parameter. Optim. Methods Softw., 17:931–964, 2002.

[99] J.-J. Ruckmann and A. Shapiro. Augmented Lagrangians in semi-infiniteprogramming. Math. Program., Ser. B., 116:499–512, 2009.

[100] A. Shapiro and J. Sun. Some properties of the augmented Lagrangian incone constrained optimization. Math. Oper. Res., 29:479–491, 2004.

[101] D. Sun, J. Sun, and L. Zhang. The rate of convergence of the augmentedLagrangian method for nonlinear semidefinite programming. Math. Pro-gram., 114:349–391, 2008.

[102] J. Sun. On methods for solving nonlinear semidefinite optimization prob-lems. Numer. Algebra Control Optim., 1:1–14, 2011.

[103] J. Sun, L. W. Zhang, and Y. Wu. Properties of the augmented Lagrangianin nonlinear semidefinite optimization. J. Optim. Theory Appl., 129:437–456, 2006.

[104] X. L. Sun, D. Li, and K. I. M. McKinnon. On saddle points of augmentedLagrangians for constrained nonconvex optimization. SIAM J. Optim.,15:1128–1146, 2005.

[105] C. Wang, Q. Liu, and B. Qu. Global saddle points of nonlinear augmentedLagrangian functions. J. Glob. Optim., 68:125–146, 2017.

[106] C. Wang, C. Ma, and J. Zhou. A new class of exact penalty functions andpenalty algorithms. J. Glob. Optim., 58:51–73, 2014.

[107] C. Wang, J. Zhou, and X. Xu. Saddle points theory of two classes ofaugmented Lagrangians and its applications to generalized semi-infiniteprogramming. Appl. Math. Optim., 59:413–434, 2009.

[108] C.-Y. Wang and D. Li. Unified theory of augmented Lagrangian methodsfor constrained global optimization. J. Glob. Optim., 44:433–458, 2009.

[109] C. Y. Wang, X. Q. Yang, and X. M. Yang. Unified nonlinear Lagrangianapproach to duality and optimal paths. J. Optim. Theory Appl., 135:85–100, 2007.

31

[110] C. Y. Wang, X. Q. Yang, and X. M. Yang. Nonlinear augmented La-grangian and duality theory. Math. Oper. Res., 38:740–760, 2012.

[111] Z. Wen, D. Goldfarb, and W. Yin. Alternating direction augmented La-grangian methods for semidefinite programming. Math. Program. Com-put., 2:203–230, 2010.

[112] H. Wu, H. Luo, X. Ding, and G. Chen. Global convergence of modifiedaugmented Lagrangian methods for nonlinear semidefintie programming.Comput. Optim. Appl., 56:531–558, 2013.

[113] H. X. Wu and H. Z. Luo. Saddle points of general augmented Lagrangiansfor constrained nonconvex optimization. J. Glob. Optim., 53:683–697,2012.

[114] H. X. Wu, H. Z. Luo, and J. F. Yang. Nonlinear separation approachfor the augmented Lagrangian in nonlinear semidefinite programming. J.Glob. Optim., 59:695–727, 2014.

[115] Z. Y. Wu, F. S. Bai, X. Q. Yang, and L. S. Zhang. An exact lower orderpenalty function and its smoothing in nonlinear programming. Optim.,53:51–68, 2004.

[116] Y. D. Xu and S. J. Li. Nonlinear separation functions and constrainedextremum problems. Optim. Lett., 8:1149–1160, 2014.

[117] H. Yamashita and H. Yabe. A survey of numerical methods for nonlinearsemidefinite programming. J. Oper. Res. Soc. Jpn., 58:24–60, 2015.

[118] X. Q. Yang and X. X. Huang. Partially strictly monotone and nonlin-ear penalty functions for constrained mathematical programs. Comput.Optim. Appl., 25:293–311, 2003.

[119] Yu. G. Yevtushenko and V. G. Zhadan. Exact auxiliary functions inoptimization problems. U.S.S.R. Comput. Math. Math. Phys., 30:31–42,1990.

[120] W. I. Zangwill. Nonlinear programming via penalty functions. Manag.Sci., 13:344–358, 1967.

[121] A. J. Zaslavski. Optimization on Metric and Normed Spaces. SpringerScience+Business Media, New York, 2010.

[122] L. Zhang and X. Yang. An augmented Lagrangian approach with a vari-able transformation in nonlinear programming. Nonlinear Anal., 69:2095–2113, 2008.

[123] X. Y. Zhao, D. Sun, and K.-C. Toh. A Newton-CG augmented Lagrangianmethod for semidefinite programming. SIAM J. Optim., 20:1737–1765,2010.

[124] F. Zheng and L. Zhang. Constrained global optimization using a new exactpenalty function. In D. Gao, N. Ruan, and W. Xing, editors, Advances inGlobal Optimization, pages 69–76. Springer, Cham, 2015.

32

[125] J. Zhou and J. S. Chen. On the existence of saddle points for nonlinearsecond-order cone programming problems. J. Glob. Optim., 62:459–480,2015.

[126] J. Zhou, N. Xiu, and C. Wang. Saddle point and exact penalty represen-tation for generalized proximal Lagrangians. J. Glob. Optim., 56:669–687,2012.

[127] Y. Y. Zhou and X. Q. Yang. Some results about duality and exact penal-ization. J. Glob. Optim., 29:497–509, 2004.

[128] Y. Y. Zhou and X. Q. Yang. Augmented Lagrangian function, non-quadratic growth condition and exact penalization. Oper. Res. Lett.,34:127–134, 2006.

[129] Y. Y. Zhou and X. Q. Yang. Duality and penalization in optimization viaan augmented Lagrangian function with applications. J. Optim. TheoryAppl., 140:171–188, 2009.

[130] Y. Y. Zhou and X. Q. Yang. Augmented Lagrangian functions for con-strained optimization problems. J. Glob. Optim., 52:95–108, 2012.

[131] Y. Y. Zhou, J. C. Zhou, and X. Q. Yang. Existence of augmented Lagrangemultipliers for cone constrained optimization problems. J. Glob. Optim.,58:243–260, 2014.

[132] S. K. Zhu and S. J. Li. Unified duality theory for constrained extremumproblems. Part I: image space analysis. J. Optim. Theory Appl., 161:738–762, 2014.

[133] S. K. Zhu and S. J. Li. Unified duality theory for constrained extremumproblems. Part II: special duality schemes. J. Optim. Theory Appl.,161:763–782, 2014.

33

a uniﬁed approach to the global exactness of penalty and ... · penalty function for nonlinear...

Documents