decomposition and invariance of measures, and statistical transformation models
TRANSCRIPT
Lecture Notes in Statistics ----------------------------------------------------------------------.-
VoL 1: RA Fisher: An Appreciation_ Edited by S.E. Fienberg and D.V. Hinkley. XI, 208 pages, 1980.
VoL 2: Mathematical Statistics and Probability Theory. Proceedings 1978. Edited by W. Klonecki, A. Kozek, and J. Rosiriski. XXIV, 373 pages, 1980.
Vol. 3: B.D. Spencer, Benefit-Cost Analysis of Data Used to Allocate Funds. VIII, 296 pages, 1980.
VoL 4: E.A. van Doorn, Stochastic Monotonicity and Queueing Applications of Birth-Death Processes. VI, 118 pages, 1981.
Vol. 5: T Rolski, Stationary Random Processes Associated with Point Processes. VI, 139 pages, 1981.
Vol. 6: S.S. Gupta and D.-y' Huang, Multiple Statistical Decision Theory: Recent Developments. VIII, 104 pages, 1981.
VoL 7: M. Akahira and K. Takeuchi, Asymptotic Efficiency of Statistical Estimators. VIII, 242 pages, 1981.
Vol. 8: The First Pannonian Symposium on Mathematical Statistics. Edited by P Revesz, L. Schmetterer, and V.M. Zolotarev. VI, 308 pages, 1981.
Vol. 9: B. Jorgensen, Statistical Properties of the Generalized Inverse Gaussian Distribution. VI, 188 pages, 1981.
Vol. 10: A.A. Mcintosh, Fitting Linear Models: An Application on Conjugate Gradient Algorithms. VI, 200 pages, 1982.
VoL 11: D.F Nicholls and B.G. Quinn, Random Coefficient Autoregressive Models: An Introduction. V, 154 pages, 1982.
Vol. 12: M. Jacobsen, Statistical Analysis of Counting Processes. VII, 226 pages, 1982.
Vol. 13: J. Pfanzagl (with the assistance of W. Wefelmeyer), Contributions to a General Asymptotic Statistical Theory. VII, 315 pages, 1982.
Vol. 14: GUM 82: Proceedings of the International Conference on Generalised Linear Models. Edited by R. Gilchrist. V, 188 pages, 1982.
Vol. 15: K.R.W. Brewer and M. Hanif, Sampling with Unequal Probabilities. IX, 164 pages, 1983.
VoL 16: Specifying Statistical Models: From Parametric to Non-Parametric, Using Bayesian or Non-Bayesian Approaches. Edited by J.P Florens, M. Mouchart, J.P Raoult, L. Simar, and A.FM. Smith, XI, 204 pages, 1983.
VoL 17: IV Basawa and D.J. Scott, Asymptotic Optimal Inference for Non-Ergodic Models. IX, 170 pages, 1983.
Vol. 18: W. Britton, Conjugate Duality and the Exponential Fourier Spectrum. V, 226 pages, 1983.
Vol. 19: L. Fernholz, von Mises Calculus For Statistical Functionals. VIII, 124 pages, 1983.
VoL 20: Mathematical Learning Models - Theory and Algorithms: Proceedings of a Conference. Edited by U. Herkenrath, Q. Kalin, W. VogeL XIV, 226 pages, 1983.
VoL 21: H. Tong, Threshold Models in Non-linear Time Series Analysis. X, 323 pages, 1983.
VoL 22: S. Johansen, Functional Relations, Random Coefficients and Nonlinear Regression with Application to Kinetic Data. VIII. 126 pag(?s, 1984.
Vol. 23: D.G. Saphim. Estimation of Victimization Prevalence Using Data from the National Crime Survey. V, 165 pages. 1984.
Vol. 24: TS. Rao, M.M. Gabr, An Introduction to Bispectral Analysis and Bilinear Time Series Models. VIII, 280 pages, 1984.
VoL 25: Time Series Analysis of Irregularly Observed Data. Proceedings, 1983. Edited by E. Parzen. VII, 363 pages, 1984.
Vol. 26: Robust and Nonlinear Time Serios Analysis. Proceedings, 1983. Edited by J. Franko, W. Hardie and D. Martin. IX. 286 pages, 1984.
Vol. 27: A. Janssen, H. Milbrodt, H. Strasser. Infinitely Divisible Statistical Experiments. VI, 163 pa(Jes, 1985.
Vol. 28: S. Amari, Diffemntia!-Geometrical Methods in Statistics. V, 290 pa(Jes_ 1985.
VoL 29: Statistics in Ornithqlogy. Edited by B.J.T Mor~Jan and PM. North. XXV, 418 pages. 1985.
Vol. 30: J. Grandell, Stochastic Models of Air Pollutant Concentration. V, 110 pages, 1985.
VoL 31: J. Pfanzagl, Asymptotic Expansions for General Statistical Models. VII, 505 pa(Jes. 1985.
Vol. 32: Guneralized Linear Modols. Proceedin(Js, 1985. Edited by R. Gilchrist, B. Francis and J. Whittaker. VL 178 pa(Jes. 1985.
VoL 33: M. Csor(J6. S. Csiir(Jo. L. Horv,ith, An Asymptotic Theory for Empirical Reliability and Concontration Processes. V. 1'71 pa(Jes, 1986.
Vol. 34: D.E. Critchlow. MeUic Methods for Analyzing Partially Rank(?d Data. X. 216 pagos, 1985.
Vol. 35: Linear Statistical Inference. Proceedings, 1984. Edited by T Caliriski and W. Kloll(-?cki. VI, 318 pa(Jes, 1985.
VoL 36: B. Matern. Spatial Variation. Second Edition. 151 pages, 1986.
Vol. 37: Advancr?s in Ordln Restricted Statistical Inference. Proceudin(Js, 1985. Edited by R. Dykstra, T Robertson and FT Wright. VIIL 295 pages. 1986.
Vol. 38: Survey Research Desi(Jns: Towards a Better Understanding of Their Costs and Benefits. Edited by R.W. Puarson and R.F Boruch. V, 129 pages, 1986.
VoL 39: J.D. Malley, Optimal Unbiased Estimation of Variance Components. IX, 146 pa(Jes, 1986.
VoL 40: H.R. Lerche, Boundary Crossing of Brownian Motion. V. 142 pa(Jes. 1986.
VoL 41: F Baccelli, P Brcmaud, Palm Probabilities and Stationary Queues. VII, 106 pages, 1987.
Vol. 42: S. Kullback, J.C. Kee(Jel, J.H. Kullback, Topics in Statistical Information Theory. IX, 158 pages, 1987.
Vol. 43: B.C. Arnold, Majorization and the Lorenz Ordor: A Brief Introduction. VI, 122 pa(Jos, 1987.
-------------.---------------.. ---------------------
ctd. on inside back cover
Lecture Notes in Statistics
Edited by J. Berger, S. Fienberg, J. Gani, K. Krickeberg, and B. Singer
58
Ole E. Barndarff-Nielsen Preben Blresild Paul Svante Eriksen
Decomposition and Invariance of Measures, and Statistical Transformation Models
Spri nger-Verlag New York Berlin Heidelberg London Paris Tokyo Hong Kong
Authors
Ole E. Barndorff-Nielsen Preben BI<Esild Department of Theoretical Statistics Institute of Mathematics, Aarhus University 8000 Aarhus, Denmark
Poul Svante Eriksen Department of Mathematics and Computer Science Institute of Electronic Systems, Aalborg University Center Strandvejen 19,9000 Aalborg, Denmark
Mathematical Subject Classification: 20-02, 20G99, 22-02, 22D99, 22E99, 28-02, 28A50, 28C 10, 53C21, 53C65, 57S20, 57S25, 58A 15, 58C35, 62-02, 62A05, 62A 10, 62E 15, 62F99, 62H99.
ISBN-13: 978-0-387-97131-5 e-ISBN-13: 978-1-4612-3682-5 001: 10.1007/978-1 -4612-3682-5
This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re·use of illustrations, recitation, broadcasting, reproduction on microfilms or in other ways, and storage in data banks. Duplication of this publication or parts thereof is only permitted under the provisions of the German Copyright Law of September 9, 1965, in its version of June 24, 1985, and a copyright fee must always be paid. Violations fall under the prosecution act of the German Copyright Law.
© Springer·Verlag Berlin Heidelberg 1989
2847/3140·543210 - Printed on acid·free paper
Preface
The present set of notes grew out of our interest in the study of
statistical transformation models, in particular exponential transfor
mation models. The latter class comprises as special cases all fully
tractable models for mUltivariate normal observations. The theory of
decomposition and invariance of measures provides essential tools for
the study of transformation models. While the major aspects of that
theory are treated in a number of mathematical monographs, mostly as
part of much broader contexts, we have found no single account in the
literature which is sufficiently comprehensive for statistical pur
poses. This volume aims to fill the gap and to indicate the usefulness
of measure decomposition and invariance theory for the methodology of
statistical transformation models.
In the course of the work with these notes we have benefitted much
from discussions with steen Arne Andersson, J0rgen Hoffmann-J0rgensen
and J0rgen Granfeldt Petersen. We are also very indebted to Jette Ham
borg and Oddbj0rg Wethelund for their eminent secretarial assistance.
May 1989
Ole E. Barndorff-Nielsen
Preben BI~sild
Poul Svante Eriksen
CONTENTS
Preface
1. Introduction
2. Topological groups and actions
3. Matrix Lie groups
4. Invariant, relatively invariant, and quasi-invariant
measures
5. Decomposition and factorization of measures
6. Construction of invariant measures
7. Exterior calculus
8. Statistical transformation models
Further results and exercises
References, with author index
Subject index
Notation index
Page
1
2
15
28
41
53
66
74
113
133
137
145
1. Introduction
Decomposition or disintegration of measures and construction of
invariant measures play essential roles in mathematics and in various
fields of applied mathematics.
In particular, the mathematical methodology in question is required
for certain advanced parts of parametric statistics. See, for instance,
Fraser (1979), Muirhead (1982), Barndorff-Nielsen, Bl~sild, Jensen and
J0rgensen (1982), Eaton (1983), Baddeley (1983), Andersson, Br0ns and
Jensen (1983), Farrell (1985), and Barndorff-Nielsen (1983, 1988).
However, a comprehensive exposition of the relevant mathematical
results is not available in the statistical literature, nor in the
mathematical, and where the various results can be found they are often
rather inaccessible, being included in some advanced and more compre
hensive mathematical treatise.
The present notes constitute an attempt to remedy this situation
somewhat, particularly as concerns the need in statistics. An important
starting base for the work has been an excellent set of notes by An
dersson (1978). We have tried to strike a suitable balance between a
complete account of the mathematical theory and a mere skeleton of
formulas, thus providing the interested reader with enough details and
references to enable him to complete the account mathematically, if he
so desires. The reader is assumed to have some, rather limited, elemen
tal knowledge of topology, group theory and differential geometry, and
- if he or she wishes to study the statistical applications in section
8 - a considerable knowledge of parametric statistics.
In section 2 we discuss the actions of groups on spaces, with the
ensuing concepts of orbital decompositions and right and left factori
zations of groups. Some basic results on matrix Lie groups and the asso
ciated Lie algebras are provided in section 3. section 4 contains the
definitions of invariant, relatively invariant and quasi-invariant
measures and conditions for the existence of such measures, while
methods for constructing invariant measures are considered in section
6. Part of the material of section 6 is intimately connected to ques
tions of decomposition and factorization of measures which is the sub
ject of section 5. The exterior calculus of differential geometry often
provides an efficient and elegant way of decomposing a measure or find
ing an invariant measure. The most relevant aspects of exterior calcu
lus are outlined in section 7.
The final section 8 illustrates the usefulness of the mathematical
tools by deriving the key properties of statistical transformation
2
models. That section has been organized with the aim of enabling a
reader with only a limited knowledge of the material in sections 2-7 to
follow the main lines of the developments. However, a good general
knowledge of parametric statistics is needed for a full appreciation of
the discussion in section 8.
Another area of statistics in which decomposition, or disintegra
tion, of measures is of great importance is that of spatial statistics.
However, except for a derivation of the Blaschke-Petkantschin formula,
we do not touch upon that area here. In particular, we do not discuss
Palm measures and Gibbs kernels. outstanding accounts of the mathemati
cal theory of these are available in Kallenberg (1983) and Karr (1986).
See also stoyan, Kendall and Mecke (1987) and Karr (1988).
Examples are considered throughout the book, and a small collection
of further results and exercises is included at the end. In particular,
an outline of the main properties of exponential transformation models
is given as exercise 25. (Most of the examples in section 8 are, in
fact, concerned with models of this type.)
Some of the results discussed, especially in section 8, are novel
but most of the material stems from the existing literature. The rela
tion of the material presented here to other parts of the literature is
indicated in the bibliographical notes which conclude each of the sec
tions 2-8.
2. Topological groups and actions
In this section we introduce the concept of an action of a topologi
cal group G on a topological space ~. Furthermore we lay down a set
of topological conditions to ensure that there exists a measure on ~
which is invariant under the action of G.
A group G that is also a topological space is called a topological
group if the mapping
G x G ~ G
is continuous (with respect to the product topology on G x G and the
topology on G).
Let ~ be a topological space and G
ping ~ from G into the symmetric group
a topological group. A map
~(~) over ~, i.e. the
3
set of one-to-one transformations of ~ onto ~ with composition of
transformations as composition rule, is called an action of G on ~
if
(i) ~ is a homomorphism
(ii) the mapping
(g,x) .... ~ (g) (x)
is continuous.
It is clear from (ii) that ~ (g) is continuous and since (i) im-
plies that -1 -1 ~(g) = ~(g ), we must have that ~ (g) is a homeomor-
phism for all 9 € G. For short we often write gx for ~ (g) (x) . The
subset Gx = {gxlg € G} is called the orbit of x, and the set ~ is
partitioned into the collection of orbits. We will call this collection
the orbit space and denote it by G\~. The mapping
'IT: ~ .... G\~
x .... Gx
is called the orbit projection. We will assume that G\~ is endowed
with the quotient topology, i.e. A ~ G\~ is open if 'IT-1 (A} ~ ~ is
open.
We say that G acts transitively on ~ if there is one orbit only,
i.e. for every x 1 ,x2 € ~ there exists a 9 € G such that gX1 x 2 .
Since ~ is said to be a homogeneous space if for all x 1 ,x2 € ~
there exists a homeomorphism f of ~ so that f(X 1 ) = x 2 ' it is
clear that if G acts transitively on ~ then ~ is a homogeneous
space.
Example 2.1. Linear and affine groups. Consider the general linear
group GL(n) consisting of the invertible n x n matrices endowed
with the usual matrix multiplication. The group GL(n) is a topologi
cal group acting linearly on ffin by
GL(n) x ffin .... ffin
(A,x) .... Ax
4
where Ax is the ordinary matrix-vector product. The space rnn is
divided into the two orbits {OJ and rnn\{O}, so that GL(n) acts
transitively on rnn\{O}.
The subgroup GL(n) consisting of matrices A having positive
determinant is denoted by GL+(n) and is termed the positive general
linear group.
The group GL(n) is a subgroup of the general affine group GA(n)
which consists of all pairs [A,f] where A € GL(n) and f € rnn, the
group operation being given by
[A' ,f/] [A,f] [A' A,A' f+f ']
Restricting A to have positive determinant one obtains the positive
general affine group GA+(n).
The two groups GA(n) and GA+(n) act transitively on rnn by
([A,f],x) -+ Ax + f . o
The action of G on ~ is said to be free if all orbits are
'copies' of G, i.e. for all x E ~ we have that, if y E Gx then
there exists only one g € G such that y = gx.
For x E ~ we define Gx ' the isotropy group or isotropic group of
x, as the subgroup of G consisting of the elements that leave x
fixed, i.e. Gx = {g E Glgx = x}. Obviously, G acts freely if and
only if Gx {e}, the identity element, for all x E ~.
Now let us choose orbit representatives u, i.e. on each orbit of
~ we select one point u to represent that orbit. We may think of u
as a function of x, u(x) being the representative of the orbit GX,
and this function is a maximal invariant since u(gx) u(x) and
u(x1 ) ~ U(X2 ) if x2 (GX1 . If G acts freely then there exists a
unique element z(x) E G such that x = z(x)u(x), and since z(gx)
gz(x) we have that z is eguivariant. In this way we have defined a
kind of coordinate system on ~ and the mapping x -+ (z,u) is termed
an orbital decomposition of x.
Example 2.2. Scale action.
acts on rnn by
The positive multiplicative group
5
* n n IR+ x IR ~ IR
(a,x) ~ax.
The orbits are {O) and half-lines extending from zero. One finds that * n -1 IR+ acts freely on IR \{O} and that x ~ (lIxll,lIxll x), II II indicat-
ing the Euclidean norm on is an orbital decomposition of x.
* Example 2.3. Location-scale action. Let G = GA+(1) = IR+ x IR be
the group of location-scale transformations. Then, given an Xo €
IR~{O}, G acts on IR n by
G x IR n ~ IR n
([a,f],x) ~ ax + fxO .
The isotropic groups are seen to take the form
{{[a,a(1-a)]\a > O}
{e}
if
otherwise.
a € IR
It follows that G acts freely on ~ = IRn\{x\x
<.,.> be an inner product on IR n and define
-x -1 <xO,xo> <xO,x>
-1 -2 <xO,xo> <x,x> - X
axO' a € IR}. Let
Then -1 -u(x) = s(x) (x-xxo) is an orbit representative and x ~
«s(x),x),u(x» is an orbital decomposition. o
o
Example 2.4. Commutator action. Let H = {V i }1=1 be a finite sub
group of the orthogonal group O(n) = {V € GL(n) \UU* = In}. Consider
the commutator group
C(n,H) {A € GL(n) \AV i ViA, i=1, ... ,p}.
Then C(n,H) acts on IR n by
6
(A,x) -+ Ax.
The structure of such actions is totally clarified and, in a statisti
cal context, the structure is described in Andersson, Br0ns and Jensen
(1982). Andersson, Br0ns and Jensen study the class of n-dimensional
normal distributions with a covariance matrix which is invariant under
* H. The set of such covariance matrices is given by (AA IA E C(n,H)}.
Here we just mention a few properties of the prescribed actions.
Define the n x n symmetric matrix
s (x)
and suppose
~ = (x E mnls(x) is positive definite}
is non-empty. Then it can be shown that ~ is open and mn\~ has
Lebesgue measure zero. Moreover, C(n,H) acts freely on ~.
Further, under the class of n-dimensional normal distributions with
mean vector
we have that
o and a covariance matrix of the form * AA ,
sex) is the maximum likelihood estimate of
A E C(n,H),
* AA •
A very simple example is provided by taking H to be the group of
permutations, i.e. if a E ~(n) - the symmetric group of order n
then U(a) E H is the mapping
and
C(n,H)
Furthermore,
s(x). . 1,J
{A {aij } € GL(n) I aii a,
i=j
i=1, ... ,n, a .. =b, 1J
i;o!j } .
7
The associated normal model is the class of distributions that are
invariant under permutation of the coordinates. o
When G does not act freely, it is often the case that we can
choose an orbit representative u so that the isotropic group of u
is independent of u, Le. Gu(x) = K for all X E ~. In this case
there exists for any pairs x and x' of elements of ~ a one-to-one
mapping of Gx onto Gx ' and we say that ~ has orbits of constant
~. Furthermore, there exists a unique element z(x) E G/K
{gKlg E G}, where gK = {gklk E K}, such that x = gu(x) for
g E z(x). In this case the mapping x ~ (z,u) is also termed an orbi
tal decomposition.
Example 2.5. The orthogonal group O(n) = (U E GL(n) luu* = In}
acts as a subgroup of GL(n) linearly on
collection of balls (S(r) Ir > O}, where
we choose the orbit representatives u as
it is clear that the isotropic group K of
trices
{~ ~}, U E O(n-l).
mn\{O}. The orbits are the
S(r) = {x E mnlllxll = r}. If * n u(x) = (O, ... ,O,lIxll) Em,
u(x) is the set of ma-
This shows that mn\{O} has constant orbit type. o
Example 2.6. Let SOl(l,q) denote the connected component of the
pseudo-orthogonal group O(l,q) which contains the identity element
e. Here O(l,q) is the group of linear transformations of mq+1
leaving the symmetric form x~-x;- ... -x~+l invariant, and sol (l,q)
constitutes a subgroup of O(l,q). If we consider the (q+l) x (q+l)
matrix
I l,q
then, as will be discussed in example 3.2, sol (l,q) has the matrix
representation
8
(A € GL(q+1) Idet A 1, * AIl A = I1 }. ,q ,q
As a subgroup of GL(q+1), sol (l,q) acts linearly on mq+1 and the
orbit of u~ = (1,0, ... ,0) € mq+1 is the unit hyperboloid
1}
where the scalar product * is defined by x*y The set
{O} constitutes an orbit of a particular type. For q > 1 and
x € mq+1,{0} we obtain different orbit types according to where x*x >
0, x*x = 0 or x*x < O. If x*x ~ 0 we define the 'hyperbolic
length' as r(x) = Jx*x. We are in particular interested in the open
subset
* if and only if If u(x) r(x)uo € ~, X € ~, then A € G u(x)
A {~ g}
where U is an element of GL(q) such that UU * = I and det U 1. q
(We use det to indicate the determinant of a matrix and I I to
indicate the numerical value of the determinant of a matrix A, i.e.
IAI is short for Idet AI.) The set of such matrices u constitutes
the special orthogonal group SO(q), cf. example 3.2. Thus, for
we have that Gu(x) is essentially equal to SO(q) and we write
X € ~
GU(X) = SO(q). since Gu(x) does not depend on x we have that
is of constant orbit type.
The above-mentioned action on the orbit corresponding to r(x)
1, has been studied in a statistical context by Jensen (1981), cf.
also example 8.7. o
Example 2.7. Consider the vector space gl(p,n) of p x n ma
trices. The group G = O(p) x O(n) acts on gl(p,n) by
G x gl(p,n) ~ gl(p,n)
* «U,V),X) ~ uxv
9
Suppose p:: n i=l, ... ,p where 2 2 A1 (X» ... >A (X) - - p
are the ordered eigenvalues of * XX .
u(X) = (diag (A 1 (x), ... ,Ap (X», !V
Then
is an orbit representative. The orbit type of X is determined by the
set of equalities in the relation A1(X)~ ... ~Ap(X) ~ o.
In particular, the orbits of ~ = (xIAl(X» ... >Ap(X) > o} are of
the same type since we may infer that if X € ~ then (U,V) € GU(X)
if and only if
v {~ ~}
where W € O(n-p) and
follows that GU(X) = K
(±l}P x O(n-p).
U = diag(c 1 , ... ,c p )' ci € {±1}, i=l, ... ,p.
where K may be thought of as the group
Among the remaining orbits, we find one of particular interest,
namely
st(p,n) = (X € gl(p,n) IA 1 (X) = ... = Ap(X) 1}
* = (X € gl(p,n) Ixx = Ip} .
This is known as the stiefel manifold and plays a fundamental role in
orientation statistics, see e.g. Downs (1972) and Khatri and Mardia
(1977). Cf. also example 8.8. 0
Examl2le 2.8. Consider the set t of matrices of gl(p,n) p x n
It
rank p, p :: n. This is an open subset of gl(p,n). The group GL(p)
acts on gl(p,n)t by
GL(p) x gl(p,n) t ~ gl(p,n) t
(A,X) ~ AX .
The rows of X span a p-dimensional subspace of ~n. Any other set of
p vectors spanning the same subspace is given by the rows of AX for
some A € GL(p). It follows that the set of p-dimensional subspaces of
~n can be identified with GL(P)\gl(p,n)t. This is known as the
Grassman manifold and is denoted by G(p,n). The Grassman manifold and
10
similar objects plays a central role in the development of stereologi
cal procedures, see Santa16 (1979). For instance it is of interest to
define a uniform distribution on G(p,n). It seems natural to seek for
a distribution which is invariant under rotations of the subspaces,
since this corresponds to choosing a subspace which is oriented at
random. This leads to considering the action of O(n) on G(p,n)
given by
O(n) x G(p,n) ~ G(p,n)
* (U, GL(p)X) ~ GL(p)XU
This action is transitive, which means - as we shall see in example 4.6
- that the uniform (invariant) distribution is uniquely determined. 0
A subset H of a topological group G is a (topological) subgroup
of G if H is a subgroup of the group G and a closed subset of the
topological space G. Notice that if H is a subgroup of a topologi
cal group G and H is an open subset of G then H is a closed
subset of G.
Let G be a topological group and H a (topological) subgroup of
G. Then H acts on G in two ways.
The left action 0H of H on G is defined by
0H: H x G ~ G
(go,g) ~ gog·
The homeomorphism 0H(gO) is called left translation by go and is
often denoted by L go
It is clear that this action is free, and the
orbits are the right cosets Hg, g € G. If u:G ~ G is an orbit
representative with respect to this action, then letting U = u(G) we
have that
G HU
i.e. every
h € Hand
g € G has a unique representation as g = hu,
u € U. We call (2.1) a right factorization of
respect to H.
Similarly the right action c H of H on G is given by
(2.1)
where
G with
11
-1 -+ ggo
and the right translation by is also denoted by
The orbits under this action are the left cosets gH, 9 € G. If
v:G -+ G is an orbit representative of ~H and V = v(G) then
G VH (2.2)
is a left factorization of G with respect to H.
We denote the cOllections of right and left cosets respectively by
H\G and G/H, i.e.
H\G = {Hg,g € G}, G/H = {gH\g € G}.
Since H is closed, G acts transitively on both G/H and H\G
by
G x G/H -+ G/H (2.3)
(g,g'H) -+ gg'H
and
G x Ji'G -+ Ji'G (2.4)
(g,Hg' ) -+ -1 Hg'g •
We must require that H is closed to ensure the continuity of (2.3)
and (2.4). We shall refer to (2.3) and (2.4) as the natural actions of
G on G/H and Ji'G.
Similarly, in the case where G is a topological group acting
transitively on a topological space ~ and H is the isotropic group
of some Xo €~, we will be interested in a left factorization of G
with respect to H, because ~ is in one-to-one correspondence with
G/H by the mapping gH -+ gxo' Several examples illustrating the con-
struction of such factorizations are given in the next chapter.
We will now list a set of topological regularity conditions which
become relevant in connection with invariant measures.
12
A topological space ~ is said to be locally compact if
(i) for all x €~, there exists an open neighbourhood U of
x such that the closure of U is compact
(ii) for all pairs x,y €~, x # y, there exist open
neighbourhoods U of x and V of y such that U and
V are disjoint (Hausdorff condition).
We will call ~ an LCD-space if ~ is a locally compact topological
space with a denumerable base of open sets. Quite importantly, the
latter condition allows us to identify the concepts of Radon measures
and regular abstract measures on ~,
1978) •
(cf., for instance, Andersson,
Suppose that G acts on ~ and that G and
Below we present a condition which ensures that
Consider the mapping
f: G x ~ ~ ~ x ~
(g,x) ~ (gx,x)
and let
v = f(G x ~) = {(gx,x) Ig € G, x € ~}.
are LCD-spaces.
is an LCD-space.
(2.5)
(2.6)
We then define (G,~) to be a standard transformation group if
(i) G and ~ are LCD-spaces
(ii) v is a closed subset of ~ x ~
(iii) every compact subset of v is the image under f of a
compact subset of G x ~.
with this definition we have
Theorem 2.1. Suppose (G,~) is a standard transformation group.
Then
(i) G'~ is an LCD-space
13
(ii) the mapping
is a homeomorphism for all x € ~
(iii) the orbit Gx is closed in ~ for all x € ~. D
A subclass of the standard transformation groups is obtained by
requiring that the action of G on ~ is proper, which means that the
mapping r defined by (2.5) is proper. In general a mapping f from
one topological space to some other topological space is said to be
proper provided that f is continuous and the inverse image under f
of every compact set is compact. In case of a proper action it follows
that theorem 2.1 is true and in addition, it can be shown that
( iv) the isotropic group G = {g € G!gx x x} , is compact, for
all x € ~.
Another theorem, which is a useful corollary of theorem 2.1, is
given by
Theorem 2.2. Let G be a topological group and H a closed sub-
group of G.
G/H [ll'G]
If G is an LCD-space then the left [right] coset space
is an LCD-space. D
The following lemma concerns the question of existence of a measur
able orbital decomposition.
Lemma 2.1. Let (G,~) be a standard transformation group.
Then there exists a mapping
(z, u): ~ ~ G x ~
so that x = z(x)u(x), where u(gx)
(z,u) is Borel-measurable.
u(x), g € G, and so that
D
In the case where ~ has constant orbit type, we may improve the
lemma as follows.
14
Lemma 2.2. Let (G,~) be a standard transformation group and sup
pose that ~ has constant orbit type. Then there exists a Borel measur-
able orbital decomposition. 0
It is worth stressing, that all of the considerations carried out in
this section, also hold for the opposite group GO of G, which is
defined to be a topological copy of G endowed with the multiplication
rule
If G acts on ~ then GO acts on ~ by the prescription
-1 ~Go(g): x ~ ~G(g ) (x).
(2.7)
(2.8)
Let I:G ~ GO
then HO = I(H)
denote the identity map. If H is a subgroup of G,
is a subgroup of GO, and HO'Go is homeomorphic to
G/H, showing that we can restrict our discussion to left coset spaces.
Also, in connection with the theory of invariant measures, to be dis
cussed in the next section, the concept of the opposite group turns out
to be a valuable tool.
Bibliographical notes
An extensive and general treatment of groups and actions is found in
Bourbaki (1970), chapter 1. Husain (1966) gives a most readable intro
duction to topological groups, which also covers results on locally
compact groups. The proof of theorem 2.1 is given in Eriksen (1989) and
is based on results extracted from Bourbaki (1960), chapter 3. Lemma
2.1 is essentially a reformulation of Jespersen (1985), theorem 2.2.
The theorem by Jespersen is based on a major result of Effros (1965)
concerning the existence of a measurable orbit representative. Lemma
2.2 is obtained by a slight modification of the proof of lemma 2.1. The
details may be found in Eriksen (1989). The question of existence of a
measurable orbital decomposition has also been considered by wijsman
(1967, 1986) in some special cases, with the purpose of describing the
distribution of maximal invariant statistics.
15
3. Matrix Lie groups
This section contains a brief introduction to matrix Lie groups. The
focus is on developing tools for the construction of factorizations of
a group with respect to some subgroup. In this connection the Lie alge
bra and the exponential map are the central concepts. It may be noted
that all the groups occurring in the examples considered in these notes
are matrix Lie groups.
Throughout the rest of these notes we often consider an m-dimension
al differentiable manifold M and we then use the following notation.
A chart around p € M is a pair (U,~) consisting of an open neigh
bourhood U around p and a diffeomorphism ~: U ~ Rm. Letting ~ -1 . denote the inverse mapping of ~, i.e. ~ = ~ and sett1ng V
~(U) S Rm we refer to the pair (V,~), or sometimes just to ~, as a
(local) parametrization of U.
A group G is said to be an m-dimensional Lie group if
(i) G is an m-dimensional differentiable manifold
(ii) the mapping
G x G ~ G
is smooth.
Condition (ii) is equivalent to smoothness of the mappings
and (gl,g2) ~ glg 2·
-1 g ~ g
Example 3.1. GL(k) - the group of invertible k x k matrices - is
a Lie group:
k 2 GL(k) is an open subset of m , and may thus be considered as a
k 2-dimensional differentiable manifold with the same differentiable
k 2 structure as m
Multiplication and inversion are infinitely often differentiable
mappings, i.e. they are smooth. D
A subset H of a Lie group G is called a Lie subgroup if
16
(i) H is a submanifold of G
(ii) H is a subgroup of G
(iii) H is a topological group.
Note that H need not be a topological subgroup. However, we have
the following theorem.
Theorem 3.1. Let G be a Lie group and H a closed subgroup of
G. Then H is uniquely structured as a topological Lie subgroup of
G. 0
Example 3.2. Invariance subgroups. Let ! € GL(k) and let
G (g € GL(k) Ig!g* = !}.
It is easy to verify that G is a group and, since the mapping
~: G -> G
* g -> g!g
is continuous, it follows that G = ~-1({!}) is closed, i.e. G is a
topological Lie subgroup of GL(k).
In particular, this applies to the case where k p+q and
I p,q
the group thus determined being the pseudo-orthogonal group of order
~ which is denoted by O(p,q}, i.e.
O(p,q) (A € GL(p+q) IAI A* = I }. p,q p,q
It is clear that if A € O(p,q) then det(A) € {±1}. This gives rise
to the definition of the special pseudo-orthogonal group of order
~ as
SO(p,q) (A € GL(p+q) IAI A* p,q Ip,q' det(A) 1}.
17
It can be shown that the connected component of SO(p,q) containing
the identity element is
sot (p,q) * (A € GL(p+q) IAI A p,q Ip,q' det(A) 1, det(All»O}
where All € GL(p) is determined by
A
In case q = 0 the groups O(p) = O(p,O), SO(p) = SO(p,O) are simply
the orthogonal group and the special orthogonal group, respectively. 0
A finite-dimensional vector space Lover m is called a (real)
Lie algebra if there exists a rule of composition (A,B) ~ [A,B] in L
satisfying
(i) [aA+f3B,C] = a[A,C] + f3[B,C] a,{3 € m
(ii) [A,B] = -[B,A]
(iii) [A, [B,C]] + [B, [C,A]] + [C, [A, B]] = O.
Condition (iii) is known as the Jacobi identity, and
called the Lie product of A and B. A subspace N of
subalgebra if A,B € N ~ [A,B] € N. For 0 # A € Land
has by (i) and (ii)
[aA,{3A] a{3[A,A] -a{3[A,A]
so that
[aA,{3A] o.
[A,B] is
L is a (Lie)
a,{3 € m one
This shows that everyone-dimensional subspace of L is a trivial
subalgebra.
Consider the vector space gl(k) of real k x k matrices and the
commutator product
gl(k) x gl(k) ~ gl(k)
(A,B) ~ AB - BA.
18
This is a Lie product making gl(k) into a Lie algebra.
The exponential map is the mapping defined by
exp:gl(k) ~ GL(k)
A ~ exp(A) ,
where
exp(A)
Let A,B € gl(k). Simple calculations show that
and
* exp(A) * exp(A )
exp(A+B) = exp(A)exp(B) if AB
In particular, for B = -A
exp(A)exp(-A) = exp(O) = I;
thus exp(A) belongs to GL(k) and
exp(A)-l = exp(-A).
BA.
For S € GL(k) and A € gl(k) it is seen that
(3.1)
(3.2)
(3.3)
(3.4)
(3.5)
(3.6)
Turning back to the groups, define a group to be connected, if it is
not the union of two non-empty and disjoint open sets.
The following theorem characterizes the Lie algebra of a Lie sub
group of GL(k).
19
Theorem 3.2. Let H be an m-dimensional Lie subgroup of GL(k).
Then there exists an m-dimensional subalgebra h of gl(k) such
that the following holds.
Let A € gl(k), A "I- O. Then we have
HA (exp(As) Is € ffi} is a (connected) one-dimensional Lie
n subgroup of H
A € h.
Every connected one-dimensional Lie subgroup of H has the form HA
for some A € h, and h is called the Lie algebra of H. Conversely,
if h is a subalgebra of gl(k), then there exists a Lie subgroup of
GL(k) having h as its Lie algebra. 0
It is seen from (3.4) that HA is always a connected one-dimension
al Lie subgroup of GL(k) and that for A "I- 0 we have HA = HAA •
The space gl(k) is, in fact, the Lie algebra of GL(k).
Remark. Different subgroups can have the same Lie algebra. This is
illustrated by taking a subgroup H which is not connected and letting
He be the connected component of H containing the identity. Then He
is a subgroup and the connected one-dimensional subgroups of Hand
He must coincide, i.e. the algebras coincide. This is exemplified by
GL(k), where the identity component is GL+(k) which consists of the
matrices with positive determinant. o
Example 3.3. The Lie algebra of O(p.g). Let I € GL(p+q) p,q be
the matrix
I - P {I oj
p,q 0 -I
and consider the pseudo-orthogonal group of order (p,q)
O(p,q) (B € GL(p+q) IBI B*=I }. p,q p,q
20
This is a Lie group and for A E gl(p+g) we get
* exp(As) I exp(As) n p,g
I exp(As)I-1 nP,g p,g
I p,g
*
S E IR
-1 exp(I AI s) p,g p,g exp(-A s) s E IR
so that the Lie algebra consists of the matrices o(p,g) =
(A E gl(p+g) II A=-A*I ). If g = ° then o(p) = o(p,O) is the p,g p,g
set of skew-symmetric matrices. 0
The Lie algebra of an m-dimensional subgroup H of GL(k) can
often be obtained easily in the way just illustrated. Alternatively,
let (U,~) be a chart containing I, i.e. U is an open subset of H
containing the identity element and ~ maps U diffeomorphically onto
an open subset V of IRm. Let ~ = ~-1, the inverse mapping of ~, and let u o = ~(I). Then the vectors a~:lv=v' i = 1, ... ,m, form a
~ ° basis of the Lie algebra, i.e. the Lie algebra is in fact the tangent
space of H at I.
Example 3.4. Let
t .. ~J
0, i > j}
denote the group of k x k upper triangular matrices with ones in the
diagonal.
Then the mapping
~: IR k (k-1)/2 ~ T (k) +1
(tij ) i<j ~ {tij }
is such that ~ ~-1 determines a chart with ~(O) I, and
21
Thus the Lie algebra t+1 (k) of T+1 (k) consists of the k x k upper
triangular matrices with zeroes in the diagonal. []
Next we discuss the question of factorization of Lie aroups. Let G
and H be closed subgroups of GL(k) with Lie algebras ~ and ~.
Suppose H is a subgroup of G implying that ~ is a subalgebra of
~. As mentioned in section 2 we are often interested in factorizations
of G with respect to H. In the following we illustrate a construc
tion technique, which is applicable in most cases. Let ~ denote a
complement to ~ in ~, i.e. ~ can be written as the direct sum
~ ~ $ ~.
If K = exp(~) then we often have that G
hope that the mapping
~ x H -+ G
(a,h) -+ exp(a)h
HK KH. One could also
was a diffeomorphism, but in general the exponential map is not one-to
one. A simple example is provided by
[[0 9]] [cos 9 exp -9 0 _ sin 9
sin 9]
cos 9 9 E IR.
However, in most cases we can find an open subset u of ~ so that
the mapping
u x H -+ G
(a,h) -+ exp(a)h
is differentiable, one-to-one and onto G, except maybe for a closed
null set (a submanifold of dimension less than the dimension of G).
The technique may be generalized as follows.
Let
~
and define
22
>/I: :fli ••• i:fq -+ G
(a l , ... ,aq ) -+ exp(a l ) ... exp(aq ).
We might then seek for open subsets u i C :f i , i
the mapping
uli ••• iuq x H -+ G
(a l , ..• ,aq,h) -+ >/I (aI' ... ,aq)h
is differentiable, one-to-one and (almost) onto G.
We now turn to some examples.
l, ... ,q, so that
Example 3.5. A factorization of SO(p). The special orthogonal
group SO(p) = (U € GL(p) Idet(U) = 1, uu* = I} acts linearly and p
transitively on the unit ball Sp-l = {x € IRP lllxll = I} by the law
SO(p) x Sp-l -+ Sp-l
(U,x) -+ Ux.
Let x~ = (0, ••• ,0,1) € IR P . The isotropic group K of Xo consists
of the matrices
{~ ~}, U € SO(p-l),
i.e. K is isomorphic to SO(p-l), and we will, with a slight abuse
of notation, write SO(p-l) for K. The Lie algebra of SO(p) is the
set so(p) of p x p skew-symmetric matrices. It is seen that
so(p)
where
{[ Let
sO(p-l) i :f
a * -a
e.] 1 ,
a i 1, ... ,p-l ,
23
and define
R: 8 -+ SO(p)
(9 1 ,···,9p _1 ) ~ eXp(9p_1Fp_1)···eXp(91F1)
where 8 = {9 € ~p-11-v/2 < 9 i < v/2, i = 1, ••• ,p-2,
(-v/2,v/2) U (v/2,3v/2)}. It is easy to see that
I i - 1 0 0 0
0 cos 9. 1
0 sin 9. 1
exp(9 iF i) 0 0 I . p-l-1 0
0 -sin 9. 1
0 cos 9 i
9 1 € p-
i = 1, •.. ,p-2. This means that eXp(9iFi) is a rotation in the plane
spanned by the i'th and the last coordinate vector.
It follows that
>#1 1 : 8 ~ sp-1
9 ~ R(9)Xo
is a diffeomorphism of 8 onto SP-1,{(Xi ) € ~Plxp = O}, and that
>#I~l(X) is the set of polar coordinates of x. Furthermore, the map
ping
R: 8 x SO(p-1) ~ SO(p)
(9,U) ~ R(9)U
is differentiable, one-to-one and 'almost' onto SO(p). [)
Example 3.6. Factorizations of Sol~. The linear action of the
group SOl(l,q) on Rq+l has been considered in example 2.6. The unit
hyperboloid Hq was characterized as sol (l,q)/SO(q), i.e. we are
interested in a left factorization of sol (l,q) with respect to
SO(q). The Lie algebra of SOl(l,q) is given by (cf. example 3.3)
so(l,q) = {[~ U€SO(q)}.
It follows that
so(l,q) ='-1 ED so(q)
where
The exponential map
:f1 -+ sol (l,q)
a -+ exp(a)
24
can be shown to be a diffeomorphism onto the set of boosts, i.e. the 1 elements in so (l,q), which are of the form
B
where
2 x2 1 +--1+x1
x2 x3
1+X1
If B(q)
2 Xq+1
1 + 1+X1
denotes the set of boosts, then
it can be shown that sol (l,q) has both a left and a right factoriza
tion as SOl(l,q) = B(q)SO(q) = SO(q)B(q).
In relativity theory the group 0(1,3) is known as the Lorentz
group and the elements of B(3) are also called pure Lorentz transfor
mations. Our interest in sol (l,q) and the above factorizations de
rives from their importance for the hyperboloid exponential model, see
Jensen (1981) and example 8.7.
25
An alternative factorization is due to the socalled Iwasawa decompo
sition of so(l,q) which is given by
so(l,q) = ~2 ~ so(q)
where
is a Lie subalgebra of so(l,q). If A(~)
exp(z(O,t» then
{ c~ s~: o}
. ~~ ... '?~. ~ ..... . ° : I q _1
and
C(t)
exp(z(~,O» and C(t)
Let P(~,t) A(~)C(t) and P(q) (P(~,t) I (~) E IRq}. Then
SOl(l,q) = P(q)SO(q)
determines a left factorization and
[c~ P(~,t)uo = s~
+ 1/2e~lItIl2l + 1/2e~lItIl2
t
parametrizes Hq . The group P(q) acts transitively and freely on
Hq • 0
26
Example 3.7. A factorization of SOI~. The notation introduced
in example 3.5 and example 3.6 will be used without reference. We will
now make a generalization of these examples.
Consider the (p+q) x (p+q) matrix
I p,q {I: O} . ~ ~ .. ~ . ~~~ ..
This defines a scalar product
Y E IRp+q.
p ~ 2,q ~ 1 .
* on * by x*y x I y, p,q
Let sol (p,q) denote the connected component of the pseudo
orthogonal group of order (p,q), i.e. the group O(p,q) of linear
transformations of IR p+q leaving the symmetric form x*x invariant.
Then sol(p,q) has the matrix representation (cf. example 3.2)
SOl(p,q) = (A E GL(p+q) Idet A 1, det All> 0, * AI A = I } p,q p,q
where All E GL(p) is determined by
A {All A2l
A12} A22
.
The action of sol (p,q) on IR p +q is defined by
sol (p,q) x IRp+q -+ IRp+q
(A, x) -+ Ax .
Now let * - p xo - (0, ••• , 0, 1) E IR Then the orbit
of U o is the generalized hyperboloid
and the isotropic group of Uo is isomorphic to sol (p-l,q). so we
wish to make a left factorization of sol (p,q) with respect to
sol (p-l,q).
with a chance of confusion we define, in accordance with the pre
vious examples,
27
R(9) {.~~~~ .• ~ ..• ~ .• }, 9 E a o : I • q
and
P(/L,t) { I p _1 : 0 }
•. ~ ...• ~ P(/L,t)
If Q(9,/L,t) R(9)P(/L,t) and Q(p-1,q) (Q(9,/L,t) 19 E a,
then
SOT(p,q) = Q(p-1,q)SOT(p-1,q)
except for a closed null set. Defining ~(/L,t)
have that
Q(9,/L,t)UO
parametrizes except for a null set.
Bibliographical notes
o
The theory of Lie groups is treated in many mathematical textbooks
on differential geometry, see for instance Cohn (1957) and Helgason
(1978). In particular, semisimple Lie groups is a very well described
class of Lie groups, but in a statistical context it does not seem
natural to impose semisimplicity. Our considerations on factorization
are more or less selfmade, but it may be noted that the existence of an
Iwasawa decomposition follows from semisimplicity, see e.g. Helgason
(1978) or Barut and Raczka (1980).
28
4. Invariant. relatively invariant. and quasi-invariant measures
In this section we discuss existence and uniqueness of invariant,
relatively invariant and quasi-invariant measures on a space ~ with
an acting group G. In particular, the left and right invariant
measures on G itself are considered, and several basic formulas re
lating these are derived. Various disintegration formulas are also
presented.
In the sequel it is assumed that (G,~) is a standard transforma
tion group, as defined on p. 12.
Let ~(~) denote the real-valued continuous functions on ~ with
compact support. A Radon measure on ~ is a linear functional ~:~(~)
~ R with the property that ~(f) ~ 0 for f > O. The set of such
measures will be denoted by ~(~). The measures on ~, which are
traditionally used in statistic, are regular (abstract) measures, i.e.
mappings n defined on ~, the a-ring generated by the compact sets
in ~, satisfying (i) n(B) ~ 0, (ii) n(A U B) = n(A) + n(B) for
A n B = 0, (iii) lim n(A ) n~ n
00
n(UA ) 1 n
for A1 S ... S An C •.• and (iv)
n(B) = sup{n(K) IK S B, K compact}. Under the present topological
conditions there exists a one-to-one correspondance between Radon
measures and regular measures given by
~(f) J f(x)dn(x) f € ~(~), ~
where the right-hand is an ordinary integral with respect to the ab
stract measure n (cf., for instance, Andersson, 1978). As in the
classical abstract measure theory, a Radon measure ~ can be extended
to a larger class of functions called the ~-integrable functions. If B
€ ~ then the indicator function 1B is ~-integrable and the regular
abstract measure n is simply determined by
n(B)
since the two kinds of measures coincide under the topological regular
ity conditions we adopt we do not distinguish between them in the sub
sequent discussions and we use the common notation
It(f) f f(x)djJ.(x) , f € ~(It) , !l
29
where ~(It) is the vector space of It-integrable functions.
If g is a proper one-to-one transformation of !l onto !l we let
9JL denote the measure It lifted by g, i.e.
9JL(f) = It(f 0 g)
where 0 signifies composition of mappings. Further, suppose It is
absolutely continuous with respect to a measure v and let h denote
the corresponding density (or Radon-Nikodym derivative), i.e.
djJ.(x) = h(x)dv(x).
Then 9JL is absolutely continuous with respect to gv and we have the
important formula for transformation of the density
d(9JL) (x) -1 h (g x) d (gv) (x) •
A measure It on !l is said to be invariant relative to the group
G acting on !l if
9JL = It, g € G.
Here, for short, we write 9JL for ~(g)lt, the measure It lifted by
~(g). For the construction of invariant measures, as discussed in the
next section, it is convenient to introduce the more general concepts
of relatively invariant measures and quasi-invariant measures.
* Let ~ be a continuous mapping from G into the positive reals m+
such that
, , , ~(gg) ~(g)~(g), g,g€G,
i.e. ~ is a group homomorphism. A mapping of this kind is called a
multiplier on G, and a measure It on !l is said to be relatively
invariant with multiplier ~ if
g-llt = ~(g)lt, 9 € G, (4.1)
or, equivalently, in terms of differentials
30
-1 d(g JL)(x) = l«g)dJL(x).
We shall often denote such a measure by JLl(. Note that an invariant
measure is a relatively invariant measure with multiplier l( = 1.
On the other hand, if JL ~ 0 is a measure fulfilling (4.1) for some
function l(, then l( is a multiplier in relation to which JL is
relatively invariant.
For relatively invariant measures one has the following existence
and uniqueness theorem
Theorem 4.1. Suppose G acts transitively and properly on ~.
Then for every multiplier l( on G there exists one and, up to multi
plication by a positive constant, only one relatively invariant measure
JLl( on ~ with multiplier l(. 0
Let l( be a multiplier and let m be a positive continuous func
tion on ~ which satisfies
m(gx) l«g)m(x), g € G, x € ~.
Then m is called a modulator with (associated) multiplier x. When
it is important to make the dependence of m on l( explicit we write
for m.
concerning the existence of modulators, we have the following the
orems.
Theorem 4.2. If G acts properly on ~, then to every multiplier
l( there exists a modulator m having l( as its associated multi-
plier.
A subgroup K of
implies that gKg-1
type G/K with K
o
G is said to be regular, if 9K9-1 ~ K, g € G,
K. Furthermore, if (G,~) has constant orbit
regular we say that (G,~) has regular orbit type.
Theorem 4.3. Suppose that (G,~) has regular orbit type, where the
orbits are homeomorphic to G/K.
Let l( be any multiplier fulfilling
l( (k) 1, k € K.
31
Then there exists a modulator with ~ as its associated multi-
plier. o
The importance of the concept of modulator lies in the fact that by
means of modulators it is possible to construct any relatively invari
ant measure - in particular, an invariant measure - on ~ from any l(0
other relatively invariant measure on ~. Specifically, suppose ~
is relatively invariant with multiplier ~o and that we wish to find a
measure ~l( which is relatively invariant with some other multiplier
l(. Let m -1 be a modulator with associated multiplier l(l(~1 (where l(l(0
-1 -1 l( (l(l(o ) (g) = l«g)l(o(g) ). Then, as is simple to check, the measure ~
given by
l(o m -1~
l(l( 0
is, in fact, relatively invariant with multiplier ~.
(4.2)
Example 4.1. GA+C1l-invariant measure. Let the setup be as in ex
ample 2.3 which is concerned with the location-scale group. Let A be
the restriction to ~ = (x € ~nls(x) > O} of Lebesgue measure on ~n. Evidently, A is relatively invariant with multiplier
Furthermore, it is easily verified that m(x) = s(X)n
with l( as associated multiplier. Hence
~(x) = s(x)-n dA(x)
is an invariant measure on ~.
l«(a,~» = n a .
is a modulator
o
Example 4.2. CCn,Hl-invariant measure. In the setup of example 2.4,
let A be the restriction to ~ = (x € ~nls(x) is positive definite}
of Lebesgue measure on ~n. It is clear that A is relatively invari
ant with multiplier l«A) = IAI. Furthermore, m(x) = Is(x) 11/2 is
easily shown to be a modulator with l( as associated multiplier. It
follows that
~(x) = Is(x) 1-1/ 2 dA(x)
is an invariant measure on ~. o
32
By the above mentioned existence and uniqueness result for invariant
measures there exist measures a and ~ on G which are invariant,
respectively, under left action 0 and right action c of G on
itself, and these measures are termed left invariant and right invari
ant, respectively. (Alternatively, a and ~ are called left and
right Haar measure). Below we show that there exists a multiplier A,
called the modular function or the module of G, such that
a = A~ (4.3)
(with suitable choice of the arbitrary multiplicative constants for a
and ~). When we wish to make the dependence of a, ~ or A on G
explicit we use the notations a G, ~G and AG• It follows that a is
relatively invariant under c with multiplier A-I and that ~ is
relatively invariant under 0 with multiplier A-I. We also note, and
later prove, the important formula
ff(g-l)da = ff(g)d~,
and the relations
and
-1 J( ~
-1 ~(J(A)
(4.4)
(4.5)
(4.6)
(4.7)
where for instance (4.7) means that a relatively right invariant
measure with multiplier (J(A)-l is relatively left invariant with
mul tiplier J( .
Proofs of formulas (4.3)-(4.7). Let o (g) and -1
c(g ) denote left
and right translations by g, respectively. Obviously, 0 0 c = coo,
Le.
To prove (4.3), first note that c(g)a is left invariant, as appears
from the calculations
o (g') (I'. (g)a) (f)
33
(I'.(g)a)(f 0 o(g'))
a(f 0 o(g') o I'. (g))
a(f 0 I'. (g) oo(g'))
(0 (g' )a) (f 0 I'. (g))
a(f ol'.(g))
I'.(g)a(f).
Applying theorem 4.1 and the remark preceding it we find that there
exists a multiplier A such that
I'.(g)a = A(g)a,
-1 i.e. a is relatively right invariant with multiplier X = A .
Letting Xo = 1, formula (4.3) follows from (4.2) on noticing that
m(g) = A(g) is a modulator corresponding to the multiplier A-1 = -1
XXO; in fact
m(l'.(g)g') -1 -1 m(g'g ) =A(g) m(g').
To prove (4.5) we observe that
--1 - --1 d(o (g ) (xa)) (g) = X (gg)d(o (g )a) (g)
X (g) X (g) da (g)
= X (g)d(xa) (g).
Formula (4.6), which is proved similarly, implies that
-1 ~(xA) = XA~ = xa
and this proves (4.7). Finally, for f € ~(G)
v -1 f(g) = f(g )
v v and let a(f) = a(f). using
v V (fol'.(g)) =foo(g)
it follows that
v let f be defined by
34
v v v v v V E.(g)a(f) a (f 0 E. (g) ) a(f o c5 (g» = c5(g)a(f) a (f) a (f)
v showing that a is right invariant.
v Consequently a (f) (3 (f) , which
is equivalent to (4.4). 0
The group G is said to be unimodular if A (g) = 1, g E G. If a
group is compact or commutative it is unimodular. Subgroups of unimodu
lar groups are not in general unimodular (cf. example 4.3 below). A
general method for calculating modular functions will be given in sec
tion 6, see formula (6.8).
Example 4.3. Triangular group. The triangular group T+(n) is the
group of n x n upper triangular matrices with positive diagonal ele
ments. It is a subgroup of the general linear group GL(n) , and the
latter is unimodular as shown in example 6.1. However, the module of
T+(n) is, as will be proved in example 6.2,
n 2i-n-1 II t ..
i=l 11
where tii denotes the i-th diagonal element of T E T+(n). Hence
T+(n) is not unimodular. o
More generally, suppose that g-l~ and ~ are equivalent, (i.e.
mutually absolutely continuous) for every g in G. Then there exists
a nonnegative function X on G x ~ such that
-1 g ~ = X(g,·)~
or, written in terms of differentials,
-1 d(g ~)(x) =X(g,x)<4t(x).
Transforming this identity by g we obtain
, , X (gg, x) <4t (x) X (g,gx)X (g,x)<4t (x).
This leads to defining a quasi-multiplier to be a positive continu
ous function
35
with the property that
I I
x(gg,x) = x(g,gx)x(g,x)
and Jl is said to be a quasi-invariant measure with quasi-multiplier
X if
-1 d(g Jl) (x) x(g,x)dJ.L(x), 9 € G, x € ~.
Theorem 4.4. Suppose G acts transitively on ~, let u be an
element of ~ and let K denote the isotropic group at u, Le. K
G . u Then K is closed, ~ and G/K are in homeomorphic correspon-
dence, and
(i) There exists a quasi-invariant measure Jl on ~ with some
quasi-multiplier x. Any two quasi-invariant measures are equivalent,
and any quasi-multiplier determines the corresponding quasi-invariant
measure uniquely, up to a multiplicative constant.
Furthermore, letting ~ denote the mapping from ~ to G/K = {gK:g € G} which establishes the homeomorphic connection between these
two spaces, we have
(ii) A positive continuous function X on G x ~ is a quasi-multi
plier for some quasi-invariant measure Jl on ~ if and only if X is
of the form
x(g,x) p(gz)/p(z), z€~(x), (4.8)
where p is a positive locally integrable function on G satisfying
p (gk) AK(k) x-(k) peg), k € K, 9 € G.
G (4.9)
(Note that, because of (4.9), the right hand side of (4.8) is the same
whichever z in ~(x) is chosen.)
(iii) A quasi-invariant measure Jl and the associated p-function
(as described in (ii) above) are related by
J f(g)p(g)daG(g) G
where v = 'J.L.
36
J J f(gk)daK(k)dv(gK) G/K K
(4.10 )
o
It is to be noted that if G is a group and if K is an arbitrary
closed subgroup of G then the above theorem applies with ~ = G/K
and ~ as the natural action of G on G/K.
Corollary 4.1. Suppose G acts transitively on ~ and let K be
an isotropic group of this action. Then there exists a relatively in
variant measure J.L on ~ with multiplier X if and only if
X (k) (4.11)
This measure is unique up to a multiplicative constant and v = 'J.L
(defined in theorem 4.4) satisfies
J f(g)x(g)daG(g) G
J J f(gk)daK(k)dv(gK). G/K K
o
If the action of G on ~ is proper then K is compact (cf. (iv)
of the remark after theorem 2.1) and hence AG(k) = AK(k) = 1, k € K,
* since the only compact subgroup of (ffi+,.) is {1}. It follows, in
this case, that to every multiplier X on G there exists a unique
(except for a constant) relatively invariant measure on ~ with X as
the associated multiplier.
Corollary 4.2. Supppose G acts transitively on ~ and let K be
an isotropic group of this action. Then there exists an invariant
measure J.L on ~ if and only if
(4.12 )
This measure is unique up to a multiplicative constant and v = 'J.L
satisfies
J J f(gk)daK(k)dv(gK). G/K K
37
The same relation for the opposite group GO is seen to be equivalent to
f f f(kg)d~K(k)dvO(Kg). l<'GK
D
Proof. The first decomposition is a simple consequence of (4.10),
since we can choose p(g) = 1. NOW, considering the opposite group and
applying this decomposition we obtain
f f(g)da o(g) GO G
where denotes multiplication in GO. If I:Go ~ G is the identity
mapping, then I(a ) = ~G and I(a ) = ~K' Since we can identify GO KO
GO/KO and l<'G, the last assertion easily follows. D
Example 4.4. A generalization of Fubini's theorem. A subgroup K
of G is said to be normal if gkg-1 € K, k € K, 9 € G. If K is
normal then G/K = (gK:g € K) can be endowed with a group structure
by the following prescription of the group operation
, , (gK) (gK) ggK.
with this structure G/K is called the quotient group of G and K.
If G is a Lie group and if K is a closed normal subgroup of G
then G/K is a Lie group. Left invariant measure a G/ K on G/K is
clearly also invariant under the action of G on G/K. It follows
from corollary 4.2 that AG(k) = AK(k), k € K, and that
f f f(gk)daK(k)daG/K(gK). G/K K
In case G = V is a vector space and K L is a subspace, we may
identify V/L with a complement M to L in V, i.e. V = L ~ M.
We then recognize the well-known factorization, due to Fubini,
where ~ denotes Lebesgue measure and ® indicates product measure.
D
38
Example 4.5. Factorizations of GL(n). and existence of associated
invariant measures. The group GL(n) may be written as a product in
the following two fashions
GL(n) = O(n)T+(n) = T+(n)O(n).
Both O(n) and T+(n) are closed subgroups of GL(n), and GL(n)
and O(n) are unimodular while T+(n) is not (cf. examples 6.1, 6.5
and 4.3). Therefore, by corollary 4.2 there exists a GL(n) invariant
measure on GL(n)/O(n) = T+(n) but not on GL(n)/T+(n) = O(n).
If we consider the action of GL(n) on the space PD(n) of posi
tive definite matrices defined by
GL(n) x PD(n) ~ PD(n)
* (M,~) ~ ~M (4.13)
then O(n) is the isotropy group of I E PD(n) and since GL(n) and
O(n) are both unimodular it follows from corollary 4.2 that there
exists a measure v on PD(n) which is invariant under the action
(4.13) . o
Suppose that G acts transitively on ~ and that the isotropic
group K of Xo E ~ is compact. As noted after corollary 4.1 we then
have AG(k) = AK(k) = 1, k E K. Applying corollary 4.2 we obtain that
~ has a G-invariant measure ~.
The compactness of K also implies that if f E ~(~) then
f (g ~ f(gxo» E ~(G)
and since the measure a on ~ given by
is invariant, it follows from the uniqueness that (except maybe for a
constant) a =~, i.e. the invariant measure ~ on ~ is given by
39
Example 4.6. Invariant measures on Step,nl and Gep,nl: existence
and uniqueness. The above applies, in particular, to the stiefel
manifold (example 2.7) and the Grassman manifold (example 2.8). In both
cases G is also compact, so that a G may be chosen as a probability
measure, and thereby ~ becomes the unique invariant probability
measure on !I • D
Example 4.7. sol e1,gl-invariant measure on Hq and Begl: exist
ence. As discussed in example 3.6, sol (l,q) is both right and left
factorizable, as
sol (l,q) = SO(q)B(q) = 8(q)SO(q).
Consider the action of sol (l,q) on the unit hyperboloid
Hq {x (x x X)* € Rq+1 ·.x*x = 1, = = l' 2'···' q+ 1
given in matrix representation by
SOl(l,q) x Hq ~ Hq
(A,x) ~ Ax.
The isotropy group at (1,0, ... ,0) € Hq is SO(q) which is compact.
Furthermore, SOl(l,q) is unimodular (as will be shown in example
6.5). Hence (4.12) is satisfied and there exists a sol (l,q)-invariant
measure on Hq or, equivalently, on the set of boosts B(q). D
Example 4.8. SO(p,q)-invariant measure on HP,q: existence. Con-
sider (cf. example 3.7) the action of SO(p,q) on the generalized
hyperboloid
where the linear and transitive action is defined by
SO(p,q) x HP,q ~ HP,q
(A, x) ~ Ax. (4.14 )
40
The isotropic group at Xo € HP,q is isomorphic to SO(p-l,q), which
is noncompact if p > 1. However, SO(p-l,q) is a closed subgroup of
SO(p,q), and SO(p,q) is unimodular for arbitrary p and q (cf.
example 6.5). It follows that there exists an invariant measure on
HP,q under the action (4.14). D
Example 4.9. GA+Cl)-invariant measures. The location-scale group
GA+(l) may be identified with the group G of 2 x 2 matrices of the
form
g = [~ {] , a > 0, f € ~.
It is easily seen that the left and right invariant measures on G are
given by
-2 a dadf
and
-1 a dadf.
Consequently, by (4.3),
-1 a
showing that G is not unimodular.
Let K be the subgroup of G consisting of elements of the form
Here K is closed, noncompact, and unimodular (because it is commuta
tive). Furthermore, AG(k) = 1 for every k € K, and consequently
there exists a G-invariant measure on G/K.
* Letting x E ~ be represented as the two-dimensional vector (x,l)
the action of GA+(l) on ~ with Xo = 1, as defined in example 2.3,
Le.
41
([a,f),x) ~ ax+f ,
becomes
and it is seen that K and G/K may be identified with the groups of
location and scale transformations of ffi, respectively.
Let K denote the isotropic group G(0,1)* of this action. The
elements of K are of the form
k [~ ~] a > o.
Since K is noncompact and unimodular (because it is commutative) we
may conclude, by the remark after theorem 2.1, that the action is not
proper and, by corollary 4.2, that there exists no invariant measure
under the action of G on ffi. 0
Bibliographical notes
The main theorem of this section is theorem 4.4, which is a repro
duction of theorems 4.1.1 and 4.3.1 of Barut and Raczka (1980), which
however contains no proofs. It may also be extracted from Reiter
(1968), chapter 8, which gives a nice treatment of the concepts and
provides the necessary proofs. Theorem 4.2 is a consequence of proposi
tion 2.4.7 of Bourbaki (1963), chapter 7 which also contains a general
treatment of the contents of this section. Theorem 4.3 is proved in
Eriksen (1989).
5. Decomposition and factorization of measures
Suppose a space ~ is partitioned into disjoint subsets ~v,
v € U,
measure
and let
on
f f (x) C4.t (x) ~
be a measure on &. If for each v € U we have a
and if there is a measure K on U such that
f f f(x)dp (x)dK(v) U ~ V
(5.1)
V
42
for every integrable function f then «pV)V€IT,K) is said to consti
tute a decomposition of ~, and we speak of (5.1) as a disintegration formula. We have already in section 4 encountered decompositions and disintegrations, cf. formula (4.10) to which we return below.
Two types of questions arise here
(a) Given measures ~ and K on ~ and IT, respectively,
does there exist a family of measures (pv)v€IT such that
(5.1) holds.
(b) Given measures ~ on ~ and, for every v € IT, on
~v, does there exist a measure K on IT such that (5.1)
holds.
The answer to (a) is affirmative under very weak conditions (see for instance theorem 5.1 below and theorem 15.3.3 in Kallenberg (1983»,
and the real problem lies in constructing or describing the measures pv. On the other hand, a positive answer to (b) is available under
considerable restrictions only. In this case there is, in addition, the problem of constructing or describing K.
Next, let u and s be mappings defined on ~ and with range spaces ~ and ~, respectively, and let again ~ denote a measure on ~. One may then ask whether there is an associated factorization of the lifted measure (u,s)~ as
(u,s)~ K 8 p, (5.2)
where K and P are measures on ~ and ~, and one may seek to describe or construct K or p, or both.
This factorization problem is closely related to the decomposition
problem. Specifically, suppose u is a mapping of ~ onto IT such that ~v = {x: u(x) = v} for every v € IT. If, moreover, the level
sets ~v are all (or almost all) of the same mathematical character so
that each can be identified with a certain set ~,
measure on ~ and Pv the corresponding measure on
and (5.2) are essentially the same.
and if P is a
~v then (5.1)
The questions outlined above constitute in a sense the main techni-
cal problem of mathematical statistics. We shall be concerned here exclusively with exact and explicit solu-
43
tions to these questions, when the partition of ~ is generated by a
group G acting on ~, and we shall pay attention to the possibility
of characterizing K, or p, or the p~'s by invariance. Important-
ly, however, even when exact solutions are not available it is often
possible to obtain highly accurate approximate solutions, see Barn
dorff-Nielsen (1988).
If ~ is a topological space, then ~(~) denotes the collection of
measures on ~, and for ~ € ~(~)- we let ~(~) denote the vector
space of ~-integrable functions on ~.
Suppose in the following that (G,~) is a standard transformation
group. A solution of question (a) is then available as
Theorem 5.1. Suppose that ~ € ~(~) is a-finite. Let K € ~(G\~)
be a a-finite measure, which is equivalent to ~~, ~ denoting the
orbit projection. Such a measure K exists, even though ~~ is not in
general a-finite. Then there exists a collection of measures (p~)~€G\~
such that
i) p~(x) € ~(Gx) K-almost every ~ € G\~
ii) ~ ~ p~(f) € ~(K) for every f € ~(~)
iii) J f(x)~(x) ~
whenever f €
J J f (y) dp ( ) (y) dK (~(x) ) G\~ Gx ~ x
~ (~) . o
Combining this theorem with theorem 4.3 we obtain, in generalisation
of corollary 4.2,
Corollary 5.1. The measure ~ is quasi-invariant with quasi-multi
plier ~(g,x) if and only if for K-almost every ~(x) € G\~ the
measure p~(x) is quasi-invariant on Gx with quasi-multiplier
~(g,y), (g,y) € G x Gx.
Suppose that ~ is quasi-invariant with quasi-multiplier ~. Then
there exists an invariant measure on &, which is equivalent to ~,
if and only if
~(g,x) 1, g € G x
for ~-almost every x € ~.
(5.3)
44
Finally, if Gv is the isotropic group of some y E Gx, with v =
vex), then (5.3) holds if and only if
(5.4)
for K-almost every v E G\~. o
Proof. Suppose that ~ is quasi-invariant with quasi-multiplier
~(g,x). It is easy to see that this is equivalent to the statement
that for K-almost every vex) E G\~ we have that pv(X) is quasi-in-
variant on Gx with quasi-multiplier
A is invariant and equivalent to ~.
have
~(g,y), (g,y)E G x Gx. dA Let ~(x) = hex) > o.
hex) ~ (x) h(gx)~(g,x)~(x)
so that
hex) = h(gx)~(g,x).
It follows that (5.3) must be satisfied.
On the other hand, the quasi-multiplier ~(g,y) of
fies, according to (4.8) and (4.9),
~(k,y)
AG (k)
Y k • , E Gy . "G (k)
Suppose
Then we
sat is-
This shows the equivalence of (5.3) and (5.4). Suppose that (5.3) is
satisfied, and let (z,u) be a decomposition of the type described in
lemma 2.1. If m(x) = ~(z(x),u(x», then a bit of calculation shows
that m(gx) = ~(g,x)m(x) and it is now easy to see that m(x)-1~(x) is invariant. o
The function m is called a modulator with quasi-multiplier x. Note, that the last lines in the proof of corollary 5.1 contains a
method, under the assumption (5.3), for the construction of a modulator
on the basis of an orbital decomposition and ultimately for the con
struction of an invariant measure. This method will be considered in
more detail in section 6.
45
A more explicit solution to (a) is available in terms of geometric
measures when ~ is a d-dimensional Riemannian manifold with metric
~. Recall that if Eix ' i = l, .•. ,d, denotes the coordinate frame at
x € ~ corresponding to a local parametrization (v,~) then the geomet
ric measure 0 is given by
do (x) (5.5)
Here q is the d x d matrix with elements qij(V) = ~x(Eix,Ejx)
where x = ~(v) and ~ denotes Lebesgue measure on V.
In the special case where ~ is a submanifold of mk (endowed with
the inherited metric) the modulating factor in (5.5) may be calculated
as
(5.6)
Here denotes the (generalized) Jacobian 1 £» *£» 11/2, * where £»
is the d x k matrix * a-/J lav. Supppose that the sets are submanifolds of ~ determined as
the level sets of a differentiable mapping u from ~ to some other
Riemannian manifold ~. Letting ~ and
on ~ and ~, respectively, and writing
K
~u
be the geometric measures
instead of ~ we have .".
that the decomposition of ~ is determined by
1 *1-1/2 u pu = DuDu 0 •
Here u o denotes geometric measure on
(5.7)
~u and Du is the differen-
tial of the mapping u. In principle the determinant in (5.7) should
be calculated by representing Du as a matrix corresponding to (arbi
trary) local orthonormal (relative to the Riemannian metric) bases for
~ and ~; see, however, exercise 19.
Next, we present a situation where a solution of (b) is feasible.
Suppose G acts properly on ~. Denoting the level sets of the
orbit projection .". by ~."., .". € G\~, we may for each .". € G\~ de-
fine a measure p.". € ~(~.".) by the prescription
46
f f(gx)d~(g), x E ~~, G
where ~ is the right invariant measure on
hand side is, in fact, the same whichever
(5.8)
G and where the right
x E ~ is considered. v
Theorem 5.2. Suppose G acts properly on ~, define measures p~
by (5.8) and let ~ be a measure in ~(~) such that
A(g)~, g E G, (5.9)
i.e. ~ is relatively invariant with multiplier A-I. Then there
exists a measure K on G'~ such that ((p~)~ E G,~,K) is a decompo-
sition of ~. The corresponding disintegration formula may be written
f f(x)~ = f f f(gx)d~(g)dK (5.10) ~ G'~ G
where ~ is right invariant measure on G. D
The measure K in theorem 5.2 is called the quotient measure and is
often denoted by ~/~. It should be noted that the measures ~ satis
fying (5.9) are the only ones for which there exists a K such that
(5.10) is satisfied.
Note that in the particular case where the action of G on ~ is
free, so that ~ = G x ~/G, theorem 5.2 yields, in effect, a factori
zation
(5.11)
Next, we give a similar decomposition without the properness condi
tion, but where we suppose that. ~ has regular orbit type G/K and
that
(5.12)
Consider the subset ~o of ~ consisting of the points with isotropic
group K, i.e.
~o R}. (5.13)
47
Then we might hope for a decomposition onto G/K x ~o. However, in
general ~o does not represent the orbit space. Suppose that u € !to
and gu € !to' g € G. Then K = G gu -1 -1 gGug = gKg . This gives rise
to considering the normalizer H of K defined by
I -1 H = {g € G gKg K} • (5.14)
It follows that K is a normal subgroup of H and that the group H/K
acts freely on !to by
~H/K: H/K x ~o ~ ~o
(hK,u) ~ hu.
(5.15)
This suggests that we seek a decomposition onto G/K x (H/K)'~O.
First we need some preparations in order to state the theorem. Since
K is a normal subgroup of H, it follows that H acts on K by the
law
f:HxK~K
(h,k) ~ hkh-1 (5.16)
It is easy to see that a K is relatively invariant. Denote the multi
plier by Xo' i.e.
(5.17)
In particular, we have that
(5.18)
From (5.12) and (5.18) it follows that
-1 -1 X (hK) = 4G(h) X 0 (h) (5.19)
is a well-defined multiplier on H/K. According to theorem 4.1 there
exists a modulator n on ~o with X as the associated multiplier,
i.e.
48
n(hu) x(hK)n(u), u € ~O' hK € H/K. (5.20)
If we let a G/ K denote the measure on G/K which is invariant under
the action of G then for each v € G\~
~(~v) by the prescription
we may define a measure P € v
f n(U(x))f(gKu(x))daG/K(gK), G/K
x € ~ v
(5.21)
where u: ~ ~ ~o is an arbitrary orbit representative. The definition
of is independent of u. We now have the following theorem.
Theorem 5.3. Let (G,~) be a standard transformation group of
regular orbit type and suppose (5.12) is satisfied. Define pv by
(5.21) and suppose ~ is an invariant measure on ~. Then there
exists a measure K on G'~ such that «pv)v € G,~,K) is a decompo-
sition of ~. The corresponding disintegration formula may be written
f f(x)~(x) ~
f f n(U(X))f(9KU(X))daG/K(9K)dK(v(x)). G'~ G/K
(5.22)
o
A straight-forward application of theorem 5.3 shows the following
factorization theorem.
Theorem 5.4. Suppose (G,~) is a standard transformation group of
regular orbit type, and let G act transitively on the LCD-space ~.
Furthermore, let s: ~ ~ ~ be a continuous mapping which commutes with
the actions of G on ~ and ~, i.e. s(gx) = gs(x), and suppose
(s,v) is proper.
If ~ is an invariant measure on & then
(s,v)~ p ~ K (5.23)
where p is the unique (up to a mUltiplicative constant) invariant
measure on ~ and K is a certain measure on the orbit space G\~.
o
49
If G acts properly on ~ and on ~ and s: ~ ~ ~ is continuous
and commuting with the action of
5.4 are satisfied. In this case K.
G, then the conditions of theorem
equals the quotient measure m-1~/~ where ~ is the right invariant measure on G and m is a modulator
on ~ with A as the associated multiplier.
Example 5.1. The action of SO(p,a) on a cone. Consider the cone
1}
and define rex) = (x*x) 1/2, x €~. Let ~ be Lebesgue measure on
~, which is an open subset of mp+q • The group SO(p,q) acts
linearly on mp+q and ~ is an invariant subset under this action.
Hence SO(p,q) also acts on ~ and ~ is invariant. Let ~ = HP,q
and sex) r(x)-lx . Then
(s,r)~ = p 8 K.
determines p as an invariant measure on HP,q. In particular,
I f(s)dp (s) HP,q
CD
I f 1 (r)dK.(r) I f 2 (s)dp(s). o HP,q
I f(s(x»dx. {x€~lro~r(x)~r1}
1. Then
If, for instance, q = 0
the invariant measure on
this is the well-known characterization of Sp-1 given by
P (A) A ({x € mPlo < "x" p -1
~ 1, "x" X € A}). o
In some situations it is possible to characterize the measure K. in
theorems 5.1-5.4 by invariance as will now be discussed.
Suppose that (G,~) is a standard transformation group. As we shall
see in a moment, there are instances, where there exists a 'supplement
ary' group H acting on ~ by (h,x) ~ xh-1 , say. Combining this
with the action of G, we obtain an action of G x H on ~ given by
50
(G x H) x 3: -+ 3: -1
«g,h),x) -+ gxh . (5.24)
By referring to the preceeding theorems, it is easy to establish the
following theorem.
Theorem 5.5. Suppose that the action (5.24) is transitive and that
G (respectively H) acts transitively on the LCD-space ~ (respec
tively ~) and in addition that
(s,r): 3: -+ ~ x ~
is a proper mapping with the property that
-1 s(gxh )
-1 r(gxh )
gs (x)
hr(x) .
Furthermore, suppose that both of the actions of G and of H on 3:
- in turn - fulfills the conditions of either theorem 5.2 or theorem
5.4.
Then if ~ is invariant on 3: under the action of G x H we have
that
(s,r)(~) = a ® p
where a (respectively p) is invariant measure on ~ (respectively
~) . o
Remark. Observe that both a and p may be interpreted as quo
tient measures.
Example 5.2.
-1 (T, x) -+ T X
Consider example 5.1. The group * IR+ acts on 3:
Furthermore, r(x)-(p+q) ~(x) is invariant under the action of
* SO(p,q) x IR+, since ~ is relatively invariant with multiplier
by
that we may consider the action of
-1 (T,r) -+ T r.
51
The last equality also implies
* on ~ = r(~) = ffi+ given by
Applying theorem 5.5 we obtain that
(s,r) (r(x)-(p+q)djL(X)) = da(s)dp(r)
where a
spectively
(respectively
* ffi+) .
p) is the invariant measure on (re-
o
Example 5.3. Consider examples 2.3 and 4.1 and the action of G
GA+(1) = ffi: x ffi on ~ = (x € ffinls(x) > O} given by
and
ant
[a,f]: X -+ ax + fxO
<','> an inner product on ffin.
u: ~ -+ ~ = {x€ffinlx=o, s(x)=1}
u (x) -1 -s (x) (x-xxo )
Furthermore, the maximal invari
is g'iven by
Let x € ffin be considered as a row vector and define
where
O(n) (V € GL(n) I <xV,yV>
is the group of isometries.
The measure djL(x) = s(X)-n dA(x) is invariant under the action of
G x H and it follows from theorem 5.5 that
52
«s,X) ,u) (IL) a ® p
where a is left invariant on G and p is invariant on ~ under
the action of H. In fact, a bit of reflection reveals that ~ ~
O(n-1)/O(n-2), Le. can be considered as a sphere in n-1 IR , which
is also clear by noting that ~ is the intersection between the sphere
{x € IRnl<x,x> = 1} and the hyperplane {x € IRnl<x,xo> = a}. 0
Another situation where the quotient measure can be characterized by
invariance, is obtained by considering a closed subgroup H of a group
G and supposing that
Then, for every g € G and h € H,
which shows that AG is a modulator on G with respect to left action
°H of H on G. Hence, by (4.3) , the measure -1 AG a G f3 G is a rela-
tively invariant with multiplier -1 and we may there-measure on G AH
fore use theorem 5.2 to decompose this measure. In fact, (5.11) applies
so that f3 G is factorized as f3 G = f3 H ® K with K being the quotient
measure f3 G/f3 H • Comparing this to corollary 4.2 we see that f3 G/f3 H
must be invariant measure on H\G, relative to the natural action of
G on H'G. There is, of course, a similar factorization of a G, and
we may summarize these comments in the two formulas
G = H x G/H
a G = a H ® aG/aH (5.25)
and
G = H x H\G
f3 G = f3 H ® f3 G/f3 H (5.26)
53
where aG/aH and ~G/~H are measures on G/H and H\G, invariant
under the natural actions of G on G/H and H\G, respectively.
Bibliographical notes
A general treatment of disintegration of measures is given in Bour
baki (1959). Theorems 1, 3, 4 and corollary 1 of this section are
proved in Eriksen (1989). Except for the conditions of orbit regularity
theorem 4 generalizes lemma 3 of Andersson, Br0ns and Jensen (1983),
cf. also Barndorff-Nielsen, Bl~sild, Jensen and J0rgensen (1982).
Example 5.1 shows a situation, which is not covered by this lemma, but
where the conditions for applying theorem 5.4 are fulfilled. Theorem
5.2 is a reproduction of proposition 2.2.4 of Bourbaki (1963), chapter
7. Finally, this section also contains a few remarks on Riemannian
geometry and we refer to Boothby (1975) for an excellent introduction
to Riemannian geometry and to Tjur (1980) for more details concerning
the decomposition of geometric measures.
6. Construction of invariant measures
We shall discuss here methods of constructing a G-invariant measure
~ on the space ~, in the sense of expressing ~ on the form
~(x) = ,(x)dX(x) (6.1)
for some function , and where X is some given measure on ~. If ~
is an open subset of Rr then X will typically be Lebesgue measure.
More generally, X may be the geometric measure on ~ when ~ is a
Riemannian manifold (formula 5.5)).
The construction of , can often be achieved as follows. One starts
by showing that X is quasi-invariant under G with quasi-multiplier
x(g,x), say. supposing that the conditions of corollary 5.1 are ful
filled, we know that there exists an invariant measure of the form
(6.1). Essentially we have to check that
X(k,x) 1, x €~, k € Gx .
Subsequently we seek a modulator m with quasi-multiplier x, i.e. a
positive function m on ~ for which
54
m(gx) l( (g,x)m(x).
Finally, we define .(x) as l/m(x) and then ~, given by (6.1), is
invariant. The problem is thus reduced to constructing a suitable
modulator m. The construction method on which we will concentrate,
has, in fact, already been applied at the end of the proof of corollary
5.1, i.e. if (z,u): ~ ~ G x ~ is an orbital decomposition as in lemma
2.1, then m(x) = l«z(x),u(x» is a modulator.
More specifically, we will treat the case where ~ is a subset of
mn and where G acts linearly on ~, i.e. {~(g) Ig € G} are linear
transformations of mn leaving ~ invariant. Moreover, we will assume
that the following regularity conditions are satisfied
i) ~ is an m-dimensional differentiable submanifold of mn
ii) G = {~(g) Ig € G} is a closed subgroup of GL(n).
The last condition implies that G is itself a differentiable submani
fold of the vector space of n x n matrices. For simplicity we will
assume that ~ is covered by a single chart, i.e. there exists an open
subset V of mm and ~:V ~ mn so that ~ is a parametrization of
~ and ~ has differential ~ of rank m. The geometric measure A
on ~ is given by
dA (x)
where Am is Lebesgue measure on V and J~(V) = I~*(v)~(v) 11/2,
cf. (5.6). The mapping f(g) (v) = ~-l(~(g)~(V» on V induced by
~(g) is a diffeomorphism and the Jacobian of f(g) can be calculated
as
* * 1/2 -1 J f (g) (v) = 1 ~ (v) ~ (g) ~ (g) ~ (v) 1 J~ (f (g) (v» •
Furthermore, the geometric measure on ~ is seen to be quasi-invariant
with quasi-mUltiplier
J~ (g) (x)
1 * -1 * -1 11/2 -1 -1 ~ (~ (x»~(g) ~(g)~(~ (x» J~(~ (x» . (6.2)
55
NOw, suppose that ~ has regular orbit type G/K and let (z,u):~ ~
G/K x ~ be an orbital decomposition according to lemma 2.2. Summar
izing the considerations above we then have that there exists an in
variant measure u on ~ if and only if. for every u,
J"( (k) (u) 1, k € K. (6.3)
In this case the function
m(x) = J"( (g) (u), g € Z (x) (6.4)
is a modulator with quasi-multiplier (6.2), and hence the measure u is
given by
c4t (x) -1
J"( (g) (u) dA (x) (6.5)
.=i",s,--"i'-'.n....,v,-,a",r=...1",,' a"'n!..!.t""'--"'u""n"'d"'e'-"r'---'t"'h"'e"--a"'-><c->:t.=i->:o"'n'----'''('-----'o'''-f''-------'''G_----''o'"''n-'------''''~ • Note that, by coro 1-
lary 5.1, condition (6.3) is equivalent to
In particular, let ~ = G and consider the left action 0 of G
on itself. This action is transitive and the isotropic group of any
element in G is trivial. The conditions for applying (6.5) are
therefore trivially fulfilled and we find that left invariant measure
a on G exists and is of the form
da (g) -1
J o (g) (e) dA (g).
Similarly we find that right invariant measure ~ is given by
d~ (g) J -1 (e)-ldA (g) e(g )
(6.6)
(6.7)
On comparing (4.3), (6.6) and (6.7) we obtain the following important
formula for calculating the module of G
A (g) -1
J -1 (e)JO ( )(e) • e (g) g
(6.8)
56
Let us introduce an action C = CG - the conjugation diffeomorphism -
of G on itself by the prescription
1
C(g)g
Thus C
A(g)
1 1-1 ggg
E. 00 DOE. and
-1 J C (g) (e) (6.9)
Example 6.1. The
{(Vi)~=1Iv1,···,vn of (IRn)n and
invariant measures on GL(n) and GA(n). Let V
is a basis of IR n }. Then V is an open subset
-/I: V -+ GL(n) n
(vi )i=l -+ {V1 ,···,Vn }
parametrizes GL(n). We have J-/I = 1 so that the geometric measure X
on GL(n) is the restriction of the Lebesgue measure on
open set GL(n). Now let g € GL(n) and let reg) = -/1-1
Then reg) «Vi)~=l) = (gvi)~=l is linear and Jr(g) (v) =
is left invariant. Similarly we have
d~(g) = Igl-n dX(g) ,
showing that GL(n) is unimodular.
(IRn) n to the
o o(g) 0 -/I.
Igl n so that
The general affine group GA(n) = {[g,x] Ig € GL(n),x € IRn} with the
multiplication rule [go,Xo][g,X] = [gog,gox+xo ] is parametrized by
~: V x IR n -+ GA(n)
(v,x) -+ [-/I(v),x]
and the geometric measure is clearly given by dX(g)dx. The mapping -1 n
~ 0 o[g,XO] 0 ~(v,x) = «gvi)i=l,gx+xO) is affine with Jacobian
Igln+l so that
I I-n-l da(g,x) = 9 dX(g)dx
is left invariant.
with the parametrization
~: V x ~n ~ GA(n)
* (v,x) ~ [-/I (v) ,x]
57
--1 -1 it is seen that ~ 0 c([g,Xo] ) 0 ~(v,x)
linear with Jacobian Igl n so that
d~(g,x) = Igl-ndX(g)dx
is right invariant and the module of GA(n) is given by
AGA(n) (g,x) = Igl-1 . D
Example 6.2. The invariant measures on T+lnl. Consider example
4.3 and introduce the following parametrization of T+(n). Let
n
t = k
and define -/I: IT Sk ~ k=1
by T where
{to 0
(T) o. = ~J ~J 0
i ::; j
otherwise.
It is obvious that J == -/I 1 so that the geometric measure on T+ (n) is
given by dX (T) IT dtij · If n E T+(n) and hk = g {gij}i,j=1
i::;j
k then the mapping -/1-1 o 0 (g) -/I(t) n is linear {gij}i,j=1 0 = (hkt k )k=1 n n k
with determinant IT Ihkl IT IT g .. = ~ gn-k+l u k=1 kk
Consequently, the k=1 k=1 i=1
left invariant measure on T+(n) is given by
da (T) n . IT t7-:-n - 1dX(T).
i=1 ~~
In a similar way we obtain the right invariant measure as
djj (T) n . 1I t:-~ dX (T)
i=l l.l.
58
and the module of T+(n) is given by
n 2i-n-1 1I t ..
i=l l.l. o
Example 6.3. GLCn)-invariant measure on PDCn). Consider the action
of GL(n) on PD(n) given by (4.13). It is clear that PD(n) is an
open subset of Sen), the vector space of symmetric n x n matrices,
so that the geometric measure on PD(n) is the restriction d1 of the
Lebesgue measure on S(n). The isotropic group of I € PD(n) is
O(n), and GL(n) = T+(n)O(n) is a left factorization. This means that
if 10 € PD(n),
determinant of the linear isomorphism
f(T O): Sen) .... Sen)
* 1 .... T01To.
we just have to calculate the
The invariant measure is then given by If(T) 1-1d1 where 1 = TT*.
Now, a bit of calculation shows that If(To ) I = IToln+1 = 110 1 (n+1)/2,
so that
is the invariant measure on PD(n).
Alternatively, consider the measure on PD(n) given by
JL (f) * f f(AA )daG(A) GL(n)
where is the (left) Haar measure on GL(n) . To see that this
integral is well-defined suppose that f € ~(PD(n», i.e. f is con
tinuous with compact support. Then we have to verify that
(A .... f(AA*» € ~(GL(n», but this is obvious since the isotropic
groups are compact. Furthermore, it is easy to see that JL is invari
ant i.e. JL = JL. Now T+(n) x O(n) acts transitively and freely on
GL(n) by the prescription
* ~(T,U): A ~ TAU.
59
By Example 6.1, u G is both left and right Haar measure, implying that
is also invariant under ~ and the same holds for the measure
given by
~(f) = * J J f(TU )dUduT (T) T+(n) O(n) +
where dU is the (left) Haar measure on O(n) and is the left
This J * Haar measure on T+(n) . shows that Jl (f) f(AA ) dUG (A) = GL(n)
J * * * J f(TU UT )dUduT (T) J f(TT )duT (T) if dU is norma-T+ (n) O(n) + T+ (n) +
lized to be a probability measure on O(n).
Consequently, we obtain that
* Jl (f) = J f(TT )duT (T) (6.11) T+ (n) +
is another way of representing the invariant measure on PD(n). D
Example 6.4. The Blaschke-Petkantschin formula. In continuation of
example 2.8 and example 4.6 let dX denote the restriction to
gl(p,n)t of Lebesgue measure on gl(p,n) and consider the action of t G = GL(p) x O(n) on gl(p,n) given by
gl(p,n)t G x 91(p,n)t ~ «A, U) ,X) * ~ AXU
Then dX is relatively invariant with multiplier K(A,U) = Idet(A) In,
as follows from observing that ~(A,U) = 8(A) 0 c(U) and that 8(A),
c(U) are linear. Secondly, we have in analogy with the considerations
in example 6.1 that
det(~(A,U)) = det(8(A))det(c(U)) = det(A)n det(U)P = det(A)n
On the other hand, let IAI-PdA denote the invariant measure on
GL(p) and let dL denote the invariant measure on G(p,n) =
t GL(p)\gl(p,n) .
60
Now, define a measure ~ t on gl(p,n)
~(f) J J f(AX) lAin IAI-P dA dL . G(p,n) GL(p)
by
It is easily verified that ~ is also relatively invariant with multi
plier x. with a suitable normalization it follows that we have
J f(X) dX J J f(AX) IAl n - p dAdL
gl(p,n) t G(p,n) GL(p)
This is known as the Blaschke-Petkantschin formula (Blaschke 1935a,
1935b and Petkantschin 1936), and is a useful tool in stochastic geo
metry, e.g. in proving unbiasedness of stereological estimators (see,
for instance, Miles, 1979, Jensen and Gundersen 1988, and Jensen, Kieu
and Gundersen, 1988). o
Example 6.5. Invariance subgroups of GL(n). Let ~ E GL(n) and
let H be a closed subgroup of G = (g E GL(n) Ig~g*=~}. Suppose that
~:V
2 .... IR n
chart on
E GL(n).
-1 is a local parametrization such that ~ = ~ provides a
H, and let Lg (Rg) denote left (right) translation by g
2 Then Lg and Rg are linear isomorphisms of IRn. In order
to determine a left invariant measure on H, we must calculate the
Jacobian at the identity of the diffeomorphism
-1 -1 f(Lg) = ~ 0 Lg 0 ~: V .... V, ~ (g) E V.
Now, and with -1
~ (e) , we obtain the follow-
ing equation for the differentials
This implies that
61
* ~ ~ ~ (e) 0 L -1 0 R~ 0 ~(e),
~
where the last equality follows from the fact that L * 0 L -1 0 Lg g ~
L -1. ~
Hence, taking determinants it follows that
where the constant c is given by
Consequently, a left invariant measure on H is given by
similarly, replacing Lg by Rg and observing that R * 0 R~ 0 Rg g
R~, it follows that a is also right invariant, showing that H is
unimodular.
In particular, this example applies in case H is a closed subgroup
of O(n) or O(1,q) or O(p,q). 0
Example 6.6. Orbits of invariance subgroups. Let H be a group of
the type considered in the previous example and let Xo € mn. Then we
want to determine an invariant measure on ~ = {gxolg € H}, i.e. on
the orbit of
Suppose that is a parametrization of ~. Then
arguments similar to those given in example 6.5 show that
(6.13)
is an invariant measure on ~. We consider three special cases.
(i) Next, consider example 3.5 and the parametrization of
62
given by the polar coordinates
"'1 (9) R(9)XO' 9 € 9 , (6.14)
where R(9) is a sequence of rotations. Some tedious calculations
yield
J(9)
so that the invariant measure on Sp-l, which is also the geometric
measure, is given by
dM 1 (9) = J(9)d9. (6.15)
(ii) Reconsider example 3.6 and the parametrization of Hq given by
U € IR, (6.16)
where 'l'(u,t) = chu + 1/2 e U lltll2 and ~(u,t) = shu + 1/2 e Ulltll2.
Performing the requisite calculations, we obtain
{-I-lItIl 2
D>/I;(U,t)II D>/I2(U,t) = ,q -t
* } -t
-I q-l
which has determinant (-I)q so that the invariant measure on Hq is
given by
dudt1 ... dtq _1 (6.17)
(iii) Consider example 3.7 and the parametrization of HP,q (except
for a null set) given by
63
[
'T (u,tHl (8)] ~(8,u,t) = ~(:,t) (6.18)
Calculations yield
* ~ (8,u,t)I ~(8,u,t) p,q
which has determinant (_1)qJ2 'T 2 (p-l), so that the invariant measure
on HP,q can be represented as
dj.!(8,u,t) = (chu + 1/2 e U lltll2) (p-l)J(8)d8dudt. (6.19)
An alternative parametrization of HP,q is given by
;j;(8,s) (6.20)
where /3(s) (1 + IIsIl2)1/2, and we obtain
-2 * J /3 ss-I
with determinant (_1)qJ2/32(p-2), so that another representation of
the invariant measure is given by
dj.!(8,s) = (1 + IIsIl 2 )1/2(p-2)J(8)d8ds. (6.21)
o
Example 6.7. sol (l,gj-invariant measure on cone surface. Consider 1 the action of so (l,q), q ~ 2 on the orbit
!'r o o}
64
In this case the methodology of the previous example breaks down, be
cause the determinant in (6.13) is identically zero. Instead we use
(6.5) to construct an invariant measure. This requires an orbital de
composition, which may be constructed via Lie algebras along the lines
in section 3, leading to the conclusion that
i.e. ~o is isomorphic to a generalized hyperboloid.
However, the decomposition is almost trivial to establish in the
following way.
We consider the parametrization
"'(x) = ["~"] ,
and define the decomposition by
"'(x) = g(x)uo
where
e~ (1,0, •• ,0) € mq ,
g(x) o } {Z(X) U(x) 0
{~(IIXII+IIXIl-1) ~(IIXII-IIXIl-1)} z(x)
~(IIXII-IIXIl-1) ~(IIXII+IIXIl-l)
and U(x) € O(q) is a sequence of rotations determined by
since
x II xII
65
it is easy to see that
Applying (6.5) it follows that
dJ,t(x) = IIxll- l dx
is the invariant measure on ~O'
For later purposes we note that ~o is also invariant under the
action of ~: x sol (l,q) given by
(a ,A) : [ II XxII]
and it is obvious that ~ is relatively invariant with multiplier -_ a q - l . x(a,A) 0
We close this section by an example illustrating theorem 5.5, which
characterizes the quotient measure by invariance.
*. . . Example 6.S. SO(p,g) x ~+-~nvar~nt measure on a cone. Cons~der
example 5.2 and the action of * SO(p,q) x ~+
and xl > 0 if p = I}, given by
The mapping
(s, r) :
transforms the invariant measure
respectively p, is invariant on
r(x)-(p+q) dx into
HP,q, respectively
where a,
66
The invariant measure on is given by -1 r dr
f f(x)r(x)-(P+q)dx !I
CX>
f f f(r- 1s)r-1drda(s) HP,q 0
or equivalently, applying (6.21),
f f(x)dx !I
so that
In case q = 0 this is the well-known polar decomposition of Lebesgue
measure on ffiP. 0
Bibliographical notes
This section consists primarily of examples involving the notion of
differentiable manifolds and Lie groups. A comprehensive, but rather
succinct, exposition of these concepts are given by Helgason (1978),
chapters I and II, while Cohn (1957) provides a very nice introduction
to Lie groups.
7. Exterior calculus
The exterior calculus of differential geometry provides procedures
for factorization of measures and for the construction of invariant
measures, which in many cases constitute a shortcut to the result. We
wish here to indicate the technique so as to enable the reader to apply
it without having to study exterior calculus as such. Accordingly, the
discussion will in the present section be somewhat informal in compari
son with the previous sections. For a comprehensive and rigorous expo
sition of exterior calculus see, for instance, Edelen (1985).
Exterior calculus can be said to be the calculus of differentials.
We shall start by illustrating, through an example, how manipulations
with differentials can sometimes, in a simple and elegant way, lead to
a desired factorization of a measure. Actually, in the example, we use
- except for a reference to the result (5.11) - little more than stand
ard reasoning of ordinary calculus.
67
Example 7.1. A factorization of Lebesgue measure on PD(n). Let
(7.1)
be Lebesgue measure on the set
matrices viewed as a subset of
PD(n) of positive definite n x n
Rn (n+1)/2. We seek a factorization of
~ corresponding to the spectral decomposition
* UAU (7.2)
of an arbitrary matrix ~ € PD(n), where A is the diagonal matrix of
eigenvalues of ~ and U is an element of O(n). Disregarding the
nullset N of PD(n) corresponding to multiple characteristic roots
we may assume that the diagonal elements of A satisfy 0 < All < A22
< ••• < A nn The set of all such A will be denoted by ~, and we
shall write Ai for Aii . There are (n-1)n/2 functionally indepen-
dent elements of U and we choose to work with
n as such a set of elements.
The group G = O(n) acts on PD(n) by the law
O(n) x PD(n) ~ PD(n)
* ~ U~U
for
Under this action the set ~ = PD(n),N is invariant and of constant
orbit type, the matrix A is a maximal invariant and the isotropy
group at A consists of the set of diagonal matrices whose diagonal
elements are +1 or -1. Furthermore, the action is proper and the
measure (7.1) is invariant under the action. Hence, by theorem 5.2, we
have that ~ factorizes in the sense that for any integrable f
* f f f(UAU )d~(U)dK(A) ~ G
where ~ is right invariant measure on O(n).
The mapping
G x ~
(U,A) * ~ UAU
(7.3)
68
is locally a diffeomorphism and consequently K
with respect to Lebesgue measure rrd~i on ~.
has a density h(A)
To find h(A) we first
note that (7.2) entails
* * * dUAU + UdAU + UAdU (7.4)
where cu: , dU and dA
differentials of ~, U
are the
and A,
n x n matrices consisting of the
* respectively. since UU = I we have
* * dUU + UdU o and hence
* * * UdAU + dUU ~ - ~dUU (7.5)
from which we may determine an expression for rri~j da ij and conse-
quently the form of h(A). In view of the factorization (7.3) it suf
fices to consider the case U = I, when
d~ dA + dUA - AdU
or, expressed in terms of the entries of the matrices,
where
h(A)
is the Kronecker delta. Consequently,
rr (~. -A . ) • i<j J 1
Let M be an m-dimensional differentiable manifold and 1 m be local coordinates M. with each point w , ••• , (IJ on p
associate an m-dimensional vector space * TMp (the dual of
space TMp of M at p) and a particular basis of TM * p'
D
let
of M we
the tangent
the m
vectors of the basis being denoted by dw 1 , ... ,dwm. The details need
not concern us here, what is important in the present context is how
the basis changes under a change of coordinates. Suppose ~l, ... ,~m is a set of alternative coordinates, and let us denote generic
coordinates of w = (w 1 , ... ,wm) and ~ = (~l, ... ,~m) by wr , wS , ••.
and ~a, ~b, ... , respectively. Furthermore, if f is a dif
ferentiable function on M we write fir for 8f(w)/8w r , and in
particular we then have ~;r 8~a/8wr. The transformation rule for
69
the basis vectors may now be stated as
d ."a __ .I,a d r ." ." /r w • (7.6)
Here, and in the sequel, we employ the Einstein summation convention.
Note that this transformation rule fits in with the standard (mostly
informal) way of expressing the differential df of a function f as
df(w) r fir dw (7.7)
for if on the right hand side of the expression df(",) = f d",a /a we
insert (7.7) there results Le. df(w).
It is possible to introduce a product of vectors in termed
the exterior product or the wedge product. In particular, the exterior r l rm
product of m basis vectors dw , ... ,dw in that order, is denoted
by
m r, A dw 1
i=l
r l r dw A ... Adw m
Again, most of the details need not concern us here, but it
tial to know that the wedge product has the properties that
more of the indices rl, ... ,rm coincide then (7.8) equals
and if o(l), ... ,o(m) is a permutation of l, ... ,m then
(7.8)
is essen-
if two or
0 Le.
(7.9)
(7.10)
Using these rules and the distributive property of the wedge product we
obtain from (7.6) that
70
where J~(w) is the Jacobian determinant of the transformation from w
to ~.
As discussed in section 6, the determination of an invariant measure
can usually be reduced to the calculation of a Jacobian. The point we
want to make here is that the above-mentioned properties of exterior
multiplication are convenient for such calculations. Example 7.1 can,
in fact, be seen as illustrating this. As another instance we consider
Example 7.2. Jacobian of the Cholesky decomposition. Consider the
one-to-one mapping of T+(n) onto PD(n) given by T ~ I where
* T T
is the (unique) Cholesky decomposition of
Denoting the entries of T and ~ by
ly, we have
and hence
i I (tk,dtk,+tk,dtk ,).
k=1 ] ~ ~ ]
and respective-
(7.12)
To obtain the Jacobian JI(T) of the transformation from T to I we
calculate the exterior product A da l,],. In view of (7.9), when dt~], i~j ~
has occurred in one of the factors we may disregard it in any
subsequent factor. Bearing this in mind we write out (7.12) as
da1n
dann
from which it follows that
A da 1· J' i~j
so that
J~(T) n II
i=l t~:-i+1.
11
71
Let 1 ~ r ~ m. A linear combination of the type
o
(7.13)
where the functions f. . are smooth and we are using the Einstein 1 1 ... 1 r
summation convention, is called a (differential) r-form on M. Without
loss of generality we assume that is skew-symmetric, i.e.
f. . 1a(1)·· .1a (r)
sign(a) f. . 1 1 ... 1 r
for any permutation a(l), ... ,a(r) of 1, ... ,r. Note that it is im
plicit in (7.13) that for any given coordinate system on M we have a
collection of smooth functions and that, in an obvious notation, these
are related by
(7.14)
(In the language of differential geometry this means that
a covariant tensor of rank r.)
72
A differential r-form (7.13) is said to be invariant under the ac
tion of a Lie group G on the manifold M if for every fixed g E G
and with the coordinate system ~ defined by ~ = gw, i.e. ~(p)
W (g-lp) f . t E M h or any pOln p , we ave
f. . (gw) 1 1 ···1r
(7.15 )
where, in this context, 6 is the Kronecker delta.
We conclude this section by exemplifying how the invariant measure
on an m-dimensional differentiable manifold M under a transitive
action of a Lie group G may be determined as the exterior product of
m invariant I-forms.
In general, suppose that M is a submanifold of ffin and that G
is a closed subgroup of GL(n). Let K be the isotropic group of Xo
E M, and suppose that 4 K(k) 4G(k), k E K, i.e. there exists an
invariant measure on M (cf. corollary 4.2). Furthermore, we will
assume that G = HK is a left factorization of G with respect to K,
so that M = HXO. Then for x = hxo' h E H, we have that h-1 dhxo * (w 1 (x), ... ,wn (x» is a vector of invariant I-forms on M. If we
choose m linearly independent I-forms ~1' ... '~m among w1 , ••• ,wn
then the exterior product
d~ (x)
of determines an invariant measure on M.
Example 7.3. The invariant measure on HP,q. According to example
2.6 that HP,q is parametrized by
Q(9,y,J,L,t)uO
where x = ~1(9,y) E Sp-l and (~) = ~2(J,L,t) E Hq . For short, let Q
Q(9,y,J,L,t). since Q = R(9,y)P(J,L,t) = RP it follows that
NOw, let
73
:} . Then i,j
* (w 1 ,···,w 1 ) P p- ,p
1, ... ,p, are invariant 1-forms on sp-1 and w*p =
is a vector of linearly independent 1-forms deter-
mining the invariant measure on Sp-1 given by
dw(x) = w 1 (X)A ... Aw 1 (x) P p- ,p
Similarly, if
{:
then determines an invariant measure on Hq •
combining the above we obtain
-1 Q dQuo [aw*p 1 2 a w + PO~
awp:~ + P*O
where
[::::v + ..J is a set of linearly independent, invariant 1-forms on HP,q.
w A ... Aw Aw = 0 we have that the invariant measure on 1p p-1,p pp
the point Quo = (~x) is given by
p-1 a dw(x)Adp(a,v)
Since HP,q at
74
where dw is the invariant measure on Sp-1 and dp is invariant on
Hq • This is in agreement with the result obtained by combining (6.15),
( 6 . 17 ) and ( 6 . 19) • 0
Bibliographical notes
For more details and further examples concerning the topics dis-I
cussed in this section, see Santalo (1979) part II, Muirhead (1982)
chapter 2, and Farrell (1985).
8. statistical transformation models
Many of the most important models in statistics are transformation
models, or partly transformational. In this chapter we present the key
aspects of the powerful theory of transformation models. This theory
draws heavily on the theory of decomposition and invariance of measures
considered in sections 2-7.
To make the account accessible without a detailed knowledge of the
material in the previous sections, we have chosen to formulate the
definition and the main theorems of transformation models as non-tech
nical as possible and to discuss the more technical aspects, including
the appropriate regularity conditions, in the form of comments. We do
assume, however, that the reader is familiar with the basic notions
relating to group actions and to invariant measures, as given in the
beginning of sections 2 and 4, respectively. Furthermore, a good know
ledge of parametric statistics is essential for a full appreciation of
the import of the material in this section.
We say that a class of probability measures ~ defined on a sample
space ~ constitutes a transformation model with acting group G if
G acts on ~ and if there exists an element P of ~ such that ~
{gP: g € G}. Here gP denotes the measure P lifted by the transfor
mation corresponding to g € G, i.e.
(gP) (A) (8.1)
for every A € ~, the class of measurable sets. Note that, more gener
ally, an action of G on ~ induces, via formula (8.1), an action of
G on the set of all probability measures on ~. The equality ~
{gP: g € G} says that ~ is invariant under this induced action and
that ~ consists of a single orbit. If we require only that ~ be
75
invariant relative to G we have the concept of a composite transform
ation model with acting group G.
In the present section we consider primarily models ~ which are
pure transformation models, but some discussion of composite transform
ation models is provided at the end of the section. After some first
examples of transformation models we formulate the concept of a stan
dard transformation model. Virtually all transformation models met in
statistics are of this type, and for such models we then state and
prove a theorem which is the main tool for statistical inference under
transformation models. The proof of this result (Theorem 8.1) relies
strongly on the theory developed in sections 2-7, but in line with our
introductory remark we have endeavoured to formulate the theorem so as
to make it comprehensible and applicable without a primary close atten
tion to the requisite technicalities and regularity conditions. A con
siderable diversity of examples is subsequently discussed, to illus
trate the versatility and power of this theorem. The concludig discus
sion of composite transformation models contains as a main result a
formula for the marginal likelihood based on the maximal invariant
statistic.
As an incidental comment it may be noted that most of the models
considered in the examples of the present section are exponential
transformation models, i.e. they are both transformation models and of
the exponential family type. Such models thus possess two quite differ
ent sets of general, theoretically and practically significant, proper
ties. They are therefore of particular interest and they have various
special structures of import. These structures are indicated in exer
cise 25 (p. 128). A brief exposition of the structures may also be
found in section 2.5 of Barndorff-Nielsen (1988), where references to
the literature are given.
Example 8.1. Location-scale model. Let G be the representation
of the location-scale group considered in example 4.9, i.e. G is the
group of 2 x 2 matrices
g [~ ~] a > 0, f € IR •
The location-scale group acts on IR n by the law
G x IRn -+ IR n
(g,(x1 ,···,xn » -+ (f+axl,···,f+axn) (8.2)
76
Let f: m ~ m+ denote an arbitrary probability density function
(with respect to Lebesgue measure on rn) and let P be the probabili
ty measure on mn corresponding to the assumption that the variables
x 1 , ... ,xn are stochastically independent and identically distributed
each with probability density function f. Then
~ cD. (X1 ,···,X) n n
n a-n IT
i=1
x·-f f(-~-)
a
where An denotes Lebesgue measure on rnn.
(8.3)
The model ~ = {gP: g € G} is a transformation model, called the
location-scale model corresponding to f. o
Example 8.2. Additive effects model. (Multivariate two-way analysis
of variance). Let x .. i=1, ... ,k, j=l, ... ,l, v=l, ... ,m, be indepen~]v
dent, p-dimensional random vectors with x .. ~]v
following a mUltivariate
normal distribution. We shall discuss the model for (fixed) additive
effects, that is x .. ~]v
is assumed to have mean vector of the form
+ ~j + ~ and covariance matrix ~, where the p-dimensional vectors
a i (i=1, ... ,k), ~j (j=1, ... ,I) and ~ and the symmetric positive
definite matrix ~ are all considered as unknown parameters. We let n
k-I-m and q = k-I-m-p.
Define G to be that subgroup of GA+ (q) whose typical element g
[B,f] has f of the form f ijv = a.+~.+~ :1 ]
and B of the form B
A~In with In the n x n identity matrix and A an arbitrary ele-
ment of GL+ (p) . In order to identify the effects we assume ~a. = 0 ~
The additive effects model is a transformation model under the
action of G on inherited from the action of as defined
in example 2.1. If we let P denote the probability measure on mq
corresponding to the standard normal distribution Nq(O,I) then gP
with ~ = AA *. o
Example 8.3. von Mises-Fisher model. The von Mises-Fisher distri
bution on Sp-1 = {x € rn P : IIxll=1} (or the circular normal distribution
in mP ) with mean direction f € Sp-1 and concentration parameter K
~ 0, which we denote by Cp(f,K), has probability density function
77
with respect to the surface measure ~ on of the form
X € sp-1 . (8.4)
Here indicates the usual inner product in mP and ap(K) is a
norming constant given by
p/2 1-p/2 (2v) K I p/ 2_1 (K)
where denotes the modified Bessel function of the first kind
and of order p/2-1.
As mentioned in example 3.5, the special orthogonal group SO(p)
{U € GL(p) Idet(U) = 1, uu* = Ip} acts on Sp-1 by the law
SO(p) x Sp-1 ~ Sp-1
(U,x) ~ Ux .
Since the measure ~ is SO(p)-invariant one has that
(8.5)
(where X denotes a random variate and is read "distributed as").
The sample space ~ corresponding to a sample of size n from the
von Mises-Fisher distribution is
and SO(p) acts on ~ by the law
SO(p) x ~ -+ ~
(U, (x 1 ' ... , Xn » ~ (Ux 1 ' ... , UXn ) . (8.6)
The measure on ~ is invariant under this action. If
denotes the probability measure on ~ corresponding to a sample
x 1 , •.• ,xn from the Cp(f,K)-distribution it follows from (8.4) that
dP (f ,K) ~ (X1 ,···,x)
dj.L n n (8.7)
78
where x+ x 1+ ... +xn . Furthermore, (8.5) implies that
for U € SO(p) (8.8)
The action of SO(p) on sp-1 is transitive, implying that for every
fixed K > 0 the class of probability measures
constitutes a pure transformation model, whereas the class
_ I p-1 f - {P(f,K) f€S , K~O}
constitutes a composite transformation model. o
Example 8.4. Elliptical models. Consider the action of GA(p) on
mP given by
([A,a],x) ~ Ax + a .
It is well-known that the class of (regular) mUltivariate normal
distributions is invariant under this action and constitutes an example
of a transformation model, i.e. the action is transitive. If we con
sider the parameterization Hp = {Np (f,2) If € mP ,2 € PD(p)}, where f
is the mean vector and 2 is the covariance matrix, then the induced
action of GA(p) on mP x PD(p) is given by
([A,a], (f,I» * ~ (Af+a, AlA )
The isotropic group of (0,1) equals O(p), i.e.
Hp - GA(p)/O(p) .
NOW, suppose that P is a probability measure on mP having densi
ty with respect to some quasi-invariant measure. It is then a natural
question to ask when the model ~ = (gPlg € GA(p)} can be parametrized
by GA(p)/O(p). This problem has been solved by Jespersen (1989) who
shows that ~ has to be elliptical, i.e. P has density f with
79
respect to Lebesgue measure on ffiP of the form
x € ffiP ,
where
h: ffi+ ~ ffi+ is continuous
and
sm h(s) sp/2-1 ds o f(P/2)'IT-P/ 2 .
It follows that
f(f,};) (x)
where }; * AA
d([A,f]P) (x) dx
In terms of densities we have
If X ~ f (ft};) and S~ h (s) sp/2 ds < 00, then the mean value
is f whereas the covariance matrix of X is proportional to
class of mUltivariate normal distributions is determined by
h(s) = (2'IT)-p/2 exp(-s/2) ,
while
h(s) = (n'IT)-P/2f «n+p)/2)/f(n/2) (1+s/n)-(p+n)/2
of X
L The
defines the class of mUltivariate student distributions with n de
grees of freedom. When n = 1 these are called the multivariate Cauchy
distributions. 0
Before we give the definition of a standard transformation model we
recapitulate the necessary concepts concerning the action of a group G
on a space ~. An orbit representative u is a maximal invariant
function defined on ~ and with values in ~, i.e. u(gx) = u(x) E ~
80
for x € ~ and 9 € G and u(x) = u(x') if.and only if there exists
a 9 € G such that x' gx.
relative to the action of G
u and a regular subgroup K
u(x)} = K for every x € ~.
The space ~ is of regular orbit type
if there exists an orbit representative
of G such that GU(X) = (g€Glgu(x)
In that case we refer to u and K as
the orbit regresentative and the isotrogy groug associated to G. A
mapping t defined on ~ and with range space ~ is
t(x) = t(x') implies. t(gx) = t(gx' ) for every 9 €
acts on the range
scription
Gx~-+~
(g,t) -+ gt
where
gt t(gx)
space
if t
We are now set for
~ of an equivariant mapping
t(x) •
egyivariant if
G. The group
t by the pre-
~,
G
Definition 8.1. A transformation model ~ with sample space
acting group G and probability measure family ~ = {gP: g€G} is said
to be a standard transformation model if the following conditions are
satisfied:
(i) There exists an invariant measure ~ on ~ such that
P «~ (i.e. P
~). We let p
Le. p = dP/~.
is absolutely continuous with respect to
denote the density of P with respect to
(ii) ~ is of regular orbit type relative to the action of G.
~,
The orbit representative and the isotropy group associated to
G will be denoted by u and K, respectively.
(iii) K is a subgroup of K, where K Gp = {g€GlgP=p}.
(iv) There exist equivariant statistics rand s and a statistic
v with range spaces ~, ~ and 1, respectively, such that:
(a)
81
G acts transitively on ~ and on !I,
the range space of u, one has Gr(u)
K, so that the actions of G on ~
identified with the natural actions of
G/K and G/K, respectively.
and for
= K and
and !I may
G on the
(b) ~ is in one-to-one correspondence with K/K.
u € 11,
Gs(u)
be
spaces
(c) There exist measures p, a, u and ~ on ~, !I, 11
and ~, respectively, such that under the bijections
!: .... ~xll !: .... !I x 11 x l'
x .... (r,u) x .... (s,u,v)
the invariant measure ~ is lifted as
~ .... p at u
where p and a are invariant measures under the action
of G on ~ and !I, respectively. []
comments
(A) As stated in theorem 8.1 below, condition (i) of definition 8.1
implies that the transformed probability measure gP has a density,
which will be denoted by gp, with respect to the invariant measure.
This density is of the form
~ -1 gp(x) = ~ (x) = p(g x) , X € ~ •
Consequently, by condition (iii), an equivalent characterization of the
subgroup K is given by
K (g € Glp(g-l o) = p(o) <~>} (8.9)
(Here <~> indicates that the equality holds except possibly for a set
of ~ measure 0.)
(B) On account of (8.9), there exists a one-to-one correspondence
between the set , of probability measures and the set of left cosets
G/K = {gKlg € G}, i.e. the transformation model , may be para
metrized by G/K. However, such a parametrization is often considered
82
as artificial and we have here chosen to use the space ~ as a para
meter set. According to (a) in condition (iv) the set ~ may be
thought of as a representation of the space of cosets G/K and in that
case the statistic s is typically the maximum likelihood estimator.
In applications one often has a left factorization G = HK of G with
respect to K, as discussed in sections 2 and 3, and in that case ~
is simply a representation of the subset H of G.
(C) For a transformation model satisfying that the maximum likeli-
hood estimate sex) exists uniquely for every x € 3: and for which
the orbit representative u in (ii) is chosen such that G = s(u) K for
every u € 'tI, it is easily seen that K C K. This observation may be
viewed as the justification for condition (iii) in definition 8.1.
Models with K = K are special instances of the socalled balanced
transformation models, introduced and studied in Barndorff-Nielsen
(1983, 1988). Note that if K = K the statistic v in condition (iv)
is trivial. The location-scale model with sample size n ~ 2 is ba
lanced, but for n = 1 we have K J K (cf. the continuation of ex
ample 8.1 below).
(D) From unbalanced transformation models, in particular unbalanced
exponential transformation models, it is sometimes possible by margina
lization to a minimal sufficient statistic or by conditioning on an
ancillary statistic to obtain balanced transformation models (cf. the
continuations of example 8.3 below). According to the statistical prin
ciples of sUfficiency and ancillarity a model derived in one of these
ways from an original transformation model is an appropriate basis for
inference on the model parameters.
(E) In example 8.4 and example 8.6 below the action of G on the
sample space is free, i.e. K = {e}, but the transformation model is
unbalanced. However, in both examples there exists a subgroup Go S G
generating the same transformation model as G, but such that the
model corresponding to Go is balanced.
(F) Finally, note that the assumption Gr(U) = K in (a) of condi
tion (iv) implies that the mapping x ~ (r,u) is, in fact, a bijec
tion. Furthermore, according to (a) and (b) we have the identifications
~ - G/K - G/K x K/K - ~ x 1
and hence the mapping x ~ (u,s,v) is also a bijection, and it corre-
83
sponds to the identification
II == II/G x G/K x K/i{ D
until now no regularity conditions have been mentioned in relation
to a standard transformation model. These conditions, which are all of
topological nature, are as follows:
(i) (G,II) is a standard transformation group (cf. p. 12).
(ii) The spaces ~, ~ and 1 are LCD-spaces (cf. p. 12) and
(G,~) and (G,~) are standard transformation groups.
(iii) The mappings x ~ (r,u) and x ~ (s,u,v) are proper mappings
(cf. p. 13).
Remarks. Note that endowing
(i) implies that G acts on ~
G x ~ ~ ~
(g,P) ~ gP ,
where gP is given by (8.1).
~ with the weak topology, condition
according to the law (cf. p. 3)
In all examples of non-trivial standard transformation models we
know of, the isotropy group K is compact, which by (iii) in defini
tion 8.1 implies that K is compact. We have not been able to prove
that in general K is compact: however, this is certainly the case if
the action of G on ~ is proper, cf. (iv) on p. 13.
Example 8.1. Location-scale model (continued). According to example
4.9, if n = 1 there exists no invariant measure on m under the
action (8.2) so in that case the location-scale model is not standard.
From the same example it follows that if we let K = G(O,l)*' the
isotropy group corresponding to x = 0, one has that
K {[~ ~]: a > O} .
For n ~ 2 let
84
- 1 n
where x }; xi· Restricting the sample space to ~ = { x€lRn : n i=l
s(x»O}, it follows from example 4.1 that the measure on ~ given by
dj.L(x) -n s (x) dAn (x) (8.10)
is invariant under the action of the location-scale group as defined by
(8.2). From (8.3) and (8.10) one obtains that
p(x) dP(X) dj.L
n s (x) n rr f (xi) .
i=l (8.11)
As shown in example 2.3 the action is free, i.e. K consists of the
2 x 2 unity matrix I2 alone, and the 'configuration statistic', i.e.
- -u (x)
x -x x -x 1 n
(--s-' ... '--s-) (8.12)
is an orbit representative and is an ancillary statistic. Furthermore,
the mapping
r: ~ -+ IR+ x IR
x -+ (s(x),x)
defines an equivariant statistic such that r(u(x» = (1,0) for all x
€ ~ and it follows from example 2.3 that the mapping x -+ (r,u) is
one-to-one and onto.
since f is assumed to be the probability density function of a
non-degenerate distribution one has that G acts freely on
gP = g'P if and only if g = g'. Consequently K = {I2 }.
g = [~ i) € G
'!I, Le.
Identifying
with (a,f) € IR+ x IR it follows that ~
action of G on ~ is
IR+ x IR and that the
G x (IR+ x IR) -+ IR+ x IR
(g, (s,t» -+ (as,at+f) (8.13 )
85
or, equivalently, the left action of G on
obtain that the invariant measure p on ~
-2 dp(s,t) = s dsdt.
G. From example 4.9 we
is given by
(8.14)
According to example 5.3 the measure u turns out to be the uniform n
measure on the set {x € IRnl}; xi = 0, IIxli = I} which may be identii=l
fied with n-2 S •
The above considerations show that the location-scale model is a
standard transformation model if n ~ 2. o
Example 8.2. Additive effects model (continued). An invariant
measure on IRq under the action of the group G, previously defined in
this example, exists if and only if f > p, where f = n-(k+l-l) is
the 'number of degrees of freedom'. Such a measure is constructed
below. The condition f > p, which we henceforth assume to be satis
fied, is also necessary and sufficient for the existence ~n~ ~n~queness
with probability 1 of the maximum likelihood estimator (a,~,7,};) of
(a,~,7,};), where we have set a = (a1,···,ak ) and ~ = (~l""'~l)'
Let
- 1 }; xi++ 1m x .. j,v ~Jv
- 1 }; x+ j + km x ..
i,v ~Jv
x+++ 1 n
}; x .. i,j,v ~Jv
and
Defining 3: by
3: = {X€lRq: z positive definite} ,
we have that 3: is an open and invariant subset of IRq and that the
86
probability mass of ~ is 1. Moreover, on ~,
- -a. 1 xi++-x+++, {jj 'Y =
and
1 - z . n
Lebesgue measure X on ffiq is relatively invariant under G with
multiplier ~(g) = lAin which, since ~ = AA*, may be reexpressed as
~(g) = 1~ln/2. Further, one sees immediately that lil n/ 2 is a modu
lator with multiplier ~(g). Consequently, by (4.2), the measure
c4L (x) = I i (x) l-n/ 2 dX (x) (8.15 )
is an invariant measure on ~, and the corresponding model function of
the additive model is
p(x;a,{j,'Y,~)
(8.16)
where all multiplicative constants have been absorbed in ~.
The group G acts on the range space ~ of the maximum likelihood
estimator
s (a,{j,'Y,~) (8.17)
by
gs (8.18 )
Under this action, Lebesgue measure X~ on ~ is relatively invariant
87
with multiplier IAlk+l+p. In fact, it follows from (8.18) that the multiplier is IAI (k-1)+(1-1)+1+(p+1) IAl k+l +p . Consequently invari-
ant measure on the range space ~ of s is given by
da(s) = lil-(k+l+p)/2 dA~(s) . (8.19)
In view of these properties one sees that the additive effects model
is a standard transformation model with K = {e}, K = SO(p) and with
s, ~ and a given by (8.17), (8.15) and (8.19). o
Example 8.3. von Mises-Fisher model (continued). For simplicity we
restrict ourselves to the case p = 3, i.e. to the Fisher distribution
on the sphere and furthermore we assume that K is fixed. In order to
establish that this model is a standard transformation model, note
first that (8.7) implies that (i) in definition 8.1 is fulfilled. Let-
* ting P be the Fisher distribution with mean direction (0,0,1) it
follows from example 3.5 and formula (8.8) that
K = {{~ ~}: U € SO(2)} ,
i.e. K is the set of rotations in the 1-2 plane.
Since the action of SO(3) on s2 is transitive we may if the
sample size n equals 1 consider
tive and so, in this case, one has
Now suppose that n ~ 2 and let
* (0,0,1)
K = K.
as an orbit representa-
where x+ = x 1+ ... +xn . Clearly ~ is invariant under the action (8.6)
and ~~n is an invariant measure on ~. In order to find an orbit
representative we now, inspired by the considerations in example 3.5,
construct a mapping z: ~ ~ G such that
-1 z(Ux) = z(x)U for U € SO(3) ,
by the following considerations. Let x € ~ and let x+
If we let
(8.20)
otherwise
we define R23 (X) as the uniquely determined rotation in the 2-3
88
plane, such that
similarly we determine R13 (x) as the unique rotation in the 1-3 plane
such that the corresponding angle lies in the interval [-v/2,v/2] and
such that
(8.21)
Finally, we let i = 1,2, ... ,n and j = min{i:
(Xi1,Xi2)~(O,O)} and determine R12 (X) as the unique rotation in the
1-2 plan such that
The function z: ~ ~ G defined by
z (x) (8.22)
satisfies (8.20) and it follows that u: ~ ~ ~ given by
u(x) = z(x)x , X E ~ ,
is an orbit representative and that r: ~ ~ G given by
-1 rex) = z (x) , x E ~ , (8.23)
is equivariant. Consequently, one has that x ~ (r(x),u(x» is an
orbital decomposition, i.e.
x r(x)u(x) . (8.24)
The action of G on ~ is free. In order to prove this it suffices
* For u E ~ one has that u+ = (O,O,lIu+lI)
and that there exists a j such that (Uj1 'Uj2 ) ~ (0,0). The condi
tion
89
Uu u
implies in particular that
(8.25)
and
(8.26)
From (8.25) one obtains that U must be a rotation in the 1-2 plane
and (8.26) then implies that U = 1 3 . since K = {1 3 } we have that K
~ K and G/K = SO(3). Thus setting ~ = SO(3) and letting r be the
mapping defined by (8.22) and (8.23), the part of (iv) (a) in definition
8.1 pertaining to the statistic r is fulfilled. Furthermore, letting
~ = s2 we define the equivariant mapping s in definition 8.1 by
Note that (8.21) implies that
sex)
Finally, since K/K
v: ~ -+ K -1 x -+ R12 (X)
K we set .,
(8.27)
K and define v by
Noticing that the SO(3)-invariant measures p and a on ~ and ~
are, respectively, the invariant measure on SO(3) and the surface
measure on S2, we have now established that the family of Fisher
distributions with fixed concentration parameter K is a standard
transformation model. o
Theorem 8.1. Main theorem on transformation models. Let J be a
standard transformation model. Then, in the notation of definition 8.1
and under the topological regularity conditions specified after this
theorem the following conclusions hold:
(i)
90
~ is dominated by ~, i.e. ~« ~ and if gP
then
-1 gp(x) = peg x)
d(gP)/~
(8.28)
(ii) The maximal invariant statistic u has a distribution which
does not depend on g, i.e. u(gP) = uP for every g E G.
Furthermore, uP« u and the density of uP with respect
to u is
p(u) = f p(r,u)dp(r) <u> • (8.29)
(iii) The conditional distribution of r given u and under gP
has density with respect to p of the form
gp(rlu) ~ -1 c(u)p(g r,u) <p> (8.30)
where c(u) -1 p(u) .
(iv) If p(s,u,v) does not, in fact, depend on v then v is
distribution constant (i.e. its distribution does not depend
on g) and is independent of (s,u). Furthermore, the con
ditional distribution of s given (u,v) and under gP
has density with respect to 0 of the form
gp(slu,v) = gp(slu) = c(u)p(g-l s ,u) <0> (8.31)
where p(s,u) denotes the density of (s,u) under P with
respect to 0 ~ u and where
-1 c(u) f p(s,u)do(s) .
<J
(v) If, in addition to the assumption in (iv), p(s,u,v) is of
the form PO(u)P1(s,w) for some nonnegative functions PO
and and where w is some invariant statistic then
(a) (s,w) is sufficient
(b) sand (u,v) are conditionally independent given w
91
given w, or (c) the conditional distribution of s
equivalently given u, and under
respect to a of the form
gP has density with
gp(s Iw)
where
-1 c(w)
-1 c(w)P1(g s,w)
S P1(s,W)da(S) . ':f
<a> (8.32)
[]
Remark. It may be shown that if one of the following two conditions
is fulfilled
(a) G has a left factorization with respect to K of the form
G HK
where H is a subgroup of G (cf. formula 2.2»
(b) K is a normal subgroup of G (cf. example 4.4»
then p(s,u,v) does not depend on v, i.e. condition (iv) of theorem
8.1 is satisfied.
Proof of theorem 8.1
For any measurable set A such that
due to the invariance of ~, that
(gP) (A) p(g-lA)
S p(x)~(x) -lA g
S p(g-lX)~(x) A
which proves (i).
-1 g A is measurable one has,
According to (c) in definition 8.1 one has for any ~-integrable
function f that
S f(x)~(x) !I
J J f(r,u)dp(r)du(u) . ~ ~
(8.33)
92
Using (i) and the G-invariance of the measure p and of the statistic
u we find for all measurable subsets U of ~ and for all g € G,
that
u(gP) (U) (gP) (u -lU ) -1 J 1u (U(X))P(g x)~(dx)
!I
J J U !'Ii
J J U !'Ii
-1 p(g r,u)dp(r)du(u)
p(r,u)dp(r)du(u)
and the proof of (ii) is complete.
To prove (iii), note that (i) and (8.33) imply that the distribution
of (r,u) under gP has density
-1 gp(r,u) = p(g r,u)
with respect to the measure p ® u. using (ii) the proof of (iii) is
easily completed.
Since gp(s,u,v) denotes the density under gP of (s,u,v) with
respect to the measure a ® u ® ~ one has
J J J gp(s,u,v)d~(v)du(u)da(s) 1. ';I ~ "t
Consequently, the condition that gp(s,u,v) does not depend on v
implies that
that ~("t) = 1
~("t) must be finite. Normalizing the measure ~
it follows that the densities of v and (s,u)
gP with respect to ~ and a ® u are, respectively,
gp(v) 1
and
gp(s,u) gp(s,u,vo) ,
where is an arbitrary element of "t. This implies that v
such
under
is
distribution constant and independent of (s,u). Furthermore, since
gp(s,u) -1 p(g s,u) ,
93
the proof of (iv) may be completed by an argument similar to that in
the proof of (iii).
Under the conditions in (v) one has that
gp(s,u,v) gp(s,u) -1
peg s,u) -1
PO(u)P1(g s,w) (8.34)
which implies that (s,w) is sufficient. Furthermore, it follows from
(8.34) that
-1 gp(u,v) = Po(u) f P1(g s,w(u»da(s)
<J
and we obtain that the conditional distribution of s given (u,v)
and under gP has derivative with respect to a of the form
-1 P1(g s,w(u»/f P1(s,w(u»da(s) .
<J
Since this expression depends on (u,v) only through w(u) the re-
maining conclusions in (v) are easily proved. 0
Example 8.1. Location-scale model (continued). From (8.11) and
theorem 8.1, or more directly from (8.3) and (8.10), it follows that
-1 n x·-f gp(x) = peg x) (s(x)/a)n IT f(-~-)
i=l a
for x € ~ and
g [~ ~] € G ,
or equivalently that
-1 n su.+x-f gp(r,u) peg r,u) (s/a)n IT f( ~ )
i=l a
- -1 where r = (s,x) and where u and g r are given by (8.12) and
(8.13), respectively. using (ii) in theorem 8.1 and (8.12) one obtains
that
p(u)
94
J~ J~ n - n-2 -_~ 0 IT f(sui+x)s dsdx
i=l
and from (iii) in theorem 8.1 that
gp(rlu) -1 n p(u) (s/a) n IT
i=l
su.+x-f f (1 )
a
<u>
<p> • o
Example 8.2. Additive effects model (continued). It is straight
forward to check that all of theorem 8.1 applies to the additive ef
fects model. In particular, conclusion (v) holds with w degenerate
and combining this with (8.16) and (8.19) one sees that the distribu-..............
tion of s = (a,~,~,!) has density with respect to Lebesgue measure
given by the right hand side of (8.16) multiplied by lil-(k+I+P)/2.
In view of the way in which this density is factorized, the (stochas-... .... ... ...
tic) independence of the four variates a,~,~ and ! as well as their
distributional laws can be read off directly from this result. In par
ticular, the form of the density of !, i.e. the Wishart distribution,
is immediately seen to be
n -1' --tr{! !}
pC!:!) = cl!I-(n-k-I+1)/2Iil (n-k-I-p)/2 e 2
for some constant c independent of ! and !. A further, special
argument is required to show that
c = P I IT r (2"(f+1-v» }.
v=l
Note also that the independence of s = (a,~,~,!) and of the like
lihood ratio statistic w for testing the assumptions of additivity
follows simply from (v) (b) of theorem 8.1, according to which s is
independent of the maximal invariant statistic u. In fact,
w = n
where I -1 - - * ! = n ! (x.. -x .. +) (x.. -x.. ) , l]V 1] l]V 1]+
from which it is seen that w
is invariant, i.e. w is a function of u only. o
95
Example 8.3. von Mises-Fisher model (continued). setting f * U(O,O,l) it follows from (8.7) that
Up(x) (8.35)
Furthermore, letting w(x) = IIx+1I formula (8.24) implies that w(x)
lIu+lI. Finally, using (8.27) we may rewrite (8.35) as
Up(s,u,v) ( () ) n Kwf· s a 3 K e .
From (v) in theorem 8.1 it follows that
<a>
i.e. the conditional distribution of s = x+/llx+1I given w = IIx+1I is
the Fisher distribution with mean direction f and concentration para
meter Kllx+lI. Observe that this conditional model is a standard trans-
formation model with K = K. o
Example 8.4. Elliptical models (continued). We now turn to some
inferential aspects for the elliptical distributions.
Instead of considering the action of GA(p) on mP x PD(p), it is
convenient to wOrk with a free action. This is obtained by considering
the subgroup TA+(p) = T+(p) x mP, which acts transitively and freely
on mP x PD(p). Furthermore, the action of on gives rise
to the same families of elliptical distributions as is obtained by the
action of GA(p).
In the following we consider the repeated sampling situation for the
elliptical family corresponding to h, i.e. the model
where
dP (x) dx
96
Our sample space is now (mP)n and we will represent an observation
in this space by the p x n matrix
where xi € mP are column vectors. The action of TA+(p) takes the
form
* (T,E): X ~ TX + Ee
where * e (1,1, ••• ,1) € IRn. In order to apply theorem 8.1 we first
seek for an orbital decomposition.
Let
- 1 n x }; xi n i=l
* Xc X - xe
1 * 1 n * S Ii Xc Xc }; (x.-x) (x.-x) n i=l 1 1
and
~ = (X € gl(p,n) Idet(S(X» > O}
Supposing n > p we have that the complement of ~ is a closed in
variant subset of gl(p,n) of Lebesgue measure zero and we restrict
our sample space to ~.
If X € ~ let
S (X) T(X)T(X) *
be the Cholesky decomposition of SeX) and let
where
97
(X € gl(p,n) Ix(x) 0, S (X)
Then the mapping
defines an orbital decomposition, i.e.
- * X = T(X)u(X) + x(X)e = (T(X) ,x(X» (u(X»
If dX is the restriction to ~ of Lebesgue measure on gl(p,n),
then the measure
~(X) = IT(X) I-n dX
is invariant under the action of TA+(p) on ~. This means, according
to theorem 5.4, that
(T,x,u) (J.L)
where a is the left invariant measure on TA+(p) and K is some
measure on ~. The next step is to characterize K.
For that purpose we introduce a supplementary action. Let
H = (V € SO(n) IVe = e}
Then H acts on ~ by
* V: X -+ XV
and we have that
and
- * x(XV) x(X)
* S (XV) S (X)
* u(XV ) * u(X)V
98
Furthermore,
transitively on
ant measure on
~ is invariant under the action of
•. Applying theorem 5.5 shows that
~ under the action of H. In fact
• = SO(n-1)/SO(n-1-p) = St(p,n-1) ,
i.e. ~ can be considered as a stiefel manifold.
Applying theorem 8.1 with
p(X) en
~(X) c4t
n 11
i=l
H,
K
and H acts
is the invari-
it is now a fairly trivial matter to see that the density of (T,x)
conditionally on u = {U1 , ... ,Un } and with respect to the left invari-
ant measure on TA+(p) is given by
p(T,x;f,};lu) =
n _ * -1 _ 11 h«Tui+x-f)}; (TUi+X-f»
i=l
and that u has density
p (u) n _ 2 _ 11 h(IITUi+xll )da(T,x)
i=l
with respect to the invariant measure on ~.
(8.36)
In particular, if h(s) = (2v)-p/2 exp(-~ s) we obtain the multi-
variate normal model, where (8.36) depends on
that (T,x) is independent of u and that u
buted on the Stiefel manifold ~.
(T,x) only, showing
is uniformly distri
[]
Example 8.5. Elliptical MANOVA models. Example 8.4 is easily gener
alized to cover a MANOVA model for an elliptical family. Therefore, we
only give the details necessary for completing the discussion of these
models.
Let
JI trW € gl(p,n) Ir € gl(p,q)} ,
99
where W is a known q x n (design) matrix of rank r. Suppose n-r
~ p, and let Q denote the n x n matrix of the orthogonal
projection onto the subspace of ffin spanned by the rows of W, i.e.
MQ M, M € .M ,
and
NOw, G T+(p) x.M acts on gl(p,n) by
(T,M): X ~ TX + M .
Letting
M(X) XQ ,
* (X € gl(p,n) Idet(X(In-Q)X ) > O} ,
S (X) T(X)T(X) * T(X) € T+(p) , X € ~ ,
and
(X € gl (p,n) IM(X) 0, S (X)
it follows that
X ~ «T(X) ,M(X», u(X»
is an orbital decomposition. Next, the group
H = (V € O(n) IQv* = Q}
acts on !t * by V: X -+ XV
100
and transitively on ~ * by u -+ uV .
Furthermore, (T,M) is invariant whereas u is equivariant under H.
Finally
~(X) = IT(X) I-n dX
is invariant under the action of G x H.
It is now straightforward to continue in the same way as we did in
example 8.4. It may be noted that example 8.2 is also covered by the
present model. o
Example 8.6. Conical surface model. In the following we consider a
class of distributions on IRq,{O} = !to {["~"] € IRq+llx € IRq, x 'I- o}
which has recently been characterized by invariance, cf. Letac (1988).
This class of distributions is a subclass of the mUltivariate general
ized hyperbolic distributions, see Barndorff-Nielsen (1977), Bl~sild
(1981), and Bl~sild and Jensen (1981) p. 65. In the present context we
will refer to this class of distributions as the conical surface dis
tributions in IR q+1 •
According to example 6.7 we have that IR: x sol (l,q) acts transi-
tively on !to by
and that
-1 ~(x) = IIxll dx
is relatively invariant with multiplier ~l(a,A)
Consider the probability measure
where (1,0, ... ,0) € IRq+l,
J IIxll-1 e- lIxll dx
IRq,{O}
t(x) * (lIxll, x )
q-l a .
and
Here
101
J dv J~ r-1 e-r r q - 1 dr sq-1
aIr (q-1) . q-
is the surface area of the unit ball
Transforming P by (a,A) yields
d«~A)P)(X) = a-q+1 c q (-1(A-1)* t()) exp -a eO· x
If -1 -1 * 9 = a (A ) eO it follows from example 6.8 that
9
and we clearly have that 9*9
d( (~A) P) (x) = C q (9*9) (q-1)/2 exp(-9.t(x))
where -1 -1 * 9 = a (A ) eO'
9
-2 a
traverses
i.e.
Subsequently, we consider the situation where we have a sample
x 1 , ... ,xn ' n ~ 2, from this class of distributions.
Letting (a,A)P = P9 and ~ = (x1 , ... ,xn ) we have that
n ! t(xi ), and it is easily seen that
i=l
As pointed out for the elliptical distributions, it is convenient to
work with a free and transitive action on 9. For that purpose we
* restrict ourselves to consider the action of G = ffi+ x P(q), where
P(q) is defined in example 3.6. Furthermore, we restrict our sample
space to ~ = {~It+(~)*t+(~) > OJ. This leaves out a closed invariant
subset of Lebesque measure zero, and G acts freely on ~.
102
Next, we want to describe the distribution of the sufficient statis
tic t+ and the orbit projection v, i.e. we want to transform p:n
by the mapping
In order to apply theorem 8.1 we have to find an invariant measure
on ~n ~. The measure ~ is relatively invariant with multiplier
J(n (a ,A) n(q-1) a , i.e. we need a modulator which is invariant under
(l,A), A € P(q). Now t+(~)*t+(~) is the maximal invariant under the
action of P(q) on ~ and it is easily seen that
We may conclude that is a modulator with J(n
as associated multiplier. Hence
is invariant and
Finally, we need to describe the invariant measure T on ~, which
by example 6.8 is given by
dT(t) = (t*t)-(q+1)/2 dt
Application of theorem 8.1 now yields that v and t+ are independent
and that the distribution of t+ is given by
c (S*S)n(q-1)/2 (t*t)n(q-1)/2-(q+1)/2 n,q
-s·t e
(8.37)
where c n,q
103
is a normalizing constant, which is independent of a.
Setting a = (1,0, ... ,0)* € mq+1 in (8.37) a direct calculation
shows that
c n,q
-qj2 1 T f(2(n(q-I)+1))
f (~(n-I) (q-I))f (n(q-I)) []
Example 8.7. Conical model. In continuation of the preceeding ex
ample, we focus on a family ~ of distributions as described in
(8.37), Le. !f1 = (Pala€9} and
dP '\ (a*a)A.(t*tl" -a·t _aCt) e (8.38)
dsL ,q
Here t € ~ and
~ (t € * m+ x mqlt*t > O}
9 {a € * m+ x mqla*a > O} ,
and
dsL (t) (t*t)-(q+I)/2 1 (t*t) dt (0, (X»)
is invariant under the action of
well-defined for all A. > (q-1)/2
(8.38) is given by
'\,q f(A.-~(q-I))f(2A.)
* G = m+ x P(q) on ~. The family is
and the norming constant '\,q in
We proceed to investigate a submodel of ~ which is invariant under
P(q). Any such model corresponds to a fixed value of the maximal in
variant, when we consider the action of P(q) on 9. A maximal in
variant is given by a*a, and consequently (a*a)1/2 = KO corresponds
to the submodel
dP Ko,m dJL (t)
104
where m € Hq . When we are making inference for a, we should condi
tion on the maximal invariant, and hence ancillary, statistic ret) = (t*t)1/2. (In the terminology of theorem 8.1 the maximal invariant
statistic is denoted by u. Here we use the letter r for the maximal
invariant statistic (t*t) 1/2 since this statistic is in fact the
(hyperbolic) resultant length of the vector t.) If t ~ (r(t),s(t»,
where set) r(t)-lt € Hq , we have by example 6.8 that
(r,s)(~) = p ® 0
where 0 is invariant on Hq . It follows that the distribution of s
conditionally on r has density with respect to 0 given by
Here c (.) is a normalizing constant given by q
C (K) q
(K/211") (q-1)/2
2K(q_1)/2(K) (8.39)
where K(q-1)/2 denotes the modified Bessel function of the third kind
and with index (q-1)/2.
This class of distributions has been considered for observations on
Hq , i.e.
dP d~ ,m) (s) C (K)
q
and is known as the hyperboloid model, see Jensen (1981). However,
Jensen writes the density in terms of the * product in the following
way
dP (K,T}) (s) = c (K) do q
-K11*S e ,
and refers to the parameters 11 = 11 m E Hq ,q and K E IR+
(8.40)
as the direc-
tion parameter and the concentration parameter, respectively.
105
Note that the family is a transformation model only when K is
fixed. In the case of a sample s1, ... ,sn this means that it is more
difficult to derive the distribution of the minimal sufficient statis-n
tic 5 = 1 }; n i=1
since we must also characterize the quotient
measure corresponding to the maximal invariant 5*5. See the continua-
tion of this example at the end of the section. 0
Example 8.8. von Mises-Fisher matrix model. In the following we
give a brief introduction to the socalled von Mises-Fisher matrix dis
tributions, see Downs (1972) and Jupp and Mardia (1979). These distri
butions constitute a model for observations on the stiefel manifold
* St(p,n) = (XEgl(p,n) Ixx =Ip }' i.e. observations of p mutuallyortho-
gonal directions in ffin.
If dX
st(p,n)
given by
denotes the O(p) x O(n)-invariant probability measure on
- as described in example 4.6 - the family in question is
~ = (PMIM E gl(p,n)} with
* c(M) exp(tr(MX »
where tr indicates the trace and C(M)-1 = oF1 (n/2,MM*/4) is a
hypergeometric function of matrix argument, cf. James (1964).
Since the action of O(p) x O(n) on St(p,n) is given by
* (U, V): X -+ UXV
it is simple to see that
d( (U, V) PM) dX (X) * * a(M)exp(tr(UMV X »
i.e. the induced action on the parameter space ~
by
* (U , V): M -+ UMV
gl(p,n) is given
This action has been considered in example 2.7, where A1 (M) ~ ... ~
2 P {Ai (M) } i=1 being the eigenvalues of
ized as a maximal invariant.
* MM , are character-
106
If is fixed, it follows that ~A o
~A = (M € gl{p,n) IMM* o
formation model and that
* c(Ao)eXp(tr(MX »
where the norming constant
diag(A~l, ... ,A~p'O, ... ,O) € gl(n,n), depends on AO only.
is a trans-
As noted in example 2.7 there are multiple types of orbits, i.e.
essentially different models, depending on the set of equalities in the
relation A01 ~ •.. ~ AOp ~ o.
We do not proceed further, but note that the von Mises-Fisher model
on the sphere corresponds to p = 1, and that the complications exhi
bited in example 8.3 do not at all simplify in this more general set-
ting. 0
We finally turn to a brief discussion of composite transformation
models, as defined at the beginning of this section. such a model con
sists of a class of probability measures ~ which may be partitioned
as ~ {~A: A € A} for some index set A with each subclass ~A
being a transformation model relative to a group G acting on the
sample space ~, the group G and its action on ~ being the same
for all A. Thus ~A is of the form ~A = {gPA: g€G}. The variate A
is called the index parameter of the model.
Since the group G and its action on ~ are assumed to be indepen
dent of A € A it seems natural to require that the same holds for the
quantities in definition 8.1 related to the action of G on ~. Con
sequently, we assume that the quantities ~, u, v, K, r, p and ~
do not depend on A € A. The remaining quantities in definition 8.1
may depend on A and in that case we write PA' etc.
We then say that ~ is a standard composite transformation model pro
vided the only quantities which do, in fact, depend on A are PA and
p(·;g,A), where
107
and we speak of ~ as a balanced standard composite transformation
model if, in addition, K K.
Let ~ = (~X: X€A) be a standard composite transformation model and
suppose that the conclusions of theorem 8.1 apply to each of the trans
formation models ~X. In particular, one then has that
-1 p(x:g,X) = peg x:e,X) (8.41)
and, furthermore, that the marginal distribution of u has density
with respect to u of the form
p(u:X) = J p(r,u:e,X)dp(r) <u> • (8.42)
In many contexts of statistical inference interests centers on like
lihood functions. In the present setting the primary likelihood func
tion is
L(g,X:x) p(x:g,X) (8.43)
considered as a function of (g,X) for fixed x. Factors in L de
pending on x alone are considered irrelevant. Note, however, that
according to assumptions one has that Gp = K. Consequently, the X
density function p(x:g,X) depends on g only through gK or equiva
lently only through s, where s denotes the point in ~, considered
as the parameter space, corresponding to gK (cf. comment (B) to defi
nition 8.1). Thus the primary likelihood function (8.43) may be re
written as
L(S,X:X) p(x:s,X) (8.44)
Similarly,
L(X:u) = p(u:X) , (8.45)
with p(u:X) given by (8.42), is a marginal likelihood function for X
based on observation of the maximal invariant statistic u alone,
factors depending on u (or even x) alone being again considered
irrelevant.
We shall now show that the formula (8.45) can be transformed into
another expression for the marginal likelihood L(X:u) in terms of an
108
integral of the primary likelihood L(S,A;X), but an integral with
respect to s (or, equivalently, a part of g) rather than r (or,
equivalently, a part of x). The disregarding of parameter free fac
tors is essential in this connection and the resulting formula (8.46)
is often simpler to apply than (8.45). Specifically we have
Theorem 8.2. Let 1 = {1A: A € A} be a standard composite trans
formation model with index parameter A. Suppose that for every A E A
the conclusions of theorem 8.1 apply to ~A and that, in addition, the
isotropic group K is compact. Then the marginal likelihood L(A;U)
based on the maximal invariant statistic u may be calculated as
~ -1 ~ ~ L(A;U) = f L(S,A;X)A G (s)da(s) . (8.46)
In (8.46) a denotes the invariant measure on ~ and the modular
function AG of the group G is considered as a function of s, or
equivalently as a function of the left coset gK. This function is
well-defined because of the compactness of K which implies that
AG(k) = 1 for every k € K. o
Proof. The basic formula used in the proof given here of theorem
8.2 is formula (5.25) stating that
where H is a closed subgroup of G and where a G/ H denotes the
invariant measure relative to the natural action of G on G/H.
Since K is compact it follows from the assumptions that K is
compact and consequently that aK(K) < 00. From (8.45) and (8.42),
using (8.47) with H = K after a transformation, it follows that
L(A;U) P(U;A)
f p(r,u;e,A)dp(r) ~
Disregarding the factor
(4.4) and (4.3) we find that
L(A;U) J P(gK,u;e,A)daG(g) G
~ -1 J p(K,u;g ,A)daG(g) G
109
and using, respectively, (8.41),
~ -1 J p(K,U;g,A)A G (g)daG(g) G
NOw, let go € G be such that x = gou. From the invariance of the
measure a G and from (8.41) we obtain that
~ -1 -1-1 L(U;A) = J p(K,u;go g,A)A G (go g)daG(g)
G
where in the last equality we have used the fact that the quantities
P(goK,U;g;A) and AG(g) depend on 9 only through gK. This fact in
conjunction with (8.47) with H = K implies that
110
Disregarding the factor AG(go)aK(K) and transforming we find that
L(u;X)
~ -1 ~ ~ f p(X;s,X)A G (s)da(s) ';/
~ -1 ~ ~ f L(S,X;X)A G (s)da(s) , ';/
as was to be proved. o
The importance of theorems 8.1 and 8.2 is intimately connected with
two basic principles of statistical inference, those of conditionality
and of marginalization. According to the first, inference on g, or on
that part of g on which gp genuinely depends i.e. gK, should be
performed conditionally on a suitable ancillary statistic. Under the
assumptions of theorem 8.1, (u,v) constitutes such a statistic and
the conditional model given (u,v) reduces to that determined by the
model function gp(slw), for which the formula (8.32) is available.
The principle of marginalization implies that under a composite trans
formation model the proper basis for inference on the index parameter
X is the marginal distribution of the maximal invariant statistic u.
The density function for this distribution is given by (8.42), and
while that formula is often rather intractable the derived expression
(8.46) for the marginal likelihood is more manageable.
We close this section by showing an application of formula (8.46).
Example 8.7. Conical model (continued). As mentioned above, the
class of hyperboloid distributions p ~ given by (8.39) and (8.40) (K, s) ,
and with direction parameter s € Hq and concentration parameter
K € ffi+, is a standard composite transformation model with K as index
parameter. The likelihood function corresponding to a sample ~
(sl, ... ,sn) of size n ~ 2 is
dp@n ~ (K,S)(S) @n -
da (8.48)
111
where s = + n I
i=1 si. Expressing the mimimal sufficient statistic s+
in terms of the maximal invariant statistic u = (s *s ) 1/2 + + -maximum likelihood estimator of s one obtains
and the
Inserting this in (8.48) the primary likelihood function takes the form
(8.49)
Since a is the SOf(1,q)-invariant measure on the parameter space ~ Hq it follows immediately from (8.39), (8.40), and (8.49) that the
marginal likelihood (8.46) for K is
L(K ;u)
where c (.) q
(8.50)
is given by (8.39).
Formula (8.50) may be compared with the actual distribution of u.
By formula (8.42) this distribution has density function with respect
to Lebesgue measure of the form
with L(K;U) as in (8.50). Rukhin (1974) has shown that
co
h 2 (U) = wn+1 ~e(in+1 f (Hg1 ) (X)}nJo(UX)XdX) , o
(8.51)
(8.52)
where is a Hankel function and J o is a Bessel function, - a
rather redoubtable expression. In contrast, for q = 2 we have simply
(2w)n-1 n-2 (n-2)! u(u-n) ,
cf. Jensen (1981). For q > 3 the form of is not known.
If we apply the asymptotic relation
112
~ -1/2-x K (x) ~ vV/2 x e v
x ~ 00
to (8.50) we find that
Suppose we adopt the right hand side expression as an approximation for
L(K;U). This expression is recognised as the likelihood function of a
gamma distribution and this suggests that 2K(u-n) is approximately
distributed as ~2«n-1)q), i.e.
(8.53)
A more detailed calculation shows that this indeed the case, see Jensen
(1981). As mentioned above, the exact distribution of u is known only
for q = 1 and 2. In the latter case (8.53) is, in fact, exact and
for q = 1 it represents a considerable simplification over the exact
result, cf. (8.50)-(8.52). o
Bibliographical notes
Some key references to the statistical literature on transformation
models are Barnard (1963), Fraser (1979), Barndorff-Nielsen, Bl~sild,
Jensen and J0rgensen (1982) and Barndorff-Nielsen (1983, 1988).
Theorem 8.1 is an extended version of theorem 3.1 in Barndorff-Nie1-
sen, Bl~sild, Jensen and J0rgensen (1982).
In the balanced case, i.e. for K = K, the results of theorems 8.1
and 8.2 are stated in Barndorff-Nielsen (1988), chapter 2.
Further results and exercises
~ Let G be a group acting on ~ and assume there exists a mapping
z: ~ -+ G
which is equivariant , i.e. z(gx)
G. Show that the mapping
gz(x) for all x € ~ and all g €
u: ~ -+ ~
x -+ (z(X»-lx
is maximal invariant.
h i)
Let M be a linear subspace of IR n
* Show that G = IR+ x M acts on
of dimension
IR n by
G x IR n -+ IR n
([a,JL] IX) -+ aX+JL
[Section 2]
m < n.
ii) Let p denote the orthogonal projection on M with respect to
the usual inner product on IR n and define
s(x) = IIx-p(x)lI.
Show that G acts freely on
Show that the action of G on
is transitive but not free and determine the isotropy group at
o.
114
iii) Let
{x€3: I s(x) 1, p(x)=O}
and define
u: er ~ 'tI
x ~ S(X)-l(X-p(X»
Show that
x ~ ([s(x),p(x»), u(x»
is an orbital decomposition.
[Section 2]
~ Let PD(n) denote the set of positive definite n x n matrices
and T+(n) the group of upper triangular matrices with positive
diagonal elements.
i) Show that T+(n) acts transitively and freely on PD(n) by
T+(n) x PD(n) ~ PD(n)
(T,~) * ~ T~T
(Consider the Cholesky decomposition of ~).
ii) Let P(k) = (m:)k denote the multiplicative group of positive
vectors, i.e.
X,TJ € P(k) .
Show that G P(k) x T+(n) acts freely on PD(n)k+1 by
G x PD(n)k+1
«T,X), (SO,Sl, ... ,Sk»
115
iii) * Let So = TOTO' TO € T+(n) be the Cholesky decomposition of
So' and define
i 1, ... ,k
Show that
is an orbital decomposition.
[section 2]
~ Let SL(2) be the special linear group {A€GL(2) Idet(A)=l} and
consider the action of SL(2) on gl(2) - the vectorspace of 2 x 2
matrices - given by
Let
and
Show that
SL(2)xgl(2) ~ gl(2)
* (A,B) ~ ABA
* {B€gl(2): B=B }
* {B€gl(2): B=-B }.
+ -gl (2) = gl (2)61g1 (2) ,
i.e. gl(2) is the direct sum of gl+(2) and gl-(2).
Show that gl+(2) is invariant under the action of SL(2), i.e.
if A € SL(2), B € gl+(2) .
116
similarly, show that gl-(2) is invariant and that the action of
SL(2) on gl (2) is trivial.
Show that gl+(2) has 4 orbit types corresponding to the following
conditions on B.
i)
ii)
iii)
iv)
B o .
Show that GB o and that GB SL (2) •
det(B) = 0, B ~ O.
Let ~ € {±1} denote the sign of the non-zero eigenvalue of B.
Show that
u(B) = [~ g]
is an orbit representative and that
~] la € ~, 0 € {±1}}
det (B) A > 0 •
Let ~ € {±1} denote the sign of the eigenvalues of B. Show
that
u(B) = ~~ [~ ~]
is an orbit representative and that GU(B)
Show that
det(B) A < 0 •
u(B) = J=X [01 0] -1
is an orbit representative and that GU(B)
So (2) •
SO(l,l).
[Section 2]
117
~ Let (G,~) be a transformation group.
Show that the following conditions are equivalent to (G,~) being
standard respectively to properness of the action. 01)
Suppose that {(Yn,Xn)}n=l ~ v and (Yn'xn ) ~ (Yo,xo ) € ~x~. n~
Standard: Then there exists a sequence 01)
{gn}n=l ~ G such that
01)
Properness: For every sequence {gn}n=l so that Yn 01)
have that {gn}n=l has a convergent subsequence.
[Section 2]
~ Let H be a closed subgroup of G and consider the action
Show that -r is proper.
-r: H x G ~ G (h,g) ~ hg
[Section 2]
we
~ Show that the action of G on ~ is proper if G is compact.
[Section 2]
~ Let H be a closed subgroup of G and consider the action
-r: H x g ~ G (h,g) ~ hgh-1
Show that properness of -r is equivalent to H being compact.
[section 2]
~ Let (G,~) be a transformation group and H a closed subgroup of
G.
Show that if (G,~) is standard respectively G is acting proper
ly then - with the inherited action of H - (H,~) is standard respec-
118
tive1y H is acting properly.
[section 2]
10. If A E gl(n) then there exists an upper triangular matrix T -1 and an invertible matrix S such that A = S TS (the Jordan normal
form). Note that Sand T have some complex entries in case A has
some complex conjugate eigenvalues.
i) Show by referring to (3.6) that
det(exp(As» = exp(tr(A)s) , S E IR .
ii) Show that the Lie algebra of SL(n) = (SEGL(n) Idet(S)=l} is
given by
sl(n) (AEg1(n) Itr(A)=O} .
[section 3]
11. Reconsider exercise 4 and the 3 orbit types under the action of
SL(2) on gl+(2)\[0]. The Lie algebra sl(2) of SL(2) is given by
exercise 10.
i) Consider the set
~1 (BEg1+(2) Idet(B»O}
with constant orbit type, and determine the Lie algebra ~ of
Show that
s1(2) st (2) ® 'j{
where
st(2) = {[~ _~] la,b E IR} .
Show that st(2) is the Lie algebra of the group
ST(2) = {[~
119
Finally, show that ST(2) acts freely on ~1' i.e. the equa
tion
B * Tu(B)T T E ST(2)
uniquely determines an orbital decomposition.
ii) Consider the following sets, each having their own orbit type:
~2 {BEgl+(2) Idet(B) = 0, B#O}
~3 {BEgl+(2) Idet(B)<O} .
Identify in both cases the Lie algebra ~ of the isotropic
group GU(B) and show that
sl(2) so(2) ~ :1/ ~ ~
where
This leads to considering
SO(2) = {V(S) = [c~SS -sinS] Is E [0 2W)} s~nS cosS '
and
Show that in both cases the system
* B V(S)O(a)u(B)O(a)V(S) S E [O,w), a > 0
has a unique solution, i.e. determines an orbital decomposition.
The similarity between these two decompositions indicates that
the concept 'constant orbit type' might be relaxed by only re
quiring isotropy groups to be isomorphic, as is the case in this
example.
[Section 3]
120
12. Let G be the subgroup of GL+(3) determined by
G
Determine /3 G and
13. Let G be the subgroup
{ [~ 0
G Y 0
of GL (3) . Find a G , /3 G and
Let H be the subgroup of
H { [g
o x o
~] :
AG·
G
0
Y 0
~]: x > 0, y,z E rn} .
[sections 4 and 6]
x,y,z E rn, xy "F o}
determined by
~] : y "F o} .
Show that there exists no G-invariant measure on G/H.
[Sections 4 and 6]
14. Let G be the subgroup of GL+ (3) given by
{[~ 0 log X]
> 0, y,z E rn} . G x z : x 0 1
Find a G , /3 G and AG·
Let H be the subgroup of G determined by
H = {[~ 0 ~]: y E rn} . 1 0
Show that H is unimodular and that there exists an invariant measure
under the natural action of G on G/H. Determine this measure.
[Sections 4 and 6]
121
Consider the action of on
GA+(n) x ~n ~ ~n
([A,f] ,x) ~ Ax+f •
given by
Let ~ be the measure corresponding to the n-dimensional normal
distribution with mean 0 and variance In' i.e.
~ (dx)
where ~n denotes Lebesgue measure on ~n and where
Show that ~
plier l([A,f],x)
is a quasi-invariant measure. Find the quasi-multi
and determine the function p in formula (4.8).
[Section 4]
16. Let Hand K be closed subgroups of G such that the action
H x K: G ~ G
(h,k): g ~ hgk-1
is proper and transitive. Let a G be a left Haar measure on G.
Show that - under the action of H x K - the measure is rela-
tively invariant with multiplier l(h,k)
NOW, let ~ € meG) be determined by
~ (f)
where and are left Haar measures on H and K, respective-
lye
Show that ~ is equal to a G, up to a constant, and summarize the
result in the formula
122
[Section 4]
17. Let ~ be a d-dimensional manifold and let (V,~) be a local
parametrization. Recall that for x € ~(V) the corresponding coordi
nate frames E1x , ... ,Edx of T!x' the tangent space at x, are given
by
-E...... (fo~) (v) , av1
where f is a smooth function defined in a neighbourhood of x = ~(v). Let E1X , ... ,Edx be the frames corresponding to an alternative
local parametrization (Z,C). Show that using Einsteins summation
convention one has
where v and
Let ~ be a Riemannian manifold with metric ~, i.e. for every x
€ ~ the value ~x of ~ at x is a positive definite, bilinear form
on T!x' which in the local parametrization (V,~) is represented by
Show that
where refers to the representation of ~ with respect to the
local parametrization (Z,C).
Finally show that the geometric measure a on ~ given by formula
(5.5) is a geometrical quantity, i.e. it does not depend on the local
parametrization.
[section 5]
123
18. Consider the cone
as a Riemannian submanifold of i.e. the metric on is the
restriction to ~ of the usual Riemannian (or Euclidean) metric on
m3 • Let u be the mapping
u: ~ -+ m+ x -+ ( * ) 1/2 x x ,
where
For every u € m+ find the geometric measure on
~u {x€~lu(x)=u}.
Consider m+ as a Riemannian submanifold of m and find the de
composition of Lebesgue measure on ~ determined by formula (5.7).
[Section 5]
19. Let ~ be a d-dimensional Riemannian manifold with metric ~. A
local parametrization (Z,C) is called orthonormal at x € ~ (with
respect to ~) if for x = C(z) one has
{jab '
where {j denotes Kroneckers delta function and where the d x d ma-
trix is the representation of
system z.
~ x in the coordinate
Let (V,~) be an arbitrary local parametrization and let q(v) be
the corresponding representation of ~. For a fixed x € ~(V) let v
= ~-1(x) and let q1/2(v) be a square root of q(v), i.e. q1/2(V)
is a d x d matrix satisfying q(v) = q1/2(V)q1/2(v)*. Show, using
exercise 17, that if z: V -+ Rd is a one-to-one mapping such that
124
* az (v) av q1/2 (v) ,
then one obtains an orthonormal local parametrization (Z,C) at x by
letting
Z z (V)
and
C = .;ov
where v = z -1 the inverse mapping of z. Now let, in addition, u be a differentiable mapping from ~ into
some m-dimensional Riemannian manifold ~ with m < d and metric ~.
Let (V,~) be a local parametrization at u(x) such that ~(V) = u(';(V» and let (Z,r) be a i-orthonormal parametrization at u(x)
constructed as above. Thus, expressed in terms of the local coordinate
systems v and v, the mapping u becomes
and expressed in the coordinates z and z u is given by
With this notation it follows from formula (5.7) and the remarks imme
diately after the formula, that the differential OU and the modulat
ing factor evaluated at x € ~ are given by, respectively,
OU
and
louou*I-1/ 2
* az
a" a"* -1/2 I u* ~I az az
Show that these quantities expressed in terms of the arbitrary
coordinates v and v become, respectively,
Ou
and
125
q1/2(V} aK* q-1/2(v} av
Consequently, the factor louou*I-1/ 2 in formula (5.7) may be calcu
lated without reference to orthonormal parametrizations.
[section 5]
20. Let ~ be a family of probability measures on ~ dominated by a
a-finite measure ~. Identify ~ with the set of densities ~ =
{~(-) Ip€~} and assume that ~ (or ~) is a d-dimensional manifold.
Let
w -+ p(- ;w} dP w ~ (.)
be a parametrization of ~. Show that the Fisher information i(w}
{irs(w}}, given by
defines a Riemannian metric on ~ (provided that i(w) is positive
definite} .
[Section 5]
21. Identify the set of normal distributions H = {N(f,a 2 }: fEffi, a 2>0}
with the set ffi+ x ffi = {[a,f]: a>O, fEffi} and consider the Fisher in-
formation on H. Show that the geometric measure , on H correspon
ding to the Fisher information satisfies
, = F2 a GA (1) +
and conclude that , is invariant under the action of GA+(l} on H
given, in obvious notation, by
126
GA+(l) x if .... if
([;:T ,f] ,N(f,a2 )) .... ;:TN(f,a 2 ) + f [section 5]
22. Consider the unit hyperboloid
as a Riemannian submanifold of ffi3 and show that the geometric measure
on H2 is given by
Prove, using (8.39), (8.40) and the formula
that the density function of the hyperboloid distribution on H2 with 2 direction parameter ~ = (~O'~l'~2) € H and concentration parameter
K € ffi+ can be expressed as
Prove (or take for granted) that the corresponding Fisher informa
tion is
-2 0 0 K
2 - (1+K)~1~2
i (K '~l'~2) 0 (1+K) (1+~2)
2 2 ~O ~O
- (1+K) ~l ~2 2
0 (1+K) (1+~1)
2 2 ~O ~O
127
Now assume that K is known. Show that the geometric measure a i
on corresponding to the Fisher information for the
unknown parameters is given by
[Sections 5 and 8]
23.
Then
Let G be a closed subgroup of GL(n), i.e. G is a Lie group.
g-l dg determines a set of left invariant one-forms on G, i.e.
-1 g dg.
These may be used to construct a left Haar measure on G, as de
scribed on p. 72.
Show that (dg)g-l may be used for the construction of a right
Haar measure on G.
A simple example is provided by GA+(l), where
{~
determines -2 a dad,l as left invariant, whereas
}-1 {-I Jl _ a da 1 - 0
determines -1 a dad,l as right invariant.
Use this method to determine left and right Haar measures on the
groups considered in exercises 12, 13 and 14.
[Section 7]
24. Let S € PO(p), the set of p x p positive definite matrices.
Recall from p. 70-71 that if
* S T T
where T € T+(p), is the Cholesky decomposition of S then
dS " ds .. i~j 1J
128
The p-dimensional gamma function fp is given by
for A > (p-1)/2.
Show that
S IsI A-(p+1)/2 e-trs dS PO(p)
I1 P (p-1)/4 P 11 f(A-1/2(i-1))
i=1
and conclude that for ! € PO(p) one has that
Thus setting A = f/2, where f is an integer greater than or equal
to p, one obtains the density function of the p-dimensional Wishart
distribution with f degrees of freedom and variance !
1 I S I (f-p-1)/2 2Pf/2 f (f/2) 1!lf/2
P
1 -1 -2' tr! S e S € PO (p) .
Show that the maximum likelihood estimator ! on p. 94 is dis
tributed according to a p-dimensional Wishart distribution with f = 1 n-(k+1-1) degrees of freedom and variance n!.
[Sections 7 and 8]
25. This exercise is concerned with the structure of exponential
transformation models. More detailed discussions of such models may be
found in Barndorff-Nielsen, Bl~sild, Jensen and J0rgensen (1982), Erik
sen (1984) and Barndorff-Nielsen (1988). For notation and terminology
concerning exponential families the reader is referred to Barndorff
Nielsen (1978).
129
Let 1 be a standard transformation model on ~ with acting group
G and assume, in addition, that 1 is an exponential model of order
d with minimal representation
-1 p(x;g) = peg x) a(9(g»b(X)e9 (g).t(X)
~ being a G-invariant measure on ~. Furthermore, assume that the
minimal sufficient statistic t is continuous.
i) using the affine relationship between respectively the canonical
parameters and the canonical statistics of two minimal represen
tations of 1 (cf. for instance lemma 8.1 in Barndorff-Nielsen,
1978) show that there exist uniquely determined subgroups G and G of GA(d),
( [A (g) , B (g) ]: gEG}
G * -1 ~ {[A (g ) ,B(g)]: gEG}
such that:
a) t(gx) A(g)t(x)+B(g)
b) 9(g)
and
c) the mappings
G .... G
9 .... [A(g) ,B(g)]
and
G .... G
* -1 ~
9 .... [A (g ) ,B(g)]
are representations of the group G, i.e. the mappings are
homomorphisms from G into GA(d) .
130
ii) Let
iii)
6(g) a(9(e))/a(9(g))e-9 (g).B(g)
and prove that
Let E.(g)
given by
~ -1 b(gx) = 6(g)b(x)eB(g )·t(x)
In 6(g) and let M(g) be the element in GL(d+2)
M(g)
A(g)
o ~* -1 B (g )
B(g) o
1 o
E. (g) 1
Show that the mapping
9 -+ M(g)
is a group representation of G, i.e. a homomorphism from G
into GL(d+2).
iv) Let +(x) = In b(x) and let s denote the (d+2)-dimensional
statistic given by
* * s (x) (t (x),1,+(x))
Furthermore, let (g,u) be an orbital decomposition of x,
i.e. x = guo Show that
s(x) M(g)s(u)
and, in conclusion, show that there exists a constant (d+2)-di
mensional vector c such that
p(x) = eC·M(g)S(U)
v) Let e denote the parameter space of the full exponential fami
ly ~ containing ~, i.e.
131
Show that the group G = ([A*(g-l),B(g)] IgEG} leaves e in
variant and conclude that if the action of G on e given by
(G,e) -+ e (g,9) -+ A*(g-1)9+B(g)
is not transitive then ~ is a composite transformation model.
[Section 8]
26. Denote by ~ the 2 x 2 matrix
{ C1 -1 C1 -C1
and consider the group of 2 x 2 matrices
G = {~:C1>o}
with the usual matrix multiplication as the rule of composition. Let
ting (xl, .•. ,xn ), xi E m2 , denote a point in gl(2,n) we define an
action ~ of G on gl(2,n) by
Now let ~ denote the transformation model on gl(2,n) generated by
G and by the probability measure Po under which x 1 , ... ,xn are n
independent observations from a bivariate normal distribution with mean
o and variance equal to the identity matrix.
i) Determine the set ~ where the maximum likelihood estimate C1
of C1 exists uniquely.
ii) Show that G acts freely and properly on ~ (see exercise 5)
and that the complement ~c is closed and of Lebesgue measure
zero.
132
iii) Show that the Lebesgue measure on ~ is invariant under the
action of G.
iv) Find a maximal invariant statistic u (see exercise 1).
v) Find the conditional distribution of the maximum likelihood '2 2 estimate a of a given the maximal invariant statistic u.
[Section 8]
* References
Andersson, s. (1978): Invariant measures. Tech. Rep. No. 129, Dept. statistics, stanford University. [1, 12, 28]
Andersson, S., Br0ns, H. and Jensen, S.T. (1982): An algebraic theory for normal statistical models. Unpublished manuscript. Inst. of Math. Stat., Univ. Copenhagen. [6]
Andersson, S., Br0ns, H. and Jensen, S.T. (1983): Distribution of eigenvalues in mUltivariate statistical analysis. Ann. Statist. 11, 392-415. [1, 53]
Baddeley, A.J. (1983): Applications of the coarea formula to stereology. Proceedings of Second International Workshop on Stereology and Stochastic Geometry. Memoirs No.6, Dept. of Theor. statist., University of Aarhus. [1]
Barnard, G.A. (1963): Some logical aspects of the fiducial argument. J. Roy. Statist. Soc. ~ 25, 111-114. [112]
Barndorff-Nielsen, O.E. (1977): Exponentially decreasing distributions for the logarithm of particle size. Proc. Roy. Soc. Lond. A 353, 401-419. [100]
Barndorff-Nielsen, O.E. (1978): Information and Exponential Families in Statistical Theory. Wiley, Chichester [128, 129]
Barndorff-Nielsen, O.E. (1983): On a formula for the distribution of the maximum likelihood estimator. Biometrika 70, 343-365. [1, 82, 112]
Barndorff-Nielsen, O.E. (1988): Parametric Statistical Models and Likelihood. Lecture Notes in statistics 50. Springer, Heidelberg. [1, 43, 75, 82, 112, 128]
Barndorff-Nielsen, O.E., BI~sild, P., Jensen, J.L. and J0rgensen, B. (1982): Exponential transformation models. Proc. R. Soc. Lond. A 379, 41-65. [1, 53, 112, 128]
Barut, A.O. and Raczka, R. (1980): Theory of Group Representations and Applications. Polish Scientific Publishers, Warszawa. [27, 41]
Blaschke, W. (1935a): Integralgeometrie 1. Ermittlung der Dichten fur lineare Unterraume im En. Actualites Sci. Indust. 252, 1-22.
[60]
Blaschke, W. (1935b): Integralgeometrie 2. zu Ergebnissen von M.W. Crofton. Bull. Math. Soc. Roumaine Sci. 37, 3-11. [60]
*NUmbers in square brackets refer to the pages where the paper is cited
134
Blzsild, P. (1981): On the two-dimensional hyperbolic distribution and some related distributions: with an application to Johannsen's bean data. Biometrika 68, 251-263. [100]
Blzsild, P. and Jensen, J.L. (1981): Multivariate distributions of hyperbolic type. In C. Taillie, G.P. Patil and B.A. Baldessari (Eds.): Statistical Distributions in Scientific Work Vol. ~, 45-66. D. Reidel Publ. Co., Dordrecht. [100]
Boothby, W.M. (1975): An Introduction to Differentiable Manifolds and Riemannian Geometry. Academic Press, New York. [53]
Bourbaki, N. (1959): Elements de Mathematigue, Integration. Chapitre 6. Hermann, Paris. [53]
Bourbaki, N. (1960): Elements de Mathematigue, Topologie Generale. Chapitre 3 a 4. Hermann, Paris. [14]
Bourbaki, N. (1963): Elements de Mathematigue, Integration. Chapitre 7 a 8. Hermann, Paris. [41, 53]
Bourbaki, N. (1970). Elements de Mathematigue, Algebre. Chapitre 1 a 3. Hermann, paris. [14]
Cohn, P.M. (1957): Lie Groups. cambridge University Press. [27, 66]
Downs, T.D. (1972): Orientation statistics. Biometrika 59, 665-676. [9, 105]
Eaton, M.L. (1983): Multivariate statistics. Wiley, Chichester. [1]
Edelen, D.G.B. (1985): Applied Exterior Calculus. Wiley, New York. [66]
* Effros, E.G. (1965): Transformation groups and C -algebras. Ann. Math. 81, 38-55. [14]
Eriksen, P.S. (1984): A note on the structure theorem for exponential transformation models. Research Report 101, Dept. Theor. Statist., Aarhus University. [128]
I
Eriksen, P.S. (1989): Some results on the decomposition of quasi-invariant measures. Inst. Electr. Syst., University of Aalborg. (In preparation). [14, 41, 53]
Farrell, R.H. (1985): Multivariate Calculation. springer, New York. [1, 74]
Fraser, D.A.S. (1979): Inference and Linear Models. McGraW-Hill, New York. [1, 112]
Helgason, S. (1978): Differential Geometry, Lie Groups, and Symmetric Spaces. Academic Press, New York. [27, 66]
Husain, T. (1966): Introduction to Topological Groups. Saunders. [14]
James, A.T. (1964): Distributions of matrix variates and latent roots derived from normal samples. Ann. Math. statist. 12, 475-501. [105]
Jensen, E.B. formulae points -appear. )
135
and Gundersen, H.J.G. (1988): Fundamental stereological based on isotropically oriented probes through fixed with applications to particle analysis. J. Microsc. (To
[60]
Jensen, E.B., Kieu, K. and Gundersen, H.J.G. (1988): On the stereological estimation of reduced moment measures. Research Report 174, Dept. Theor. Statist., Aarhus university. [60]
Jensen, J.L. (1981): On the hyperboloid distribution. Scand. J. Statist. ~, 193-206. [8, 24, 104, 111, 112]
Jespersen, N.C.B. (1985): On the structure of transformation models. Cando scient. thesis., Inst. Math. Statist., Univ. Copenhagen. [14]
Jespersen, N.C.B. (1989): On the structure of transformation models. Ann. statist. 17, 195-208. [78]
Jupp, P.E. and Mardia, K.V. (1979): Maximum likelihood estimators for the matrix von Mises-Fisher and Bingham distributions. Ann. statist. 2, 599-606. [105]
Kallenberg, o. (1983): Random Measures. (Third edition.) AdamedieVerlag, Berlin, and Academic Press, New York. [2, 42]
Karr, A.F. (1986): Point Processes and their statistical Inference. Dekker, New York. [2]
Karr, A.F. (1988): Palm distributions of point processes and their applications to statitical inference. contemporary Mathematics 80, 331-358. [2]
Khatri, C.G. and Mardia, K.V. (1977): The von Mises-Fisher matrix distribution in orientation statistics. J. Roy. Statist. Soc. Ser. l! 39,95-106. [9]
Letac, G. (1988): Les familles exponentielles statistiques invariantes par les groupes du cone et du paraboloide de revolution. (To appear in Ann. statist.) [100]
Miles, E.E. (1979): Some new integral geometric formulae, with stochastic applications. J. Appl. Prob. 16, 592-606. [60]
Muirhead, R.J. (1982): Aspects of Multivariate statistical Theory. wiley, New York. [1, 74]
Petkantschin, B. (1936): Integralgeometrie 6. Zusammenhange zwischen den Dichten der linearen Unterraume im n-dimensionalen Raume. Abh. Math. Sem. Univ. Hamburg 11, 249-310. [60]
Reiter, H. (1968): Classical Harmonic Analysis and Locally Compact Groups. Oxford university Press. [41]
Rukhin, A.L. (1974): Strongly symmetrical families and statistical analysis of their parameters. Zap. Nauevn. Sem. Leningrad. Otdel. Mat. Inst. Steklov 43, 59-87. (English translation (1978): ~ Soviet Math . .2., 886-910.) [111]
Santal6, L.A. (1979): Integral Geometrv and Geometric Probability. Encyclopedia of Mathematics and Its Applications, Vol. 1. AddisonWesley, London. [10, 74]
136
stoyan, D., Kendall, W.S. and Mecke, J. (1987): Stochastic Geometry and Its Applications. Akademie-verlag, Berlin, and wiley, New York. [2]
Tjur, T. (1980): Probability Based on Radon Measures. wiley, Chichester. [53]
Wijsman, R.A. (1967): Cross-sections of orbits and their application to densities of maximal invariants. Proc. Fifth Berkeley Symp. Math. statist. and Prob., university of California Press, Berkeley and Los Angeles, ~, 389-400. [14]
wijsmann, R.A. (1986): Global cross sections as a tool for factorization of measures and distribution of maximal invariants.
Sankhya 48, 1-42. [14]
Subject index
action 3,14,43,46,47,48,49,54,74 free 4 induced 74
137
instances 3,4,5,5-6,7,8,9,10,22,26,38,39,49,50,51,59,65,67,75,76, 77,78,84,86,95,96,99,100,101,103,105,113,114,115,117, 118,119,121,125-126,131
left 10,32,52,55 natural 11,36,52,53,108 proper 13,36,45,49,117,118,121,131 right 10,32 transitive 3 see also group; conjugation
additive effects model 76,85-87,94-95
analysis of variance see additive effects model
ancillarity principle 82
ancillary statistic 82,110 instances 84,104
Bessel function 77,104,111
Blaschke-Petkantschin formula 59-60
boost 24,39
canonical parameter 129 statistic 129
Cauchy distribution mUltivariate 79
chart 15
Cholesky decomposition 70,96,114,115,127
commuting mapping 48
composite transformation model 75,78,106,110,131 standard 106,108
balanced 107
concentration parameter 76,104,126
conditional distribution
formulae 90-91 independence 90
conditionality principle 110
conditioning 82
cone 49,123
configuration statistic 84
conical model 103-105,110-112 surface model 100-103
conjugation 56
coordinate frame 45,122
coset left 11 right 10
138
decomposition of measures 1,42,45,46,47,48,74,123
differentiable manifold 15,66,68,72,122,123,124,125
see also manifold submanifold 54,123,126
differential 45,66,124 r-form 71
invariant 72,127 of function 69
direction parameter 104,126
disintegration 1,2,42,53 formulas 42,46,48
distribution constancy 90
Einstein summation convention 69,71,122
elliptical models 78-79,95-98 MANOVA models 98-100
equivariant mapping 4,80,113 statistic 80,84,88,89,100
exponential map 15,18,21
instance 24 model 129
representation of 129
exterior calculus 1,66
exterior product 69 and Jacobians 69 and invariant measures 72
factorization of group 15,21,27 left 11,52,72,82,91
of GL(n) 38 of SO(p) 22
of sol (1,q) 23
of SOl (p,q) 26 right 10
of GL(n) 38
of sol (p,q) 23
139
factorization of measures 1,46,48,66 of Lebesgue measure 37,66,67-68 of left invariant measure 52,121-122 of lifted measure 42,48,49,50
instances 51,52,65 of right invariant measure 52
Fisher distribution information
87-89,95 125,126,127
forms see differential
gamma distribution 112 function 128
geometric measure 45,53,54,122,123,125,126,127 on GA(n) 56 on GL(n) 56 on PD(n) 58 on sp-1 62 on T+(n) 57
G-invariant measure see measure
Grassman manifold see manifold
group 2,14 action 3 commutator C(n,H) 5,31 compact 34,117 connected 18 factorization 15 general affine GA(n) 4,56
positive GA+(n) 4
general linear GL(n) 3,15,18,19,34,38,56 positive GL+(n) 4
of isometries 51 isotropy (or isotropic) 4,13,35,36,38,44,46,80,83,108
instances 39,40,116 location - scale 5,31,40,51 opposite 14,37 orthogonalO(p) 5,7,17,38
special SO(p) 8,17,22,39 positive multiplicative 4 pseudo-orthogonal O(p,q) 7,16,19,26
special SO(p,q) 16,39,49,65
special,identity component sol (p,q) 7,17,23,26,39,63 quotient 37 representation 129 special linear SL(n) 115,118 special triangular ST(2) 118
140
standard transformation 12,13,28,43,48,49,83,117 sub- 10,52
closed 10,16,117,121,127 normal 37,47,91 regular 30,80
supplementary 49 symmetric 2,6 topological 2 triangular T+(n) 34,38,114
with unit diagonal T+ 1 (n) 20
unimodular 34 instances 39,40,41,56,120
see also factorization: Lie
Haar measure 32 see also invariant measure (left: right)
Hankel function 111
homogeneous space 3
hyperbolic distributions see conical surface model length 8
hyperboloid model 24,104,110,126 generalized 26,39,49,51,64 unit 8,23,39,126
hypergeometric function of matrix argument 105
independence 94,98,102 see also conditional
index parameter 106,108,110
invariance characterization by 43,49-50,52,65-66 subgroups 16,60-63
invariant measure 1,29,31,36,38,43,44,48,50,53,72,74,80,108,129 construction 53,55,66,72 and Jacobians 54-55,56,57,69 instances 31,38,39,40,49,51,56,57,58,59,62,63,65,66,67,73,77,84,85,
86,87,97,98,100,102,103,120,125,127,132 on cone 65,102 on cone surface 63 on cosets 52 on G(p,n) 59-60 on GL(p) 56
on gl(p,n)t 60
on Hq 39,85
on HP,q 51,63,65,72-74 on invariance subgroups 60-61
orbits of - - 61-63 on PD(n) 58-59 on Sp-1 62
141
under action of C(n,H) 31 under action of GA+(1) 31,40,84,85
left 32,37,52,55,59,120 on GA(n) 57 on GA+(1) 40,127
on T+(n) 57
left~right formulae 32 right 32,46,49,52,55,120 on GA(n) 57 on GA+(1) 40,127
on T+(n) 58
see also exterior product: differential: relatively
Iwasawa decomposition 27 of so(1,q) 25
Jacobi identity 17
Jacobian 45,54,69-70
Jordan normal form 118
Kronecker delta 123
LCD - space 12,13,50,83
Lie algebra 15,17,20,64
of GL(k), i.e. gl(k) 18,19 of O(p,q), i.e. o(p,q) 19-20 of SL(n), i.e. sl(n) 118 of SO(p), Le. so(p) 22
of sol (l,q), Le. so(l,q) 23 of ST(2), Le. st(2) 118 of T+1 (k), i.e. t+1 (k) 21
subalgebra 17 of gl (k) 19
Lie group 15,37,66,72,127
factorization 21 semisimple 27 subgroup 15-16,19
of GL(k) 19
Lie product 17 instance 18
likelihood function 107,110,112 marginal (function) 75,107,110
instance 111
likelihood ratio statistic 94
locally compact 12
location - scale model 75,82,83-85,93-94 see also group
Lorentz group 24 transformation 24
manifold Grassman G(p,n) 9,39 Riemannian 45,53,122 stiefel st(p,n) 9,39,98,105 see also differentiable
marginal distribution
formula 107 see also likelihood
marginalization principle 110
maximal invariant function 4,79,113
instances 51,67,104,105
142
statistic 14,75,90,107,110 instances 102,103,104,105,111,132
maximum likelihood estimator 82,94,128,131,132 existence and uniqueness 85
mean direction 76
metric Riemannian 45,122,123,125
modular function 32,34,52,108,120 in terms of Jacobians 55 of GA(n) 57 of T+(n) 34,58
of location-scale group 40
modulator 30,31,33,47,49,53,54 with multiplier 30,31,33,47-48,49,52
existence 30,30-31 instances 31,102
with quasi-multiplier 44,54-55 in terms of Jacobian 55
module see modular function
multiplier instances
normal
29,30,31,33,36,46,49 31,32,47,50,52,59,65,86,87,100,102,121
mUltivariate - distribution 6,76,78,79,121,125,131 - model 6-7,98
see also group
normalizer 47
143
orbit 3 types 106,116,119 constant - type 7,13-14,119
instances 7,8,9,118 regular - type 30,46,48,53,55,80 of invariance subgroup 61-63 projection 3,43,45,102 representative 4,7,14,48,79,84,88,116 space 3,48
orbital decomposition 4,7,13,14,44,54,55 instances 5,64-65,88,97,99,114,115,119,130
parametric statistics 1,74
parametrization (local) 15,45,122,123,124 orthonormal 123,124,125
polar decomposition 66
proper action mapping
quasi-
see action 13,29,48,50,83
invariant measure 35,43,53,54,121 multiplier 34-35,43,44,53,54,121
quotient measure 46,49
characterization by invariance 49,65 instances 50,51,52
topology 3
Radon measure 12,28
regular measure 12,28
relatively invariant measure 29,30,31,46 construction 31 instances 31,32,47,50,52,59,65,86,100,102,121 existence 30,36 uniqueness 30,36
relativity theory 24
Riemannian geometry 53 manifold see manifold metric 45
skew-symmetry 71
statistical inference 110
stereology 10,60
stiefel manifold see manifold
student distribution multivariate 79
submanifold 45,123,126
sUfficiency principle 82
sufficient statistic 90,102 minimal 82,105,111,129
transformation of densities 29,34 standard - group see group
transformation model 74,78,112 instances 105,106,131
balanced 82,112 exponential 2,75,128 main theorem 89-91 regularity conditions 83,89,91 standard 75,80,89,129
instances 85,87,89,95 see also composite
translation left 10 right 11
von Mises-Fisher model 76-78,87-89,95 matrix model 105-106
wedge product see exterior
Wishart distribution 94,128
1M
Notation index
a,aG 32 gx 3
s4 74 gK 7
gP 74
B(q) 24 9JL 29
~'~G 32 G 2
~ 28 GO 14
G(p,n) 9
C(n,H) 5 GA(n) 4
Cp(f,K) 76 GA+(n) 4
l( 29, 35 GL(n) 3
GL+ (n) 4
det 8 Gx 3
D 45 G 4 x
D,D G 10 G/K 7
A,A G 32 G\~ 3
" 3
Eix 45, 122 r 12
6,6H 10 r (g) 54
'Il 21
fir 69
'"(11) 29 HP,q 26
Hq 8
gl(k) 17 H\G 11
gl(p,n) 8 ':It 21
gl(p,n)t 9
gl+(2) 115 i 125
gl (2) 115 I p,q 16
gp 81 I 1 ,q 7
146
J 45 a "'/r 69
til 45
:t (~) 28 ~ 74
L 107 q 45
LCD 12
L 10 Rg 11 g
~ 67 * IR+ 4
m,ml( 30, 44 sign 69
Jl (f) 28 sl (n) 118 Jll( 30 seep) 22
Jl//3 46 se(l,q) 23
.M 75 st(2) 118
.M(~) 28 S (n) 58 Sp-1 22
Nq(f ,};) 76 St(p,n) 9
v 12 SL(n) 115, 118 J{ 78 SO(p,q) 16 p
SOl(p,q) 17
e(p) 20 sol (l,q) 7
e(p,q) 20 SO(q) 8
o (n) 5 ST(2) 118
O(p,q) 16 Y' (n) 6
0(1, q) 7 Y'(~) 2
P(q) 25 t+1 (n) 21
PD(n) 38 tr 105
11" (x) 3 T+(n) 34
147
T+1 (k) 20 1\ 69
TA+(P) 95 77
TMp 68 77
* TM P
68 < > 81
u(x) 4 - 64, 82
-x 5, 84
X+ 78
!: 2
!: 41-42 11"
z(x) 4
C ,CG 56
* (transposition)
* (product) 8
< > 5
II II 5
I I 8
[ , ] (element af
GA(n) ) 4
[ , ] (Lie multi-
plication) 17
81 21
18 28
0 29
et (product of measures) 37
et (tensor product) 76
Lecture Notes in Statistics Vol. 44: D.L. McLeish, Christopher G. Small, The Theory and Applications of Statistical Inference Functions. 136 pages, 1987.
Vol. 45: J.K. Ghosh,.Statisticallnformation and Likelihood. 384 pages, 1988.
Vol. 46: H.-G. Muller, Nonparametric Regression Analysis of Longitudinal Data. VI, 199 pages, 1988.
Vol. 47: A.J. Getson, F.C. Hsuan, {2}-lnverses and Their Statistical Application. VIII, 110 pages, 1988.
Vol. 48: G.L. Bretthorst, Bayesian Spectrum Analysis and Parameter Estimation. XII, 209 pages, 1988.
Vol. 49: S.L. Lauritzen, Extremal Families and Systems of Sufficient Statistics. XV, 268 pages, 1988.
Vol. 50: O.E. Barndorff-Nielsen, Parametric Statistical Models and Likelihood. VII, 276 pages, 1988.
Vol. 51: J. Husler, R-D. Reiss (Eds.)' Extreme Value Theory. Proceedings, 1987. X, 279 pages, 1989.
Vol. 52: P.K. Goel, T. Ramalingam, The Matching Methodology: Some Statistical Properties. VIII, 152 pages, 1989.
Vol. 53: B.C. Arnold, N. Balakrishnan, Relations, Bounds and Approximations for Order Statistics. IX, 173 pages, 1989.
Vol. 54: K. R Shah, B. K. Sinha, Theory of Optimal Designs. VIII, 171 pages. 1989.
Vol. 55: L. McDonald, B. Manly, J. Lockwood, J. Logan (Eds.), Estimation and Analysis of Insect Populations. Proceedings, 1988. XIV, 492 pages, 1989.
Vol. 56: J.K. Lindsey, The Analysis of Categorical Data Using GLiM. V, 168 pages. 1989.
Vol. 57: A. Decarli, B.J. Francis, R Gilchrist, G.U.H. Seeber (Eds.), Statistical Modelling. Proceedings, 1989. IX, 343 pages. 1989.
Vol. 58: O. E. Barndorff-Nielsen, P. Blaasild, P. S. Eriksen, Decomposition and Invariance of Measures, and Statistical Transformation Models. V, 147 pages. 1989.