ch6 ann and ga

7/31/2019 ch6 ann and ga

1/104

1

Neural Networks: Definition

Neural computing is the study of networks of adaptable nodes

which, through a process of learning from task examples, store

experiential knowledge and make it available for use.


2/104

2

What Are Neural Networks?

A computing model, inspired by the mammalian neural system,

composed of many simple, highly interconnected processing

units.

Neural network models are algorithms for cognitive tasks, such aslearning and optimization, which are in a loose sense based on

concepts derived from research into the nature of the brain.


3/104

3

What Are Neural Networks?

Neural network model is a directed graph with the following

properties:

A state variable ni is associated with each node i.

A real value weight wij is associated with each link from node i

to node j.

A real value bias i is associated with each node i.

A transfer function fi(nj, wij, i) is defined, for each node i,

which determines the state of node i.


4/104

4

What Can ANN Do?

Biological

Modeling the retina

Modeling brain disorders (ADD)

Business

Evaluate probability of oil in geological formation

Identify and filter promotion and job applicants

Mine corporate databases for business rules

Financial

Assessing credit risk

Identify forgeries

Interpret handwritten forms

Predict portfolio and stock values


5/104

5

What Can ANN Do?

Manufacturing

Automated robot control systems

Control material flow

Optimize production lines

Quality inspection

Medical

Analyze speech in hearing aids

Diagnose and prescribe treatment by symptoms

Monitor surgery and recovery

Read X-rays and CET/PET Scans


6/104

6

What Can ANN Do?

Military

Classify radar and sonar signals

Target acquisition and tracking

Analyze intelligence inputs

Optimizing scarce resources

Signal processing

Adaptive Noise Canceling

Zip Code Reader

Speech Recognition


7/104

7

A Brief History

First concepts

Turing 1936

McCulloch & Pitts 1943

Hebb 1949

Early steps 1950s - 1960s

The perceptron

ADALINE and MADALINE

Excessive hype


8/104

8

A Brief History

Stunted growth 1969-1981

Perceptrons by Minskey and Papert

Continued work

Renewed interest

The Hopfield model 1982

Backpropagation rediscovered 1985 (first 1974 by

Werbos)

Radial Basis Functions - Broomhead & Lowe 1988


9/104

9

A Quick Word About The Brain


10/104

10

The Biological Neuron

Cell Body Synapse() Dendrites() Axons()


11/104

11

Computers And The Brain

We do not understand the brain

The ANN model is only loosely based on the brain

The ANN model is metaphoric to the brain


12/104

12

Computers vs. Neural Networks

Von-Neumann Machines Neural Networks

Few strong processors ~1011 Simple neurons

Serial processing Parallel processing

Central control No central control

10-9 sec. Cycle 10-3 sec. Cycle

Bit data Voltage data

Not tolerant Very robust

Fast numeric operations Slow numeric

operations

Slow high operations Fast high operations

Learning ? Learning !


13/104

13

Building Blocks Of The Model

The processing element

The connections

Learning methods


14/104

14

Processing Element Building Block

The basic building block of a neural network is the

processing element (or node or unit).

A generalised node embodies elements:

inputs(+bias)

weights

transfer function

combining function

activation function

output(s)


15/104

15

The function of a single node

The job of a processing element is to receive a number of

inputs (either from the external world or from other nodes

or from itself) and to distribute a single output (either to

the external world or to other nodes).


16/104

16

Some Input Functions

Weighted Summation

net = w1x1 + w2x2+ + wnxn + bias

where wi is the weight associated with the connection

between an input and the processing element


17/104

17

Some Input Functions

Multiplication (or Product)

net = w1 x1 * w2x2* * wnxn

similar to the weighted summation but the summation is

replaced by the product

Maximum, Minimum, Majority

net = max (wnxn)

net = min (wnxn)

net = 1 IF (wnxn) > 0 ELSE -1


18/104

18

Some Activation Functions

Sigmoid

maps an input into a value between zero and one

Linear

where no transformation takes place to the outcome of

the combing function

Tangent

similar to the sigmoid but the mapping is between -1 and

1

Step

where the transfer value equals 1 if the outcome of the

combing function is greater than some threshold,

otherwise it equals 0


19/104

19

Some Activation Functions


20/104

20

Closer Look At Transfer Functions

Unipolar

Sigmoid

Threshold()

Bipolar

Sigmoid

Sign


21/104

21

The Connections

The connections are the only thing changing in neural

networks

Connections may be either inhibitory or excitatory

Connection strengths are expressed by weights


22/104

22

The role of the weights

Each input or node is connected to a processing element

Graphically this is represented by an arc

Each arc has a weight. The weight simply determines the

influence (or strength) of an input to a processing element

Neuro-computing is concerned with identification of thecorrect set of weights


23/104

23

An example of a single node

Assume a processing element receives 3 inputs: 1 0.5 0.3

If the combining function is the weighted summation and the

weights are: -0.2 0.04 2.35

then the result of the combining function is 0.705

1

0.5

0.3

-0.2

0.04

2.35

0.705


24/104

24


If the activation function is

linear f(x)=x then output is 0.705

1

0.5

0.3

-0.2

0.04

2.35

0.705f(x)=x

0.705


25/104

25


If the activation function is

sigmoid then output is 1 / (1 + exp(-0.705)) = 0.669

1

0.5

0.3

-0.2

0.04

2.35

0.705f(x)=1/(1+exp(x)

0.705


26/104

26

Neural Networks Layers

NN can be constructed using a number of processing

elements

Rather than a chaotic construction it is generally preferable

to build neural networks using layersA neural network will have an input layer, an output layer and

in between zero, one or more of hidden layers


27/104

27

Neural Network Layers 2

Depending on where a processing element is placed, it is

categorised as an input, hidden or output processing

element

Typically, but not necessarily, each processing element ina layer has the same transfer function

a NN with 4-3-2 configuration is a 2 or 3 layer NN

(depends on if input layer is counted) with 4 input nodes, 3

hidden nodes, 2 output nodes


28/104

28

The Role of the Input Layer

An input processing element receives input from the external

world and simply sends the actual input to the processing

elements of the next layer


29/104

29

The Role of the Hidden Layer

A hidden processing element receives its input from the

nodes of the previous layer and the transformation of the

input is sent to the next layer

A hidden layer may be seen as a pre-processor


30/104

30

The Role of the Output Layer

An output processing element delivers the representation of

the original input after transformations have taken place to

the world


31/104

31

Connectivity Matters

A number of different networks can be constructed - differ in

terms of the connectivity pattern and the number of layers

No hidden layers are called single-layer networks

One or more hidden layers are called multi-layer networks

If all connections lead from input to output then it is called

a feed-forward network

If there are connections in the opposite direction then it is

called a feedback or recurrent network


32/104

32

Artificial Neural Networks Models

Single layer

feedforward

Multi layer

feedforward

Recurrent

( feedforward )


33/104

33

Calculations of a multi-layer feed-forward

neural network

x2

+1

+1

1.5

-1

0.5

+1

+1 0.5+1

x1

x4

x3

x5


34/104

34

Learning Laws

As we saw on the previous slide the output with the current

weights is wrong if we want to perform AND.

This bring to us the problem of finding the correct set ofweights

The process of identifying the correct set of weights is called

the learning process and it is characterised by a learninglaw


35/104

35

Learning Laws 2

The purpose of a learning law is to locate the set of weights

which will give correct answers for all the inputs

The learning is achieved by employing an algorithm whichiteratively changes the weights of the connections in

response to every set of inputs until the correct weights

have been located


36/104

36

Learning Laws 3

Most learning laws are based on Hebbs rule which states

that

if two units are simultaneously active, increase the

strength of the connection between them

This rule is the basis for most learning laws used today

(Kohonen learning, Boltzman learning, Delta rule)


37/104

37

Some Learning Rules

Hebbian learning rule

Perceptron learning rule

Delta learning rule

Widrow-Hoff learning rule

j

t

iij xxwcfw )(

jtiiij xxwdcw sgn

jiiiij xnetfodcw'

j

t

iiij xxwdcw


38/104

38

Learning Methods

Supervised approach

a neural network is given a set of inputs and also the

correct output


39/104

39

Learning Methods 2

Unsupervised approach

a neural network is given a set of inputs and no outputs.

The network attempts to generate its own classes


40/104

40

Learning Methods 3

Reinforcement approach

a neural network is given a set of inputs and no outputs.

The network generates an output and only then it is

told if the produced output was correct or not

Learn by doing


41/104

41

Single-Layer Perceptrons

Network architecture

x1

x2

x3

w1

w2

w3

w0

y= signum(net)

y=step(net)

net= xi * wi -

= xi * wi + w0

where w0 =

= xi * wi

where i=0 nowSignum(net) = 1 if net > 0

else -1

Step(net)=1 if net > 0 else 0


42/104

42

Example I - The AND Function

X1

X2

W2

=

W1 =

W0

= O

1

1

2

1,1 ---> 1

rest ---> 0


43/104

43

Single-Layer Perceptrons

If correct response no modification takes place, else

An entire pass through all of the input training vectors is

called an epoch. When such an entire pass of the training

set has occurred without error, training is complete.

jtiiij xxwdcw sgn


44/104

44

Limitations

Perceptron networks have several limitations.

First, the output values of a perceptron can take on only one

of two values (True or False).

Second, perceptrons can only classify linearly separable setsof vectors. If a straight line or plane can be drawn to

separate the input vectors into their correct categories, the

input vectors are linearly separable and the perceptron will

find the solution. If the vectors are not linearly separable

learning will never reach a point where all vectors are

classified properly.

The most famous example is the boolean XOR problem.


45/104

45

The XOR problem

In 1960s perceptrons created a great deal of interest until.

M.Minsky and S. Papert Perceptrons MIT Press

Cambridge MA 1969

single-layer perceptrons can only be used for toy problemssince

cannot represent a simple XOR function


46/104

46

The XOR problem 2

The task is to classify a binary input vector to class 0 if the

vector has an even number of 1s or assign it to class 1.

A two-input binary XOR truth table:

0 0 0

0 1 1

1 0 1

1 1 0


47/104

47

The XOR problem 3

Recall that the output of a perceptron is given as follows:

1 if the weighted input is greater than 0

0 otherwise

The first input of XOR is 0 0 with desired output as 0

hence the weighted input must be less or equal than zero

in order to get the desired output

0 w1 + 0 w2 + 1 wo < = 0

wo < = 0


48/104

48

The XOR problem 4

The second input of XOR is 0 1 with desired output as 1

hence the weighted input must be greater than zero in

order to get the desired output

0 w1 + 1 w2 + 1 wo > 0

w2 + wo > 0


49/104

49

The XOR problem 5

The third input of XOR is 1 0 with desired output as 1

hence the weighted input must be greater than zero in

order to get the desired output

1 w1 + 0 w2 + 1 wo > 0

w1 + wo > 0


50/104

50

The XOR problem 6

The fourth input of XOR is 1 1 with desired output as 0

hence the weighted input must be less or equal than zero

in order to get the desired output

1 w1 + 1 w2 + 1 wo < = 0

w1 + w2 + wo < = 0


51/104

51

The XOR problem 7

In summary the percptron requires satisfying the following

four inequalities

wo < = 0

w2 + wo > 0w1 + wo > 0

w1 + w2 + wo < = 0

The first inequality tell us that wo must be less or equal to

zero. Therefore for 2nd and 3rd to apply must have w2and w1 respectively as positive numbers - which

contradicts with the 4th which says that their summation

must be negative or zero


52/104

52

Linear Separability

For binary inputs and outputs using the step function the

output is 1 if the net input is positive and 0 if the net input

is negative

net_input = 0: for two-inputs this equation represents a

line

If there are weights so that all of the training input vectorsfor which the correct response is +1 lie on one side of

the decision line and all of the training input vectors for

which the correct response is 0 lie on the other side of

the boundary then the problem is linearly separable


53/104

53

Linear Separability


54/104

54

The XOR problem 8

The XOR problem is not linearly separable

We can not use a single-layer perceptron to construct a

straight line to partition the two dimensional input

space into two regions, each containing only data

points of the same class

X

Y

0

1

0 1

0

0

1

1


55/104

55

Multi-Layer Perceptrons

The lack of suitable training methods for multi-layer

perceptrons (MLPs) led to a waning of interest until the

reformulation of the backpropagation training method

Previous work used signum or step activation functionswhich are nondifferentiable, now continuous activation

functions are employed


56/104

56

Multi-Layer Perceptrons 2

All nodes (or neurons) perform the same function on

incoming signals

a composite of the weighted sum and a differentiable

nonlinear activation function together known as thetransfer function


57/104

57

Multi Layer Feedforward Networks

The layers that are neither input nor output are called hidden

layers

Hidden layers extract high order statistics and in a way

provide an overall view of the input dataThe output of each layer is used as input to the next layer

There is no theoretical limit on connections between non

neighboring layers


58/104

58

MLP Architecture 2-2-1

x2 In p u t le ve l

In te r m e d ia tele ve l (H id d e n )

O u tp u t le ve l

y

x1

h1 h2


59/104

59

Activation Functions

Logistic function

f(net) = 1 / (1 + e -net )

Hyperbolic tangent function

f(net) = tanh(net/2) = (1 - e -net ) / (1 + e -net ) =

(2 / (1+e -net) ) - 1 = (e net - e -net) / (e net + e -net)

Identity function

f(net) = net

where net is the weighted input


60/104

60

Activation Functions 2

Logistic and Hyperbolic tangent function

approximate the signum and step function respectively

but they provide smooth, non-zero derivatives with

respect to the input signalsreferred to as squashing functions since the inputs to

these functions are squashed to the range [0,1] or [-

1,1]

referred to as sigmoidal functions because of their S-

shaped curves

the hyperbolic is sometimes referred to as the bipolar

sigmoidal

the logistic is sometimes referred to as the binary

sigmoidal


61/104

61

Activation Functions Graphs

The Logistic Function

-2

The Hyperbolic Function

-2


62/104

62

Identity Activation Function

Identity function

it is usually employed for nodes of the output layer to

approximate a continuous valued function not limited to

[0,1] or [-1,1]such nodes are referred to as the linear nodes

The Identity Function

-2


63/104

63

Binary and Bipolar Sigmoid Derivatives

f(net) = 1 / (1 + e -net )

f(net) = f(net) [ 1-f(net) ]

f(net) = (2 / (1+e -net) ) - 1

f(net) = 0.5 [ 1 + f(net) ] [ 1 - f(net) ]


64/104

64

LearningLearning target:

minimize the difference between actual outputs and target

outputs

Learning rule:

Steepest descent (Back-propagation)

Conjugate gradient method

All optimization methods using first derivativeDerivative-free optimization


65/104

65

MLP and the backpropagation algorithm


66/104

66


67/104

67


68/104

68

MLP and the backpropagation algorithm

oj

( d e s ir e do u tp u t )

hi wi j

wkixk

XS ig n a l E rr o r

In p u t L a y e r H id d e n L a y e r O u t p u t L a y e r

yj


69/104

69

Backpropagation Algorithm

0 Initialise Weights

1 While Stopping condition is false, do steps 2 to 9


70/104

70

Backpropagation Algorithm 2

2 For each training pair, do steps 3 to 8

Feedforward pass

3 Each input unit receives input signal and broadcasts this

signal to all units in the layer above (the hidden units)4 Each hidden unit sums its weighted input signals, applies

its activation function to compute its output signal and

sends this signal to all units in the layer above (output

units)

5 Each output unit sums its weighted input signals and

applies its activation function to compute its output signal

End of Feedforward Pass


71/104

71


Backward Pass

6 Each output unit receives a target pattern corresponding

to the input training pattern, computes its error information

term, calculates its weight and bias correction term, andsends its error information term to units in the layer

below

7 Each hidden unit sums its error information terms (from

units in the layer above) multiplies by the derivative of its

activation function to calculate its error information term,calculates its weight and bias correction term

End of Backward pass


72/104

72


Updating Pass

8 Each output unit updates its bias and weights. Each

hidden unit updates its bias and weights.

End of Updating pass

9 Test stopping criterion


73/104

73



74/104

74

Problems

How to determine the architecture?

How to determine the parameters?

How to get global optima?

... ...


75/104

75

GA and ANN

Three levels:

connection weights: introduce an adaptive and global

approach to training

architectures: adapt the topologies to different tasks withouthuman intervention and thus provide an approach to

automatic ANN design as both ANN connection weights

and structures

learn rules: learning to learn, an adaptive process of

automatic discovery of novel learning rules


76/104

76

Evolution of connection weights

Weight training in ANNs is usually formulated as

minimization of an error function, such as the mean

square error between target and actual outputs averaged

over all examples, by iteratively adjusting connectingweights.

BP often gets trapped in a local minimum of the error

function and is incapable of finding a global minimum if the

error function is multimodal and/or nondifferentiable.

GA can be used effectively in the evolution to find a near-optimal set of connection weights globally without

computing gradient information.


77/104

77

Typical cycle of the evolution of the

connection weights

1 Decode each individual in the current generation into a set

of connection weights and construct a corresponding ANN

with the weights

2 Evaluate each ANN by computing its total mean squareerror between actual and target outputs. The fitness of an

individual is determined by the error. A regularization term

may be included in the fitness function to penalize large

weights.

3 Select parents for reproduction based on their fitness

4 Apply genetic operators, such as crossover and mutation,

to parents to generate offspring, which form the next

generation


78/104

78

Representation

Binary or real number

Put connection weights to the same node together. Nodes in

ANN are in essence feature extractors and detectors.

Separating inputs to the same node far apart wouldincrease the difficulty of constructing useful feature

detectors because they might be destroyed by crossover

operators.

Permutation problem: The many-to-one mapping from the

representation to the actual ANN since two ANNs thatorder their hidden nodes differently in their chromosomes

will still be equivalent functionally. This makes crossover

operator very inefficient in producing good offspring.


79/104

79


80/104

80

Comparison between GA and BP

GA can handle the global search problem better. It can be

used to train many different networks regardless of their

architecture and saves a lot of human efforts in

developing different training algorithm for different types of

ANN.

GA makes it easier to generate ANN with some special

characteristics.

GA is much less sensitive to initial conditions of training.

There is no clear winner in terms of the best training

algorithm.


81/104

81

Hybrid training

Combine GAs global search ability with local searchs ability

to fine tune. GA can be used to locate a good region in the

space and then a local search procedure is used to find a

near-optimal solution in this region.


82/104

82

The evolution of architecture

The architecture of an ANN includes its topological structure,

i.e., connectivity, and the transfer function of each node in

the ANN.

The architecture has significant impact on a networksinformation processing capabilities. Given a learning task,

an ANN with only a few connections and linear nodes may

not be able to perform the task at all due to its limited

capability, while an ANN with a large number of

connections and nonlinear nodes may overfit noise in thetraining data and fail to have good generalization ability.


83/104

83

Traditional way to design the architecture

There is no systematic way to design a near-optimal

architecture for a given task automatically.

A constructive algorithm starts with a minimal network

(network with minimal number of hidden layers, nodes and

connections) and adds new layers, nodes andconnections when necessary during training.

A destructive algorithm starts with a maximal network

(network with maximal number of hidden layers, nodes

and connections) and deletes unnecessary layers, nodes

and connections when during training.

Such structural hill climbing methods are susceptible to

becoming trapped at structural local optima. They only

investigate restricted topological subsets rather than the

complete class of network architecture.


84/104

84

Typical cycle of the evolution of

architecture

1 Decode each individual in the current generation into an

architecture.

2 Train each ANN with the decoded architecture by a

predefined learning rule starting from different sets ofrandom initial connection weights and learning rule

parameters.

3 Compute the fitness of each individual according to the

above training result and other performance criteria such

as the complexity of the architecture.

4 Select parents from the population based on their fitness.

5 Apply search operators to the parents and generate

offspring which form the next generation.


85/104

85

The direct encoding scheme

An NN matrix C=(c(i,j)) can represent an ANN architecture

with N nodes, where c(i,j) indicates presence or absence

of the connection from node i to node j.

Such an encoding scheme can handle both feedforward andrecurrent ANNs.


86/104

86

A feedforward ANN


87/104

87

A recurrent ANN


88/104

88

Notes about direct encoding scheme

It is straightforward to implement.

Training error, training time, complexity can be used in the

fitness function

A large ANN would require a very large matrix and thusincrease the computation time of the evolution. Domain

knowledge can be used to reduce the search space

The permutation problem still exists


89/104

89

The indirect encoding scheme

Only some characteristics of an architecture are encoded to

reduce the length of the chromosome. The details about

each connection in an ANN is either predefined according

to prior knowledge or specified by a set of deterministic

development rules.


90/104

90

Parametric representation

ANN architectures may be specified by a set of parameters

such as the number of hidden layers, the number of

hidden nodes in each layer, the number of connections

between two layers, etc.

In general the parametric representation method will be most

suitable when we know what kind of architectures we are

trying to find.


91/104

91

Example of pattern recognition

Input Output Input Output

0000 00 0100 00

1100 00 1000 00

1001 01 0000 011101 01 0101 01

0010 11 1010 11

0110 11 1110 11

0011 10 0111 101011 10 1111 10

In fact the first two bits of the input are noise and the output

is the Gray code of the last two bits of the input.


92/104

92

Chromosome

We use a 16-bit chromosome

The first 2 bits stand for the study ratio: 0.5, 0.25, 0.125,

0.0625

The next 2 bits stands for the momentum: 0.9, 0.8, 0.7, 0.6The next 2 bits stands for the range of the initial weight: 1,

0.5, 0.25, 0.125

The next 5 bits is used for the 1st hidden layer: the first bit

means if there is a hidden layer and the other 4 bits

stands for the number of hidden units.

The last 5 bits is used for the 2nd hidden layer: the first bit

means if there is a hidden layer and the other 4 bits

stands for the number of hidden units.


93/104

93

Evolution and result

Only use the first 8 samples for evolution.

Use 7 of these 8 samples for training the ANN and the other

one is used to get the fitness.

Finally we get a 4-1-4-2 ANN(structure and weight).In order to check the final result we use the other 8 samples

and compare with a 4-16-16-2 ANN which is trained by BP.


94/104

94

Developmental rule representation

Development rules, which are used to construct architectures,

are encoded in chromosomes.

A development rule is usually described by a recursive

equation or a production system.How to get such a set of rules to construct an ANN? One

answer is to evolve them. We can encode the whole rule

set as an individual (Pittsburgh approach) or encode each

rule as an individual (Michigan approach)


95/104

95

Examples of some development rules


96/104

96

Development of an ANN architecture

Si lt l ti f hit t &


97/104

97

Simultaneous evolution of architectures &

weights


98/104

98

Evolution of learning rules

An ANN training algorithm may have different performance

when applied to different architectures. The design of

training rules, more fundamentally the learning rules used

to adjust weights, depends on the type of architectures

under investigation. Different variants of the Hebbian

learning rule have been proposed to deal with different

architectures. It is desirable to develop an automatic and

systematic way to adapt the learning rule to an

architecture and the task to be performed. Designing alearning rule manually often implies that some

assumptions, which are not necessarily true in practice,

have to be made.

T i l l f th l ti f l i


99/104

99

Typical cycle of the evolution of learning

rule

1 Decode each individual in the current generation into a

learning rule

2 Construct a set of ANNs with randomly generated

architectures and initial connection weights, and trainthem using the decoded learning rule.

3 Calculate the fitness of each individual according to the

average training result

4 Select parents from the current generation according to

their fitness

5 Apply search operators to parents to generate offspring

which form the new generation


100/104

100

Evolution of algorithm parameters

The adaptive adjustment of BPs parameters through

evolution could be considered as the first attempt to the

evolution of learning rules.

Some researchers used an GA process to find parametersfor BP but ANNs architecture was predefined. The

parameters evolved in this case tend to be optimized

towards the architecture rather than being generally

applied to learning.

Some researchers encoded BPs parameters inchromosomes together with ANNs architecture.


101/104

101

Evolution of learning rules

The evolution of learning rules has to work on the dynamic

behavior of an ANN.

Try to develop a universal representation scheme which can

specify any kind of dynamic behaviors is clearlyimpractical.

Two basic assumptions which have often been made on

learning rules are 1) weight-updating depends only on

local information such as the activation of the input node,

the activation of the output node, the current connectionweight, etc.; 2) the learning rule is the same for all

connections in an ANN


102/104

102

Learning rule

A learning rule can be described by the following function

There are three major issues involved in the evolution of

learning rules: 1) determination of a subset of terms

described in the above equation; 2) representation of the

coefficients as chromosomes, and 3) the GA used to

evolve these chromosomes.


103/104

103

Other combination between GA and ANN

Evolution of input features: finding a near-optimal set of input

features to an ANN

ANN as fitness estimator: the time-consuming fitness

evaluation based on real systems is replaced by fastfitness evaluation based on ANN

Evolving ANN ensembles: combining different individuals in

the population to form an integrated system is expected to

produce better results.

A general framework for GA and ANN


104/104

ch6 ann and ga

Documents