Linear Heteroassociator
David Wallace Croft
2005-02-16
This was an excercise that I worked through with the help of
Dr. Richard M. Golden
in his course Neural Net Mathematics.
Identities
Vector Norm Squared to Dot Product
Vector Derivative
Linear Heteroassociator
For this exercise, we use a simple neural network with 3 inputs and 2 outputs.
The network will be trained with 4 stimulus patterns.
Linear Heteroassociator
There are 4 stimulus patterns.
The 4 3x1 stimulus vectors make a 3x4 matrix.
The 4 2x1 response vectors make a 2x4 matrix.
The combined response matrix.
Objective Function
To train the weight matrix, the objective function is to be minimized.
For a given stimulus pattern (si),
the difference (di)
between the output desired (oi)
and the actual response (ri).
Minimize the average square error.
Substitute di.
Use the dot product identity.
The 4 2x1 difference vectors form a 2x4 matrix.
Convert the matrix to a vector by stacking columns.
Replace the sum of 4 dot products with one big dot product.
Transpose the weight matrix and convert it to a vector.
Rewrite the objective function as a scalar function of a vector.
Gradient Descent
Imagine a blind man in a land with many hills and valleys.
He wants to get to the lowest point in the area.
With each step, he uses his staff to tap around himself
to determine the slope of the land at his current position.
He then takes a step downward.
He eventually reaches the bottom of a valley.
The gradient descent weight update rule with learning rate α.
Chain Rule
First Term
Pull the constant out of the derivative.
Use the derivative identity.
Second Term
Response to a Single Stimulus Pattern
Combined Response Vector as a Function of the Weights
Third Term
Third Term as an 8x6 Matrix
Third Term as a Matrix of Vectors
Combine the three terms.
Move the constants and drop the identity matrix.
The Weight Update Rule
Links