In this tutorial we shall implement our first learning algorithm, namely a single neuron. The celebrate artificialneural networks (ANN) are built up of numerous neurons, so this tutorial is the first step.
Marsland discusses the perceptron as a collection of individual neurons. Observe that the neurons are completely independent. Each one receives the same input vector, but produces its own scalar output. Combining several neurons into one perceptron gives us vector output.
In this Problem we will implement only a single neuron with the perceptron training algorithm. Building a multi-neuron perceptron is a relatively simple extension which we consider later.
Marsland Chapter 3 gives all the details you need to understand the neuron and the perceptron.
At the conceptual (mathematical) level, the neuron receives a real vector as input. The output is 0.0 or 1.0, which we also consider as a real number.
What data type should be used in Haskell to represent the input and the output from the neuron?
You may define your own data types or type aliases for input and output if you want to, but it is not necessary.
Define a data type Neuron
to represent a single
neuron, recording
all the weights.
Think through the following questions.
We need a function to create a new, pristine neuron. In a production system, this should be done randomly, but randomness is non-trivial, so we have to return to that later. For the time being, we are happy to initialise the neuron with small constant weights (say 0.01).
Define a function initNeuron :: Integer -> Neuron
which returns a neuron with small non-zero weights (say 0.01).
The integer input specifies the dimension of the input vector
to be used with the neuron.
Test the function in ghci
.
Does the function give you what you expect?
In operation, the so-called recall, the neuron receives an input vector and produces a scalar output. We need the following function to do this,
recall :: Neuron -> InputVector -> OutputValue
The algorithm (formula) is defined by Marsland, p. 47. Remember to include the thresholding, so that the output is 0 or 1, and remember that we have one more weight than inputs, and this extra weight is multiplied by -1.
Define the recall
function.
Replace InputVector and OutputVector with the types you
decided to use in Step 1.
We need to test the function. Start ghci
and try
the following
recall (initNeuron 3) [ 1.0, 0.5, -1.0 ]
If you have used another datatype for the input vector, other than
list, you need to change the test call appropriately.
Do you get the expected output?
Obviously, you do not learn all that you want to know from the above
test, but at least you get to check for type errors.
Develop your own test, by manually defining a test neuron with
other weights, and use that in lieu of initNeuron
.
The training function receives an old neuron, together with an input vector and a corresponding output value, and it returns a new neuron, with updated weights. Additionally, it takes the training factor η. Thus we need a function with the following signature.
trainOne :: Double -> InputVector -> OutputValue -> Neuron -> Neuron
This function considers only one input/output pair and updates
the neuron. The complete train
function, which
we consider later, will use trainOne
to consider
multiple input/output pairs and train the neuron iteratively.
Define the trainOne
function.
Replace InputVector and OutputVector with the types you
decided to use in Step 1.
We need to test the function similarly to what we did in the
previous step. Start ghci
and try
the following
trainOne 0.5 [ 1.0, 0.5, -1.0 ] 1.0 (initNeuron 3)
Do you get the expected output?
Using the above trainOne
function, we want to
design a trainSet
function which takes a list
of input vectors instead of a the single one, and trains
the neuron on each input vector in turn.
This gives this signature:
trainSet :: Double -> [InputVector] -> OutputValue -> Neuron -> Neuron
Define the trainSet
function.
Replace InputVector and OutputVector with the types you
decided to use in Step 1.
Test the function as you did with trainOne
, but
replace the input vector with a list of two input vectors (of
you choice), each of length 3.
It is usually not sufficient to run the training once only.
Usually, we want to repeat the trainSet
operation
T times.
In other words, we want a function with the following signature:
train :: Integer -> Double -> [InputVector] -> OutputValue -> Neuron -> Neuron
The train
function repeatedly applies
trainSet
T times.
Each round uses the output neuron from the previous application
as input.
Define the trainSet
function.
Replace InputVector and OutputVector with the types you
decided to use in Step 1.
Test the function using the same test data as you used for
trainSet
. Try both T=2 and T=5.
replace the input vector with a list of two input vectors (of
you choice), each of length 3.
A simple test to device is to take a simple function and try] to predict whether it is positive or negative. Take, e.g. Now you can choose a couple of points and calculate the corresponding function value. This gives you input and output values to train your perceptron.
You may also want to test your perceptron on data which was
not used for training. Choose a few other
points, and you can easily calculate both
and recall [x,y,z] n
where n
is a trained neuron.
It is not necessary to solve this problem at this stage, but we will need it eventually. Please postpone it if you are behind schedule.
Define a data type Layer
to represent a multi-neuron
perceptron.
What data type can be used to hold a set of neurons?
The name `layer' will make sense when we advance to more complex neural networks. The perceptron consists of a single layer only, but other neural networks will be lists of layers.
Define a function initLayer
to return
a perceptron (layer) where all weights in all neurons is set to
some small, constant, non-zero value.
Remember arguments so that the user can choose both the number
of neurons in the layer and the number of inputs.
Define a function recallLayer
which does a recall
for each neuron in the layer, and returns a list of output values.
Generalise each of the training functions
trainOne
,
trainSet
, and
train
for perceptrons.
The training functions for perceptrons have to apply the
corresponding training function for each neuron in the layer.
You have just implemented your first classifier. Well done.
However, this prototype leaves much to be desired.
As you can see, we have to go back and learn some more techniques in Haskell. First of all, we will learn I/O in the next tutorial, to be able to read complete data sets from file, both for training and for testing.