5.9. Neural Networks for Classification#
Thus far we have seen neural networks used in regression. This means that the network will output raw values. For example, we previously saw this network output the values 107 and 32.
A neural network used in classification works the same way but with a small modification. The final layer applies a softmax function to the output layer.
The purpose of the softmax layer is that it turns output values into probabilities. The way it does this is a little complicated, but let’s look at it with this example. Before we apply the softmax function our neuron outputs are 107 and 32.
What we do is we take the exponential of these outputs.
Then we divide the exponentials by the sum of the exponentials.
We can use python to calculate these outputs.
import numpy as np
e107 = np.exp(107)
e32 = np.exp(32)
print(e107 / (e107 + e32))
print(e32 / (e107 + e32))
Output
1.0
2.6786369618080778e-33
And here are the results:
Each output neuron will correspond to a class and we predict the class with the highest probability.
Let’s look at a slightly more complicated but more realistic example where we try to classify an image of a digit. For simplicity we only have handwritten digits of the numbers 1, 2 and 3.
Here the input to our neural network is an image. We break the image up and feed each input neuron a pixel value. These values are propagated through the network and we make a prediction based on the highest probability. In this case, we predict class 1, i.e. the digit is a 1.