Building a Neural Network For Classification

5.10. Building a Neural Network For Classification#

For this example we’ll use our pass/fail dataset. A 1 means the student passed the exam, and a 0 means they failed. pass_fail.csv.

import pandas as pd

data = pd.read_csv("pass_fail.csv")
print(data)
Output
Time Spent Studying (hours)  Exam Result
0                          4.5            1
1                          8.0            1
2                          1.5            0
3                          3.5            1
4                          5.5            1
5                          3.0            0
6                          6.5            1

Building a neural network for classification is very similar to building a neural network regression, the main differences is that instead of importing MLPRegressor we use MLPClassifier.

from sklearn.neural_network import MLPClassifier

Then when we make predictions, we can get the raw class predictions using .predict or we can obtain the probabilities using .predict_proba.

Here is a complete example.

from sklearn.neural_network import MLPClassifier
import pandas as pd
import numpy as np

data = pd.read_csv("pass_fail.csv")
x = data[["Time Spent Studying (hours)"]].to_numpy()
y = data["Exam Result"].to_numpy()

nn = MLPClassifier(
    hidden_layer_sizes=(2), max_iter=500, learning_rate_init=0.1, random_state=0
)
nn.fit(x, y)

x_test = np.array([[0], [3], [5]])
print(nn.predict_proba(x_test).round(4))  # Round to 4 decimal places
print(nn.predict(x_test))
Output
[[1.     0.    ]
[0.9839 0.0161]
[0.     1.    ]]
[0 0 1]

Let’s interpret the outputs.

The 3 test samples we provide are of students who have studied 0 hours, 3 hours and 5 hours respectively. The output probabilities are:

[[1.     0.    ]
[0.9839 0.0161]
[0.     1.    ]]
[0 0 1]

Here is a summary of what this means:

Student 1

0

1

0

Studetn 2

3

0.9839

0.0161

Student 3

5

0

1

Based on these probabilities students 1 and 2 are predicted to fail and student 3 is predicted to pass. This corresponds to the predictions we see:

[0 0 1]

Student 1

0

1

0

0

Student 2

3

0.9839

0.0161

0

Student 3

5

0

1

1

Code Example: Neural Network for Classification

We’ll use a smaller version of the MNIST dataset that only contains digits 1, 2 and 3. Your tasks is to build a KNN classification model and to use it to classify the 6 samples provided in the test dataset. These images are provided below.

../../_images/123_train.png

Note

Each image in the dataset has been flatted into a single row of pixels. For more details and examples refer to Extension: Image Data. from lesson “4. K-Nearest Neighbours and K-Means Clustering”

Instructions

  1. Using pandas, read the files 123_xtrain.csv, 123_ytrain.csv and 123_xtest.csv into DataFrames and then convert them to numpy arrays.

  2. Reshape the y training values using .reshape(-1, ) (this turns the y values into a 1D array)

  3. Using sklearn.neural_network, create a MLPClassifier

    • The neural network should have 3 hidden layers with 10, 20 and 10 neurons in these layers

    • Set the maximum number of iterations to 500

    • Set the learning rate to 0.001

    • Set random_state to 0

  4. Fit the data neural network to the training data

  5. Obtain the prediction probabilities of the test data, round this to 3 decimal places and print the results

  6. Print the predicted the class labels

Your output should look like this:

[[X.    X.    X.]
[X.XXX X.XXX X.XXX]
[X.    X.XXX X.XXX]
[X.XXX X.XXX X.XXX]
[X.    X.    X.   ]
[X.    X.    X.   ]]
[X X X X X X]
Solution

Solution is locked

Extension More Training

We have provided a larger training set for you to work with locally. Use this to try to build a better neural network. Things to consider:

  • The size of the network

  • The number of iterations

  • The learning rate

mnist_train_x_large.csv

mnist_train_y_large.csv

TIPS:

  • Start small and build up! If you start with a huge neural network it will take a long time to train so you will be slow to make improvements. We recommend starting with something small and continuously tweaking it to improve performance.

  • You may also want to consider writing a script to test a whole range of different hyperparameters (e.g different network sizes) and storing the results. That way you can leave experiments running over night and check the results in the morning.

  • Don’t forget that what we care about is how well the neural network does on unseen data. You might want to withhold some of the training data as validation data so you can see how well the model does on new data.