Extension: Interpreting The Output Graph

3.8. Extension: Interpreting The Output Graph#

Let’s see how the training samples ‘flow’ through our classification tree.

../../_images/classification_tree_diagram_2.png

Let’s now focus in on one of the nodes.

../../_images/classification_tree_diagram_2_crop.png
  • Ear Length (cm) <= 6.5 is the decision associated with the node.

  • samples = 5 means that 5 samples have ‘flowed’ through this node. This corresponds to the 5 lines passing through the node in our visualisation.

  • value = [1, 4, 0] tells you that there were 1 cat, 4 dogs and 0 rabbits that ‘flowed’ through this node. Again, you can tell by the colours of the lines. There is one yellow line for the cat and 4 purple lines for the dogs. The order of the list corresponds to the order of the class labels. In our case we had 1: cat, 4: dog, 0:rabbit.

  • class = Dog means that the majority of the samples passing through this node belong to the dog class. In instances where there is a tie between classes, the class associated with the lowest number will be selected. E.g. [4, 4, 0] has equal cats (0) and dogs (1), so we default to cats.

Code Challenge: Extension: Building a Classification Tree

You have been provided with a csv file called astronomy.csv with data sourced from Wikipedia. This data contains the following columns:

  • Name

  • Mass (Earths)

  • Density (g/cm^3)

  • Class

The Class column values are, 0: Moon, 1: Planet, 2: Dwarf planet.

We will use this data to build a classification tree that can classify astronomical objects as a moon, planet or dwarf planet.

Instructions

  1. Using pandas, read the file astronomy.csv into a DataFrame

  2. Extract the 'Mass (Earths)' and 'Density (g/cm^3)' columns into the variable x

  3. Extract the 'Class' column into the variable y

  4. Convert both x and y to numpy arrays

  5. Using sklearn, create a DecisionTreeClassifier model to fit to the training data, set the max_depth to 3

  6. Export the tree using export_graphviz and set rounded=True and impurity=False

  7. Save the tree as a png file.

Your figure should look like this:

../../_images/classification_tree_plot.png
Solution

Solution is locked