Machine Learning Coding Tutorial 2. Visualizing a Decision Tree

In the previous tutorial, we used decision tree as the classifier. Decision Tree is an easy to read and understand classifier. In this tutorial, we are going to write a program to visualize the decision tree.

1. “Iris” Problem

We are going to code a program to solve a classical machine learning problem called “Iris”, to identify what type of flowers base on the measurements: the length and width of the pedal, the length and width of the sepal.

https://en.wikipedia.org/wiki/Iris_flower_data_set

Sample Iris Data
Sepal length Sepal width Petal length Petal width Species
5.1 3.5 1.4 0.2 I. setosa
4.9 3.0 1.4 0.2 I. setosa
4.7 3.2 1.3 0.2 I. setosa
5.7 2.9 4.2 1.3 I. versicolor
6.3 3.3 6.0 2.5 I. virginica

The Iris data set includes three types of flowers. They are all species of Iris: Setosa, Versicolor and Virginica.

Iris data is an array of arrays looks like this

Iris data is an array of arrays looks like this

2. Procedure

Our coding procedure would be the following steps:

  1. Import Iris Dataset
  2. Create training and testing data
  3. Train a Classifier
  4. Predict label for a new flower
  5. Visualize the Decision Tree

3. Coding

Create a python file iris.py and write following code to program.

Please read comments carefully to understand the meaning of codes.

Run the program with the following command in Terminal (Mac) or Command Prompt (Windows):

Do you see the program predicts the flower with a correct label?

Do you find the generated pdf file?

Yep. The machine is clever lol.

Leave a Reply

Your email address will not be published. Required fields are marked *