This project is a multi-class image classification. I have used TensorFlow mobilenet_v2_130_224
to build and train a Neural Network that will classify dog breed from images passed as input.
You can see below some predictions made by the final model :
To do this, we'll be using data from the Kaggle dog breed identification competition. It consists of a collection of 10,000+ labelled images of 120 different dog breeds.
We're going to go through the following workflow:
- Get data ready (download from Kaggle, store, import).
- Prepare the data (Turn into tensors & batches, train & valid & test sets)
- Choose and fit/train a model (TensorFlow Hub,
tf.keras.applications
, TensorBoard, EarlyStopping). - Evalaute the model a model (making predictions, comparing them with the ground truth labels).
- Improve the model through experimentation
For preprocessing our data, we're going to use TensorFlow 2.x. The whole premise here is to get our data into Tensors (arrays of numbers which can be run on GPUs) and then allow a machine learning model to find patterns between them. For our machine learning model, we're gonna do some transfer learning and we're going to use a pretrained deep learning model from TensorFlow Hub.
To preprocess our images into Tensors we're going to :
- Uses TensorFlow to read the file and save it to a variable,
image
. - Turn our
image
(a jpeg file) into Tensors. - Normalize image (from 0-255 to 0-1)
- Resize the
image
to be of shape (224, 224). - Return the modified
image
.
A good place to read about this type of function is the TensorFlow documentation on loading images.
Dealing with 10,000+ images may take up more memory than your GPU has. Trying to compute on them all would result in an error. So it's more efficient to create smaller batches of your data and compute on one batch at a time.
In this project, we're using the mobilenet_v2_130_224
model from TensorFlow Hub.
https://ai.googleblog.com/2018/04/mobilenetv2-next-generation-of-on.html
MobileNetV2 is a significant improvement over MobileNetV1 and pushes the state of the art for mobile visual recognition including classification, object detection and semantic segmentation. MobileNetV2 is released as part of TensorFlow-Slim Image Classification Library, or you can start exploring MobileNetV2 right away in Colaboratory. Alternately, you can download the notebook and explore it locally using Jupyter. MobileNetV2 is also available as modules on TF-Hub, and pretrained checkpoints can be found on github.
The first layer we use is the model from TensorFlow Hub (hub.KerasLayer(MODEL_URL)
. This input layer takes in our images and finds patterns in them based on the patterns mobilenet_v2_130_224
has found.
The next layer (tf.keras.layers.Dense()
) is the output layer of our model. It brings all of the information discovered in the input layer together and outputs it in the shape we're after, 120 (the number of unique labels we have). The activation="softmax"
parameter tells the output layer, we'd like to assign a probability value to each of the 120 labels somewhere between 0 & 1. The higher the value, the more the model believes the input image should have that label.
The following confusion matrix is the result of using our model to classify the validation dataset images. The model have been trained on 9000 images, and the validation dataset have 1000 images.
Another way to analyse the inference of our model is to print out the predicted class of an image and it's probability distribution. For each image, we show it's original label (left), it's predicted label (right), and the probabilty assossiated with the predicted label (how much confident is our Neural Network about the predicted class).
How to approuve model accuracy :
- Trying another model from TensorFlow Hub - A different model could perform better on our dataset.
- Data augmentation - Take the training images and manipulate (crop, resize) or distort them (flip, rotate) to create even more training data for the model to learn from.
- Fine-tuning