An Introduction to Transfer Learning

Machine Learning and Deep Learning are vast fields with a wide range of applications ranging from image recognition, speech recognition, recommendation and association systems, etc. To build any model from scratch, we require a lot of storage and computing power, which may not always be available to us. Also, we can face situations where we have methods to improve existing models, but the complications of training the models from scratch once again prevent us from doing so. To handle such objectives, we can use the concept of Transfer Learning.

Transfer learning is a machine learning method where a model developed for a task is reused as the starting point for a model on a second task. In this method, pre-trained models are used as the starting point on computer vision and natural language processing tasks instead of developing models from the very beginning. This allows us to handle the challenge of the large amount of computing and storage resources required to develop Deep Learning models. However, it should also be noted that transfer learning only works in deep learning if the model features learned from the first task are general.

In this article, we can see an implementation of transfer learning for face recognition using the MobileNet architecture. The pre-trained weights for MobileNet can be found in Keras and downloaded. There are other Neural Network architectures like VGG16, VGG19, ResNet50, Inception V3, etc, but MobileNet comes with its own set of advantages:

  1. It is lightweight and can be used for embedded and mobile applications.
  2. There are a reduced number of parameters.
  3. It is a low-latency neural network.

MobileNet also comes with a small disadvantage of reduced accuracy due to the lower size, the number of parameters, and preference of higher performance speed.

The dataset being used here is based on the 14 Celebrity dataset taken from Kaggle, one of the most popular websites for working with Machine Learning and Deep Learning problems and getting datasets. In addition to the data already present, I have added more images from my end from Google Images, to bring a total of 283 training images and 70 test images spread across 14 classes.

To implement MobileNet, we first download the weights of the pre-trained MobileNet architecture from Keras. In transfer learning, we freeze the previously trained layers and add new layers to the existing model to develop a newer and improved model tailored to our requirements.

To add new layers, we create a function that we can call later under the name get_head . Here, we retrieve the layers of the MobileNet model and add new layers (One Pooling Layer and 4 Dense Layers. The final Dense layer is for prediction).

Next, we import the required modules for adding layers and call the function to add the new layers. Currently, we are using

Image processing applications in Machine Learning and Deep Learning require a vast amount of training and testing data. If we are not able to collect enough data, we can use the ImageDataGenerator offered by Keras to expand our current image dataset. The ImageDataGenerator is typically used to perform operations like rotation, horizontal flips, vertical flips et. on images to bring more variation into the training and testing data. This helps in building a good model.

Now, we have the training and testing data ready for use. We can proceed with the model training. We can use the concept of checkpoints, Early Stops, and callbacks to make our training process more efficient. The trained model is stored in the file face_recognition.h5 . To test our model, we can import the trained weights from this file.

Now, we test our model with any given image. I have taken one image from the testing data. In the code, the final output is the image being tested for prediction, with the predicted class displayed on the image window as well. For displaying the image, we use the cv2 module in python.


From the results above, we can see that the model doesn't always give the correct answer. Such errors are often due to training dataset size and variation and the hyperparameters used. By varying different hyperparameters like batch size, and increasing the volume and variation of training data, we can build better models that give greater accuracy.


  2. Research Paper where the MobileNet architecture was published — A. G. Howard, M. Zhu, B. Chen, D. Kalenichenko, W. Wang, T. Weyand, M. Andreetto, and H. Adam. Mobilenets: Efficient convolutional neural networks for mobile vision applications. arXiv:1704.04861, 2017.
  4. Original Dataset from Kaggle —

ECE Undergrad | ML, AI and Data Science Enthusiast | Avid Reader | Keen to explore different domains in Computer Science