Integrating ML/DL with DevOps
The integration of technologies like Machine Learning, Deep Learning, Artificial Intelligence, and DevOps can change the way product development and deployment cycles work in the future.
In today’s connected world, Machine Learning (ML), Artificial Intelligence (AI), and DevOps are shaping the way we learn, work, and connect. DevOps integrates the working of the Development and Operations teams, while ML, Deep Learning (DL), and AI handle data to produce powerful results in the form of models that can predict future trends based on the current data. Integrating these technologies can speed up the deployment of ML models to production on a large scale. Based on this concept, I have developed a small scale system which shows the effect of automating the training of machine learning models using Git, Jenkins, and Docker. In the explanation below, I have focused on the automation of ML and DL models and not the installation of Git, Docker, and Jenkins.
The automation of the ML/DL models can be carried out in 3 broad steps:
- Build the ML and DL models.
- Create the Docker container images for training the respective ML and DL models using Dockerfile
- Setting up the Jenkins jobs to perform the following:
- Job 1: Pull content from the GitHub repository into the local host system.
- Job 2: Run the correct Docker containers for each model.
- Job 3: Execute the .py file of the models in the respective containers and find the accuracy. If the accuracy is lesser than a certain limit, then the model’s hyperparameters must be modified and the model retrained to provide better accuracy.
- Job 4: Monitor the Docker containers and send an error message if there is an issue with the containers running.
I have developed this setup with a base system running on Windows, with an RHEL8 OS running on Oracle VirtualBox.Both Jenkins and Docker are installed on the RHEL8, and the automation has also been done almost completely on the RHEL8.
Firstly, we create machine learning models. To show variation, I have created two models — One focusing on the implementation of Neural Networks and the other focusing on the implementation of standard Logistic Regression. The code for the models and the corresponding datasets used for each model can be found in my GitHub Repository. Git plays an important role here, as it forms the basis for the integration of your local repository with the remote repository on GitHub. Once the models have been created, we can move to the creation of the Docker containers on which we can train the models.
Next, we create Dockerfiles for Docker containers. Typically, we can find many container images in the online repository on https://hub.docker.com/, but we need a more customized version for our purpose. For each Dockerfile, we do the following:
- Create a workspace for each Docker container on the RHEL8 command line from the root. For example —
# mkdir dl_docker
# cd dl_docker
# gedit Dockerfile
The following image shows the content we fill in the Dockerfile for DL models.
2. After creating the Dockerfile, we build the container image. Note that you must be in the same directory as the Dockerfile to successfully run the command. If you are in the same directory as the Dockerfile, then
filepath is simply
. .If you are in any other directory, ensure that you mention the full path of the Dockerfile.
# docker build -t <image_name>:<version> <filepath>
3. Once the container image has been successfully built, we can launch containers as and when we require it. We follow a similar process for the ML container as well. The Dockerfile contents for the ML models are given below.
Once the container images are created, we can start setting up the jobs in Jenkins. Jenkins can be launched from the RHEL8 VM with a private IP address that is not exposed to the outside network (Internet). To access Jenkins from other systems, we can use ngrok to temporarily expose the IP address of Jenkins for a limited amount of time. Note that Jenkins must be given the power of root users by modifying the existing
/etc/sudoers file after its installation. The jobs are initialized in Jenkins as shown below:
- Job 1 involves taking content from the GitHub repository and copying it into the RHEL8 system. There are many ways by which we can integrate Git and Jenkins, but all of them require the Git Plugin to be installed. After the installation of the Git Plugin, one of the methods is to set up Webhooks using the Jenkins URL and GitHub repository URL. We set up the Job 1 as shown by the following images.
In the fourth image, the shell script which will be executed once the Job is successfully built is given. The script runs on the bash script of the host RHEL8 system.
2. In Job 2, we launch the containers depending on the model that needs to be trained. One way to recognize the type of model is to look at the libraries being imported in the .py file. Based on the libraries, we can segregate the ML and DL model training files. Job 2 is executed only if Job 1 is successful. In Jenkins, Job 2 is called the downstream project of Job 1, while Job 1 is called the upstream project of Job 2. This is a simple implementation of Pipelines in Jenkins.
There are two shell scripts given here. Both will be executed when the Job is built.
3. In Job 3, we train the models from the shell script itself. Job 3 can run only if Job 2 has finished its build successfully.
In the shell scripts, after training the models, the accuracy of the models is stored in a .txt file in the same directory as the .py files themselves.
4. The final job is simply meant to monitor the containers after their launch, hence it can also be considered a downstream project of Job 2.
Once all the Jobs have been created, we can then use the Build Pipeline to visualize the links between the jobs. To use the Build Pipeline, we need to install the Build Pipeline Plugin by going to Manage Jenkin ->Manage Plugins and installing the Build Pipeline Plugin from the list of available plugins. Once the plugin has been installed, we can create a new View in the Build Pipeline style, and choose the starting job as Job 1. Once the view is created, we can see the following.
The Jobs above are green in color as they have all been successfully executed serially.
To start the build of the Jobs, we can push the changes of the local repository to the remote repository on GitHub. This will automatically trigger the build of Job 1. As each Job completes successfully, its respective downstream job will also commence its build, until all the jobs have been executed. The console outputs of one such execution of all jobs are shown below.
In all the console outputs above, you may have noticed the ‘Email being sent’. This is also another feature offered by Jenkins, where we can send email notifications regarding the status of the Jobs. This can be set up by installing the Email Extension Plugin or by even using the default Email services offered by Jenkins. To send these emails, we can go to the Post-Build actions of each job and add the Email Notifications (If we choose the default email services) or the Editable Email Notifications (After installing the Email Extension Plugin).
This is the set up that can help us automate the training of the ML and DL models using different DevOps tools like Git, Docker, and Jenkins. In this setup, we have not only managed to automate the training of ML and DL models but have also modified the ML and DL training files to achieve the desired accuracy by changing different hyperparameters. This is just a small glimpse of how DevOps and Machine Learning can together revolutionize the digital world as we know it.