Problem: The main problem is that a person specialized in Data science, Devops, or ML may or may not be familiar with these technologies. So this project was built in such a way that it has removed that hurdle and anyone can make a good ML model irrespective of their knowledge in these technologies.
Solution Overview: So i have created multiple jobs to do the following task as described below
So below is the script for Dockerfile which will install all necessary libraries and tools required for running/training ML model.
NOTE: the name of the file should be Dockerfile only.
then build the image using Dockerfile by below syntax
docker build -t <name>:<tag> <location of docker file>
JOB 1: Github observer : By using Poll SCM it regularly checks for changes in code and copies into workspace.
JOB 2: ml_env_launcher: This job runs if JOB1 is successfully built and it’s job is to launch ML environment as created by Dockerfile and i have set remote script which is later used.
JOB 3: Model_train: This job runs when job 2 is successfully built and it’s task is to train the model and based on that accuracy it compares to desired accuracy i.e. in my case 95%
if it fails to achieve then it moves to Job 5 for improving accuracy otherwise success mail is sent
Note: i have used remote scripts to shift between jobs and if Job 3 fails then JOB 4 runs which basically sends a failure message of model_train.py . You have to download Downstream ext. plugin in Jenkins to run a job if any fails and you want to run a job after that.
JOB 4: model_train_failure_mail: It basically sends a mail to developer if Job 3 fails.
NOTE: For sending mail i have used smtp server and most importantly in order to send e-mail you have to decrease your sender e-mail id security by allowing sign in from less secure apps.
JOB 5: improve_accuracy: It is triggered to improve accuracy of the model until desired accuracy is reached. It also runs Job 6 which sends a failure mail to developers if accuracy_improve.py fails to compile due to any error. Remote script in build trigger has been created to access this job in JOB 4.
JOB 6: accuracy_improve_failure: sends a mail to developers if Job 5 fails.
JOB 7: success_mail: If all goes right and the desired accuracy is achieved in the model then this job will send a success mail to client. Remote script in build trigger has been created to access this job in JOB 4.
Job 8: monitor-ml_env: This job monitors ml_env container and due to any reason if the container fails then it again trigger Job 1 using remote script.
That’s all for this project. To view project in a better way and have more control on jobs have used build pipeline plugin which you can download inside jenkins.
NOTE: All the python scripts and model files are in my github repository and anyone can edit according to their requirements especially mail sending files.
CONCLUSION: This type of project is integrated with Docker, Jenkins, ML to save a lot of time and human effort on some basic task which have been configured under this project so the developer can focus more on coding part. This is one of the great example of Multi-tier architecture based on fully automation.
Github Link: https://github.com/Apeksh742/ML_with_devops