XenonStack Recommends

Data Science

MLOps Processes and Principles - The Complete Guide

Navdeep Singh Gill | 03 July 2023

MLOps Processes and its Implementation | Complete Overview

Introduction to ML Project Life Cycle

The steps involved in the ML Life cycle, and MLOps is all about advocating for automation and monitoring at all the above steps. Machine learning project development is an iterative process that means we continue to iterate from each of the above processes (except scoping) during a life cycle of a model to improve the efficiency of the process.

  • For instance, We improve the data when new data comes in or feature engineering new features out of the existing data.
  • We iterate through the modeling process according to its performance in production.
  • Accordingly, the deployed model gets replaced with the best model developed while iteration.
  • This process goes on with the iteration, but one should follow some best practices while iterating through the process. We will talk about these here.
Mixed data scientists and services designed to provide automation in ML pipelines and get more precious insights in production systems. Click to explore about, MLOps Platform - Productionizing Machine Learning Models

MLOps Process for Continuous Delivery

Developing a machine learning mode and deploying it fast and cheaply but maintaining it over time becomes difficult. Any team developing ML solutions must follow best practices to get the best out of the machine learning models. It helps to avoid “machine learning technical debt.”

The best practice that needs to follow while developing ml solutions:

Data Validation

In the Ml system, data is the most crucial part. It does not validate correctly and may cause various issues in the model. It is necessary to validate input data that is fed to the pipeline. Otherwise, as data science, one concept says, garbage in, garbage out. As a result, data must be considered a top priority in the ML system. It should be continuously monitored and validated at every execution in the ml pipeline.

Experiment and track experiments

To get the best accuracy, one needs to do experiments. Machine learning is all about experimentation. It may involve trying out different combinations of code, preprocessing, training, evaluation methods, data, and hyperparameters tuning. Each unique combination produces different metrics you need to compare with other experiments and keep track of them. So, later you can compare which combination is performing better.

Model validation across segments

The performance of machine learning models can degrade over time. And they need to be retrained to maintain good performance. Before deploying a model into production, the model needs to be validated. Model validation includes producing metrics (e.g., accuracy, precision, rmse, etc.) on the test datasets to check the model performance so it can fit business objectives.

The model should also be validated on various data segments to ensure they meet requirements. Otherwise, the model can be biased in the data, and several incidents have happened where the model was biased and performed inadequately for some users.


Reproducibility means in machine learning that every phase should produce the same results given the same input. It can be data preprocessing, model training, and model deployment. It’s challenging and requires tracking model artifacts such as code, data, algorithms, package, and environment configuration.

Monitoring predictive service performance

The practice mentioned above can help you deliver a robust ml model. In operations, different metrics need to be measured to evaluate the performance of the deployed model. These metrics evaluate model performance regarding business objectives. Users might need a good performance and better accuracy of the model, but they also need as fast as possible and availability all the time. To monitor operational metrics such as:

  • Latency: measured in milliseconds,
  • Scalability: how much traffic can the service handle at the expected latency
  • Service update: how much downtime is introduced during the update of service.

For instance, delaying any service can impact the user, and it can cause loss to the business.

Automate the process

Managing machine learning tasks manually becomes difficult and time-consuming when the models get into production. Data preprocessing, model training and retraining, hyperparameter tuning, and model deployment can be automated. If data drift, model drift or the performance of the model degrade. So it can be retrained automatically. It just needs to be triggered. After automating the process, the error margin becomes less and more models can be deployed. ML pipeline can be used to automate the process. So, the model can follow continuous training and continuous delivery. 

Support the organization as they begin to climb MLOps practices through an organization's AI / ML workflows. Taken From Article, MLOps Roadmap for Interpretability

MLOps Best Practices Organizations should follow

The best practices for MLOps are described below:

Best Practices for Scope Management

Scoping is defining the project goals in terms of Machine learning development goals. For instance, the business team might ask us to develop a conversational AI or agent for our website that will answer the FAQs of the user. Now the development of a FAQ answering agent is a business goal. Once this is clear, we need to define our goal: developing a question-answering algorithm based on the FAQs present.

Best Practices to follow while scoping

  • Understanding the Business Problem

This is a crucial step though it seems like a simple step, due to the lack of understanding of the business problem, all development processes may go in vain. So the development team needs to be on the same page with the business team (or the team handing out the problem). Understand the problem properly and clearly and get it verified with the stakeholders. Note: Do not proceed with the development plan until the problem is clear.

  • Brainstorming within the team

Once the problem is defined, one should brainstorm and accumulate all the solutions' ideas. The goal here is to think outside the box and explore all the ideas suggested by the team members.

  • Research About the problem

At this stage, we have clarity of the problem and ideas from the team, now do thorough research at your end about the problem, the research should be solution-oriented, keeping in mind that we need to come up with a road map and approach doc for the solution (elaboration to these are given in next sections).

  • Define the Development plan concretely, aka “Roadmap.”

Once the problem is defined, one needs to come up with a Roadmap, i.e., a visual representation for flow for the development of the solution to the problem. The roadmap should contain the following things:

  1. Proposed processes and steps to deliver the solution.
  2. Estimated time for each process, i.e., Timeline.
  3. Special remarks that you think should be given with each process. For example, some dependencies need to be fulfilled, such as data dependency from the data engineering team before the EDA process in data preparation steps.
  4. Once the roadmap is developed, get it verified with the concerned person. In your case, it might be Subcoach, Coach, etc., and get the inputs.
  5. The template can be found here.
  • Prepare Approach Doc
  1. Once the Roadmap is clear, one needs to prepare an Approach doc. This document contains information about the approach you will use to solve the business problem you are given. For example, suppose you are given a business problem that involves classification, then in the approach doc. In that case, you need to tell the initial algorithm(s) you are going to select for the implementation with the implementation flow.
  2. The purpose of Approach Doc is to give visibility of our approach to the stakeholders so that we can take them in our confidence for the development process we are going to follow.
  3. An example template of the Approach Doc can be found here. Once Approach Doc is prepared, get it verified and get the inputs from the stakeholders.
The system needs continuous learning and training from the real world. Click to explore about, DevOps for Machine Learning.

Best Practices for Successful Data Processing

Here, we will discuss the best practices while processing the data before the modeling stage.

Types of Data Problems

The data types for any machine learning problem can be divided into the below categories.

The above figure shows the datasets we can see while developing the ML solution for a business problem. Let’s see the best practices while handling both types.

Best Practices for Defining the Dataset for Structured Data

Here we will see the best practices for defining the dataset.

  • Information of each column: Maximum efforts should be put in getting the information on each column of the dataset if it’s not present to remove the ambiguity from the dataset if the dataset is present in the tabular format. If data is Unstructured, metadata(information on each field of the dataset) should be fetched and asked from the team providing the dataset to you. It’s solely the responsibility of the team to get the info on the dataset if it’s not present.
  • A clear distinction between features and labels: The first important step in data processing should be defining the dataset, i.e., for the ML problems, we should know what the features(X) need to be considered and what should be a label(Y) for the problem if this is not clear don’t proceed for the other steps this is a prerequisite. For unstructured data, the labels must also be defined. For example, if it’s an image classification problem, the images become features, and the labels should be given.
  • Consistency in Labelling format for Unstructured data: Sometimes, what happens with Unstructured data (text, image, audio) is that we need to label it manually or give the task of labeling to the labelers (these can be anyone who is assigned the task of labeling the dataset). If more than one labeler is involved in the dataset, we must ensure a consistent labeling strategy. For instance, consider labeling the image of Smartphones with defects or not. In case 1, the labeler has been labeled as given in figure 1, and for a similar case, the other labeler has labeled it as it is given in figure 2. So there is inconsistency in labeling, which must be avoided by providing clear instructions to the labelers.

Best Practices while preprocessing the dataset

Remember This “Always Keep track of the dataset aka Data Versioning,” let’s dive into the best practices of it.

  1. Use Data versioning tools: For data versioning with each experiment done with any dataset version, we should use data versioning tools like DVC.
  2. Text files for data versioning: If, due to some reasoning, the data versioning tools can’t be used, use text files or google sheets to maintain the records of the dataset used in the experiments, but maintaining versioning records is the responsibility of the developer, and he/she needs to reproduce it when asked.
  3. Tracking and Reproducible Experiments: The main purpose of data versioning is that, when required, one can easily reproduce the experiments conducted with any version of the dataset, this is not possible if one never does the versioning of the dataset.

Consistency in Data pipelines

  1. Make data pipelines consistent both for development, testing, and production: It is tempting for ML developers to kick start the development process without giving focus to data pipelines, for instance, the data preprocessing script used for the training model in the development stage can’t be used in production or even during scoring Always keep in mind to make consistency data pipelines which means you can one pipeline everywhere for data processing.
  2. Fault tolerance capacity of production pipeline: Give these pipelines the ability to handle any exceptions that may occur while the model is deployed in production. For instance, one needs to handle the scenario if one or more values go missing from the inference data( data in production).

Other Miscellaneous points to keep in mind for the data processing stage

  • Balanced Train/Val/test: The train/dev/test should represent the dataset. Let us understand it with an example, consider a dataset with 100 examples of smartphones, and out of 100, 30 are positive(defective), other negatives:

Row 2 shows how split can be non-representative of the actual dataset as every.
The set must contain the 30% samples from the positive class. But Row 3 shows the correct way in the table.

  • Prevent Data leakage: When your training data contains information about the target, but similar data is not available when the model is used for prediction, data leakage (or leaking) occurs. This results in an excellent performance on the training set (and potentially even the validation data), but poor performance in production. To put it another way, leakage makes a model appear correct until you start making decisions. See more here.
tinyML is a branch of ML that focuses on creating and implementing machine learning models on low-power, small-footprint microcontrollers such as the Arduino. Click to explore about, MLOps for Scaling Tiny ML and its Applications

Best practices for Data Modelling

The best practices of data modeling are described below:

Define Baseline and Benchmark the model

Once you reach the Modelling part, we need to set up a baseline to compare the performance of our model in different experimentations.

  1. Human-Level Performance (HLP) as a baseline: For unstructured data like images, humans can be used to set the baseline accuracy of the model (if data is small enough and you have labelers). For example, For the computer vision problem of detecting defects in smartphone images, the human can detect the defect in the smartphone screen and then be tested with a model.
  2. Quick implementation: The other most-followed option is a quick implementation with a basic algorithm and considers it as a baseline. But the baseline is necessary.

Model Versioning and Tracking

  1. Use Model versioning tools: For Model versioning, with each experiment done with any model version, we should use model versioning tools like mlflow.
  2. Text files for Model versioning: If the model versioning tools can’t be used due to some reason, use text files or google sheets to maintain the records of the models used in the experiments.

Error Analysis once the Model is trained

Once the model is trained, Error analysis is the process of getting visibility about where the model did not perform well. For example, a classification problem model might not be performing in the class. This allows us to improve the model performance and to audit its performance at every iteration. The process can be understood with the below diagram.

Let’s see the Best practices for the error analysis process.

  • Accuracy is not always the best checkout confusion matrix: 
    Always consider various evaluation metrics while evaluating the model's performance. Confusion matrix and classification reports give these metrics like precision, recall, and f1 score consider these also.
  • Brainstorm how things can go wrong with the model and test it:

- Performance on different subsets of dataset known as cross-validation.

- Performance in a rare class.

-Fairness and bias of model (checkout fairness section).

A process that enables the developers to write code and estimate the intended behavior of the application. Download to explore about Machine Learning

Use a Data-centric Approach not a Model-centric Approach

It becomes tempting for ML solution developers to use cutting-edge algorithms for solving the problem given at hand. Still, it is always better to have a simple model with better explainability than a complex model on bad data.

Best practices for improving the dataset, i.e., following the data-centric approach:

  • Data Augmentation for Unstructured data: For unstructured data like images data and audio data, augmentation is an excellent approach to have more datasets but keep in mind these things while performing augmentation:
  1. Create more examples on which algorithms show poor performance in error analysis.
  2. If possible, see if the baseline model is performing well on this dataset.
  • Feature Engineering for structured data: It might not be possible to create new samples for structured data such as online user data as it is impossible to add new users. For structured datasets creating new features can be a great option to explore.

Developing Fair and Unbiased ML algorithms

It focuses on building Fair and unbiased ML algorithms so that every end-user using the served by us in production should have equal opportunities. This means they are not discriminated against based on race, sex, religion, socioeconomic status, and other categories. For example, a credit card approval application using the ML model at the backend may reject a person based on his race if Bais was not eliminated from the data. To avoid such unfair events, follow the best practices regarding Bias and Fairness given below:

Analyze the data for biases: One should properly analyze the data so there is no representational bias in the dataset. This means one group of people is left intentionally for some reason, such as if the dataset used to train the models excludes darker skin tones. We have mentioned bias only. Other biases can be present in ML workflow. We need to reduce all of them. See the figure below and follow this link for more information. Following the above procedure, the model is ready to go for production. For deployment Best practices, see ModelOps best practices section.


Now that the list of the most excellent MLOps tools is compiled, all you have to do is figure out how to put them to use in the setup. These tools make it easier to keep track of modifications and model performance, allowing us to focus on domain-specific tuning and model performance. It will continue to improve in the future, with new functionality added to the tools to make the life of data science teams handling the operational side of machine learning projects that much easier.