Overview of Solution Architecture for Building ML Platform

Feature storage

The data from the data warehouse is processed and stored in an extracted feature repository which can be later used by data scientist to build machine learning models from these features.

Model Building

Model building services consist of the jupyter notebook which helps in easy model building and data visualization.

Model Training

Models are trained in a distributed manner and using specialized hardware such as TPU’s for faster training. Training is done on multiple nodes simultaneously over big data.

Model Versioning

Generally, for a solution multiple machine learning model are built. so we store and version the models in our model repos

Model Deployment

Deployment service is responsible for the deployment of built machine learning models and through this service the machine learning models are made available in different regions.

Model Validation

Once the models are deployed the models are put into production the models are made available to the end users or a/b testing of the models is done. On the basis of the impact of the model, the models are validated to their performance. We can monitor the model performance via a monitoring dashboard. Once the best model is selected it is made available across all the regions.

Technology stack:

Model building

Jupyter

Machine learning

Tensorflow, Keras, sci-kit-learn

Model training distributed / standalone

Google cloud TPU / Cloud machine learning engine

Data warehouse

Big query

Data pipeline

Cloud data flow

Data visualization

Google data studio

Model versioning and serving

Tensorflow-serving

Deployment

Kubernetes, docker

Impacts of Solution Architecture for Building ML Platform

  • Enabled more users to create machine learning based products
  • Reduced time and efforts
  • Enabled easier model evaluations
  • Increased insights into machine learning model in production
  • Standardized environments for machine learning model development
  • View Real-time model performances
  • Better business decisions based on deep insights from the user data
  • Reduced development time
  • No need to extract features again and again