XenonStack Recommends

Enterprise AI

ML Design Process and Lifecycle Deployment | Notebook Use case

Dr. Jagreet Kaur Gill | 05 July 2023

 ML Design Process and Lifecycle Deployment

Overview of Notebook Use Case

  • Requirement of Utopus: Whole pipeline (should cover all aspects such as data 
    ingestion, data visualization, and cleaning, feature engineering (as applicable), 
    model selection, model training, A/B testing, parameter tuning, and 
  • Training notebook
  • Testing scoring notebook
  • Scoring notebook
  • ML Lifecycles
  • ML Lifecycle with the tools
  • Model training
  • Model Publishing


ML-Specific Customer Example Criteria

  • Opportunity definition, including ROI methodology for project 
  1. Power Forecasting: Wind Power Forecasting and Solar Power Forecasting.
  2. As we are talking power forecasting, the threshold metric is: nMAE (Normalised Mean Absolute Error).
  3. Penalty > 13%
  4. Main Goal is to develop and train the model to forecast below 13 % nMAE.
  • A comprehensive articulation of the problem and why/how machine 
    learning can add value (i.e., automation efficiencies).
  1. As mentioned, we need to develop forecasting models which can 
    provide % nMAE : below 13 %, that is the reason why machine learning 
    has been used.
  • Demonstration of the ability and methodology to interact with 
    customer data engineers to gather and process all manners of input 
    data, from structured and queryable, to unstructured and streamed.
  • This includes discovery skills to help the customer determine if the 
    data are sufficient and relevant to solve the customers’ business 
  1. Two types of Data have been used weather data and measurement data
  2. Weather data has been from different forecasting systems, such as 
    (ECWMF and GFS), total forecasting systems from which weather data 
    have been collected: ECWMF, GFS, ICON, ICON-EU, ECMWF ENS Control 
    and GDPS15
  • Identification of toolset, algorithms, and pre-trained models, if any:

Right now, ensemble modeling has been used to train the model (to 
cater all the weather sources), no pre-trained model is in use.

  • Model evaluation and performance criterion (minimal acceptable loss, 
    KPIs etc.) and its refresh strategy.
  1. NMAE is the main KPI that is used to evaluate performance.
  2. Different dashboards and notebooks are there to evaluate it.
  3. If the model gives more NMAE than the required, it has been re-trained for a refreshed purpose.

Feature Engineering for Notebook

  • Notebook which has been used for feature engineering.
  • Following notebook provides: the availability of data between two dates (help us 
    identifying gaps) Outliers, Correlation between features (to remove the bad 
    quality data).
  • After identifying the above three things, we remove the insufficient quality data and data the period which has gaps.

ML Design Process - Metrics

  • Failure check: For checking the cause of failure, cloud watch logs have been 
    used (cloud watch has been enabled)
  • Accuracy check: NMAE is the prime metric or KPI which has been used to check the accuracy of the models, different dashboards and notebooks are there to calculate for specific models.

ML Design Process - Quality Control

  • Versioning and re-training pipelines are already there to maintain the quality 
    of the forecast and models.
  • For versioning: dedicated code repo and CI/CD pipeline are already there
  • For – retraining: Once NMAE starts increasing beyond 13% (or beyond the 
    threshold value given by the customer), the re-training could be done.
  • To do and maintain this whole process, there is a dedicated team named
  • The model Operationalisation team they have Quality control of models as there 
    main KRA.
  • Because Databricks (the tool used to do training) does not support git actions for 
    entire repositories (only one file at a time), the workaround will be to first 
    export our code from Databricks back to our local repo, and then perform the 
    desired git action from there.

Iterative Improvement of ML Models

  • As discussed above, a dedicated team is there to improve model 
    iteratively, know as Model Operationalisation
  • The objective of this team is to keep an eye on all quality of models on the basis 
    of defined KPI (NMAE).
  • If the quality start reducing (which also means an increment in NMAE), this team 
    again initiate the process of re-training, testing, deployment on staging and 
    change the model which is running on PROD.

Model Development Expertise

  • There is a dedicated team that develops the models and has the expertise in developing the models.
  • Combined or Ensemble forecasts using several NWP-based members is the 
    current approach of leading renewable forecast vendors.
  • Same has been used to develop the model:
  • Reasons for using Ensemble forecasting:
  1. Higher Accuracy: It is a well-proven fact that the combination of 2 or more NWPs 
    delivers increased performance concerning any individual NWP.
  2. More Consistent Accuracy by look-ahead: Our current layout could mean an 
    accuracy drop between our short-term and day-ahead forecasts since 
    the performance gap between GFS and ECMWF might be noticeable.
  3. Higher Reliability: The ensemble forecast is a “self-backed-up” process. One of the most common causes of forecasts unavailability is missing or delayeNWP data. This dependence on the timeliness of external data can be efficiently  reduced through the ensemble forecast. By combining different NWPs from different organizations. We ensure our forecast is issued even if any member is unavailable. In strict terms, we can continually create forecasts unless all the ensemble members are missing simultaneously, which is a shallow probability 
  4. Low Technical Debt: An intelligent combination of the different available data sources is a process that can be easily automated at the single-site level using algorithms with different levels of complexity (from simple weights optimization to sophisticated ML based algorithms using additional inputs such as seasonality, lead-time, weather regimes, etc), with performance improvements to be expected from the most straightforward approaches.
  5. Future applications for probabilistic forecasts.

What are the points according to ML Lens?

The points according to Machine Learning Lens:

Business Goal Identification

  • Steps in this phase:
  1. Understand business requirements.
  2. Form a business question.
  3. Review a project’s ML feasibility and data requirements.
  4. Evaluate the cost of data acquisition, training, inference, and wrong predictions.
  5. Review proven or published work in similar domains, if available.
  6. Determine key performance metrics, including acceptable errors.
  7. Define the machine learning task based on the business question.
  8. Identify critical, must-have features.
  9. Design small, focused POCs to validate all of the preceding.
  10. Evaluate if bringing in external data sources will improve model performance.
  11. Establish pathways to production.
  12. Consider new business processes that may come out of this 
  13. Align relevant stakeholders with this initiative.
  • Main Goal is to develop a pipeline to train the model that can do Forecasting below 13 % nMAE (or according to customer requirements).
  • Main Goal is to develop and train the model which can do Forecasting below 13 % nMAE (or according to customer requirements).
  • It is a power forecasting problem that is the main reason for using Machine Learning and Deep Learning.
  • For forecasting, two types of data are used: Weather Data from different sources and Measurement data.

ML Problem Identification

  • As mentioned above as it was a forecasting problem from Day 1. That is the reason ML has been used.
  • According to the forecast horizon, three categories of forecasting has been decided:
  1. Intraday: Includes Very short-range weather forecasting(Up to 12 hours description of weather parameters) and Short-range weather forecasting(Beyond 12 hours and up to 72 hours description of weather parameters).
  2. Day Ahead: Includes Short-range weather forecasting(Beyond 12 hours and up to 72 hours description of weather parameters) and Medium-range weather forecasting (Beyond 72 hours and up to 240 hours description of weather parameters)
  3. Long Term: Includes Extended-range weather forecasting (Beyond 10 days and up to 30 days description of the weather parameters usually averaged and expressed as a departure from climate values for that period) and Long-range forecasting.
  • The first phase of this included a single weather source for singles forecasting capabilities (ECWMF for Day ahead and GFS for Longterm).
  • But to include all the available weather sources for Ensemble modeling 
    has been used:

Model Development Life Cycle Phase

  1. Dedicate team is there to develop the model
  2. Required goals should be matched --> POC is there
  3. Model development approaches --> We need to drive models knowledge from state-of-the-art research.
  4. This development happens in Dev Environment, and they test it on QA environment.
  5. After testing, Versioning and re-training pipelines are already there to maintain the quality of the forecast and models, this code pushed there.
  6. For versioning: dedicated code repo and CI/CD pipeline are already there.
  7. We are using the Databricks extension, visual studio, to bring code on data bricks environment.
  8. The MO team uses the code on Databricks to train the models.

Monitoring Life Cycle Phase

Currently, monitoring has been on two levels:

  • Failures of Models:
  1. Alerts have been set up in the case of failure. We are 
    using AWS SNS
  2. If failure happened, cloud watch is already enabled, and logs could be referred for troubleshooting
  • Accuracy of models
  1. Alerts have been set up in the case of breaking of threshold (based on NMAE). We are using AWS SNS to drop alerts on
  2. Different dashboards are there to monitor the accuracy, if accuracy goes beyond the threshold, re-training and redeployment could be initiated.

Extra Notes

  • Data Sources Types:
  1. Measurement Data: Turbine Sensors Data ( Live ): Active Power, Available Power ( FTP , API , Customer Kafka Streams -> AWS Kinesis Streams )
  2. Weather Data: S3 Files ( Grib Files )
  3. Data Collection & Data Processing
  4. Model Development / Training <=> Delta Lake
  5. Data Engineering Team ingests the data into Ingestion Tables ( S3 )
  6. Refined Tables: Data Cleaning, Data Transformation, Deduplication, etc
  7. OffLine Feature Store Table: Fx Global Measurement Data Lake -> Model Development / Model Training
  • Scoring
  1. Data Engineering Team consumes the data from Kinesis Stream and store 
    the features in Online Feature Store and helps them to score.


Develop and train the Machine Learning model to maintain quality, with the process of re-training, testing, deployment on staging, and changing the model running on PROD, by exporting the code from data bricks back to the local repo and then performing the desired git action from there for better trained ML model with the help of Model Operationalization team.

captcha text
Refresh Icon

Thanks for submitting the form.