Team MLOps Architecture
The Team's Architecture Mural:
Click on a number icon in the image for a brief description of the process associated with it.
Components
The following explanations apply to the team’s MLOps architecture:
1. Model Development
Data scientists develop models locally.
2. Models pushed
All models pushed from local machine are stored in the artefact store in MLflow. The store itself is an S3 bucket, where for each model a .pth
file, evaluation metrics, hyperparameters (α,β,etc) and Uniform Resource Identifier (URI) are stored.
Data format: .pth
Data visibility: High
Frequency of data flow: On demand
3. Model commit
One chosen algorithm, parameters, data and model development code is finalised and checked into the main branch and baselined.
Data format: .pth
, .csv
4 - 6. Model Registry and Store
As models move through MLflow Store:
Data visibility: High
Frequency of data flow: On demand
4 - The best model is manually chosen to be promoted, and tagged as None.
5 - Manual compliance checks are made, and model is then tagged as Staging.
6 - Model tagged manually as Production based on Comparison / AB testing.
7. Manual push to model server
Model manually pushed to Model Server to serve in Production
Frequency of archiving of data: On demand
8. Prediction service
Prediction Service is ready to serve users.
Data format: URI
Data visibility: Low
Frequency of data flow: Real-time
9. User input and predictions output
As user interacts with prediction service, their input is returned to them.
Data format: JSON
Data visibility: Med
Frequency of data flow: Real-time
10. User interaction data captured
User interaction data is captured and stored in logs database in S3.
11. Model monitoring for decay
Model monitoring setup checks model and data for drift or decay.
12. Scheduler initiates training process
13. DAG initiates docker instance to be run
14. Data retrieved from S3 bucket
The data for training, testing and evaluation is retrieved from Amazon S3 bucket.
Data format: .csv
Data visibility: Low
Frequency of data flow: Based on scheduler or on demand, depending on whether training process is manually retriggered later on.
15. Data is preprocessed and transformed
Data format: .csv
Data visibility: Low
Frequency of data flow: Based on scheduler or on demand, depending on whether training process is manually retriggered later on.
16. Features stored
The output of preprocessing is stored in feature store
Data format: .csv.dvc
Data visibility: Low
Frequency of data flow: Based on scheduler or on demand, depending on whether training process is manually retriggered later on.
17. Feature store fetch
Data from feature store is fetched for training and evaluation
Data format: .csv.dvc
Data visibility: High
Frequency of data flow: Based on scheduler or on demand, depending on whether training process is manually retriggered later on.
18. Model created
Model is created and trained against test data using metrics
Data format: .csv
Data visibility: Low
Frequency of data flow: Based on scheduler
19. Model version storing
Each model version is stored in the database within the model store
Data format: .pth
Data visibility: High
Frequency of data flow: Based on scheduler
20. Model monitor pull
Model monitor pulls new data periodically from S3 and investigates it for decay
21. Potential retrigger
Upon identifying model decay, Data Scientists manually retrigger the retraining process.
22. Industry Data
Simulated industry data is supplied to the pipeline on-demand