- Learnwithdevopsengineer
- Posts
- ⚡CI/CD for AI — Auto Retrain, Auto Test, Auto Reject, Auto Promote (EP5)
⚡CI/CD for AI — Auto Retrain, Auto Test, Auto Reject, Auto Promote (EP5)
MLOps Series — How Real Companies Protect Production Models
🎯 Why This Episode Matters
In software engineering, CI/CD catches bugs instantly.
In machine learning, a model can get worse after retraining — and everything will still look “healthy.”
No crashes.
No errors.
No logs.
Just silently deteriorating predictions.
This is one of the biggest risks in production ML systems.
Episode 5 introduces the solution companies actually use:
CI/CD for AI.
A pipeline that:
retrains models automatically
evaluates them against the current production model
rejects bad models
promotes only the better model
updates production safely
This is the layer that keeps ML from breaking silently.
📌 What We Build in Episode 5
Our folder now looks like this:
artifacts_prod/ # Production model
pipeline.pkl
artifacts_candidate/ # New model for testing
pipeline.pkl
model/
train_good_model.py
train_bad_model.py
evaluate_pipeline.py
promote.py
api_v2.py
.github/workflows/
mlops-cicd.yaml
This setup mirrors how real companies govern ML systems.
We will:
train a GOOD production model
train a BAD candidate model
compare both on the same dataset
reject the candidate
promote only the better one
automate everything with GitHub Actions
By the end of the episode, you’ll have a fully working model governance + CI/CD pipeline.
🟢 Training the GOOD Production Model
We start by training a reliable, high-performing baseline model and saving it into:
artifacts_prod/pipeline.pkl
This represents the “trusted” model currently in production.
This is the version our API is serving — and the version all future candidates must beat.
🔴 Training the BAD Candidate Model
Next, we intentionally create a bad model.
We simulate what actually happens in real teams:
bad training data
noisy labels
truncated vocabulary
wrong preprocessing
reduced dataset
The model looks normal, trains without errors, and outputs predictions…
…but its accuracy is MUCH worse.
This is exactly how bad models slip through without proper safeguards.
⚖️ Evaluating: Good vs Bad
Our evaluation script loads both models, tests them against the same dataset, and prints:
Production model accuracy: 0.xx
Candidate model accuracy: 0.xx
Reject new model
If the candidate model is worse, we reject it.
If it’s better, we promote it.
This is the core logic behind CI/CD for ML.
🟢 Promotion Logic — Only When Better
If the candidate passes the quality gate, a simple script replaces the production pipeline:
artifacts_candidate/pipeline.pkl
→ artifacts_prod/pipeline.pkl
This ensures our FastAPI service will automatically serve the updated model without any code changes.
Production always loads:
artifacts_prod/pipeline.pkl
Stable. Safe. Predictable.
🚀 GitHub Actions CI/CD for MLOps
Episode 5 introduces a fully automated CI/CD workflow:
installs Python
installs dependencies
trains candidate model
evaluates candidate vs production
auto-rejects bad candidates
auto-promotes better ones
commits the updated production model
Every push to main triggers the full loop.
This is EXACTLY how real ML teams prevent silent failures.
🧠 What EP5 Teaches You
The key idea:
In ML, new is not always better.
A retrained model can hurt performance, even if code and infrastructure look perfect.
Episode 5 teaches:
model governance
quality gates
automated evaluation
regression control
safe promotion
how CI/CD works for ML
how modern companies avoid silent model degradation
This is one of the most important real-world MLOps skills.
🚀 Coming Up in Episode 6
Episode 6 brings the next critical piece:
monitoring
drift detection
Prometheus / Grafana
alerts
real-time tracking of inputs + outputs
CI/CD protects deployments.
Monitoring protects everything after deployment.
🔗 Full Video + Code Access
🎥 Watch Episode 5: https://youtu.be/DlNzxFMXLic
📬 Code + Labs + Exercises:
https://learnwithdevopsengineer.beehiiv.com/subscribe
Subscribers get:
complete CI/CD pipeline
production-ready workflow
real incident labs
drift simulation scripts
MLflow examples
interview prep
all episode code bundles
💼 Need DevOps or MLOps Help?
If you’re building:
CI/CD pipelines
Docker + Jenkins
MLflow setups
FastAPI deployments
model governance workflows
monitoring + alerting
Kubernetes
cloud cost optimization
You can consult me directly.
Reply to this email or message me on YouTube/Instagram.
— Arbaz
📺 YouTube: Learn with DevOps Engineer
📬 Newsletter: learnwithdevopsengineer.beehiiv.com/subscribe
📸 Instagram: instagram.com/learnwithdevopsengineer