Blue-Green Deployment
Two identical environments with instant traffic switch for zero-downtime deployments.
Blue-Green deployment maintains two identical production environments. At any time, only one (say Blue) serves live traffic. You deploy the new version to the idle environment (Green), verify it, then switch the load balancer to route all traffic to Green instantly.
How It Works
The key is having two complete, independent environments. A load balancer or DNS sits in front and directs traffic to the active environment. Deployment happens to the inactive environment with zero impact on users. Once validated, the switch is instant -- typically a single load balancer rule change.
When to Use
- Zero-downtime deployments are mandatory
- Instant rollback is a regulatory or business requirement
- You can afford 2x infrastructure cost during deployment
- Database schema is backward-compatible across versions
When NOT to Use
- Database schema changes that break backward compatibility
- Budget constraints prevent running two full environments
- Very frequent deployments (multiple per hour)
- Stateful apps with in-memory sessions that cannot be shared
Real-World Examples
Capital One - Banking API
Capital One uses Blue-Green for their customer-facing banking API. Regulatory requirements demand instant rollback capability. Each deployment is validated in the Green environment with synthetic transactions before switching. Rollback has been exercised in under 30 seconds.
Transport for London - Fare Engine
TfL updates the fare calculation engine using Blue-Green to ensure zero disruption during peak commuting hours. The Green environment is validated with millions of fare calculations before traffic switch. Inconsistent fares would cause public trust issues.
Step-by-Step Implementation
1. Define two Services for Blue and Green
apiVersion: v1
kind: Service
metadata:
name: app-blue
spec:
selector:
app: myapp
version: blue
ports:
- port: 80
targetPort: 8080
---
apiVersion: v1
kind: Service
metadata:
name: app-green
spec:
selector:
app: myapp
version: green
ports:
- port: 80
targetPort: 80802. Deploy new version to Green
apiVersion: apps/v1
kind: Deployment
metadata:
name: myapp-green
spec:
replicas: 3
selector:
matchLabels:
app: myapp
version: green
template:
metadata:
labels:
app: myapp
version: green
spec:
containers:
- name: myapp
image: myregistry/myapp:2.0.0
ports:
- containerPort: 8080
readinessProbe:
httpGet:
path: /healthz
port: 8080
initialDelaySeconds: 53. Switch traffic via Ingress
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: myapp-ingress
spec:
rules:
- host: api.example.com
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: app-green # Switch from app-blue to app-green
port:
number: 804. Verify and roll back if needed
# Verify Green is healthy
kubectl get pods -l version=green
# If issues arise, switch back to Blue
kubectl patch ingress myapp-ingress --type='json' \
-p='[{"op": "replace", "path": "/spec/rules/0/http/paths/0/backend/service/name", "value": "app-blue"}]'Common Pitfalls
| Pitfall | Symptom | Fix |
|---|---|---|
| Database schema divergence | Rollback fails because Green schema is incompatible with Blue code | Use expand-and-contract migrations; keep schemas backward-compatible |
| Session state loss | Users logged out after switch | Use external session store (Redis) shared between environments |
| DNS propagation delay | Some users still hitting old environment | Use load balancer switching instead of DNS; or set low TTL |
| Forgetting to warm up Green | High latency immediately after switch | Run load tests against Green before switching traffic |