Useful Links
Computer Science
DevOps and SRE
Site Reliability Engineering (SRE)
1. Introduction to Site Reliability Engineering
2. Core Principles of SRE
3. Service Level Management
4. Observability and Monitoring
5. Incident Management and On-Call
6. Toil Management and Automation
7. Change and Release Management
8. System Design for Reliability
9. SRE Organization and Culture
10. Advanced SRE Practices
Change and Release Management
Safe Change Management Principles
Change Approval Processes
Risk Assessment for Changes
Change Coordination
Change Freeze Periods
Progressive Delivery Techniques
Gradual Rollouts
Feature Flags
Canary Releases
Blue-Green Deployments
A/B Testing for Reliability
Continuous Integration and Delivery
CI/CD Pipeline Design
Automated Testing Gates
Deployment Automation
Pipeline Security
Artifact Management
Rollback and Recovery
Designing for Quick Reversals
Rollback Procedures
Monitoring for Rollback Triggers
Post-Rollback Analysis
Forward Fix vs Rollback Decisions
Previous
6. Toil Management and Automation
Go to top
Next
8. System Design for Reliability