Course Outline
Introduction
- How SRE marries traditional IT and software development.
- The need for automation and observability
- The role of a software engineers vs system administrators.
- Site Reliability Engineers vs DevOps engineers.
Overview of an IT System
- System architecture, on-premise and in the cloud.
Overview of SRE Principles and Practices
- Infrastructure as a Code.
- The role of containerization and orchestration (Docker, Kubernetes, etc.)
- Continuous Integration, Continuous Deployment and Continuous Delivery.
- Observability.
Evaluating an IT System
- Taking stock of the team and organizational resources.
- Maping out the systems and processes.
- Estimating the potential impact of SRE.
- The role the software engineering team.
- The role of the operational team.
- The role of management.
Maintaining the Reliability of a System
- Describing and measuring the desired reliability of a service.
- Understanding Service Level Objectives (SLOs)
- Understanding Service Level Indicators (SLIs) and Service Level Agreements (SLAs).
- Working with Error Budgets.
- Developing an SLO.
Optimizing System Administration
- Setting up a development environment
- Evaluating SRE tools
- Prioritizing tasks for automation.
- Writing software.
Deploying "Infrastructure as Code"
- Testing and iterating code
- Making a system anti-fragile
- Learning from failure
Monitoring a System
- Observing system performance.
- SRE tools and techniques.
The Future of SRE
Summary and Conclusion
Requirements
- A general understanding of IT infrastructure.
- A general idea of the software development process.
- Programming or scripting experience in any language.
Audience
- Developers
- System administrators
- Software Architects
- DevOps engneers
- IT Managers
Testimonials (7)
How detailed subjects are explained with real world examples
Brian Hlabane - African Bank
Course - Site Reliability Engineering (SRE) Fundamentals
Full coverage of the material scope, real-life examples, clear explanations, and suitable for individuals with minimal experience
Monika - Capgemini
Course - Site Reliability Engineering (SRE) Fundamentals
Machine Translated
Ways to connect theory with practice. A theoretical introduction followed by tasks is the ideal combination :-)
Mariusz Zawadzki - Capgemini
Course - Site Reliability Engineering (SRE) Fundamentals
Machine Translated
Interesting practical exercises (simulation improved by the group throughout the entire course), real-life examples, inserts/digressions
Krzysztof - Capgemini
Course - Site Reliability Engineering (SRE) Fundamentals
Machine Translated
Group Tasks
Krzysztof - Capgemini
Course - Site Reliability Engineering (SRE) Fundamentals
Machine Translated
She is expert in area and provide really nice training. Material, training was really mix of examples , discussion and
Peter Tutka - Deutsche Telekom IT & Telecommunications Slovakia s.r.o.
Course - Site Reliability Engineering (SRE) Fundamentals
View on the SRE/ DevOps from more business/ theoretical point of view. Most helpful for people who already have the practical view.