SRE

This page and its sub-pages contain my notes from studying the Google SRE workbooks:

How Google Runs Production Systems: https://sre.google/workbook/table-of-contents/

Practical Ways to Implement SRE: https://sre.google/sre-book/table-of-contents/

Pre-DevOps

Operations Challenges: Complex, context-dependent; often treated as cost center in enterprises.

DevOps

  • Principles: CALMS (Culture, Automation, Lean, Measurement, Sharing)

  • Focus: Collaboration, continuous improvement, no silos.

  • Key Ideas:

    • Accidents are normal and expected.

    • Gradual, small changes preferred.

    • Culture over tooling for success.

    • Measurement crucial for improvement.

SRE

  • Definition: Implementing DevOps philosophy with a focus on concrete practices.

  • Principles:

    • Operations is a software problem.

    • Manage by Service Level Objectives (SLOs).

    • Minimize toil; automate where possible.

    • Wisdom of production informs design.

    • Reduce cost of failure to enhance development speed.

    • Share ownership with developers.

    • Unified tooling across roles.

Comparison

  • Similarities:

    • Acceptance of change.

    • Collaboration and shared ownership.

    • Small, continuous changes.

    • Importance of measurement and blameless postmortems.

    • Holistic approach to improvement.

  • Differences:

    • DevOps: Broader, culture-focused; not detailed in service management.

    • SRE: Service-specific, structured around detailed principles like SLOs and error budgets.

Last updated