> For the complete documentation index, see [llms.txt](https://notes.mikaelsamvelian.com/llms.txt). Markdown versions of documentation pages are available by appending `.md` to page URLs; this page is available as [Markdown](https://notes.mikaelsamvelian.com/devops-knowledge/sre/service-level-terminology.md).

# Service Level Terminology

Effective service management requires understanding which behaviors matter and how to measure them.

Service Level Indicators (SLIs), Service Level Objectives (SLOs), and Service Level Agreements (SLAs) help define and deliver the desired level of service.

Metrics should guide appropriate actions when issues arise, ensuring the service remains healthy.

***

## **Service Level Terminology**

* **SLI (Service Level Indicator):** Quantitative measures of service performance (e.g., latency, error rate, throughput, availability).
* **SLO (Service Level Objective):** Target values for SLIs, specifying the desired level of service performance.
* **SLA (Service Level Agreement):** Contracts with users outlining the consequences if SLOs aren’t met (e.g., financial penalties).

***

## **Service Level Indicators**

* SLIs are specific metrics that indicate service health (e.g., request latency, error rate).
* Common SLIs include availability (e.g., 99.9% availability = "three nines").
* Some SLIs may only be proxies for actual user experience (e.g., server-side latency vs. client-side latency).

***

## **Service Level Objectives**

* SLOs set expectations for service performance and help reduce complaints.
* Example: Latency SLO (e.g., average request latency < 100ms).
* Choosing SLOs is complex and should reflect both user expectations and system capabilities.
* Higher load often increases latency, so SLOs should account for this relationship.

***

## **Service Level Agreements**

* SLAs are formal agreements between the service provider and users, typically involving penalties for unmet SLOs.
* SLAs are more tied to business decisions, while SREs focus on meeting SLOs to avoid penalties.

***

## **Indicators in Practice**

Focus on a handful of meaningful SLIs that matter to users, such as:

* **User-facing systems:** Availability, latency, throughput.
* **Storage systems:** Latency, availability, durability.
* **Big data systems:** Throughput, end-to-end latency.
* **All systems:** Correctness (accuracy of returned data).

***

## **Collecting and Aggregating Indicators**

* Metrics can be collected server-side or client-side, depending on the aspect of user experience being measured.
* Aggregating metrics (e.g., average latency) can obscure important details, such as tail latencies.
* Percentiles (e.g., 99th percentile latency) offer a clearer view of performance extremes.

***

## **Best Practices**

* Use percentiles rather than averages to capture the distribution of performance, especially for latency.
* Standardize SLIs across services to simplify monitoring and ensure consistency.

***

## **Conclusion**

* Defining and managing SLIs, SLOs, and SLAs are key to delivering a reliable service.
* Prioritize metrics that matter most to users and align with your system's goals for optimal service management.


---

# Agent Instructions
This documentation is published with GitBook. GitBook is the documentation platform designed so that both humans and AI agents can read, navigate, and reason over technical content effectively. Learn more at gitbook.com.

## Querying This Documentation
If you need additional information that is not directly available in this page, you can query the documentation dynamically by asking a question.

Perform an HTTP GET request on the current page URL with the `ask` query parameter, and the optional `goal` query parameter:

```
GET https://notes.mikaelsamvelian.com/devops-knowledge/sre/service-level-terminology.md?ask=<question>&goal=<endgoal>
```

`ask` is the immediate question: it should be specific, self-contained, and written in natural language.
`goal` is optional and describes the broader end goal you are ultimately trying to accomplish on behalf of the user. GitBook uses it to tailor the answer towards what is most useful for that goal.

The response will contain a direct answer to the question and relevant excerpts and sources from the documentation.

Use this mechanism when the answer is not explicitly present in the current page, you need clarification or additional context, or you want to retrieve related documentation sections.
