Skip to main content

SLA-SLO-SLI and DevOps metrics

[PLACEHOLDER]

Companies are in need of the metrics that will allow them to stay in business by making sure they meet the expectations of their customers. The name of the game is higher customer satisfaction by winning their trust and loyalty. To do so, you want to provide good products and services. Therefore you need to find ways to monitor performance, drive continuous improvements and deliver the quality expected by the consumer in this highly competitive market.

Photos from AlphaTradeZone via Pexel and Spacejoy via Unsplash

SLAs, SLOs and SLIs are a good way to achieve the above. They allow clients and vendors to be on the same page when it comes to expected system performance.

If we go one level deeper, vendors/providers work on NFRs (Non-Functional Requirements) when working on their solutions. NFRs define the quality attributes of a system. I bring them up because the relationship between them and the SLAs is that they provide, in a way, foundational aspects for the SLA-SLO-SLI definitions. This is why, by aligning them, organizations can effectively manage and monitor the performance and reliability of their systems. The NFRs are technical focused, while the SLAs are customer focused.

List of NFR groups:

  • Performance
  • Scalability
  • Portability
  • Compatibility
  • Availability
  • Reliability
  • Usability

An example to showcase an NFR and SLA relationship can be found a few paragraphs below in the “An example of aligning SLAs and NFR”. Jump to it, if you like, or continue reading the article until you get there.

The agreements and metrics allow SRE and DevOps teams to have a clear understanding of the performance objectives that are required for the users’ satisfaction, by having a focus on reliability, being customer-centric, aligned with business goals, and, therefore, push for continuous improvement.

DevOps-development-operations

Recommended article

SRE, DevOps and ITOps. If you are wondering what the differences between the SRE and DevOps are, as well as how these roles work with ITOps within an organisation then...

Ok, coming back to SLAs-SLOs-SLIs… what are they?

  • SLA: Service Level Agreement. The agreement between the service provider and the customer that outlines the expected level of service, including specific performance and reliability targets.
  • SLO: Service Level Objective. The targets/objectives to hit.
  • SLI: Service Level Indicator. The actual metrics used to measure the performance and reliability of the system.

In other words, the SLIs are the numbers that will be used to demonstrate that you have met the promises (SLOs) you have agreed upon within the contract/agreement (SLA) with your customer.

It is important to highlight that not every trackable metric should be an SLI.

An example of aligning SLAs and NFR

Let us assume you have the following NFR: 

The CMS database must be backed up on a daily basis. 

This can be a way to align the SLAs with the above NFR:

Database Service Level Agreement (SLA).
Backup and Recovery: Full backups will be performed daily, and incremental backups will be performed hourly. If a failure occurs, the database will be restored to the last backup within thirty (30) minutes.
SLO: Full Backup to perform daily.
Incremental backup to perform hourly. Data restore will be done within 30 minutes from the time of the failure.
SLI: Backup success rate done via monitoring logs.
Backup metrics: backup success rate, backup duration, backup size, restore time and restore success time.

DevOps metrics

With these metrics you can track technical capabilities and team processes around  your software development pipeline. By being able to monitor and identify issues then you are able to efficiently and effectively address it, and discover new automation opportunities that will bring proactiveness to your operations. If you think about it, this is all relatable with principles of continuous improvement which are relevant to the DevOps culture.

There are many DevOps Metrics, which I encourage you to research about them and read them in so many great articles out on the internet. Like many of those articles, I will just highlight 4 key/critical ones (I recommend following  these "breadcrumbs" we are leaving you in this article and reading more about them. You have many options on so many blogs out there! ):

  1. Lead time for changes
  2. Deployment frequency
  3. Mean time to recovery (MTTR)
  4. Change failure rate

 Differences between SLAs and DevOps Metrics

SLAs and DevOps Metrics both look after the quality and performance of the products and services.
This said, they do serve different purposes. SLAs set clear expectations and accountability, making sure providers meet the standards and the promises agreed upon. 

DevOps metrics, on the other hand, help teams improve processes and deliver software effectively and with the quality expected. 

It is common for SRE teams to leverage SLA-SLO-SLIs to ensure reliability metrics are met, and the systems are performing as intended. 

On the other hand, DevOps teams will be focused on leveraging DevOps metrics to measure the success of development and operations processes, and identifying areas of improvement.

You may also like




Trending posts

Steer for a talent transformation strategy (and avoiding AI fatigue)

 There was a debate on whether to feature the term “AI” in the title of this article. Honestly, a key motivation for pursuing the research that led to this post was sparked by the widespread excitement about AI appearing constantly in our LinkedIn feed, to the point of feeling the fatigue, and even a bit disappointed in the algorithm of this, and the others, social media and content curated apps.  We soon discovered that there is an entire concept called "AI fatigue", not exactly how we were feeling it, but more about the mixed emotions people in the workforce have regarding the use of AI tools. Photo by Mart Production via Pexels (background updated with AI and Adobe  tech) From micro blog posts to video podcasts, lately, most of the tech content we encounter revolves around AI. They often sound or read very similar, usually mentioning the same few top providers. The articles (and social posts... at least the popular ones with paid-campaigns behind it) tend to focus less...

Building MCP with TypeScript

MCP servers are popular these days. We’ve been researching and exploring a few code repos, some where missing modularity, others just not having pieces that we were looking for… therefore we decided to build our own, simple and foundational that could be a starting point for those trying to solve for the similar things we were… and we decided to share it with the community, via our public github. MCP host, server,data sources     Before we start.  Using Typescript and NodeJS was one of our requirements. This proved somewhat challenging because I don't code as frequently these days due to my leadership responsibilities, and I typically prefer working with C# or Python. Colleagues in my tech community have been working with their teams on some of their MCPs going the Python route. Therefore, I said, “I guess we are trying the other route” 😊. One of our reasons to go with TypeScript was due to the need of the integration with APIs, and based on the research, it seems t...

Assembling MLOps practice - part 2

 Part I of this series, published in May, discussed the definition of MLOps and outlined the requirements for implementing this practice within an organisation. It also addressed some of the roles necessary within the team to support MLOps. Lego Alike data assembly - Generated with Gemini   This time, we move forward by exploring part of the technical stack that could be an option for implementing MLOps.  Before proceeding, below is a CTA to the first part of the article for reference. Assembling an MLOps Practice - Part 1 ML components are key parts of the ecosystem, supporting the solutions provided to clients. As a result, DevOps and MLOps have become part of the "secret sauce" for success... Take me there Components of your MLOps stack. The MLOps stack optimises the machine learning life-cycle by fostering collaboration across teams, delivering continuous integration and depl...

Digital Sovereignty in a Polarised World - Data, Cloud Power, and the Search for Trusted Alternatives

 Relationships have deteriorated, with trust diminished to an extent that may preclude restoration. The world, once structured to favour certain regions, has undergone significant shifts; for numerous countries, such advantages never existed. In this polarised reality, stakeholders are re-evaluating alliances, as former partners now often embody the role of "frenemy," thereby threatening freedom. This phenomenon is longstanding, rooted in historical power dynamics. When politics and influence supersede principles of fairness, respect, and integrity, ethical boundaries become blurred. Previously, issues that did not directly affect you would get overlooked out of principle, but current risks necessitate action to safeguard sovereignty. Information has consistently served as a key strategic asset, a trend only intensified by technological advancements that have elevated data as the principal factor. In other words, technology has amplified that, and data is the name of the game...

This blog uses cookies to improve your browsing experience. Simple analytics might be in place for pageviews purposes. They are harmless and never personally identify you.

Agreed