DevOps

Humanitec’s DevOps Benchmarking Study: how to overcome the DevOps mountain of tears

Kaspar von Gruenberg
CEO @ Humanitec

Humanitec surveyed over 1,800 engineering organizations across the EU and US to learn more about what makes a high-performing DevOps setup and what low-performing teams can do to overcome the “DevOps mountain of tears”.

In November 2021, Kaspar von Gruenberg, CEO of Humanitec, presented his findings and observations in a comprehensive webinar, aiming to answer the question on everyone’s mind - what sets top-performing DevOps setups apart from the rest? 

The DevOps mountain of tears

When analyzing the data from this huge dataset, Humanitec assigned each participant a DevOps Maturity Score (DMS) compiled from the answers given to industry-wide accepted metrics and concepts. 

DevOps score

With a median score of 53 and the vast majority of participants falling in the mid-performance quartile, this graph illustrates the “DevOps mountain of tears”. It’s clear that many teams in this mid-performance bracket may be finding themselves stuck there, and it’s a problem that Humanitec sets out to answer with this study. 

Tech and architecture

The first area covered by the survey is tech and architecture. Overall, the results of the survey here weren’t surprising. The vast majority of top performers run loosely coupled architectures in their application, and this setup type accounts for over 58% of team architecture in the study. 

The architecture of your applications

Kaspar states “in some situations, however, monolithic architecture makes more sense, so it’s not a good idea to try and find a general rule for that”. With that in mind, seeing monolithic architecture make up the vast majority of low-performance setups is roughly in line with the current state of the industry. 

Degree of containerization

Similarly, the degree of containerization in these teams is something that can be dependent on the situation but is also used heavily by top performers. 82% of all teams reported they’re either running full containerization, they’re migrating, or they will be migrating soon. With more low-performing teams stating they don’t have any plans to start migrating, it’s clear that containerized services are a key part of high-performance DevOps setups. 

Example: Do you store you infrastructure configurations in a version control system?

Similarly, the vast majority of top-performing teams follow the Infrastructure as Code (IaC) approach, while the majority of low performers do not.

With all of that being said, Kaspar advises caution against relating technical setups to DevOps performance. “You can ship things really fast with a monolithic architecture if that’s how you have things set up. Containerization can make things easier, and configuration as code is becoming more standard, but it’s not the predominant variable that will impact your performance metrics”.

Performance metrics

Tech and infrastructure show, however,  a high correlation on an engineering team’s performance metrics. In the survey, Humanitec covered four key DevOps metrics - Deployment Frequency, Lead Time, Mean Time to Repair (MTTR), and Change Failure Rate, all well known from the book Accelerate written by Nicole Forsgren, Jez Humble and Gene Kim. 

Deployment frequency

High- and top-performing teams overwhelmingly deploy on-demand or, at the very least, several times a day, with the majority of low performers deploying monthly. The top teams are shipping much faster, which has a compounding effect on an organization’s productivity across the board. 

Lead time

This is also evident when comparing lead times between top and low performers. Kaspar says:

“The lead time is minutes for an engineer in a top-performing environment, vs up to a month for an engineer in a low performing environment. You can build that direct correlation between deployment velocity and revenue growth of a company.”

MTTR

The top-performing teams also have a significantly lower MTTR, meaning that the average time it takes to recover from a product or system failure. Meanwhile, the vast majority of low performers need anywhere between a day and a week to get back on track. That’s a lot of lost revenue, regardless of the industry these engineering teams are working in. 

Change Failure Rate

Finally, 21% of all low performers have to roll back over 15% of all changes, which is a startling number when the cost of revisiting “faulty” code is considered. 

So what can teams learn from this? Kaspar says, again, not much. “This just shows the interplay between metrics and DevOps performance. It’s naive to assume these metrics give us an indication of what we can do better in DevOps”.

What these metrics do show teams, however, is a positive correlation between the DevOps Maturity Score established earlier and an engineering team’s performance. 

Team setup and culture

The final section of the survey covered how DevOps and developer experience is managed within the organization, and this is where the main difference between the top performers and those struggling with the “DevOps mountain of tears” becomes apparent. 

Nearly all of the top-performing teams reported that their organization works on full developer self-service with very little dependence on operations. However, the “you build it, you run it” philosophy is not as prolific within low-performing teams.

Kaspar says:

“The reality is that modern delivery systems are super complex. We’re now living in a world where you have very complex architectures that you operate with dozens of different tools that have to serve hundreds of thousands of users.”

In these situations where developer self-service may not be fully feasible, many low-performing teams end up using a hybrid “shadow operations” approach where senior developers are constantly helping mid-tier or junior developers with DevOps tasks. 

It’s at this point that organizations have to consider developer experience, and this also correlates with the difference in performance levels. 

We as an organization are constantly striving to improve developer experience

The study found that almost all of the top-performing teams reported that their organization invested heavily in developer experience. 

Overall, it’s clear that developer self-service and investment in developer experience is key to helping teams cross the boundaries of low to mid-performance and reach the higher tiers of DevOps.

Scaling the mountain

So, how can engineering teams use this data to conquer the DevOps mountain of tears? 

Unfortunately, the answer is that it will depend on each team setup, organization, and project. However, what this study does show is that there are a few key things teams need to take into account to improve their DevOps performance. 

“My base assumption is that developers are really clever people,” says Kaspar. “All of the individual processes are completely feasible if developers throw time and cognitive load into managing them. But how much are they willing to allocate? What is their acceptance level?”

There’s a positive relationship between complexity and cognitive load. While engineers are paid to deliver business logic, organizations need to ask how much cognitive load engineers can reasonably take on top of their core specialization. 

Once again, this looks different for every team, and balancing self-service and abstraction is vital to improving the developer experience. If there aren’t enough abstractions, engineering teams can end up taking on too much cognitive load. On the other hand, too many abstractions leave developers feeling restricted and, in the worst case scenario, actively working to circumvent the abstractions that are limiting their work. 

Similarly, golden paths offer a cognitive load balance while allowing developers to circumvent abstractions if needs be. But, Kaspar adds, they need to be paths, not cages. “Abstractions should be developed for the most junior member of the team, but it has to be something you can circumvent if you’re a senior and comfortable doing that”. 

What this means is that organizations shouldn’t stop developers from interacting with low-level code if that’s what is needed, but golden paths should be put into place to abstract that away from developers who don’t want that additional cognitive load. 

Establishing an Internal Development Platform, or IDP, is the most common solution for implementing golden paths as part of a DevOps setup. But as with all other products, these have to be managed in a way that brings real value to the teams that use it. 

By treating IDPs as any other product the organization produces, it will focus on what its customers - the engineering teams within the organization - need, rather than what the platform developer team thinks is a useful feature. 

High-performing teams prioritize developer experience

The main thing that differentiates high-performing teams from the rest is that they prioritize developer experience and full developer self-service. Because DevOps can, and should, look different in each organization depending on their goals and what works best for them, it’s less about the tools that you provide for your teams, and more about how efficiently they can work with those tools within your software delivery pipeline. 

Similarly, implementing a platform team to build an IDP is hugely beneficial to DevOps performance. The vast majority of top-performing teams had a platform team to manage abstractions and industry best practices like configuration and infrastructure as code. Most importantly, however, is that these platform teams communicated directly with engineering teams to determine the right abstraction balance. 

If you haven’t watched the webinar yet, you can find it here. Or, if you want to dive in deeper, you can read the full study here.