7 steps to DevOps hell

The truth is... DevOps doesn’t scale at enterprise sized setups (breaking point is approx. >50-80 devs). Lucky for us, there are clear signals that can tell you how far down the DevOps trap your organization is. Welcome to the steps to DevOps hell.

Luca Galante

Core contributor @ Platform Engineering

•

Published on

October 31, 2023

Maybe it’s just me. Maybe I have simply been working on this stuff too long and repeating these things too often. But I really do find it funny that an article with this title can still be relevant. That people still need to hear that DevOps is not the solution to their problems and in fact it might very well be what got them into this mess in the first place.

Yet, very few engineering organizations are even (painfully) aware of the challenges created by blindly following a traditional approach to DevOps and trying to make it work at an enterprise scale. And what’s even crazier, most organizations are running fast down this staircase to DevOps hell and don’t even realize it. They still think that because Airbnb said they were doing DevOps 10 years ago, that’s still what the cool kids are doing and that’s the way to attract the best talent, move fast, be super agile and make executives happy.

The truth is, DevOps doesn’t scale to enterprise sized setups (breaking point is approx. >50-80 devs) and in the cloud native era. And there are clear signals that can tell you how far down the DevOps trap your org is. Welcome to the steps to DevOps hell.

Step 1: Freedom for devs

It all starts with the decision that your organization should become more “agile”.

You should give more power to devs. After all this is the original promise of DevOps: higher innovation speed and more reliability, while shipping a better product for your customers with a shorter lead time. Get higher deployment frequency, lower your change failure rate, speed up your mean time to recovery. Fantastic. Everyone is enthusiastic.

‍Step 2: Do everything at once

You can’t wait for this bright future and you decide to change everything at once to become state of the art as quickly as possible. You replace your old monolith with a microservice architecture, move everything to the cloud. Everything gets containerized and deployed to your brand new Kubernetes clusters, across multiple providers. Finally we are catching up!

Step 3: Everybody knows everything (or should)

“You build it, you run it” is the new mantra. Every dev, whether a new hire or an existing team member, is now expected to learn and understand your tooling and pipeline in detail. They are supposed to master your increasingly complex toolchain to deploy their apps and services. But not everyone has the required knowledge or frankly the interest in figuring it all out. Things start breaking and your smartest and most experienced engineers will jump in to support more junior developers. They will start doing what we call shadow operations.

Step 4: Every team chooses their own tooling

Management promised more autonomy to teams without setting clear guidelines nor actually facilitating it. So every team goes off and picks the latest and shiniest open source tool and starts creating their own automations and custom scripts. Few people start saying that this might end up in a big mess. First few senior engineers doing shadow ops start burning out.

Step 5: You decide to centralize DevOps

Productivity is starting to drop. Your most experienced and valuable backend engineers are spending 70% of their time doing shadow ops, without actually being able to ship new features. Management decides to centralize DevOps, restructuring and growing the old Ops team. The centralized Ops team starts fighting with distributed shadow ops and individual team standards. Big chaos.

Step 6: Centralized DevOps teams build their own tooling and integrations

Your centralized DevOps team is asked to find the best in class setup among the 500 shades of deploying a service that each team has designed. It’s a lot of work and the team continues to grow, becoming more expensive. They build more and more custom tooling and integrations and try to roll it out across more teams, which ends up in a bigger and bigger mess.

Step 7: You decide to do platform engineering!

As the months or in some cases years go by, teams become more aware of the necessity for a platform approach. "Team Topologies," "golden paths," "paved roads," and "Internal Developer Platforms" become common buzzwords within the organization.

Platform teams are formed, but the question of scale remains. And here's the hard truth, with Ops/Dev ratios as skewed as 1:8, the road to redemption is a pretty long one.

What’s next?

Congratulations, you pulled yourself out of DevOps hell! Now you can change your title to Platform Engineer and finally go play with the cool kids on the block, shipping some fancy Internal Developer Platform. Or was that too fast?

Maybe. Moving from a traditional DevOps approach to platform engineering is not as straightforward as it might seem. Simply renaming your DevOps team to Platform team won’t do the trick (you’d be surprised how often this is actually the case).

The key change that needs to happen is a mindset shift. You need to go from a teaching mindset, where DevOps engineers add new tech and infra and try to teach devs how to use them (which doesn’t scale), to a product mindset, where the platform team ships the Internal Developer Platform as a product. This means listening to your internal customers (your devs) and iterating on the golden paths of your platform, finding the right level of abstraction for your engineers, without removing the necessary context.

If not, you’ll simply trade your quickly forgotten DevOp hell for a new type, the platform engineering hell.