Observability

How to deliver predictably with Platform Analytics

Ralf Huuck
Founder & CEO @ Logilica

In a talk session at PlatformCon Convention in June 2022, Ralf Huuck, the founder and CEO of Logilica, discusses Engineering Analytics for platform predictability.

Huuck explains how Platform Analytics can deliver stability and predictability within product-delivery systems and defines what it takes to integrate all silos within a company's delivery platform effectively. He suggests that users can utilize data-based analytics to avoid bottlenecks, receive real-time alerts, and engineer platforms for more efficiency, stability, strategic organization, and predictable outcomes.

Let’s explore the key takeaways in case you can’t watch the conference talk.

A brief guide on how platform analytics can help deliver products on time

Recent global supply-chain bottleneck issues have exposed the urgent need for accuracy and predictability throughout product pipelines. Companies such as Amazon or Alibaba already spend enormous efforts on predictability in package tracking and delivery. 

In this world, most things that can go wrong usually do. As the adage goes, we're only as strong as our weakest link. How can one ensure robust predictability in product delivery pipelines? The key to achieving more stable delivery streams is data-driven measurements through platform analytics.

Huuck explains that modern platform engineering identifies and removes bottlenecks by measuring DORA metrics and other Key Performance Indicator (KPI) metrics integrated through Application Program Interface (API) functions to access the data. This data-analytics overlay in modern platform engineering allows for successful tracking and strategic conflict resolution, introducing delivery systems that can be more predictable and stable than ever before. 

Product delivery platforms: markers & solutions

To start, business development and operations should ensure that their business goals align with the product delivery system. One common goal for modern DevOps and agile CI/CD processes is to have short iteration cycles. Short cycles allow you to

  • Ship your products or services to your clients quickly.
  • React to your customers' feedback in real-time.
  • Adjust the product as needed.

Ideally, you already have the infrastructure to optimize your cycles to deliver products at scale while maintaining consistent quality. If you build the infrastructure well, you only have to build it once, Huuck points out. After that, you can rely on it over and over again to roll out future products or other development team goals.

Steps in the software delivery process

“There are quite a few steps between us and the customers,” Huuck states. There’s the planning cycle, JIRA tickets, code creation, packaging it all up, security testing and release processes, and more. The goal is to streamline the processes so they’re fast and reliable. This makes your team happy because there aren’t an overwhelming number of bottlenecks, and it makes your customers happy because you’re consistently delivering the right things on time. 

As you can see, there are already a significant amount of steps for a single delivery pipeline. Each step requires a wide range of different tools that all need to interact with each other so your pipeline is as fluid and efficient as possible. 

More tools mean more places to get stuck. There are many stages in which things can go wrong, such as

  • Your ticketing system,
  • Your repository,
  • The build system,
  • The security check-in tools, and
  • Release tools.

Huuck highlights that developers may run into issues with specifications not being well described, teams veering away from the initial plan, hiccups in development, the build server falling over, or security concerns.

“In the product management and the engineering management space,” he explains, “you want to stay on top of all this, especially now that we are working in a much more distributed fashion. You want to see what’s going on without logging on to each individual tool.”

The solution, according to Huuck? 

Top-tier businesses need a bespoke engineering platform tailored to provide the necessary data to maintain maximum efficiency. The right engineering analytics platform gives developers the tools they need to evaluate how the different teams and product pipelines are performing and identify where the bottlenecks or inefficiencies exist.

The new trend is to develop engineering analytics platforms that can digest and filter the data “to give you a radar-level view of your whole organization in terms of engineering and product delivery.”

Strategic system integration

Huuck states that you can typically get the data you need from the tools you already have. “Most modern tools already provide you with APIs and GraphQL, so you can relatively easily get that data.”

He suggests using your entire range of feedback and analytics to build a strategic platform that offers insights such as 

  • The velocity of your delivery pipeline,
  • Your through-put or release cadence, and
  • Potential risk areas for burnout or bottlenecks.

It's essential to track the KPIs in these areas using APIs. As Huuck puts it, "Engineering is a team sport and delivery, especially, is more about the flow - from product creation to shipping it out of the door." Essentially, smart platform analytics should measure and calibrate your “flow.”

Platform flow: three markers

Huuck identifies three markers most commonly aligned with measurements of success for delivery systems: 

01. Velocity. The faster a business can move, the faster it can respond to market demands or client feedback.

02. Risks. There are always risks. Even velocity can come with risk because security or quality can suffer if businesses move too fast.

03. Throughput. The final throughput numbers will suffer if we move slowly to allow for superior quality.

Huuck compares these markers to rivers that flow from the source into the ocean. Business leaders need to care about their flow and measure these high-level metrics frequently to find a good balance that matches business needs and the profile of the teams.

Velocity

Aside from different tools, there are also different stakeholders at different levels of the organization who will be looking at slightly different velocity metrics. It’s essential to build these analytics to provide insights for the entire organization.

Velocity means different things to different team members. Each stakeholder’s idea of velocity will vary based on their perspective, and each perspective is valid, and it’s vital to recognize this.

For a software engineer, the velocity revolves around how long it takes to work through their PR from the moment they open it until it’s reviewed and merged in. For those working on the DevOps’s platform side, velocity could be about how long an average build takes, how many build servers they need, or how well they can parallelize a process.

For the security team, velocity could focus on how long does a scan take or examining whether the scan holds up the release process. Alternatively, product managers are looking at high-level KPIs, DORA metrics, and issue lead times. So velocity for them is all about how long it takes to complete projects and hit release milestones. 

And of course, velocity for the whole thing is the release cycle: how long it takes from inception to getting it out the door. 

Risk & warning signals

Huuck and his team used around 30,000 developers’ code to map out how long pull requests take for engineering teams, on average. His example observed four steps:

  1. Development
  2. Response
  3. Review
  4. Integration

Risk leaves calling cards–early warning signals. Analyzing the time spent in each phase of the cycle helps to identify these warning signs. 

In the example Huuck shared, the items spent a lot of time in development, which lasts from the start of coding until the response to the pull request. This can indicate that the features are too large.

Huuck pointed out that the response phase lasting nearly two days was a significant warning sign. He suggested that companies with a long response time should evaluate whether they have enough people or if their team is spread out across multiple time zones.

The review phase took the least amount of time in this example. However, the integration stage took longer than it should have. Huuck suggested that this could imply a QA issue.

By breaking all of this information down by phase, teams can pinpoint areas that need a boost in velocity. This may lead teams to consider smaller pull requests, ask for more reviewers, or redistribute their work for a smoother pipeline.

Though the granular data necessary for noting trends is often very hard to see, data analytics provide an opportunity to delve into individual cogs to make the machine operate better as a whole. 

DORA metrics

If you're on the DevOps platform-delivery side, you'll likely want to use DORA metrics. DORA is the Google-initiated set of merit metrics that analyze delivery cadence, deployment frequency, time to implement changes, time to restore, overall quality, and change failure rates.

DORA helps identify the three flow markers:

  • Deployment frequency can be used to measure throughput
  • Lead time for change and time to restore can be used to measure velocity
  • Change failure rate can be used to measure risk

Delivering analytics

“Pulling this all together,” Huuck says, “you can actually define success criteria and deliver insights to the right stakeholders.” 

For example, the engineering team is looking at velocity:

  • How long does it take on average?
  • What are some of the outliers?
  • How do we track it over time?

Typically, things can get more stressful during a release; then afterward, things should normalize. If things are not normalizing over time, it’s often an indication that things are not well-tuned, and teams may need to improve communication.

Using DORA metrics to track teams over time enables managers to see how the team is performing. Is it an allied team that can ship more than once a day, or does it take a week or a couple of months? How often were there issues, and how long did it take to resolve them? What is the proportion of these issues within processes?

Growing maturely: cross silos & link to evidence

Mapping these metrics over time clearly indicates how teams are improving and pinpoints areas for growth. As teams continue to analyze these patterns, they can build up organizational maturity and build more predictive models and analyze long-term trends.

Using a variety of metrics over time enables managers to craft a detailed map of their system that enables them to make their delivery pipeline more predictable. Growing maturely and sustainably rests on "being able to apply the right action at the right activity," Huuck states. With the right tools, teams can better integrate their back-end and front-end teams and scale responsibly.

“One word of caution–one thing that I think is quite important–whenever we introduce data into organizations,” Huuck warns, not every piece of data is “really perfect, but I think it’s quite important to note the trends.”

Lastly, Huuck says all data approaches need team buy-in. He emphasizes that an open team culture is extremely important. He recommends really sharing this and bringing this together and setting those milestones and goals as a team rather than having them being dictated from the top down. 

"The more data you have," Huuck summarizes, "the more you can connect it up in your much more mature organization, the better your chain of evidence." Then, it’s up to the team to bring all of these connections together to create a more efficient engineering organization.

In sum: Grow responsibly, predictably.

Watch the full conference talk here.