Want to share your platform story or do a technical deep dive? Submit your article!
AboutEventsBlog
Platform tooling
Jobs
Store
Join Slack
Platform tooling
Jobs
Store
Blog
Bringing together FinOps, DevOps, and platform engineering

Bringing together FinOps, DevOps, and platform engineering

Discover how to achieve cloud cost efficiency through the power of platforms, by combining proven and efficient engineering practices.
Ajay Chankramath
Head of Platform Engineering at Thoughtworks North America
Published on
November 8, 2023

Cloud cost optimization is core to most digital native organizations today. This article focuses on combining proven and efficient engineering practices around product development and delivery, to achieve cloud cost efficiency through the power of platforms.

Core definitions

DevOps is a cultural paradigm driven by a set of activities practiced by your whole engineering organization, instead of a subset of those practices executed by a team on behalf of your organization.

Similarly, platform engineering is comprised of the technologies, patterns, techniques, and governance processes required to provide a secure and scalable seamless automated path to production. That includes enabling easier infrastructure and access provisioning and more effective compliance. It also involves cutting workload management complexity in an observable manner that reduces friction and developers’ cognitive load.

FinOps, as defined by the FinOps Foundation and practiced by most organizations adopting cloud, is a set of practices and principles that can help optimize costs with the specific intention of increasing the value generated by these cloud investments.

Premise of the problem

For you to realize the promise of improving cloud costs through appropriate DevOps practices, you need to start with five fundamental areas of focus. Succeeding in these focus areas makes your solution strictly platform-centric.

  1. Cognitive load. Reduce cognitive load on developers so they can focus on building customer-visible product functionalities, instead of how to build them.
  2. Efficiency. Improve operations efficiency so the capabilities built for the customer use cloud resources in the best way possible.
  3. Agility. Increase agility to pivot between building the right product capabilities without compromising quality or costs.
  4. Replaceability. Being ready for replaceability in a fast-growing 3rd party ecosystem market is key to driving success.
  5. Composability. Focus on composability as you pick and choose the components needed to create your contextual engineering ecosystem.

Please see the illustration below for a detailed, but non-comprehensive view of how these focus areas come together:

Figure 1: Basic Tenets of Platform Thinking Mapped to Potential FinOps Themes

DevOps lifecycle

Creating an Internal Developer Platform (IDP) does not immediately remove the need for current embedded knowledge of delivery and operating production systems.

Once you have an IDP, the next step is to figure out how to operationalize it. Operationalization primarily requires an owner and a process.

The traditional way to do this is to create a dedicated team, often called a DevOps team. Have them use the platform’s capabilities to perform the functions as a proxy to the development teams. However, this model is an anti-pattern as it conflicts with the basic accepted tenets of platforms: it is not self-serve and creates additional levels of abstraction and communication channels between the DevOps team and the developers.

A better approach would be to use the concept of Technical Product Management to understand the true requirements and address them as directly as possible through the platform operating model. Platform Enablement may be required to assist teams in making the most of the development and delivery platform when they are ready to use it.

Pictured below is the popular continuous delivery lifecycle for most software development efforts.

Figure 2: Traditional DevOps lifecycle

The steps are fairly self-explanatory. Starting with planning, the life cycle moves through coding, building, testing, releasing, installing (if applicable), and then tracking and operating. Within this context, let’s look at the key characteristics and how they work under both models.

Characteristics Stand-Alone DevOps model Platform Enabled model
Knowledge Sharing Very low High
Technical Capabilities Low High
Tech Debt Pay Off Low / Medium Medium
Customer Impact “Future” Low Medium / High
Knowledge Flight Risk High Low
Developer Effectiveness Low High
New Capabilities High Low
Supportability Low High
Supports Shift Left Low Medium / High

In both models, the steps of your development lifecycle remain the same. The true expanded lifecycle in a cloud-native platform engineering world would look like the diagram below. As I’ll discuss later, these steps can potentially be optimized while typing together FinOps and DevOps.

Figure 3: Expanded DevOps lifecycle in a Cloud Native Environment

Now that we’ve looked at the DevOps lifecycle, let’s talk briefly about the other side of the coin: the FinOps lifecycle.

FinOps Lifecycle

In its simplest form, the FinOps approach can be mapped to four Rs:

  1. Report. Develop a higher level of reporting beyond what the CSPs provide.
  2. Recommend. Identify the areas of improvement and provide targeted recommendations.
  3. Remediate. Implement these recommendations automatically.
  4. Retain. Ensure the wins you’ve achieved are sustained as an organization and start impacting the DNA of the organization.
Figure 4: 4 Rs of FinOps

Report

Many tools in the market promise to solve your cloud cost optimization problems. Most of these tools are excellent in telling you what the problems are. However, when it comes to solving the identified problems, it gets far more complex. You can’t rely on a simple licensed tool approach. In this context, it’s worth further exploring the space to understand the problems and how your organization should approach them.

The reporting space is the most advanced for off-the-shelf solutions. These tools typically consume your existing usage and utilization data directly from the cloud provider reports. They also use other internal sources such as CMDB, ServiceNow, or any of your resource tracking systems to come up with a more granular approach to providing you with data insights.

Figure 5: Maturity levels in the FinOps lifecycle (Courtesy: FinOps.org) 

While it’s tempting for organizations to lean on the existing reports, I recommend you look at available tools and decide which is best for your organization based on your unique context.

Recommend

Once you have detailed reports, you’ll want recommendations based on the well-architected framework (WAF) used by pretty much all cloud service providers. Let’s look at four different classes of recommendations you might need.

1. Cloud usage patterns

Irrespective of where you are on your cloud journey, every cloud user should implement cloud usage patterns. This covers recommendations around unused or underutilized resources, tagging opportunities, and rightsizing and resource mapping.

2. Containerized workloads

Most of the clients I talk to these days are interested in the next level of cost allocation, usage, and utilization. As more and more of their workloads are containerized, any potential wastage is masked by the veneer of container orchestration. This is the context in which tools such as Kubecost™ and Komodor™ would come into consideration.

These tools provide similar optimization recommendations at the container level and identify opportunities at the node level.

Figure 6: A Typical Kubecost™ recommendation report

3. Rate optimization 

Rate optimization is one of the lowest-hanging fruits in cloud cost optimization. It can help with matters like:

  1. Identifying and availing spot discounts, taking advantage of the unused resources on the cloud
  2. Commitment-based discounts
  3. Sustained usage discounts
  4. Private pricing, a cost-saving mechanism offered by cloud vendors for enterprise customers with large spends
Figure 7: A Typical ProsperOps™ recommendation report

4. Cloud carbon optimization

A fourth class of reporting tools helps track and optimize your cloud carbon footprint. With most countries trying to reduce their carbon footprint, I expect these tools to be more pervasive in the future. Most of the tools are built around an open source initiative called CCF, which uses an API-based approach to estimate energy usage and carbon emissions from your cloud usage.

Figure 8: A Typical CCF™ recommendation report

Remediate

While reporting and recommendation can be done fairly easily with the tools mentioned above, building a proper remediation approach requires you to think of this problem in a platform-centric manner. The FinOps platform capabilities you need to build can be categorized into five classes:

  1. Core capabilities such as resource tagging, right-sizing, and resource mapping.
  2. Metrics-related capabilities that can handle usage, commitment, utilization, and sustainability.
  3. Alerting and notifications around anomalies, custom reports, forecasts, and budgets.
  4. Policies-related capabilities that can handle commitments, data retention, and various financial constraints.
  5. Automated governance capabilities around compliance that can be built in pipelines.

All of these capabilities require the backbone of an existing or customized observability platform and tooling.

Figure 9: A Notional View of a FinOps Remediation Platform

Retain

The fourth step in the FinOps problem is the age-old question of how to sustain the wins realized once you go through the third step of the four R’s process. Here’s the approach I typically use to solve this problem.

1. Understanding the audience

Understanding different personas and conducting appropriate outreach is key to retaining learnings and wins. Even within the development community, you want to be intentional about various roles and responsibilities. An old-style RACI will go a long way in clearly making sure that all the stakeholders in the development lifecycle including developers, testers, platform engineers, finance team, and the product owners understand their roles.

2. Identifying and delivering the appropriate training

I recommend having clear learning paths, or a series of outcome-based training that makes sense when stitched together in a particular way. These learning paths must be based on the types of cloud consumption. Here are a couple of examples of learning paths:

  • Cloud resource usage for application developers
  • Cloud optimization for scalability for SREs

3. Extraneous incentives for the developers

Humans always respond to the right incentives. On the flip side, the wrong incentives can hurt your outcomes pretty badly. Where possible, you should use data to create incentives that help your engineers. There are several publicly available maturity assessment models to understand your organization’s current capabilities and how to improve them. Using these models will identify specific areas for your engineers to focus their efforts.

4. A platform product operating model

Perhaps one of the most overlooked but impactful activities is to build remediation capabilities with a product mindset. The product mindset ensures there is a long-term vision and evolution of product thinking as more new core capabilities are built.

Figure 10: Summary of Organizational Aggregation Activities

Tying FinOps into your DevOps lifecycle

Now that you understand how to manage the four R’s in this context, we can look at how the DevOps lifecycle of your product development (SDLC) will be impacted by various FinOps activities.

As illustrated below, each step of your development lifecycle will have intentional activities one can incorporate as part of your path to production. This ensures that you fully integrate Continuous FinOps into your way of working instead of treating it as an afterthought.

Figure 11: FinOps Mapped to your SDLC

Specifically, you’re incorporating FinOps principles into each of the DevOps principles here as mapped below.

Steps Platform Approach DevOps Action FinOps Action
Planning for your capability Identify the requirements Identify abstracted out reusable capability Capabilities built should be aware of the cloud vendor resources
Defining the capability Technical Product Management Simulate value of the abstracted out reusable capability to help prioritize Integrate the report of the cloud resource usage to your observability platform
Designing and implementing A self-serve, API-driven, low friction approach to having the capabilities Test and provide feedback through an automated infrastructure availability Right size and tag the resources appropriately with clear mapping from application layer to the infra layer
Delivery of the capabilities Automated approach to using the capabilities and light-weight governance that goes along with it Use of self-healing to address more problems automatically increasing adoption with reduced cognitive load A feedback cycle that continues to make the operational processes and adoption better by focusing on the areas of cost bleed

The biggest benefits of tying FinOps into your DevOps lifecycle are the standardization you provide your developers and the governance that ensures compliance at each point of change in the lifecycle. This helps improve the predictability of the cost of planning, conceptualizing, and developing your product.

Conclusion

Taking a platform-centric approach to integrating FinOps and DevOps is the only way to successfully implement best practices around improving your engineering organization’s efficiency, agility, and productivity. This also improves customer experience, as you’ll be able to provide higher quality products faster and at better price points.           

Once you build your FinOps remediation platform in a product-centric manner based on the upstream dependencies of reporting and recommendation tools, you’ll need to ensure the wins are retained. This will remove the need for you to come back and address your FinOps problems periodically. Instead, it becomes your way of life, and therein lies your success in managing cloud costs through your engineering practices.

Subscribe to Platform Weekly

Platform engineering deep dives & DevOps trends, delivered to your inbox fresh, every week  🥐

By subscribing you agree to with our Privacy Policy.
Thank you for subscribing ✅
Oops! Something went wrong while submitting the form.
Share this post

Related articles

9 steps to platform engineering hell
Platform engineering done right can be a huge boost to developer productivity but when done wrong, can take you full speed down a brand new staircase to hell.
Luca Galante
Product @ Humanitec
Magical, joyful, and compassionate developer experiences
Recreating the experiences of theme parks, roadside assistance, and doctor’s offices
Jason Kennedy
Sr. Engineering Manager at OpenMedical
7 steps to DevOps hell
The truth is... DevOps doesn’t scale at enterprise sized setups (breaking point is approx. >50-80 devs). Lucky for us, there are clear signals that can tell you how far down the DevOps trap your organization is. Welcome to the steps to DevOps hell.
Luca Galante
Product @ Humanitec
All articles
Join our Slack
Join the conversation to stay on top of trends and opportunities in the platform engineering community.
Join Slack
Sitemap
HomeAboutEventsJobsStore
Resources
BlogPlatformCon23What is platform engineering?Platform tooling
Join Us
Youtube
LinkedIn
Slack

Subscribe to Platform Weekly

Platform engineering deep dives and DevOps trends, delivered to your inbox fresh, every week  🥐

By subscribing you agree to with our Privacy Policy and provide consent to receive updates from our company.
Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.
© 2023 Platform Engineering. All rights reserved.
Privacy PolicyTerms of ServiceCookies Settings
Supported by