The cognitive load crisis: Why tool proliferation is killing velocity
Your platform engineers are arguing about tools again. One wants to lean into Pulumi because "real languages and tests make infra sane," while another is doubling down on Terraform modules. Your org just finished migrating from Terraform Cloud into a new Terraform setup.
In theory, both are right: Pulumi offers expressiveness and better unit-testing, Terraform gives you a mature ecosystem and a standardized state model. In practice, teams are stuck in context-switching between patterns, re-learning IaC idioms per project, and opening tickets just to understand which path they're "supposed" to use.
The real problem isn't Pulumi versus Terraform. The real problem is that nobody is treating your internal platform and its golden path for infrastructure as a product designed to reduce cognitive load for developers.
This pattern repeats across every capability in your stack. Most organizations (Netflix, Amazon, Google, Microsoft, Adobe, Spotify, Airbnb) already own mature DevSecOps toolchains: GitHub, Jenkins, Terraform, Kubernetes, a constellation of observability and security tools. Yet developers still wait on Jira tickets, duplicate YAML, and fight configuration drift because the experience is fragmented.
When developers spend more brainpower juggling tools than solving problems, cognitive load quietly erodes velocity and quality. This article shows how a platform-as-a-product mindset, backed by executive sponsorship and the double diamond framework, turns your internal platform into a strategic engine for delivery not just another cost center.
Now add AI to this equation. With the explosion of code copilots, test generators, incident assistants, and autonomous agents, platform teams face a critical question:
Are you building AI to help developers, or building platforms to run AI workloads?
Both need the base IDP, but require fundamentally different product investments, governance models, and success metrics. This article provides a practical framework for platform PMs to navigate this split.
The double diamond framework for platform product management
The double diamond structures platform PM work into two phases: Discover & define (owning the problem space), and Develop & deliver (owning the solution space).
Discover & define: Owning the problem space
The first diamond starts with divergence: understanding the real problems developers face across the organization. Then it converges on a clear strategy and roadmap.
Customer research & internal landscape: The platform PM maps current SDLC journeys across teams, how code moves from idea to production, and where delays, handoffs, and incidents concentrate. Using real research methods such as developer interviews and shadowing observing how teams provision infrastructure, debug pipelines, and navigate incidents. Incident postmortems mining patterns in outages, security events, and config drift. Ticket and dashboard analysis identifying repetitive support requests and bottlenecks in lead time or deployment frequency. The goal is to surface friction that developers have normalized the "we always do it this way" tax that quietly erodes velocity and quality.
Adjacent teams discovery:To avoid building a platform that fights the rest of the organization, the PM brings in security, FinOps, SRE, compliance, and architecture early. This is where guardrails for cost, reliability, and risk are defined upfront, turning policies into platform requirements rather than after-the-fact gates. Eg., Security: "All CI/CD pipelines must scan for secrets and vulnerabilities before merge". FinOps: "Default resource quotas and auto-shutdown for non-prod environments". SRE: "All production services must expose health checks and structured logs". These constraints become features of the golden path, not obstacles developers work around.
Platform product strategy: Vision, mission, north star Using these insights, the PM defines a platform vision for example, "A self-service golden path from commit to production for all customer-facing services" backed by mission, principles, and 1 or 2 north-star metrics. Common north-star metrics: Lead time for changes (DORA metric): Time from commit to production. Developer satisfaction (DSAT/NPS): Quarterly survey of platform users. Mean time to provision: Time from "I need a new service" to "I can deploy code". This narrative is what secures C-level sponsorship and keeps the platform focused on outcomes, not tool adoption counts.
Product roadmap: Capability stack and operating model The PM translates strategy into a phased roadmap across key dimensions Infrastructure abstractions such as Self-service templates for databases, caches, queues, object storage. Security and compliance defaults such as Pre-configured pipelines with scanning, secrets management, and policy checks. Observability patterns such as Standardized logs, metrics, traces, and dashboards across all services. Supported languages and runtimes such as Go, Python, Java, Node.js with vetted libraries and frameworks. Workload types such as Web apps, APIs, batch ETL, IoT, ML training and inference, compliance vs non-compliance. RACI between platform and app teams such as Who owns what, from provisioning to incident response. Each increment has explicit success criteria, such as "Reduce time to provision a new service from weeks to hours" or "Standardize logs and tracing for 80% of services."
Platform adoption, customer feedback, and KPIs To close the first diamond, the PM advocates for setting up an Advisory group: Representative teams across product lines who provide ongoing feedback. Measurement framework: Blending DORA metrics (lead time, deployment frequency, MTTR, change fail rate), platform reliability (SLA, incident count), cost efficiency (spend per workload), and developer satisfaction. This turns platform usage into a continuous feedback loop rather than a one-time launch event.
Develop & deliver: Owning the solution space
The second diamond is about exploring solution options, then converging on a scalable, adopted platform that developers actually use.
Quarterly planning with engineering leaders: The platform PM works with engineering managers and enterprise architects to turn roadmap themes into quarter-sized projects that ship meaningful slices of business value. Examples: A standard "new service" template with CI/CD, observability, and security baked in or A secure pipeline for Python services with automated testing and deployment or Migration tooling to move legacy apps from manual deployments to the platform.
Solution design: Each slice is designed as a golden path, an opinionated, end-to-end workflow that makes the right thing the easy thing, not just a library or script.
Migrations, champions, and pilots: Early adopters are hand-picked: teams with high change rates, painful legacy workflows, or strong appetite for improvement become pilot customers. The PM uses these pilots to: Harden templates and validate security, FinOps, and reliability policies. Build reference stories and case studies that make later migrations easier to sell. Identify gaps in documentation, support, and self-service tooling.
Champions from pilot teams become evangelists who help onboard the next wave.
Beta launch and portfolio mix: Once patterns are validated, the platform rolls out via the Internal Developer Portal with: Clear documentation: Runbooks, tutorials, and FAQs. SLAs and support channels: Response times for questions and incidents. Usage analytics: Tracking adoption, time-to-provision, and support ticket volume.
Platform capacity is deliberately split to balance innovation and operations:
- 50% on new capabilities, optimizations, and innovations: New workload types, language support, tooling integrations
- 20% on reliability and operational excellence: Incident reduction, platform uptime, performance tuning
- 30% on migrating legacy workloads: Pulling more of the estate onto standardized rails
This mix ensures the platform keeps improving while expanding its footprint.
Continuous Improvement as a Product Loop: Post-launch, the platform PM feeds usage analytics, developer feedback, incident data, and cost reports back into the roadmap. The PM: Trims underused features to reduce maintenance burden, Invests in friction hotspots revealed by support tickets and surveys, Evolves golden paths in lockstep with product teams' needs.
The platform becomes a living product, not a static infrastructure layer.
The AI inflection point: Two fundamentally different stacks
AI is now reshaping every stage of the SDLC from planning to coding, testing, deployment, and operations. But there are two distinct ways AI enters the platform, each requiring its own capability stack and product strategy:
- AI-Enabled SDLC: AI agents help developers build and operate "normal" software (web apps, APIs, batch jobs)
- AI-Native Platform: AI agents and LLM apps are the product being built and deployed
Let's break down what the platform PM needs to curate for each.

Stack 1: AI-enabled SDLC (AI helps build software)
Use case: Your platform still supports conventional services (microservices, web apps, data pipelines), but AI copilots sit inside the SDLC to boost developer productivity.
Core foundation: Your existing IDP
The base IDP remains unchanged SCM, CI/CD, IaC, Kubernetes, observability, security, internal developer portal. These are your foundation. AI doesn't replace it, AI augments it.
AI-Assisted SDLC capabilities you curate on top

AI in discover & define
AI can also amplify the platform PM's research in Mine code, tickets, and incident reports to surface hidden patterns in technical debt and operational hotspots. Cluster developer feedback and portal usage data to reveal where golden paths are working and where developers fall back to bespoke pipelines. This accelerates discovery but doesn't replace human judgment.
Human-in-the-loop, not automation-only
Thoughtful platform product leaders keep humans in the loop: developers review AI-generated changes, architects approve new patterns, and PMs use AI insights as inputs, not decisions. This preserves trust while unlocking speed and scale.
Positioning: "Platforms for developers, with agents inside the SDLC."
Stack 2: AI-native platform (AI is the software)
Use case: The product itself is agentic AI customer-facing or internal agents that plan, call tools, and act in production (AI-powered support bots, autonomous data analysts, code generation services, AI SDRs).
This requires a fundamentally different capability stack.

Positioning: "Platforms for agentic AI, where the agents are the primary workloads."
The platform PM's decision framework
Start with the use case:
Both stacks share the base IDP
Your investment in SCM, CI/CD, IaC, observability, security, and the developer portal doesn't change. AI stacks extend, not replace, the foundation.
Why the platform PM role is now critical
As organizations push for faster delivery, lower cost, and responsible AI adoption, a dedicated platform product manager becomes the keystone role. Where PM Treats the platform as a product with customers (developers), outcomes (velocity, quality, cost), and a roadmap not as a cost center. Uses the double diamond to systematically move from fuzzy developer pain to adopted, measurable solutions. Curates two distinct AI capability stacks with clarity about which problems each solves. Integrates AI into the SDLC in a way that is safe, observable, and clearly tied to developer productivity and business impact. Keeps humans in the loop: Developers review AI-generated changes, architects approve new patterns, and PMs use AI insights as inputs, not decisions.
For engineering leaders, this is the leverage point: empower a platform PM with clear ownership, metrics, and executive mandate, and the platform stops being "the place where tools live." It becomes the system that lets every developer focus on what matters most, shipping resilient, high-quality products that move the business forward.
Conclusion: Platforms are products, not projects
The best internal developer platforms are not built by accumulating tools. They are designed, measured, and evolved as products that reduce cognitive load, accelerate delivery, and make the right thing the easy thing.
In 2026, with AI reshaping the SDLC and agentic workloads becoming first-class products, the platform PM is no longer a nice-to-have. It's the role that determines whether your organization's platform becomes a strategic engine for innovation or just another layer developers work around.
The question is not whether to treat your platform as a product. It's whether you can afford not to.









