The conversation around AI in software development has reached a critical inflection point. While conferences buzz with promises of 10x code generation, enterprises in regulated industries face a fundamentally different challenge: how to safely operationalize autonomous AI agents at scale while maintaining compliance and governance standards.
Main insights
- Code generation is not the problem - governance, safety, and ROI measurement are the real challenges when operationalizing AI agents
- Regulated industries need infrastructure-first governance with workspace-level controls, privilege separation, and centralized configuration
- A unified LLM gateway and separation between control plane and execution layer are essential architectural patterns for safe agent deployment
- Productivity gains extend beyond software engineers to business analysts and non-technical users building internal applications
Sam Barlien, head of ecosystem for the Platform Engineering Community, led this discussion with Eric Paulsen, Field CTO for Coder in the EMEA region. Eric brings five years of experience helping Global 2000 companies accelerate AI adoption through secure, scalable developer platforms.
You can watch the full discussion here if you missed it: Operationalizing AI Agents in Regulated Industries
The real AI challenge: Governance, not generation
After attending major conferences including NVIDIA GTC, RSA, SREcon, and QCon London, Sam observed a consistent pattern: "Every single one of those conferences, almost every booth, every talk, every conversation was around AI... but at a deeper level, AI agents. Conversations around how do we make the most of them, how do we get the most code from them."
Yet when speaking with enterprises and practitioners, a different reality emerges. "Code generation is not the problem," Sam emphasized. "When we talk about operationalizing an AI agent, getting an AI agent to deliver value, to deliver ROI or productivity, it's never about how much code can we generate."
The challenge becomes clear when you consider the fundamental difference between human developers and AI agents. Human developers have agency - they can make judgment calls about whether an action is appropriate. AI agents lack this capability. As Sam illustrated with a stark example: "If you ask it, 'Hey, this service is popping up with errors. Make sure it doesn't pop up with errors anymore,' and it thinks, 'Okay, what's the best way to do that? Just delete the service.' Problem solved."
This lack of contextual judgment creates significant risks in production environments, especially for regulated industries where compliance violations can result in substantial penalties.
Understanding the levels of agentic development
To frame the discussion, the speakers introduced a maturity model for agentic development based on the level of human involvement:
- Level 1 - Human in the loop: AI helps type but humans confirm and execute every action
- Level 2 - Human on the loop: Agents generate pull requests for human approval
- Level 3 - Human as orchestrator: Agents execute multi-step workflows and deploy lower-risk changes, orchestrated by humans
- Level 4 - Human outside the loop: Systems of agents initiate and promote changes autonomously
This framework helps teams understand where they are and what governance requirements change as they move up the maturity ladder. Most organizations today operate at Level 1 or 2, with Level 3 representing the sweet spot for regulated industries seeking to balance automation with control.
Why vendor-level controls are not enough
When asked why built-in controls from tools like GitHub Copilot or Cursor are insufficient for regulated environments, Eric explained the fundamental limitation: "Claude Code and Cursor and Codex are just process-level pieces of software. You take that binary, you install it inside of an environment and you run it in your terminal or call the API to execute the function."
The real governance challenge requires a platform-level perspective. "What you really need to think about is full control over the software supply chain in terms of the tooling dependencies and installations that are required to go into some of these projects," Eric noted. This includes:
- Container runtime controls and security policies
- Network isolation and traffic monitoring
- Data center and region placement for GDPR compliance
- Vulnerability remediation processes and patch management
- Centralized secret management and credential rotation
Eric shared a telling anecdote: "I've hopped on a phone with a customer that basically said, 'Hey, if you're self-hosted, we don't even need to talk about security on this call because we already know the benefits and the governance implications of self-hosting a platform.'"
This highlights a critical gap: while SaaS AI tools offer convenience, they often cannot meet the stringent requirements of regulated industries that need complete control over data residency, access patterns, and audit trails.
Essential architectural patterns for safe agent deployment
The discussion revealed two critical architectural components for operationalizing AI agents safely:
Unified LLM gateway
Rather than distributing hundreds or thousands of API keys across an organization, a unified AI gateway provides:
- Centralized authentication to multiple LLM providers (Azure OpenAI, AWS Bedrock, Google Vertex AI)
- Comprehensive observability and audit logging of all prompts and responses
- Token spend tracking and cost control with budget enforcement
- Reduced attack surface through consolidated access points
- Policy enforcement for content filtering and compliance
Players in this space include LiteLLM, Kong, and Coder's own solution, among others. The gateway acts as a critical control point for monitoring AI usage patterns and ensuring compliance with organizational policies.
Separation of control plane and execution layer
Eric described an emerging architectural pattern that addresses the limitations of simple process-level isolation: "You centralize the agent loop itself in a governed control plane and then what you do is you use the workspace as the execution layer."
In this model:
- The control plane handles all communication to LLMs and maintains conversation context
- The control plane manages all API keys and secrets centrally with proper rotation
- Workspaces perform file reads and writes but have no direct LLM access
- Workspaces contain no API keys or sensitive credentials
- All agent actions are logged and auditable through the control plane
"We think about solutions that came to market over the last 10 years like HashiCorp Vault where you centralized all your secrets in a trusted entity," Eric explained. "We're starting to see that architectural formation come to play with centralizing the agent inner loop, the skills that are trusted and blessed within the enterprise, and then of course the secrets that it has access to, but then separating the agent binary itself from the execution layer."
This separation provides multiple benefits: improved security posture, centralized monitoring, and the ability to update agent capabilities without touching individual workspaces.
Governance as competitive advantage
For regulated industries, governance is not just a compliance checkbox - it's a competitive differentiator. Eric noted that financial services companies "are competing on speed and how fast they can move within that governance framework."
The traditional tension between software-as-a-service speed and self-hosted control is being resolved through modern platform engineering approaches. "Once it is in place, you can really move fast and keep that governance layer fully in control," Eric explained.
This shift is particularly important given the rise of "shadow AI" - teams adopting AI tools without official approval because blanket bans are unrealistic. Research from the Platform Engineering Community found that roughly 15% of teams had been given blanket prohibitions on AI use because security teams were not ready to govern it properly.
Organizations that can provide governed AI capabilities gain a significant advantage in attracting and retaining talent while maintaining compliance standards.
Expanding ROI beyond software engineers
One of the most interesting insights came from a case study of a large German automotive manufacturer building an "AI innovation hub" on top of an orchestration compute layer with hosted environments.
"Yes, of course, this is for software engineers and data scientists to spin up environments and do their work as well as have agents execute on that work," Eric explained. "But they're also starting to see non-technical users, business analysts, and other people who want to automate day-to-day tasks in their roles or for their teams."
This creates a "lovable type of experience where you can go into lovable.dev, you spin up an application, it gives you a hosted URL and you can remix it or create it as an artifact and share it with other people."
The productivity gains extend far beyond traditional code generation metrics. However, Eric acknowledged a critical gap: "The big question that I've seen across my customers is how do we measure this - what are the metrics that we take to the organization and report back up to leadership as it relates to the return on investment of these AI systems."
This measurement challenge represents a significant opportunity for platform teams to establish frameworks that capture the broader impact of AI enablement across the organization.
The path forward: Start small, prove value, expand
Rather than attempting to "AI-ify everything" with a monolithic approach, the speakers recommended applying platform engineering principles to AI adoption:
- Identify specific, high-value use cases with clear success criteria
- Start small with controlled experiments in safe environments
- Prove value with measurable outcomes and user feedback
- Expand gradually based on learnings and organizational readiness
Sam cautioned against the temptation to immediately replace entire systems: "Don't immediately go and be like, 'We're going to replace all of our ServiceNow-related flows with some autonomous agent.' That's probably not going to work."
The key is creating safe environments for experimentation that allow teams to learn what works while maintaining the governance standards required by regulated industries. This approach builds organizational confidence while generating the learnings needed to scale responsibly.
If you enjoyed this, find here more great insights and events from our Platform Engineering Community.
For more comprehensive guidance, check out the Platform Engineering Certified Architect Course and learn best practices from industry experts.
Key takeaways
- Governance is the foundation, not an afterthought: For regulated industries, robust governance frameworks are table stakes for AI adoption. Without infrastructure-first controls including unified LLM gateways, centralized configuration, and separated control planes, teams cannot safely move beyond basic AI assistance to autonomous agents.
- Architecture matters more than vendor features: Built-in controls from AI tools are insufficient for enterprise needs. You need platform-level governance covering the entire software supply chain, container runtime, network isolation, and data residency. This requires treating AI agent deployment as a platform engineering challenge, not just a tool procurement decision.
- ROI extends beyond code generation: The real productivity gains come from enabling both technical and non-technical users to build internal applications and automate workflows. However, the industry still lacks mature frameworks for measuring this broader impact, making it critical to define success metrics early in your AI journey.
- Start with experimentation in safe environments: Rather than blanket bans or unrestricted access, create governed cloud development environments where teams can safely experiment with different agent capabilities and maturity levels. This approach builds organizational confidence while generating the learnings needed to scale responsibly.


