AI is hitting modern organizations like a gold rush, triggering a mad dash where the winners aren't just those who show up, but those who can extract value in the fastest, smartest, and most reliable way. And as we’re witnessing what was formerly known as Data and Business Intelligence (BI) become the central nervous system of this new normal, we’re having to rethink the very foundation of it all: the AI-first Internal Developer Platform (IDP).
We’ve been at the epicenter of the cultural and technological shift that happened on the software layer: first, DevOps morphing into the platform engineer who treats the toolchain as a product. Then, the creation of specialized roles within platform engineering, such as security platform engineer, DevEx platform engineer, infrastructure platform engineer. Now, that same revolution is radiating through the data landscape and finding its place in platform engineering. The birth of the data platform engineer (or DPE for short, as I like to call it) is in full swing.
Now, you’ve certainly heard of data engineers, and you’ve heard of software engineers for data platforms: so what is a data platform engineer? In short, the DPE is a specialized role focused on treating data capabilities and enabling AI/ML model deployment as products delivered via an Internal Developer Platform (IDP). The core mission is to build, maintain, and improve the components of the IDP specifically focused on defining and streamlining efficient data architectures, the management and maintenance of data sources across premises and platforms, and the enablement of AI/ML models and workflows.
Unlike traditional data roles, the DPE operates within a platform engineering mindset: emphasizing self-service, standardization, and collaboration of different end user profiles. It’s about bringing all the benefits and lessons of platform engineering to the world of data (and vice versa). Who are these end users? Application developers integrating AI features, data scientists building models, business units automating workflows, and, last but not least, executives hungry for data-driven insights. There is a reason this requires a focus shift from just building data pipelines to designing and intuitive, reliable interfaces that actually make those capabilities usable.
While the DPE owns the data and AI/ML layers of the platform, they don’t operate in a vacuum. They work with other platform engineers, particularly infrastructure, security and DevEx platform engineers, to deliver a cohesive and robust platform.
Core technical responsibilities of the data platform engineer
The DPE’s job is all about turning data into something teams can actually use, making it easy to find and access, easy to trust, and ready to plug into whatever they need, such as:
- Classical business applications, with devs in mind as primary ‘customers’
- Business intelligence and analytics use cases for business, marketing and product analysts, to drive business goals (operational performance, customer behavior, market trends etc)
- Strategic analysis and business forecasting using data analysis
- Training of small and large scale models by data scientists and ML/AI engineers
- AI/ML applications, with a broad range of end user profiles in mind.

The way the DPE role touches platform engineering is best understood if we zoom out and look at the reference architecture of an AI-ready IDP, and approach each area of responsibility.
As this is one of the first pieces on data platform engineering, it’s important we break things down in as much detail as possible. Here we see the following areas of focus:
Data architecture and strategy
- Designing scalable and maintainable data architectures (e.g., data lakes, warehouses, data mesh principles) that support diverse analytical and AI/ML needs.
- Defining data modeling standards and best practices.
- Collaborating with leadership (like the Head of Platform Engineering or Platform Product Manager) to align the data platform strategy with overarching business and AI objectives.
Data management and governance (data-specific focus)
- Implementing and managing systems for data quality monitoring, validation, and remediation.
- Establishing robust data lineage tracking and metadata management, including maintaining catalogs of data assets and AI projects.
- Defining data access patterns and policies (e.g., who needs access to which datasets for specific purposes), collaborating with SPEs for enforcement through platform mechanisms.
- Ensuring data handling within pipelines and storage complies with relevant regulations (e.g., GDPR, CCPA), working alongside security platform engineers who manage overall compliance frameworks.
- Managing data lifecycle policies (retention, archival, deletion).
AI/ML enablement and MLOps
- Building and managing the platform components that support the machine learning lifecycle (MLOps), such as feature stores, model registries, and tools for model training and deployment orchestration.
- Optimizing data pipelines specifically for AI/ML workflows, ensuring data is efficiently processed and prepared for model training and inference.
- Managing access controls for models to data resources.
- Overseeing relationships with third-party model suppliers and managing associated licensing.
- Enabling the infrastructure required for AI workloads (like GPU access) by defining requirements for the IPE.
Data pipelines and processing
- Designing, building, operating, and optimizing scalable and reliable ETL (Extract, Transform, Load) and ELT (Extract, Load, Transform) data pipelines for data movement, transformation, and integration.
- Enabling and managing real-time data streaming capabilities (using technologies like Kafka, Flink, Pulsar) for timely data availability.
- Ensuring data processing logic aligns with defined business rules and data quality standards.
Data Observability
- Implementing monitoring and alerting for data pipelines, tracking data freshness, volume, and quality.
- Providing insights into data usage, access patterns, and pipeline performance to support optimization and troubleshooting.
- Collaborating with RPEs to integrate data observability into the overall platform monitoring strategy.
Interfaces for Data Capabilities
- Consistent with the platform engineering approach, the DPE works with DevEx platform engineers to provide self-service interfaces for their users. These interfaces (ranging from APIs and CLIs for technical users, to UIs and low-code tools for business users) allow self-service access to data discovery, processing, model interaction, and analytics capabilities along clear golden paths.
A word of warning
As AI/ML initiatives take center stage, the heat is on DPEs like never before. That is why for anyone working in the space or looking to integrate DPEs into their platform engineering initiative, I recommend keeping the scope of your business objectives laser-focused and making those expectations crystal clear from day one. The key trick? List all possible requests from the different user personas, and sort them by frequency and impact. If you’ve identified golden paths that people are using over and over, again, consider automating it. This way, you avoid scope creep and rather focus on setting clear expectations and delivering ROI fast.
But most importantly of all, the role and scope of DPEs should not be considered in isolation. They're a fundamental piece of the platform engineering puzzle. If you're having conversations about platform engineering without DPEs at the table, you're setting yourself up for failure. Whether it's debates about platform as a product, or dissecting the most common platform engineering fallacies, DPEs need to be right in the thick of it.
Conclusion
It’s clear now that a company's edge is no longer just about simply having AI tools. It’s about how fast, reliably, and efficiently they can embed increasingly more powerful and advanced AI tooling and practices into their organization's everyday workflows. The DPE role is absolutely critical for this. DPEs are the backbone that ensures and empowers reliable, scalable data and AI/ML capabilities through your IDP.
Ignore them at your own risk.
If you want to become a platform engineer or ensure your team is ready for the Platform Engineering future, take a look at our Platform Engineering Certification courses. We will be covering Data Platform Engineering as well in our modules.