Platform engineering

Courtney Kissler: How to build internal platforms for the enterprise

Courtney Kissler
CTO @ Zulily

Talk Transcript

Christoph  00:16

And we are live. Welcome everybody to today's meet up. I'm happy to hand it over to you, Courtney, great to have you here. And really looking forward to your learnings along your journey on how to build internet platforms and how to make them work. So the floor is yours.

From Nordstrom to Starbucks to Nike to Zulily

Courtney  04:34

Well, thank you. I'm very excited to be here. And I'll just start going through this. I'm going to share a little bit about my own personal journey and then some very specific examples around leveraging platforms and some of the companies that I've worked at. All right, super quick. I've been in the tech industry for quite some time. And I really started in infrastructure engineering. So operations background, worked at a couple startups. And then I moved to Nordstrom. And I spent 14 years at Nordstrom, I started in security engineering, and then moved around in infrastructure and operations, kept getting closer to the customer. And my final role at Nordstrom was leading what we called our customer facing engineering teams. So mobile development, retail technology, personalization, loyalty. After 14 years I decided to go to Starbucks and get a global opportunity, leading retail technology there, which essentially, was global POS across all store footprint at Starbucks, very different and exciting. From there, I went to Nike, where the role that I started in was leading digital platform engineering, so near and dear to our topic today. And that included something very similar to what I was leading at Nordstrom, namely our consumer data platform, our content, digital asset management platform, identity, user services, inventory, order management, and a bunch of different technologies across the Nike digital ecosystem. Then I moved into global supply chain and logistics. And then my final role there was leading our ERP implementation. Now I'm at Zulily, started in January as CTO. It's a great company, having been around for about 11 years in all digital retail, and I'm in the process of learning what we need to focus on. And some of it's very relevant to the topic today. And we are looking at our architecture and our platforms and our API's and figuring out what's the right amount of standardization and platform approach to take.  I'm super passionate about learning and creating a dynamic learning organization, generative culture, psychological safety, people are the number one asset in any organization, leaders should care and show up in that way. And I strive to be a lifelong learner. And then for fun, I love to sing. I am terrible at it, but I really enjoy it. So I tend to sing some karaoke when I get a chance. Alright, so super quick.  Every organization is different. And so one thing I've learned throughout, you know, Nordstrom, Starbucks, Nike and Zulily is that every organization, every culture is different. But there are common patterns and anti patterns. And I'll talk a little bit about that in more detail. And I'm a big believer in the first 90 days framework. So whenever I take a new role, I use that framework to really take the time to understand the situation that I'm in and ask a lot of questions so that I can learn.  

Learnings from the 2021 State of DevOps Report

Okay, here we go. I know, this is a lot of words, but I wanted to pull out the content and not repackage it. But for the 2021 State of DevOps Report, there was so much content around why platforms matter. And whether it's platform teams or platforms themselves. And I think the main takeaway here that I wanted to talk about, as I lead into the examples, is treating your platform as a product, which I think takes a mindset change. I don't think there's always been this focus on treating platforms in that manner. And it's extremely critical, I think, in order to achieve success and sustain success. And another key data point that I think is super important from the State of DevOps Report, and then I pulled some content from Dr. Ron Westrum, the Westrum model, this is why I talked about generative culture. I'm a big believer in that. Not only is your platform approach important, but how teams interact is critical. How does information flow? How do teams work together? And even if you take a platform as a product approach, if there's no focus on team interaction, and how the information flows throughout your organization, you'll only gain limited success. And so I'm a big believer in this as well. Ways of working, how do you really understand the flow of value and how do you optimize for teams to have the least amount of friction in the work that they're doing.  

Optimize for speed, not for cost

Okay, I like to tell this story because this was a big aha moment for me in my journey. And it led to a lot of focus on platforms. So most organizations treat technology or IT as a cost center. And often the focus is on how you can get as much efficiency as possible out of technology. When I was at Nordstrom, in 2012, we had this big initiative where we said, digital is our growth channel. And basically, we need to stop focusing on cost. And we need to start focusing on speed. Now, speed. And this is also in the State of DevOps Report. If you're doing speed, right, you're not compromising stability and quality. So to say that out of the gate, but I think sometimes organizations when they start to focus on a platform approach, they focus on consolidation, cost savings, how do we make this an efficient platform versus how do I ground it in delivering value, high quality, speed and optimizing for the developer experience. And so for me, I try to really focus on if you do speed, right, you can get cost and efficiency. If you do cost and efficiency as the leading focus, you're not going to get speed.  So I'm going to tell some stories from Nordstrom. So the first story I'm going to tell is because this team was trailblazing when I was when we started to shift our mindset. And it was our infrastructure organization. So many of you have probably been in this scenario where prior to going to cloud, everything was in a data center on prem, provisioning lead time could be months, often was months. And you didn't have the flexibility in the scale. And often it required sending a ticket into a shared service team. And then you waited for your infrastructure to be provisioned. So our infrastructure team took a step back and said, We want to set a Northstar of self service, essentially a lead time of minutes versus months. And how might we do that? And they did a Value Stream Map. I'm also very passionate about that to understand where the bottlenecks were in the existing provisioning process and designed for the persona of developers. So our customer is a developer. And it really was a huge unlock for the teams to understand where they needed to focus in order to enable that. And a lot of it turned into self-service API's, and really cloud as a platform, and how could we provision services in a more effective way. Another example, so our customer mobile organization, so this was essentially our iPhone app. It was really the iPhone app, because that was the primary usage at the time our customers were using our iPhone app, we were not moving fast enough. In fact when we looked at our delivery, it was twice a year, which is an eternity in digital. And so the team went through and said, how can we create platforms and API's so that we can go faster? And one of the things they focused on, which I thought was great, was let's never have an actual person log into production.  No one is hands on the keyboard pushing code to production. And it created this great opportunity for us to think differently about code deployments, because up until that point, it was: we had a release management team and you submitted a ticket and then you waited. And then it went to the Apple Store, which we still had to wait for but we were basically in a position where we said, we don't want technology to be in the way. Let's create these self-service API's and underlying capabilities so that we don't have to ever log into production.  Now, earlier, I talked about why ways of working and Interactions matter. And this was also something this team focused on. They said, we have a lot of handoffs with a lot of different teams that are part of our value stream and that's not working great. We don't have great feedback loops. We send things to our quality organization. And then maybe weeks later we hear if we had a defect or not, quality should be owned in the team, we should own production incidents. I mean, all of this probably sounds like, of course, you would do that. But back then it wasn't really happening. I mean, DevOps was out in the industry. But a lot of organizations were still not practicing the truth, you build it, you run it, you own it. And so we did a bunch of silos, you know, breaking down silos and setting outcomes that allowed the team to be focused on stability and resilience, as well as feature delivery.  

Platform as a product: NPS and adoption metrics

Okay, I'm going to tell a story from Nike. And I found this, because I truly believe that this is true. Just because you build a platform, doesn't mean people are going to consume it. And so it has to solve a need, it has to be compelling. And it has to be an ongoing, I'll just call it a, you need to treat it like a product. If you deliver a product to a customer. You don't let it go stale. You're constantly improving and iterating on it. Same thing with platforms. And so this story probably will sound familiar, there was an effort underway, and a charter delivered to a platform team to essentially consolidate all deployment pipelines. And the drivers behind it were really saving money. And, you know, how do we get everybody to the same standard? And senior leaders, myself included, were asked to mandate adoption, like just say everyone's gonna move to the pipeline, you have until this date. And instead, what we shifted to was, do we know if this platform is meeting our developers needs? I think adoption is a way to know, if no one's adopting your platform, then that's probably an indicator. But how do you really get to the why behind the lack of adoption? And so one thing that we weren't doing, and this is where the product mindset comes in is capturing NPS for the pipeline. Would you refer a friend or colleague to this pipeline? We learned so much just from asking that simple question. Because team members weren't always elevating exactly why, and what the friction was, or the gaps in the feature set. So that created a really great platform. And then, we started talking about NPS and adoption in our monthly business reviews. So I'm also a believer that if leaders don't signal the value of an activity, then it's less likely to get traction. So we would ask, how are we doing? Have we closed this feature gap for the payments team? Are they in a position now where they can adopt it? Do they need help? Do they need prioritization and air cover to do the work to move to the new pipeline? We identified a ton of gaps and feedback loops improved, because now the product managers for the platform were super connected to their customer. And we just saw all this momentum gained from that. So then the senior leaders' roles were really to prioritize the work and prioritize the capacity and the platform team to address the feature gaps, versus just moving on to whatever the next thing was. And so it was, it was really, I think, a great example of an organization recognizing, we need to treat our platforms as a product. And it's okay, if we're not there yet. Like we need to continue to invest. And once we do get adoption at a place that feels right, we're not done. You're never done. Like you have to continuously invest and understand what might be the next thing that you're going to need to deliver for that community. Okay, so related. So I believe that, you know, a platform team's role is to design for joy. So what are you doing to minimize burden for your customers, and I believe those experiences include the processes. So if you have a platform, but getting access to it is super, super burdensome. It's then you haven't designed for joy. So it should be easy. I talked about, you know, speed to onboarding or deploying on day one, if your platform is set up in a way to minimize friction, then you can achieve those outcomes. I believe making problems visible matters. So in that example, from Nike, and even at Nordstrom as well, we didn't really know where the problems were until we made them visible. And then once you make them visible, you can go after them. I included this link to a blog post that I really liked to leverage, because I think a lot of organizations index on velocity, which I think matters. But I also think viscosity matters. If you start to understand the dependencies between teams, and this goes to information flow and team interaction, then you can understand, do we need to build an API? Do we need to build a feature on the platform to minimize or eliminate this dependency, like having transparency to dependencies is really critical. I also believe that you need the right size, the amount of guardrails and discipline and rigor that you put into an environment. Because if everything becomes highly bureaucratic, then teams are not going to feel like they're really set up for success. And so I used to work with somebody at Nordstrom, he was my business partner. And he used to say, our role Courtney is to create high curbs, and then wide boulevards for teams. So we'll define the curbs, and then give the team all the space they need to do whatever they need to do to achieve the outcome. I talked about NPS deployments on day one. 

Start small and learn before you scale

The other thing we did at Nike is we started an initiative called shift 365. And this was in response to what I consider to be an anti pattern and risk management theater when you say freeze all changes. So we're going to freeze changes during this critical time period, often a holiday. And in reality that is sub optimizing, focusing on and solving for why - Why can we not ship 365 days a year, any time of day? It should be a business decision not to deploy versus a technology reason. So that was a big unlock for us too, to really understand what's keeping our teams from feeling confident in deploying any day of the year, anytime of the year.  And then one other piece of advice that I've learned throughout my career, and I continue to see this happen in the industry is that everyone gets very hung up on the definition of a platform, and wanting to define every single platform in the organization and make sure that everybody's operating in the exact same way. I'm a believer in starting small, learn before you scale. And as long as the focus is on treating platforms as products, I think you end up in a really good spot. And, again, focus on the developer experience. All right.  This is a busy slide. But it has a lot of the inspiration that I've leveraged throughout my career. And I just keep adding to it. And at some point, I probably need to do something different. But just to share where I've gotten a lot of my external inspiration, I feel like being insular is not a good approach. And so there's so many people in the community who are going through either the same journey, or they've gone through it already and can share what they've learned. And so over the years, I've collected quite a few different resources that I've leveraged. And that was all that I had. I'd love to take questions, if there are any.

Christoph  24:09

Good, thank you so much. I do know some of the books you're recommending here. Great. We've seen some of the faces in earlier meetups. So cool. Yeah, we have two questions, and I'm pretty sure we get more once we start diving into this. So the first question is from John, make problems visible? How is this done? Seems like at some larger companies, engineers know there are problems, but higher ups don't know the pain. So changes are not made.

How to make problems visible?

Courtney  24:43

Ah, this is such a great question. I actually took the slide out. And I have a ton of passion around what I call senior leadership evolution that is required for making problems visible. Exactly, John, what you're describing, often senior leaders, and frankly, even middle managers can be very disconnected from reality. I talk about honoring reality and surfacing reality, and that leaders need to be knowledgeable enough about where the problems are to be helpful, not to micromanage, but to genuinely be there to unblock. So my favorite technique is value stream mapping, I think it's a way to bring teams together, put the facts on the table of what is in the way. Senior leaders need to be engaged, and not just in words, in action. So if I say, super important for us to make problems visible, I support us doing a Value Stream Map, go for it, and then I never show up. And I never engage and I don't ask any questions, then I'm not really demonstrating my support for problem solving. And then teams need the space to truly problem solve. If you identify something, and it requires us to make a trade off, maybe we have to slow something down. That is a senior leadership commitment to problem solving. So I talk a lot about lean when I talk about leadership evolution too. And I love using the term gamba leaders need to understand and go and see not go and tell. Go and ask questions and understand where the problems are. Now, when I do the first 90 days, it's a lot about listening. So another mechanism. Value stream mapping can be really challenging, especially in a remote situation, listening tours, roundtables, connecting with engineers, and technical product managers, and anyone who's close to the work. So I can understand reality. Because if I don't know where the real problems are, and I talk about the burden a lot, I'm not going to be able to help in any sort of meaningful way. So I hope that helps, I think it really requires senior leaders to be actively engaged in what's really going on at the team level. And then committing to and you know, you ask, like changes aren't made committing to those changes, but also in a similar way that we expect our teams to do. We're gonna try something, we might not get it right, then we're gonna learn from it. And I'm going to continuously help with us getting to a better place, and not just doing a one time change and then walking away.

How to change culture?

Christoph  27:38

And then probably also providing the psychological safety you mentioned earlier to allow people to do that. Great, great. Cool. So Nigel Simpson, first of all, great to have you here, Nigel. Always good. Remember our last meetups, your experiences demonstrate cultural change. What would your advice be to others in large enterprises that are locked into a command and control culture? How do they change the culture? Is it always top down?

Courtney  28:08

And in my experience, it needs to be both. So I've been in scenarios where bottoms up culture change. And I've seen it in action. And I've seen, I will call it limited progress, but great progress, and then it breaks, it breaks down because the tops down is not. They're just really not on board. So I truly believe that you need both. And I think one of the ways that I've seen it be successful is shifting away from being output focused to being outcome focused. If you can get leaders to set clear outcomes for the teams. And, like create the right environment where they're shared outcomes, because some of the culture and the command and control comes from, I'm focused on one thing, you're focused on another thing, not bad intent, but we're really we're like sending different signals into the organization. If we share an outcome, then you're really creating an environment where teams are going to come together in order to make something happen. Now, you also have to be mindful of doing that in a way that doesn't create a high tax across the organization. Because if you're collaborating and you need a lot of people to people human interaction in order to deliver that outcome, that's another area where leaders can say, wait, what would we do to minimize the human interaction and create an environment where teams can self serve and get what they need to minimize that burden? So I truly believe it takes both, but I also believe in, you know, creating that generative culture where leaders need to be creating psychological safety. And the best way I've seen to do that is really to create outcomes, and then create what I call a system of accountability, like how do you continuously understand the health of those outcomes? And, like, demonstrate your commitment to shifting away from command and control.

Christoph  30:32

Cool. Thanks. Great answer. So another question from Shula. I think we also talked earlier, I think you're also building a platform, if I'm not mistaken. Thank you, for the presentation Courtney. What were some of the biggest bigger barriers your teams encountered to get to a self-service internal platform? Personally, I've seen a good identity provider and proper role or attribute based management being the largest prevention of self-service.

Courtney  31:03

Yeah, I mean, that has definitely been a barrier. Because if you ... in a lot of cases, there are certain controls that need to exist inside these self service platforms. And if you don't invest in that early, you end up paying for it later. So finding the right end, you know, I love the use of identity, you know, identity provider. I also believe that if you can build in things like observability patterns, things that can minimize, and I don't know if they're barriers or just friction later, that it's super, super important. So I think it goes back to saying, you almost have to go slow to go fast. So make sure that you're doing enough of that barrier reduction out of the gate. But also, this is another thing that I've seen happen with platform approaches where sometimes teams want to wait until they have everything perfect, before they allow anyone to consume the platform, versus take a team or two figure out their needs, build in as much as you can to make that a relevant experience, and then iterate and learn as you go. And so yes, I agree that having some of what I'll call it I don't like using this term anymore, because it feels very old school, but like, non functional requirements, like how are you building things into the platform that people don't have to worry about? And you don't end up in a situation where you got everyone like, if you continue to use the identity example, like everyone's building their own AWS account, or everyone's building their own way to get in access to the system? And then over time, that becomes friction.

How small can we start?

Christoph  33:07

Yeah. Not having a beta customer is almost always a death sentence. And I think he's right with that. And that's, I mean, that's the difficult way how you would develop any other product, right? But why would you treat your internal platform any different from that? Cool. David, is asking, and I really like this question, how small can we start? I'm at a small company, and we are already seeing the need to organize the chaos. Does this platform product approach work as a way to generate culture as we grow?

Courtney  33:42

Yeah, great question. I don't know the size of your organization. But for me, and the organization I'm in now I would consider to be smaller, especially compared to the organizations that I've been at before. And I'll just share kind of what my version of starting small has been. We have specific use cases in our environment that are candidates for a platform approach. Most of our teams need to, let me backup. There are certain use cases where it takes multiple teams in order to deliver that use case or that outcome. And today, our integration patterns are not standard. So everybody can kind of do whatever they want. And it was all for the right reasons. So just super quick to say that, like, every decision that gets made has the context of the moment. So it wasn't wrong. It's just reality. And now what we're learning is some amount of standardization would actually speed up our teams. And here's the best part around the culture part. Our teams are asking for it. So rather than a leader at my level, coming in and saying, we are going to implement standards, and everyone's going to follow these API versioning standards and contracts will look like this. It's like, I'm not doing that. The teams through the listening tours and understanding the burden have elevated, we need standards. So now we're on a journey where we've started small, because we've said, we're not going to boil the ocean, we're not going to like create standards that apply to the entire technology organization, we are going to take two teams, and they are working in different tech stacks. But they need to work together. So what might it look like for them to leverage a platform to speed up their delivery. And once we solve for that, we're going to learn a lot. And then we can decide what applies to the other teams. In some cases, maybe it doesn't. So then we need to look at it and say what other use cases exist in our environment that will create the opportunity for us to enhance the platform, maybe introduce different standards that apply and need to exist for these other teams. So I like to start with one to two teams, I prefer to have it grounded in a use case, or an outcome. Like if there's a scenario where I'm like, I believe in the DORA metrics that come from the book Accelerate, it's like, if you're looking at deployment frequency, percent change failure rate, lead time for change. And meantime to restore service, you're going to get an indicator of maybe where you want to start first. Because if teams are having a hard time with frequency of deployment, or every time they deploy change, they have to roll it back. That could be an area where you say this might be a candidate for us to create a platform for them to improve those outcomes.

Christoph  36:58

Great, great answer. So you're on that journey yourself. In your new role. Great. Cool. I think those were the questions. Great questions. Thanks. It's also good to see so many people staying for the whole time. Thank you so much, Courtney, that was great. Thanks for giving us some insights into how you think about platforms and how you actually managed to build them and how you foster them. I think it's a lot about mindset. And so yeah, thanks, everybody for joining. And hopefully see you at the next meetup.