Scaling Privileged Access for Modern Infrastructure: Real-World Insights
Apr 25
Virtual
Register Today
Teleport logoTry For Free
Background image

Platform for HyperGrowth - Overview

Key topics on creating a Platform for HyperGrowth

For this 16th episode of Access Control Podcast, a podcast providing practical security advice for startups, Developer Relations Engineer at Teleport Ben Arent chats with Lukáš Hrachovina. Lukas is an Engineering Manager for the Cloud Engineering team at Productboard. ProductBoard was founded in 2014, and is a product management system that helps organizations get the right products to market faster. Lukas has been at the company 2 years and has helped keep systems stable as ProductBoard has entered into hypergrowth. This episode dives into how to plan, build and execute a platform team to help support a growing organization while keeping systems as secure as possible.

Key topics on Access Control Podcast: Episode 16 - Platform for HyperGrowth

  • Productboard is a product management system that lets organizations get the right products to market faster.
  • Productboard is transitioning to Backstage, a tool from Spotify, to build a developer portal, and has moved its documentation there as well.
  • To securely offboard engineers, Productboard uses a variety of tools including Okta and Vault.
  • Productboard chose Teleport to improve their security as the number of engineers grew and since the company works with customer data, so it's critical to protect that data and prevent any possible leaks.

Expanding your knowledge on Access Control Podcast: Episode 16 - Platform for HyperGrowth

Transcript

Ben: 00:00:00.266 Welcome to Access Control, a podcast providing practical security advice for start-ups. Advice from people who've been there. Each episode, we'll interview a leader in their field and learn best practices and practical tips for securing your org. Lukas is an engineering manager for the cloud engineering team at Productboard. Productboard was founded in 2014 and is a product management system that lets organizations get the right products to market faster. Lukas has been at the company two years and helps to keep systems stable as Productboard has entered hyper-growth. Today we'll dive in how to plan, build, and execute a platform team to help support a growing organization while keeping your systems as secure as possible. Lukas, thanks for joining today.

Lukas: 00:00:42.002 Happy to be here.

Introducing Productboard

Ben: 00:00:43.049 So to start, can you tell me a little bit about what Productboard is?

Lukas: 00:00:46.339 As you said, Productboard is a tool. It's a product management platform. It helps get the right product to market faster by understanding what your customers need and prioritizing off of that, basically. And you can also align everyone within your organization around a road map, basically, built off those features that are requested by your customers. So basically, it provides you with one sort of tool where you can aggregate feedback from all the different sort of sources, prioritize what's important, and create a road map, basically, off of those priorities. That's pretty much it.

Ben: 00:01:32.032 And is it targeted for start-ups or enterprises or mid-sized companies?

Lukas: 00:01:36.025 All of the above, really. Basically, we have sort of different plans, some targeted at start-ups, all the way up to, basically, enterprise-level customers.

Ben: 00:01:46.043 And it's offered as software as a service?

Lukas: 00:01:48.527 Yes, it's software as a service. Yes, that's correct.

Ben: 00:01:52.236 And so I think today we dive into sort of your platform team. You're obviously in charge of running that software as a service platform for Productboard.

Lukas: 00:02:00.885 I'm part of the cloud engineering team, which is, in turn, part of a platform tribe. So a tribe sort of is an overreaching, I guess organizational unit, where all the misfits go to. Basically, all the misfit teams that don't directly impact the product. They don't work directly on the product but we sort of — our customers are engineers, so to speak.

Ben: 00:02:33.254 Yeah. Your internal teams.

Lukas: 00:02:34.910 Yes, that's right.

The history of engineering at Productboard

Ben: 00:02:35.945 The company's been around for about eight years. Can you tell me a little bit about the history of the engineering before and during your time there?

Lukas: 00:02:43.233 So I can't speak too much on the before but basically, from what I've been told, the company group grew at a slow and steady pace from 2014, all the way until 2016, '17, where the growth really started to kick off. So basically, I joined in a very, very good time, halfway through 2020. So I've been with Productboard for two years. That was after our series B, where we basically started the hyper-growth phase or entered the hyper-growth phase. Just started shortly before that, then COVID came, and everybody sort of froze in their tracks and then it sort of kicked up again. To give it some context, I think my employee number is 220-something. So at the time I joined, there were like 150 people in the company. At this point, there is more than 500 of us, basically, our engineering group, in proportion to the rest of the company.

Ben: 00:03:44.675 I'm guessing too, Productboard really probably helps with people who have gone from work from home during COVID, to sort of organize and help keep companies and organizations sort of on a clear trajectory regarding their road map.

Lukas: 00:03:56.396 Yes, actually. As you rightly pointed out, COVID and working from home didn't really have a bad impact on our business. Maybe slightly the other way. Not obviously the likes of Zoom and others, but it didn't impact our business too harshly.

Ben: 00:04:13.531 So as part of scale, it's mostly around just keeping up time and availability, is sort of one of the most important parts of your company.

Lukas: 00:04:21.211 Yes. Also, with our engineering group, our engineering organization, we're basically maintaining our own CI. We run our own CI on-premise, so to speak, and in the cloud obviously. But we run it ourselves. This also becomes a problem at scale at some point when your repositories get really big, your automated test suites start using more and more resources, basically. And so and so forth. So the scale problem is manyfold.

Ben: 00:04:59.086 It's kind of keeping people's developer velocity going. I guess as you have more engineers join, more tests, more PRs, there's more tests running so you need to keep everything churning along quickly.

Lukas: 00:05:08.481 That's correct. Yes. Also, one other fairly recent changes to our organization was basically — I think it was around a year ago when we decided that we want to create an engineering presence in North America. Up to that point, Productboard had engineers in Europe, predominantly, but we already had an office in Vancouver, mostly people from customer success and sales, and we decided that it'd be a good idea to basically have some engineering presence in Vancouver as well. This obviously created a whole lot of different problems because suddenly, there is a nine-hour time difference between the Czech Republic where most of our engineers are and the Pacific Coast. Also, you have to invest into knowledge sharing, which is something I think we'll get to later down the line. But yeah, that was one of the scaling problems as well.

Ben: 00:06:04.476 And are you in Canada or are you in the Czech Republic?

Lukas: 00:06:07.274 Funnily enough, I'm Czech. I was born in Prague but I've been in Canada for the past couple of months.

Ben: 00:06:13.352 Okay. So you got to see a bit of both offices, I guess, and different cultures.

Lukas: 00:06:16.349 Yes, that's correct. Yes.

Migrating from a self-managed kOps installation to EKS

Ben: 00:06:17.929 Back in March, I think, on your engineering blog, you mentioned that the infrastructure was responsible for migrating of managed Kubernetes installation to EKS as sort of part of your, I guess, developer velocity. Can you tell me a few reasons why you made this migration?

Lukas: 00:06:32.073 Well, if you go and read the blog, there is this little section titled The Outage, which is exactly what you'd think it is. Maybe a little bit of a backstory. Originally, it was long before my time, Productboard ran off of Heroku, a platform as a service, and at some point, it was no longer fitting our needs very well, so it was decided that we would migrate to Kubernetes. And back in the day, I think EKS was still — it was in its infancy, I guess. It was recently released to the public. It was decided that we would go with Kops. Kops is a self-managed solution for running Kubernetes. It ran fairly smoothly for a long time, but along the way, we sort of lost some of our colleagues who were part of the original team that set up our Kops deployment. At some point, we were left with basically, your platform that you don't really want to touch because you don't really know exactly know how it works. As such, we were already planning to migrate to EKS in the future because it seemed a logical choice because all of the problems that we've had with EKS were no longer there. It was running a new version of Kubernetes. It was more stable, and so and so forth. We experienced this outage in September 2021. We had a downtime of approximately five hours across the whole production, and I think that was the last straw that broke the camel's back, I guess, and we decided to fast-track our transition to EKS.

Ben: 00:08:24.847 Yeah. And sort of as a team, what's your thoughts on the building itself? You buy the solution. How do you find the difference between a managed Kubernetes service and kOps?

Lukas: 00:08:34.708 I think Kubernetes is hard enough so you don't have to feel, I guess, inferior when you decide not to manage it yourself. I think a lot of the complexity comes elsewhere. It comes with scaling it. Well, it comes with how do you deploy your resources, and so on and so forth. And those are the things that I think have a direct impact on your experience. On your stability. On developer experience. And they don't necessarily have anything to do with whether you manage your control plane or not. From our perspective, it was logical and I would dare say, the correct decision to move to a managed Kubernetes platform.

Ben: 00:09:23.321 And there's still, obviously, a lot of day-to-day support that you need to do to make sure that it fits well within your organization. I think you mentioned you have your own CI/CD. Can you just touch a little about you how you deploy at Productboard?

Lukas: 00:09:37.076 Our applications are deployed using Argo CD. We're currently undergoing a migration from basically self-hosted GitHub runners to self-hosted GitHub actions, but the principle is still the same, basically. So we're running different classes of runners, which are spin up, according to our needs, basically. It's all webhook-based, basically. So a pull request comes. A runner of the predefined instance or port size is spun up. Basically, executes the job. Returns the output. That's pretty much it.

Improving engineer performance

Ben: 00:10:17.069 So as you sort of go on to more managed solutions, your team is available to work on more higher-level problems. Can you talk about how the platform team has really helped to increase the output performance of sort of your end engineers at Productboard?

Lukas: 00:10:33.266 The fact that we got rid of kOps, was that we're using Terraform to manage our cloud resources. kOps generates Terraform code for you as well, to create, basically, cloud resources like security routes, networking, and so on and so forth. So up to that point, we kept it in a separate repository. So since we no longer use kOps, we're able to merge it. Sometime further down the line, we added, basically, a permissions management into a single repository. So right now, we have only one repository where all our Terraform code pretty much resides, which has improved the flow, really, It's only one place you have to look for requests, do PR reviews, and so on and so forth. So that's one thing. Since our cloud team is fairly small, currently, there is four of us, soon to be five, and we have around 100 engineers in Productboard, so we're really big on, basically, developer enablement, on enabling our devs to do as much themselves as they can. Someone wants to create a new microservice, and that they would submit a ticket to us saying, "Hey, we need this RDS database or this Aurora database created and we need to have this access, and so on and so forth. Basically, here's your Terraform repository. This is the module you're supposed to use. Here is an example. Do it. Submit a PR. I will have a look at it. If it all checks out, we'll just approve it," and that's it. I think that's one of the things that improved.

Lukas: 00:12:14.766 Also, since we were the ones who built this solution, we were able to focus on, I guess, disaster recovery. Naturally, coming off an outage like that, we invested pretty heavily into that. So at this point, we know pretty well how things work and we're able to focus on improving our time to recovery. Basically, at this point, our entire Kubernetes cluster is very well documented. It's defined almost 100%, I would say like 98% in code. There are very few manual steps. So getting from nothing to a working Kubernetes cluster where you're able to deploy applications to, this time has improved drastically.

Ben: 00:13:07.344 And probably, I guess, EKS probably helps for that too. As long as you have the right resources, it's easier not to worry about the underlying management plane. Other nodes.

Gotchas of the EKS stack learned along the way

Lukas: 00:13:15.580 Yes, that's correct. You don't have to worry about doing cluster nodes and so on and so forth. There are some gotchas that you learn along the way, but other than that, it's loads better in terms of experience managing it.

Ben: 00:13:31.158 Can you give an example gotcha?

Lukas: 00:13:33.292 Basically, EKS comes with sort of two nodes. You can have self-managed nodes or managed nodes. The difference is fairly subtle. But we're using managed nodes. That means that you are allowed to use only Amazon, Linux, and there are a handful of other sort of requirements. You can give them an instance profile that the node assumes, but it's all a bit finicky. Also, quite honestly, the official Terraform module isn't the best so we really had to learn our way the hard way, I guess.

Ben: 00:14:12.643 Yeah. I can imagine. It can be sometimes hard to know with the different machine types and what's supported and especially, which region, too. They can be different as well. And I think one thing you mentioned is that you moved to a monorepo setup. Can you talk a little bit about why a monorepo and how that's helped your team?

Lukas: 00:14:28.281 Well, it's a monorepo in terms of the infrastructure only, really. Basically, in the past, we had three repositories. One, basically, where all the kOps or Kops stuff was and the one where everything else Terraform-related was. And we used a tool called [IAME?] for our IM management. Basically, along the way, we felt it's only natural to sort of merge these into one because our Terraform, I guess, platform is fairly robust. We're using Terragrunt. We're using Atlantis, which is a tool that basically allows you to run Terraform from your GitHub pull request, I guess, window, which is fairly handy because it removes the need to submit multiple pull requests when your Terraform run fails, and so on and so forth. Basically, since we have this fairly robust ecosystem built on top of our Terraform repository, it's only natural to sort of merge it all in one.

Storing institutional knowledge

Ben: 00:15:30.709 Yeah. And I think as far as consolidation goes, I think another thing that you mentioned in your blog post was after the sort of main lead had left the company, that institutional knowledge had been left. In a similar way in which you've created the monorepo, how do you think about storing institutional knowledge now within the team?

Lukas: 00:15:47.899 Really good question. So I think we sort of matured in this regard over the years. And I think it starts all the way at the planning level. Maybe to give some context, our platform tribe — our quarterly planning, basically, shifted one month behind our product teams, so that we're able to react to their needs better. So basically, our T1 starts not in January, but in February, and so on and so forth. These, basically, are building blocks for our, I guess, quarterly planning objectives, which are fairly well-defined. They're agreed upon across the whole platform tribe. They always have a DRI, a directly responsible individual, who's not expected to be the one necessarily delivering the entirety of the objective. He's the person in charge of knowing where do we move next. If we're stuck, how are we going to circumnavigate our problems, and so on and so forth? And this DRI is different for different objectives, so it prevents knowledge silos. I think with our team becoming more senior, more mature, this has had a very positive impact on knowledge sharing. Also, these days, we started to run game days, which is this tiny little D&D simulation, where someone is the game master who comes up with the scenario. It's usually some kind of problem that we have to solve. They are preselected roles which I think facilitates a real-life response to an emergency.

Ben: 00:17:36.233 Are they a couple of hours, half-day, how do you sort of structure them?

Lukas: 00:17:40.298 A couple of hours. We try to keep it within reasonable bounds. You introduce the problem. Roles are assigned. Then the actual problem-solving happens. You time-box it. Because if you're not able to solve it within the predefined amount of time, people obviously are lacking the knowledge and the resources and there is no point in continuing any further. That's the general plan. And then there's always some retrospective basically saying, "Hey, these are the lessons learned."

Ben: 00:18:14.478 Can you give an example topic for a recent game day?

Lukas: 00:18:17.572 We are running HashiCorp's Vault for storing our secrets, application secrets, for the most part. And we simulated our vault cluster outage. We found out that while our documentation is pretty good, one thing that we're lacking is — you have these different, I guess, recovery scenarios. Basically, your [inaudible] cluster failed or you lost your KMS key, so your vault cluster cannot unseal itself. One thing that we were missing is basically symptoms, like these key symptoms that can help you easily when you're under stress, differentiate between, I guess, different problems saying, "Hey, this is certainly this case and I don't have to worry about any of the others." So maybe that's one of the lessons learnt from the last time, and we added it to our documentation.

Ben: 00:19:18.631 And I guess there's a [inaudible] stage prior to a runbook. I guess it's like if — get to work in about 3:00 in the morning, you see this outage, check for these things first because it may not be this one thing you're trying to fix.

Lukas: 00:19:29.078 That's correct, yes. Exactly.

Ben: 00:19:30.708 And then what do you in regards as to documentation? Do you have a wiki or runbooks, or do you have it in GitHub? How do you think about organizing and structuring that?

Lukas: 00:19:41.207 It started off in GitHub. It didn't really work out, so later we transitioned into InfraDocs. Basically, MKDocs built off a GitHub repository. And these days, our entire organization is transitioning to Backstage, which is a tool from Spotify, I believe. What we're basically building is a developer portal. Basically, a single source of truth, where you have various integrations, and that's where we moved our documentation as well. For a very long time, we were in this sort of hellish state where some things were in Notion, which is something hated with passion by all engineers, but it was there for some reasons. Some things were in MKDocs and in GitHub and so and so forth. I think at this point, I can say with a clear conscience that everything is in Backstage in one place.

How to securely offboard engineers

Ben: 00:20:39.288 Nice. I'll definitely have to check it out in the show notes. Just to close out this story, as people leave, you also lose institutional knowledge. It also comes up with another problem with hyper-growth companies is you're bringing on lots of new engineers but you're also offboarding them and often they have access to secrets. API keys. Maybe SSH keys on production hosts. There's all this onboarding procedure but there's also off-boarding. What do you do to sort of clear up these secrets or credentials from previously employed people?

Lukas: 00:21:09.598 While our churn has been very, very low in terms of employees, we obviously have to deal with these things as well. And I think our setup is pretty good. We're using Okta as our main auth provider. Our main SSO. All of our resources are behind a VPN, which you can get through using your Okta sign-in. So that covers most things. We're using Vault to store application tokens and all these sensitive information. So the amount of, I would say, directly shared passwords and SSH keys is virtually none. There are some shared password vaults, but it's mostly very, very low-impact stuff that either wouldn't work without a VPN or is just totally safe stuff. We don't really do SSH anywhere, so that's a big help. We don't have to worry about it. In terms of all the other software that we use, software as a service, Datadog and so on and so forth, basically, places that you can get to without a VPN, it's mostly Okta. Our IT team has this tool, which is called Torii, and it basically tracks all the online accounts that we have in our company, which people have access, where once someone leaves, there is a person responsible for different accounts and they get notified saying, "Hey, we need you to remove this person from this account," say, Datadog.

Ben: 00:22:51.570 It's like sometimes there's a manual clean-up, yeah, of accounts.

Lukas: 00:22:54.456 Yes. Yes. But it's only usually one or two items per person that you have to clean up yourself manually. Other than that, it's mostly, as I said. So it's manageable. It's valuable management.

Why Productboard chose Teleport

Ben: 00:23:07.815 And this is a unique thing since Access Control is run by Teleport, you guys are also in the unique position of being a recent Teleport customer. What was the reasons of picking Teleport for accessing your infrastructure?

Lukas: 00:23:22.293 So basically, as our engineering grew and since we work with customer data, it's a paramount priority to protect it and prevent any possible leaks, and so on and so forth. Everyone who has access to production is obviously under an NDA and has gone through a background check. So on and so forth. But we felt that it's necessary to add another layer to have a proper audit possibility and preferably a session recording for basically anyone who would access our production environment manually, without a second pair of eyes, so to speak. Because most of our changes are done through automated steps, though CI, through pull requests, which are approved naturally, and so on and so forth. But there are use cases. It can either be responding to an emergency or hot fixing something which are rare, but we're in IT; they do happen. So for those reasons, as the number of our engineers grew, we felt it's really important to have a tool like Teleport in place, to basically improve our security.

Ben: 00:24:37.434 And you're using this primarily for, basically, Kubernetes access for short-lived kubeconfigs?

Lukas: 00:24:42.537 That's correct. For Kubernetes access and for database access as well.

Ben: 00:24:48.239 Teleport can help with your mapping of users. How do you think about the internal mapping of Kubernetes's internal users? Do you define different roles and users, or? I'm assuming everyone in your team is almost like a system master's level user, or do you sort of splice-and-dice roles in a much finer level of layer?

Lukas: 00:25:07.977 Since basically, anyone who has access to production has a business justification for it, we don't have to do that fine of a differentiation between our engineers. There are people who have read-only access, so that's one group. There are people who have very, very privileged access, usually administrators. Then there's everyone in between who has basically read-write access, that's pretty much it in our use case.

Ben: 00:25:38.351 And so I think you sort of touched on this a bit. All of these sort of auditing tools are always important for certain compliance regimes, like SOC 2. I think Productboard were certified in 2020. Can you tell me about how things have changed in the last two years and sort of how your team stays in compliance?

Lukas: 00:25:56.773 How things changed? Basically, we were continuously responding to the feedback from the reviewers of our SOC 2 compliance and were gradually implementing changes along the way. And it sort of forces you to because imagine you're supposed to rotate your SSH keys every three months and then you realize you don't really have to do SSH anymore, so you just get rid of it and then it's one less thing to worry about. In terms of how the process works, we do our internal quarterly reviews in collaboration with our other engineering teams and especially with our security team, where we basically provide the required information in terms of capacity planning in terms of how our infrastructure changed, and so on and so forth. We use tools like AWS Security Hub, we run on AWS. I guess that's worth mentioning. GuardDuty. Lacework. We scan our images before they're uploaded to ECR. And obviously, the process is that we talked about earlier, like those game days, basically BCDR exercises and documentation. So and so forth. So all these, I guess, help us stay secure and, I guess, improve.

Ben: 00:27:19.815 Yeah, the controls. For AWS, specifically, do you do anything for IAM roles to monitor people accessing AWS management console?

Lukas: 00:27:28.573 There are very, very few people who access the AWS console properly. But we use AWS to authenticate our users through IAM to Kubernetes. That's, I guess, fairly standard with EKS. And we do have monitoring. We have, obviously, monitoring on our root accounts, which should almost never be used unless you're dealing with billing, and so on and so forth. But basically, they should never be used for operational purpose. We are currently onboarding to use Teleport for AWS console access. AWS CLI, for the most part. I think that's it. All our roles are managed in our Terraform code.

Ben: 00:28:13.710 Yeah. I guess having infrastructure as code for IAM really helps because you've tested it, linked it, not have worry about someone creating one in the UI and giving a star, star to the wrong database, which could easily happen.

Lukas: 00:28:26.791 Exactly. Basically, nobody in the whole organization has permissions to do this directly without a second pair of eyes. So basically, there's no way to circumnavigate the restrictions in place.

Security issues faced and how they’re resolved

Ben: 00:28:39.116 And with all of this in place, you have good flexibility, I guess, into what's possibly happening if there's any possible security issues. If there is any security issues, what sort of things keep you up at night and how do you plan on resolving them?

Lukas: 00:28:53.059 I don't think that there is anything serious keeping as at night. Out of those things that we're aware of, we're actively trying to resolve them. It's probably not really a good idea to disclose them publicly. Maybe one of the security issues we're facing right now, I guess, is making the experience as smooth as possible. Because I think part of the journey of adopting Teleport into your engineers' lives — because I think for Infra people, it's just another tool and we're fairly used to it and it's easily digestible, I guess. It may not be like that for everyone. So one thing that we're working on is providing a layer of abstraction on top of our current tooling, on top of Teleport. On top of Vault, which is something that we call a PBCTL or a PB “cuttle”, that would be a CLI abstraction on top of these tools. So suppose you're a back-end engineer and you want to check some database. So at this point, you have to be well aware of what environment the database runs in, whether it's production, and that it means you have to use Teleport to get there. Or if it's in staging, that means you have to go through Vault, through Vault for ephemeral access. And it's all a bit much, so I think one of the ways to improve the adoption and to make everyone's lives easier is to build some clear guidance and some abstraction on top of it.

Ben: 00:30:30.815 Yes, really not get in the way of developers' productivity if they can just use the terminal for their tooling. They much prefer that to having to dive into an Okta, SSO page, to your various tools.

Lukas: 00:30:41.930 Exactly.

How Productboard works with other developers and orgs

Ben: 00:30:42.639 And then as a platform team, can you tell me how you work with the other developers and organizations? I think that was a great example of the internal tool building. And you sort of mentioned you build it one month out. What are some other initiatives that have been successful?

Lukas: 00:30:56.727 I think working with Kubernetes resources is something that's notoriously difficult for people who don't have sufficient knowledge, I would say. This came originally as a request from our own engineers saying, "Hey, basically, we need your help working with helm charts." And we thought, "Hey, this is a really good point, so let's set up some clear guidance, some, I guess, blueprints for our two major platforms, which is Kotlin services and Ruby services. And plan how can we improve this in the future?" The holy grail would be to have everything in Backstage, where basically since it's a developer portal, a developer would come and create all the necessary infrastructure for that service from there, much like a lightweight Heroku or any other platform as a service. So basically, building a platform for our own developers, I think in the future, that's what's in store for us. Something that we're actively working on.

Ben: 00:32:08.946 Since you kind of grow your engineering team or keep on expanding, you sort of build tools that you can sort of really empower them to get their job done. That's the platform for hyper-growth.

Lukas: 00:32:18.752 Exactly.

Advice for developing a platform team to support hypergrowth

Ben: 00:32:19.762 As we wrap things up here, what would be your advice for teams considering developing a platform team to do the same thing as yourself as Productboard?

Lukas: 00:32:28.395 I reckon at some point, you can spot the signs where there are huge bottlenecks where knowledge is pooled in very small circles. And I think that's the right time to sort of start thinking about whether there is some space to create the platform team or tribe or whatever that would basically work for the company itself, not necessarily create any value themselves, but in turn, empower. For us, this mark came around, I guess, 50 engineers. That's where I think it starts to really become a need.

Ben: 00:33:05.821 Yeah, definitely. Wherever I worked, CI/CD always breaks at like 20, 50, 100 people and you have to rebuild it. Same with deployment pipelines. Everything, I guess, gets more advanced. Do you have any tips for sort of when to optimize for different stages?

Lukas: 00:33:18.498 Actually, I think the idea of platform came way back when it was only one team, it was called front-end platform. And since Productboard is fairly front-end-heavy, we came up with this idea of sharing resources across all the front-end teams. I'm not a front-end developer myself, so take it with a grain of salt, but basically, there were libraries to be shared and basically some sort of predefined components that it was a good idea to basically just maintain and buy a dedicated team in a single repository in a single space. I think since this approach was very successful, I think that's what sort of sparked the idea of creating a platform tribe that would encompass this whole philosophy. So right now, not only do we have something resembling a front-end platform — we have a dedicated developer experience team, and so on and so forth. In terms of when to stop optimizing, that was your question, wasn't it?

Ben: 00:34:23.600 I guess, do you want to early optimize? I guess at different stages, there's different problems to solve for. Maybe I actually don't know what my closing question is.

Lukas: 00:34:31.920 Yeah. Does anyone ever preemptively optimize? I think that's actually — I think it's a pitfall. You should not look into the future, but not somewhere where you're not even heading, I guess. It's good to have a pretty clear outlook. Two to four quarters ahead of where you want to be. That's typically how our planning go. We try to plan for the next three to four quarters in advance. I think it's not actually about being certain that this will happen in Q4, 2023, because you obviously have no idea what's going to happen. But I think in terms of knowing what resources, in terms of team staffing and basically where you want to go, that's pretty advantageous.

Ben: 00:35:20.148 I think that's a great way to wrap it up. I really enjoyed this conversation. I think it's great everything is all so very self-service. And I guess in many ways, you kind of get out of the way of the developers and so it's almost like you're not even there, but they almost have the platform super power to sort of keep on growing in the organization. So thanks, Lukas. Do you have any last closing thoughts?

Lukas: 00:35:41.045 Yeah, I think it's a very typical thought that you're doing your job right as an Infra engineer when basically nobody needs you ever.

Ben: 00:35:52.104 Yes. Perfect. I think it's a great way to end. [music] This podcast is brought to you by Teleport. Teleport is the easiest, most secure way to access all your infrastructure. The open-source Teleport access platforms consolidates connectivity, authentication, authorization, and auditing into a single platform by consolidating all aspects of infrastructure access. Teleport reduces attack surface area, cuts operational overhead, easily enforces compliance, and improves engineering productivity. Learn more at goteleport.com or find us on GitHub, github.com/gravitational/teleport.

Background image

Try Teleport today

In the cloud, self-hosted, or open source
Get StartedView developer docs