Authoring K8s Manifests

Note: This is an internal blog post that I wrote at our company. When I interact with people in the tech community they’re often curious about how different teams approach think about these problems more broadly, so I thought I’d include this. The audience was internal Basis employees, so some of the references may not make sense.

There are no right solutions

As humans we’re obsessed with not making the wrong choice. Everything from where you go to school to whether you should order the chicken or the steak, is besieged by the weight of making “the wrong” choice. But that framing suggests that right and wrong are absolutes, as if you could plugin all the variables of a given situation and arrive at a conclusive answer. This couldn’t be further from the truth. Not in life and definitely not in engineering.

Choices are about trade-offs. Depending on what you’re optimizing for one set of trade-offs seems more practical than another. For example, investing your savings is a good idea, but the vehicles you use to invest differ based on your goals. If you need the money soon, a money market account offers flexibility but at the expense of good returns. The stock market might offer higher returns but at the risk of losing some of the principal. Do you need the money in 2 years or 20 years? How much do you need it to grow? How quickly?

The economist Thomas Sowell famously said “There are no solutions, there are only trade-offs; and you try to get the best trade-off you can get, that’s all you can hope for.”

This statement holds true in software engineering as well.

Imperative vs Declarative Manifest Authoring

When it comes to Kubernetes manifests, there really is only one method of applying those manifests and that’s using a declarative model. We tell Kubernetes what it is we want the final state to look like (via the manifests) and we rely on Kubernetes to figure out the best way to get us to that state.

With Kubernetes all roads lead to a declarative document being applied to the cluster, but how we author those manifests can take on an imperative bend if we wanted to using various template engines like Helm, Jsonnet or the now defunct Ksonnet. But using templating languages provides a power and flexibility that allows us to do some things that we probably shouldn’t do given our past experiences. Templating opens the door for impeding some of the goals we have around the Kubernetes project and what the experience we’re specifically optimizing for. I’d prefer to stay away from templating layers as much as possible and be explicit in our manifest declarations.

What are we optimizing for?

In order to really evaluate the tools we’ve got to discuss what it is we’re optimizing for. These optimizations are in part due to past experiences with infrastructure tools as well as acknowledgements of the new reality we’ll be living in with this shared responsibility model for infrastrcuture.

Easy to read manifests to increase developer involvement

With the move to Kubernetes we’re looking to get developers more involved with the infrastructure that runs their applications. There won’t be a complete migration of ownership to development teams, but we do anticipate more involvement from more people. The team that works on infrastructure now is only 6 people. The development org is over 40 people. That said the reality is that many of these developers will only look at the infrastructure side of things 4 or 5 times a year. When they do look at it, we want that code to be optimized for reading rather than writing. The manifests should be clear and easy to follow.

This will require us to violate some principles and practices like code reuse and DRY, but after years of managing infrastructure code we find that more often than not, each case requires enough customization where the number of parameters and inputs to make code actually reusable, ballons quickly and becomes a bit unwieldy. Between our goals and the realities of infrastructure code reuse, using clear and plain manifest definitions is a better choice for us. We don’t currently have the organizational discipline to be able to reject certain customizations of an RDS instance. And honestly, rejecting a customization request because we don’t have the time to modify the module/template doesn’t feel like a satisfying path forward.

A single deployment tool usable outside the cluster

Because of the application awareness our current orchestration code has, we end up with multiple deployment code bases that are all fronted by a common interface. (Marvin, the chatbot) Even with Marvin serving as an abstraction layer, you can see chinks in the facade as different deployment commands have slightly different syntax and or feature support. In the Kubernetes world we want to rely on a single deploy tool that tries to keep things as basic as kubectl apply when possible. Keeping the deploy tool as basic as possible will hopefully allow us to leverage the same tool in local development environments. In order to achieve this goal, we’ll need to standardize on how manifests are provided to the deployment tool.

There is a caveat to this however. The goal of a single method to appply manifests is separate and distinct from how the manifests are authored. One could theoretically use a template tool like Helm to write the manifests in, but then provide the final output to the deploy tool. This would violate another goal of easy to read manifests. I just wanted to call out it could be done. Having some dynamic preprocessor that happens ahead of the deploy tool and commits the final version of the manifest to the application repository could be a feasible solution.

Avoiding lots of runtime parameters

Another issue that we see in today’s infrastructure is that our deploy tool requires quite a bit of runtime information. A lot of this runtime information happens under the hood, so while the user isn’t required to provide it, Marvin infers a lot of information based on what the user does provide. For example, when a user provides the name “staging0x” as the environment, Marvin then recognizes that he needs to switch to the production account vs the preproduction account. He knows there’s a separate set of Consul servers that need to be used. He knows the name of the VPC that it needs to be created in as well as the Class definition of the architecture. (Class definitions are our way to scope the sizing requirements of the environment. So a class of “production” would give you one sizing and count of infrastructure, while a class of “integration” or “demo” will give you another)

This becomes problematic when we’re troubleshooting things in the environment. For example, if you want to manually run terraform apply or even a terraform destroy, many times you have to look at previously run commands to get a sense what some of the required values are. In some cases, like during a period of terraform upgrading, you might need to know precisely what was provided at runtime for the environment in order to continue to properly manage the infrastructure. This has definitely complicated the upgrades of components that are long lived, especially areas where state is stored. (Databases and Elasticache for example)

Much of the need for this comes from the technical debt incurred when we attempted to create reusable modules for various components. Each reusable module would create a sort of bubble effect, where input parameters for a module at level 3 in the stack would necessitate that we ask for that value at level 1 so that we can propagate it down. As we added support for a new parameter to support a specific use case, it would have the potential to impact all of the other pieces that use it. (Some of this is caused by limitations of the HCL language that Terraform uses)

Nevertheless, when we use templating tools we open the door for code reuse as well as levels of inference that makes the manifest harder to read. (I acknowledge that putting “code reuse” into a negative context seems odd) This code reuse in particular though tends to be the genesis of parameterization that ultimately bubbles its way up the stack. Perhaps not on day one, but by day 200 it seems almost too tempting to resist.

As an organization, we’re relatively immature as it relates to this shared responsibility model for infrastructure. A lot of the techniques that could mitigate my concerns haven’t been battle tested in the company. After some time of running in this environment and getting use to developer and operations interactions my stance may soften on this, but for day one it is a little bit too much to add additional processes to circumvent the short comings.

Easily repeated environment creation

In our internal infrastructure as code (IaC) testing we would often have a situation where coordinating infrastructure changes that needed to be coupled with code changes was a bit of a disaster. Terraform would be versioned in one repository, but SaltStack code would be versioned in a different repository, but the two changes would need to be tested together. This required either a lot of coordination or a ton of manual test environment setup. To deal with the issue more long-term we started to include a branch parameter on all environment creation commands, so that you could specify a custom SaltStack server, a specific Terraform branch and a specific SaltStack branch. The catch was you had to ensure that these parameters were enacted all the way down the pipeline. The complexity that this created is one of the reasons I’ve been leaning towards having the infrastructure code and the application code exist in the same repository.

Having the two together also allows us to hardcode information to ensure that when we deploy a branch, we’re getting a matching set of IaC and application code by setting the image tag in the manifest to match the image built. (There are definite implementation details to work out on this) This avoids the issue of infrastructure code being written for the expectations of version 3.0 of application code, but then suddenly being provided with version 2.0 of application code and things breaking.

We see this when we’re upgrading core components that are defined at the infrastructure layer or when we role out new application environment requirements, like AuthDB. When AuthDB rolled out, it required new infrastructure, but only for versions of the software that were built off the AuthDB branch. It resulted in us spinning up AuthDB infrastructure whether you needed it or not, prolonging and sometimes breaking the creation process for environments that didn’t need AuthDB.

Assuming we can get over a few implementation hurdles, this is a worthwhile goal. It will create a few headaches for sure. How do we make a small infrastructure change in an emergency (like replicacount) without triggering an entire CI build process? How do we ensure OPS is involved with changes to the /infrastructure directory? All things we’ll need to solve for.

Using Kustomize Exclusively

The mixture of goals and philosophies has landed us on using Kustomize exclusively in the environment. Along with that we’d like to adopt many of the Kustomize philosophies around templating, versioning and their approach to manifest management.

While Helm has become a popular method for packaging Kubernetes applications, we’ve avoided authoring helm charts in order to minimize not just the tools, but also the number of philosophies at work in the environment. By using Kustomize exclusively, we acknowledge that some things will be incredibly easy and some things will be incredibly more difficult than they need to be. But that trade-off is part of adhering to an ideology consistently. Some of those tradeoffs are established in the Kubernetes team’s Eschewed Features document. Again, this isn’t to say one approach is right and one is wrong. The folks at Helm are serving the needs of many operators. But the Kustomize approach aligns more closely with the ProdOps worldview of running infrastructure.

We’re looking to leverage Kustomize so that we:

Don’t require preprocessing of manifests outside of the Kustomize commands
Being as explicit as possible in manifest definitions, making it easy for people who aren’t in the code base often, to read it and get up to speed.
Being able to easily recreate environments without the need for storing or remembering run time parameters that were passed.
Minimizing the number of tools used in the deployment pipeline

I’m not saying it’s the right choice. But for ProdOps it’s the preferred choice. Some pain will definitely follow.