A recipe for a bespoke on-prem Kubernetes cluster

19 minute read

So you want to build yourself a Kubernetes cluster? You have your reasons. Some may want to utilize the hardware they own, some may not fully trust these fancy cloud services or just simply want to have a choice and build themselves a hybrid solution.
There are a couple of products available that I’ve reviewed, but you’ve decided to build a platform from scratch. And again, there are a myriad of reasons why it might be a good idea and also many that would convince you it’s not worth your precious time. In this article, I will focus on providing a list of things to consider when starting a project building a Kubernetes-based platform using only the most popular open source components.

Target groups

Before we jump into the technicalities, I want to describe three target groups that are referred to in the below sections.

  • Startups (SUP) - very small companies or the ones with basic needs; their focus is on using basic Kubernetes API and facilitating services around it
  • Medium businesses (MBU) - medium companies which want to leverage Kubernetes to boost their growth and innovation; their focus is on building a scalable platform that is also easy to maintain and extend
  • Enterprises (ENT) - big companies with even bigger needs, scale, many policies, and regulations; they are the most demanding and are focused on repeatability, security, and scalability (in terms of the growing number of developers and teams working on their platform)

All these groups have different needs and thus they should build their platform in a slightly different way with different solutions applied to particular areas. I will refer to them using their abbreviations or as ALL when referring to all of them.

Installation

When to apply: Mandatory for ALL
Purpose: To have a robust and automated way of management your cluster(s)

When deciding on installing Kubernetes without using any available distribution you have a fairly limited choice of installers.
You can try using kubeadm directly or use more generic kubespray. The latter one will help you not only install, but also maintain your cluster (upgrades, node replacement, cluster configuration management).
Both of these are universal and are unaware of how cluster nodes are provisioned. If you wish to use an automated solution that would also handle provisioning cluster nodes then Metal3 could be something you might want to try. It’s still in the alpha stage, but it looks promising.

If you want a better and more cloud-native way of managing your clusters that would enable easy scaling then you may want to try ClusterAPI project. It supports multiple cloud providers, but it can be used on on-prem environments with the aforementioned Metal3, vSphere, or OpenStack.

One more thing worth noting here: the operating system used by cluster nodes. Since the future of CentOS seems unclear, Ubuntu becomes the main building block for bespoke Kubernetes clusters. Some may want to choose a slim alternative that has replaced CoreOS - Flatcar Linux.

Cluster autoscaler

When to apply: Highly recommended for ENT, optional for others
Purpose: Scale up and down automatically your platform

If you choose ClusterAPI or your cluster uses some API in another way to manage cluster nodes (e.g. vSphere, OpenStack, etc.) then you should also use the cluster autoscaler component. It is almost a mandatory feature for ENT but it can also be useful for MBU organizations. By forcing nodes to be ephemeral entities that can be easily replaced/removed/added, you decrease the maintenance costs.

Network CNI plugin

When to apply: Mandatory for ALL
Purpose: Connect containers with optional additional features such as encryption

The networking plugin is one of the decisions that need to be taken prudently, as it cannot be easily changed afterward.
To make things brief I would shorten the list to two plugins - Calico or Cilium. Calico is older and maybe a little bit more mature, but Cilium looks very promising and utilizes Linux Kernel BPF. For a more detailed comparison I would suggest reading this review of multiple plugins.
Choose wisely and avoid CNI without NetworkPolicy support - having a Kubernetes cluster without the possibility to implement firewall rules is a bad idea. Both Calico and Cilium support encryption, which is a nice thing to have, but Cilium is able to encrypt all the traffic (Calico encrypts only pod-to-pod).

Ingress controller

When to apply: Mandatory for ALL
Purpose: Provide an easy and flexible way to expose web applications with optional advanced features

Ingress is a component that can be easily swapped out when the cluster is running. Actually, you can have multiple Ingress controllers by leveraging IngressClass introduced in Kubernetes 1.18.
A comprehensive comparison can be found here, but I would limit it to a select few controllers depending on your needs.

For those looking for compatibility with other Kubernetes clusters (e.g. hybrid solution), I would suggest starting with the most mature and battle-tested controller - nginx ingress controller. The reason is simple - you need only basic features described in Ingress API that have to be implemented by every Ingress controller. That should cover 90% of cases, especially for SUP group.

If more features are required (such as sophisticated http routing, authentication, authorization, etc.) then the following options are the most promising:

  • Contour - it’s the only CNCF project that is in the Incubating maturity level group. And it’s based on Envoy which is the most flexible proxy available out there.
  • Ambassador - has nice features, but many of them are available in the paid version. And yes - it also uses Envoy.
  • HAproxy from HAproxytech - for those who are familiar with HAproxy and want to leverage it to provide a robust Ingress controller
  • Traefik - they have an awesome logo and if you’ve been using it for some Docker load-balancing then you may find it really useful for Ingress as well

Monitoring

When to apply: Mandatory for ALL (unless an existing monitoring solution compatible with Kubernetes exists)
Purpose: Provide insights on cluster state for operations teams

There is one king here - just use Prometheus. Probably the best approach would be using an operator that would install Grafana alongside some predefined dashboards.

Logging

When to apply: Mandatory for ALL (unless an existing central logging solution is already is)
Purpose: Provide insights on cluster state for operations teams

It’s quite similar to monitoring - the majority of solutions are based on Elasticsearch, Fluentd and Kibana. This suite has broad community support and many problems have been solved and described thoroughly in many posts on the web. ALL should have a logging solution for their platforms and the easiest way to implement it is to use an operator like this one or a Helm Chart like this based on Open Distro (it’s an equivalent of Elasticsearch with more lenient/open source license).

Tracing

When to apply: Optional for ALL
Purpose: Provide insights and additional metrics useful for application troubleshooting and performance tuning

Tracing is a feature that will be highly coveted in really big and complex environments. That’s why ENT organizations should adopt it and the best way is to implement it using Jaeger. It’s one of graduated CNCF projects which only makes it more appealing, as it’s been proven to be not only highly popular but also has a healthy community around it.
Implementation requires some work on the application’s part, but the service itself can be easily installed and maintained using this operator.

Backup

When to apply: Mandatory for ENT, optional for the rest
Purpose: Apply the *“Redundancy is not a backup solution” approach

ALL should remember that redundancy is not a backup solution. Although with a properly implemented GitOps solution, where each change of the cluster state goes through a dedicated git repository, the disaster recovery can be simplified, in many cases, it’s not enough. For those who plan to use persistent storage, I would recommend implementing Velero.

Storage

When to apply: For ALL if stateful applications are planned to be used
Purpose: Provide flexible storage for stateful applications and services

The easiest use case of Kubernetes is stateless applications that don’t need any storage for keeping their state. Most microservices use some external service (such as databases) that can be deployed outside of a cluster.
If persistent storage is required it can still be provided using already existing solutions from outside a Kubernetes cluster. There are some drawbacks (i.e. the need to provision persistent volumes manually, less reliability and flexibility) in many of them and that’s why keeping storage inside a cluster can be a viable and efficient alternative.
I would limit the choices for such storage to the following projects:

Rook is the most popular and when properly implemented (e.g. deployed on a dedicated cluster or on a dedicated node pool with monitoring, alerting, etc.) can be a great way of providing storage for any kind of workloads, including even production databases (although this topic is still controversial and we all need time to accustom to this way of running them).

Security

This part is crucial for organizations that are focused on providing secure platforms for the most sensitive parts of their systems.

Non-root containers

When to apply: Mandatory for ENT and probably MBU
Purpose: Decrease the risk of potential exploiting of vulnerabilities found in applications or the operating system they use

OpenShift made a very brave and good decision by providing a default setting that forbids running containers under the root account. I think this setting should be also implemented for ALL organizations that want to increase the level of workloads running on their Kubernetes clusters.
It is quite easy to achieve by implementing PodSecurityPolicy admission controller and applying proper rules. It’s not even an external project, but it’s a low-hanging fruit that should be mandatory to implement for larger organizations. This, however, brings consequences in what images would be used on a platform. Most ”official” images available on Docker Hub run as root, but I see how it changes, and hopefully, it will change in the future.

Enforcing policies with OpenPolicyAgent

When to apply: Mandatory for ENT, optional for others
Purpose: Enforce security and internal policies

Many organizations produce tons of security policies written down in some documents. They are often enforced by processes and audited yearly or rarely. In many cases, they aren’t adjusted to the real world and were created mostly to meet some requirements instead of protecting and ensuring best security practices are in place. It’s time to start enforcing these policies on the API level and that’s where OpenPolicyAgent comes to play. Probably it’s not required for small organizations, but it’s definitely mandatory for larger ones where risks are much higher. In such organizations properly configured rules that may:

  • prevent pulling images from untrusted container registries
  • prevent pulling images outside of a list of allowed container images
  • enforce the use of specific labels describing a project and its owner
  • enforce the applying of best practices that may have an impact on the platform reliability (e.g. defining resources and limits, use of liveness and readiness probes)
  • granularly restrict the use of the platform’s API (Kubernetes RBAC can’t be used to specify exceptions)

Authentication

When to apply: Mandatory for ALL, for some SUP it may be optional
Purpose: Provide a way for user to authenticate and authorize to the platform

This is actually a mandatory component for all organizations. One thing that may surprise many is how Kubernetes treats authentication and how it relies on external sources for providing information on users. This means almost unlimited flexibility and at the same time adds even more work and requires a few decisions to be made.
To make it short - you probably want something like DEX that acts as a proxy to your real Identity Provider (DEX supports many of these, including LDAP, SAML 2.0, and most popular OIDC providers). To make it easier to use you can add Gangway. It’s a pair of projects that are often used together.

You may find Keycloak as an alternative that is more powerful, but at the same time is also more complex and difficult to configure.

Better secret management

When to apply: Mandatory for ENT
Purpose: Provide a better and more secure way of handling confidential information on the platform

For smaller projects and organizations encrypting Secrets in a repo where they are stored should be sufficient. Tools such as git-crypt , git-secret or SOPS do a great job in securing these objects. I recommend especially the last one - SOPS is very universal and combined with GPG can be used to create a very robust solution.
For larger organizations, I would recommend implementing HashiCorp Vault which can be easily integrated with any Kubernetes cluster. It requires a bit of work and thus the use of it for small clusters with few applications seems to make no sense. For those who have dozens or even hundreds of credentials or other confidential data to store Vault can make their life easier. Auditing, built-in versioning, seamless integration, and what is the killer feature - dynamic secrets. By implementing access to external services (i.e. various cloud providers, LDAP, RabbitMQ, ssh and database servers) using credentials created on-demand with a short lifetime, you set a different level of security for your platform.

Security audits

When to apply: Mandatory for ENT and MBU
Purpose: Get more information on potential security breaches

When handling a big environment, especially one that needs to be compliant with some security standards, providing a way to report suspicious activity is one of the most important requirements. Setting auditing for Kubernetes is quite easy and it can even be enhanced by generating more granular information on specific events generated not by API components, but by containers running on a cluster. The project that brings these additional features is Falco. It’s really amazing how powerful this tool is - it uses the Linux kernel’s internal API to trace all activity of a container such as access to files, sending or receiving network traffic, access to Kubernetes API, and many, many more. The built-in rules already provide some useful information, but they need to be adjusted for specific needs to get rid of false positives and triggers when unusual activities are discovered on the cluster.

Container images security scanning

When to apply: Mandatory for ALL
Purpose: Don’t allow to run containers with critical vulnerabilities found

The platform security mostly comes down to vulnerabilities in the containers running on it. That’s why it is so important to ensure that the images used to run these containers are scanned against most critical vulnerabilities. This can be achieved in two ways - one is by scanning the images on a container registry and the other is by including an additional step in the CI/CD pipeline used for the deployment.

It’s worth considering keeping container images outside of the cluster and relying on existing container registries such as Docker Hub, Amazon ECR, Google GCR or Azure ACR. Yes - even when building an on-prem environment sometimes is just easier to use a service from a public cloud provider. It is especially beneficial for smaller organizations that don’t want to invest too much time in building a container registry and at the same time they want to provide a proper level of security and reliability.

There is one major player in the on-prem container registries market that should be considered when building such a service. It’s Harbor which has plenty of features, including security scanning, mirroring of other registries, and replication that allows adding more nines to its availability SLO. Harbor has a built-in Trivy scanner that works pretty well and is able to find vulnerabilities on the operating system level and also in the application packages.

Trivy can also be used as a standalone tool in a CI/CD pipeline to scan the container image built by one of the stages. This one-line command might protect you from serious troubles as many can be surprised by the number of critical vulnerabilities that exist even in the official docker images.

Extra addons

On top of basic Kubernetes features there are some interesting addons that extend Kubernetes basic features.

User-friendly interface

When to apply: Mandatory for ENT and MBU
Purpose: Allow less experienced users to use the platform

Who doesn’t like a nice GUI that helps to get a quick overview of what’s going on with your cluster and applications running on it? Even I crave such interfaces and I spend most of my time in my command line or with my editor. These interfaces when designed properly can speed up the process of administration and just make the work with the Kubernetes environment much more pleasant.
The ”official” Kubernetes dashboard project is very basic and it’s not the tool that I would recommend for beginners, as it may actually scare people off instead of drawing them to Kubernetes.
I still believe that OpenShift’s web console is one of the best, but unfortunately it cannot be easily installed with any Kubernetes cluster. If it was possible then it would definitely be my first choice.
Octant looks like an interesting project that is extensible and there are already useful plugins available (e.g. Aqua Security Starboard). It’s rather a platform than a simple web console, as it actually doesn’t run inside a cluster, but on a workstation. The other contestant in the UI category is Lens. It’s also a standalone application. It works pretty well and shows nice graphs when there’s a prometheus installed on the cluster.

Service mesh

When to apply: Optional for ALL
Purpose: Enable more advanced traffic management, more security and flexibility for the applications running on the platform

Before any project name appears here there’s a fundamental question that needs to be asked here - do you really need a service mesh for your applications? I wouldn’t recommend it for organizations which just start their journey with cloud native workloads. Having an additional layer can make non-so-trivial management of containers even more complex and difficult. Maybe you want to use service mesh only to encrypt traffic? Consider a proper CNI plugin that would bring this feature transparently. Maybe advanced deployment seems like a good idea, but did you know that even basic Nginx Ingress controller supports canary releases? Introduce a service mesh only then when you really need a specific feature (e.g. multi-cluster communication, traffic policy, circuit breakers, etc.). Most readers would probably be better off without service mesh and for those prepared for the additional effort related to increased complexity the choice is limited to few solutions.
The first and most obvious one is Istio. The other that I can recommend is Consul Connect from HashiCorp. The former is also the most popular one and is often provided as an add-on in the Kubernetes services in the cloud. The latter one seems to be much simpler, but also is easier to use. It’s also a part of Consul and together they enable creation and management of multi-cluster environments.

External dns

When to apply: Optional for ALL, recommended for dynamic environments
Purpose: Decrease the operational work involved with managing new DNS entries

Smaller environments will probably not need many dns records for the external access via load balancer or ingress services. For larger and more dynamic ones having a dedicated service managing these dns records may save a lot of time. This service is external-dns and can be configured to manage dns records on most dns services available in the cloud and also on traditional dns servers such as bind. This addon works best with the next one which adds TLS certificates to your web applications.

Cert-manager

When to apply: Optional for ALL, recommended for dynamic environments
Purpose: Get trusted SSL certificates for free!

Do you still want to pay for your SSL/TLS certificates? Thanks to Let’s Encrypt you don’t need to. But this is just one of the Let’s Encrypt’s features. Use of Let’s Encrypt has been growing rapidly over the past few years. Tand the reason why is that it’s one of the things that should be at least considered as a part of the modern Kubernetes platform is how easy it is to automate. There’s a dedicated operator called cert-manager that makes the whole process of requesting and refreshing certificates very quick and transparent to applications. Having trusted certificates saves a lot of time and trouble for those who manage many web services exposed externally, including test environments. Just ask anyone who had to inject custom certificate authority keys to dozens of places to make all the components talk to each other without any additional effort. And cert-manager can be used for internal Kubernetes components as well. It’s one of my favourite addons and I hope many will appreciate it as much as I do.

Additional cluster metrics

When to apply: Mandatory for ALL
Purpose: Get more insights and enable autoscaling

There are two additional components that should be installed on clusters used in production. They are metrics-server and kube-state-metrics. The first is required for the internal autoscaler (HorizontalPodAutoscaler) to work, as metrics-server exposes metrics gathered from various cluster components. I can’t imagine working with a production cluster that lack of these features and all the events that should be a part of standard security review processes and alerting systems.

GitOps management

When to apply: Optional for ALL, recommended for ENT
Purpose: Decrease the operational work involved with cluster management

It is not that popular yet, but cluster and environment management is going to be an important topic, especially for larger organizations where there are dozens of clusters, namespaces and hundreds of developers working on them. Management techniques involving git repositories as a source of truth are known as GitOps and they leverage the declarative nature of Kubernetes. It looks like ArgoCD has become a major player in this area and installing it on the cluster may bring many benefits for teams responsible for maintenance, but also for security of the whole platform.

Conclusion

The aforementioned projects do not even begin to exhaust the subject of the solutions available for Kubernetes. This list merely shows how many possibilities are out there, how rich the Kubernetes ecosystem is, and finally how quickly it evolves.
For some it may be also surprising how standard Kubernetes lacks some features required for running production workloads. Even the multiple versions of Kubernetes-as-a-Service available on major cloud platforms are missing most of these features, let alone the clusters that are built from scratch for on-prem environments. It shows how difficult this process of building a bespoke Kubernetes platform can become, but at the same time those who will manage to put it all together can be assured that their creation will bring their organization to the next level of automation, reliability, security and flexibility.
For the rest there’s another and easier path - using a Kubernetes-based product that has most of these features built-in.

Leave a comment