Self-hosted K8s – Bare metal vs private cloud

Kubernetes began as an internal Google project, an orchestration system for “containers” — OS-level virtualization instances of larger applications, usually stripped down to just a single service or two to allow for quick redeployment and low footprint.

It comes as no surprise that initially Kubernetes, despite Google open-sourcing it in 2014, got introduced to the public as something a cloud computing provider could offer you within their ecosystem. The agreed-upon conventions and interfaces for rolling it yourself were there, but actually deploying a Kubernetes cluster on your own meant figuring out how to marry those interfaces to whatever hardware, operating system and networking setup you had available.

Google was quickly joined by competitors such as Amazon in having Kubernetes among their cloud products, which we now know was a good bet, however bare-metal options lagged behind. Thankfully, subsequent growth of the sector as a whole, with Kubernetes becoming a de-facto standard on the market, gave a lot of interest for hosting an independently deployed cluster, and today there’s reliable ways to have that.

Some things a cloud will do better for you, and some it’ll do worse. When choosing between a bare-metal Kubernetes deployment and a cloud one, consider the scale of your cluster, resource requirements and workload patterns. To make it easier, let’s look at the overall consequences for your operational flow.

Infrastructure management overhead

The main benefit for the cloud is the abstraction of everything beneath Kubernetes. You get your cluster provided as a service, no need to bother with maintaining the lower level aspects of it. When a RAM stick goes bad, or a whole datacenter gets knocked out of the network, the cloud provider has necessary redundancy for you to not even notice anything going wrong, much less having to replace dead hardware yourself. This abstraction saves time and effort, as you don't have to deal with the physical infrastructure.

That’s also where the cost comes from, though. The dynamic nature of public cloud deployments usually involves a lot of creative billing, and there’s no guarantee that you actually need the extra flash you pay for.

In bare-metal deployments, you have direct control over the underlying hardware and networking components. This means you can choose and configure the servers, storage, and network equipment according to your specific needs. You have full visibility into the hardware and can optimize it for performance, security, and cost-efficiency.

Scalability

Public cloud allows you to be basically unrestricted in terms of scale. Assuming the application code is reliable enough, you will be free to increase your processing rate however you like, tenfold, hundredfold — whatever, you’ll get the resources you ask for. Bare-metal setup will have you limited by the actual hardware you have on hand. So the question becomes, how varied is your workload in terms of required resources over time?

Cloud providers have vast pools of resources and can rapidly provision additional virtual machines or containers to scale your Kubernetes cluster. They handle the underlying infrastructure and distribute the workload across their data centers, ensuring efficient resource utilization. This makes cloud deployments highly suitable for workloads with fluctuating demand or unpredictable resource requirements.

When you have a predictable, continuous load, it might be much more cost-effective to purchase/lease your own physical servers and host them in a properly maintained datacenter. There’s a barrier of initial investment, and additional costs such as network and power, or collocation, so this requires some careful planning.

Availability and reliability

A cloud, by its definition, is supposed to be spread out. And that’s what you get when using a public cloud Kubernetes service: your provider has a vast network of points of presence across the planet, and at every level of the network there’s extra redundancy that’s supposed to handle all kinds of problems, from a power outage to a regional network collapse. There’s load balancers, out-of-the-box failover scenarios, data replication, offsite backups… They also have dedicated teams focused on maintaining infrastructure uptime, monitoring, and quickly responding to any issues.

This kind of scale is unimaginable for a lot of businesses, however it might be excessive, too. Things like network and power redundancy are easily achievable with the help of your local datacenter. And a lot of other things — failover, backups — are solved with custom scripts on the operating system level.

It's important to assess the criticality of your workload and the level of availability and reliability required. Cloud deployments provide robust built-in features and infrastructure reliability, while bare-metal deployments offer the possibility for tailored high availability setups.

Network performance

Cloud environments are rather complex because of the variety of customer needs they have to serve. On one hand, that gives you a balanced performance regardless of the workload. On the other hand, your traffic passes through an endless forest of physical and virtual hardware, not necessarily getting the best outcome you hope for. Unsurprisingly, there’s an option to pay more to give your data preferential treatment.

When you host Kubernetes yourself, though? Well, initially you have no network at all. But you have the freedom to choose whatever network provider that’ll deliver you the best results and prices for the connection you actually need.

You have full control over the networking setup and can optimize it for your specific needs. You can fine-tune network configurations, prioritize traffic, and utilize high-performance networking hardware to achieve optimal network performance. Most likely, you’ll be discussing this with datacenters. And if you want to make your Kubernetes workloads to be globally available, you have a choice of CDN providers, instead of being forced to use whatever your cloud has.

Security and compliance

Again, tailored approach versus the one that fits a wide range of customers. A cloud provider has a separate set of engineers assessing security threats, monitoring bulletins, applying hotfixes and so on. To get a security breach means to have your public image smeared some, so with competition in the field getting heavier there’s an increased focus on this. If you’re reading about a critical Kubernetes bug online, you can basically be sure that all the cloud providers have it fixed already.

It sounds good, until you need to implement security measures that go beyond the Kubernetes concepts. With bare-metal deployments you can do anything, such as firewalls, intrusion detection systems, and data encryption at the hardware and network levels. You have complete control over access control mechanisms and can implement security practices aligned with industry standards or compliance regulations.

There’s also a question of application security — as in, making sure your nginx deployment doesn’t have a privilege escalation bug — which is best taken by you separately, although some cloud providers might offer to help with that (you guessed it, for more money).

Conclusion

Bare-metal deployments typically require more effort and expertise in terms of operational overhead. You are responsible for tasks such as hardware provisioning, network configuration, and system updates. Managing and maintaining the infrastructure can be time-consuming and may require specialized knowledge and skills.

In a bare-metal environment, you need to handle tasks like capacity planning, hardware lifecycle management, and troubleshooting. You have the flexibility to customize and fine-tune the infrastructure, but it also means you have to invest more time and effort in managing the day-to-day operations.

Cloud deployments, on the other hand, abstract away much of the infrastructure management, reducing the operational overhead. Cloud providers handle tasks like hardware provisioning, networking, and maintenance. They automate many of the operational aspects, such as scaling, load balancing, and infrastructure monitoring, making it easier to get started with Kubernetes.

By leveraging cloud deployments, you can focus more on managing the Kubernetes applications and workloads rather than the underlying infrastructure. Cloud providers offer managed Kubernetes services that handle many operational aspects, including cluster management, software updates, and security patches. This allows you to focus on application development and deployment rather than infrastructure management.

In the end, the choice comes to estimating resource consumption patterns for your workloads, both short and long term. Going bare-metal might seem like a riskier, costly choice compared to outsourcing the cluster to public cloud providers, however in the long run it can be both cheaper and suit your needs better.