Background

I have a 4 nodes K8s cluster for my side projects, there is one thing anti-pattern in the cluster setting.

4 nodes are all public nodes

It is very easy to have scalability when all nodes are in Public Subnet, so all nodes will have Public IP and to connect different nodes from different cloud provider easily.

Why it is not ideal to make all K8s nodes public?

The reason is security concern, especially for self managed k8s cluster(like what I did).

There is a risk here, it is that the instance running the database, it is exposed to the internet, since all 4 nodes are in Public Subnet. (even the instance is protected by firewall, and database yaml config stopped the database to be visible on the internet. )

How to make it better?

To have Public Nodes and Private Nodes in the same cluster

Like, in a simple 3 tier web application, Web Client => Web Server => Database, we usually hide the DB Tier in a private Subnet to provide an extra layer of protection from random people connect the DB.

Ideally, Database should be run in a Private Subnet and hidden from Internet, at the same time be visible from Web Server tier, as shown in diagram above.

So we can update our cluster to a hybrid one to have some nodes private and some nodes public

BTW, cloud provider managed DBs are still recommended, comparing to run a DB in self managed k8s cluster, like AWS Aurora, or AWS DynamoDB etc.

So now, we have a hybrid k8s cluster, 2 nodes in public subnet, 2 nodes in private subnet.

The nodes’ configurations will be like:


apiVersion: v1
clusters:
- cluster:
    certificate-authority-data: x
    server: https://200.200.200.200:6443  # public ip address
  name: default
contexts:
- context:
    cluster: default
    user: default
  name: default
current-context: default
kind: Config
preferences: {}
users:
- name: default
  user:
    client-certificate-data: y
    client-key-data: z

Even the IP is a public IP address, it will be able to talk to private subnet through the LAN IP.

Basin Host, Nat Gateway are required to be configured before setting this up to be able let Private Subnet to visit internet, and Public and Private nodes communicate.

Label the nodes public (Node 1 and Node 2), private (Node 3, Node 4) to make the deployment easier

kubectl label nodes node-1 isPrivate=false and node 2

kubectl label nodes node-3 isPrivate=true and node-4

Now, we could be able to deploy a service to public node easily through node-selector

For example in this wordpress deployment config :


apiVersion: apps/v1
kind: Deployment
metadata:
  name: wordpress
  labels:
    app: wordpress
spec:
  replicas: 1
  selector:
    matchLabels:
      app: wordpress
      tier: frontend
  strategy:
    type: Recreate
  template:
    metadata:
      labels:
        app: wordpress
        tier: frontend
    spec:
      nodeSelector:
        kubernetes.io/arch: arm64
        isPrivate: 'False'
      containers:
      - image: wordpress:5.8.2-php8.1-apache
        name: wordpress
        resources:
          limits:
            memory: "512Mi"
        ports:
        - containerPort: 80
          name: wordpress

At the same time for the DB behind the Wordpress app:


nodeSelector:
        kubernetes.io/arch: arm64
        isPrivate: 'True'

If you want to know more about how to do it in AWS EKS or other K8s cloud provider, there are a few guide from Cloud Providers.

In Sum

If cost is not a really big concern, fully provider managed k8s cluster will be recommended, it usually cost more, but it does safe your time.
It is even better to put all nodes in private subnet, and use a Load Balancer to balance the traffic to the private nodes
This solution is for a self managed, low budget k8s cluster, like what I currently have, the cluster does not use any other cloud provider’s services than Computing and Storage. It makes it very independent (on a specific cloud provider), and I will not pay for Load Balancer etc.

Other references

How to make Kubernetes Cluster Hybrid to cross Private Subnet and Public Subnet on Medium