From zero to cluster

Kubewarden, an only recently admitted CNCF sandbox project, has had its first stable release on 22nd of June, 2022 — a perfect time to have a quick look at it.

What is Kubewarden?

In short: Kubewarden is an admission controller for Kubernetes (stylized as K8s), that tries to replace the now deprecated Pod Security Policies and unify the current ecosystem by supporting both versions of Rego policies (used by Open Policy Agent and OPA gatekeeper).

However, Kubewarden tries to accomplish much more, e.g. by allowing to specify admission policies in more or less any programming language that compiles to web assembly (wasm). We won’t go into the details of all the features of Kubewarden (this is where you should consult the official documentation if you are interested in these), but instead give a short introduction to admission controllers in general and set up a lightweight lab environment for Kubewarden.

TLDR: admission controllers

An admission controller is a piece of code that intercepts requests to the Kubernetes API server prior to persistence of the object, but after the request is authenticated and authorized.
– from the Kubernetes Documentation

In a nutshell: admission controllers decide if a Kubernetes object (e.g. a pod) gets to live on or not. Kubernetes provides a range of default controllers (e.g. to enforce resource quotas on pods) but you are free to install further controllers into your cluster. In general, there are two (or three, depending on how you count) types of admission controllers: validating and mutating ones (and ones that are both at the same time). The difference simply is, that mutating controllers may change the original request to a compliant one (e.g. if a pod requests risky capabilities, an admission controller might just drop these transparently).

Playing field

The first thing we need is a working Kubernetes cluster. We will not set up a production-ready K8s cluster, as this is material for another series of blog posts, but rather set up a minimal working example that is easy to use and on resources.

There is a whole range of solutions for our use case by now, some of the better known, for example, are minikube, kind, MicroK8s and K3s. The following should be easily adaptable to any one of the mentioned options, so feel free to use whatever you like best (see minikube vs. kind vs. K3s for a nice comparison). We will go with K3s, as it is lightweight and allows us to spin up multiple independent clusters on a single development machine.

Install the dependencies

Install Docker

We run our lab on an Ubuntu server. Either follow the official install instructions for Ubuntu or use the steps down below (alternatively follow the instructions for your distro).

curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
  $(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt update
sudo apt install docker-ce docker-ce-cli containerd.io docker-compose-plugin

Install kubectl

We use kubectl to manage the resources in our cluster. You can install kubectl either by downloading the binary from GitHub or by adding the Kubernetes repository to apt:

curl -fsSL https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo gpg --dearmor -o /etc/apt/keyrings/google.gpg
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/google.gpg] https://apt.kubernetes.io/ kubernetes-xenial main" |\
  sudo tee -a /etc/apt/sources.list.d/kubernetes.list
sudo apt update
sudo apt install kubectl

Install Helm

Helm is the package manager for K8s. Helm apps come bundled in charts that can be stored in any OCI registry. Charts contain all resources needed to deploy a given app and Helm can be used to install/remove/update all resources within a chart at once. Helm can be installed by downloading the binary from GitHub or by adding the Helm repository to apt:

curl https://baltocdn.com/helm/signing.asc | sudo gpg --dearmor -o /usr/share/keyrings/helm.gpg
echo \
  "deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/helm.gpg] https://baltocdn.com/helm/stable/debian/ all main" |\
  sudo tee /etc/apt/sources.list.d/helm-stable-debian.list
sudo apt update
sudo apt install helm

Install k3d

Now that we have prepared our system, the rest is pretty much straightforward. First we install k3d. (Note the difference: k3d stands for K3s in Docker). There are a slew of options to install k3d, but we went with one of the easiest for us (note, however, that you might get pre-release versions when installing like that):

go install github.com/k3d-io/k3d/v5@v5.4.3

Primer: K3s vs. k3d

K3s is a minimal Kubernetes distribution from Rancher Labs. It’s a single binary meant to be used in IoT and CI environments. The goals of K3s are to be more leightweight than K8s but nevertheless be fully compliant to K8s. This is achieved by, for example, replacing complex components by less complex alternatives (e.g. etcd by sqlite), as well as by removing some optional third-party additions.

k3d builds upon that foundation, but is not officially affiliated with K3s or Rancher Labs. k3d is more or less a simple wrapper around the official Docker images of K3s. k3d makes it easy to run multiple instances of K3s within Docker on a single machine, which makes it perfect for local development.

With k3d installed, we can test our lab environment:

$ k3d cluster create kubewarden
INFO[0000] Prep: Network
INFO[0000] Created network 'k3d-kubewarden'
INFO[0000] Created image volume k3d-kubewarden-images
INFO[0000] Starting new tools node...
INFO[0001] Creating node 'k3d-kubewarden-server-0'
INFO[0002] Starting Node 'k3d-kubewarden-tools'
INFO[0003] Creating LoadBalancer 'k3d-kubewarden-serverlb'
INFO[0004] Using the k3d-tools node to gather environment information
INFO[0006] HostIP: using network gateway 172.18.0.1 address
INFO[0006] Starting cluster 'kubewarden'
INFO[0006] Starting servers...
INFO[0007] Starting Node 'k3d-kubewarden-server-0'
INFO[0013] All agents already running.
INFO[0013] Starting helpers...
INFO[0014] Starting Node 'k3d-kubewarden-serverlb'
INFO[0021] Injecting records for hostAliases (incl. host.k3d.internal) and for 2 network members into CoreDNS configmap...
INFO[0024] Cluster 'kubewarden' created successfully!
INFO[0024] You can now use it like this:
kubectl cluster-info

Now that the cluster has been created, we can check its status by using kubectl cluster-info:

$ kubectl cluster-info
Kubernetes control plane is running at https://0.0.0.0:45163
CoreDNS is running at https://0.0.0.0:45163/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
Metrics-server is running at https://0.0.0.0:45163/api/v1/namespaces/kube-system/services/https:metrics-server:/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

Restrict Docker from exposing arbitrary ports on your server

We have one problem now. After starting the K3s cluster, the kube-api is exposed to the internet on a randomly chosen port:

$ netstat -tulpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 127.0.0.53:53           0.0.0.0:*               LISTEN      28107/systemd-resol
tcp        0      0 0.0.0.0:40501           0.0.0.0:*               LISTEN      50353/docker-proxy
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      28080/sshd: /usr/sb
tcp6       0      0 :::22                   :::*                    LISTEN      28080/sshd: /usr/sb
udp        0      0 127.0.0.53:53           0.0.0.0:*                           28107/systemd-resol

As this is a lab environment, it should definitely not be exposed to the web. Keep in mind that this is a general thing to be aware of: If you bind a port to your host with Docker, Docker exposes the port on 0.0.0.0:

$ docker run -d -p 8080:80 nginx
583844ed8df90170f3b20230d207e037bf6c239c92145048322170196e9c2a32
$ netstat -tulpn
(Not all processes could be identified, non-owned process info
 will not be shown, you would have to be root to see it all.)
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address           Foreign Address         State       PID/Program name
tcp        0      0 0.0.0.0:22              0.0.0.0:*               LISTEN      -
tcp        0      0 0.0.0.0:8080            0.0.0.0:*               LISTEN      -
tcp        0      0 127.0.0.53:53           0.0.0.0:*               LISTEN      -
tcp6       0      0 :::22                   :::*                    LISTEN      -
tcp6       0      0 :::8080                 :::*                    LISTEN      -
udp        0      0 127.0.0.53:53           0.0.0.0:*                           -

Thus, if working with Docker, it might be a good idea to enable a firewall that prevents exposing ports accidentially. Below, we give a short introduction how you can do this with ufw:

ufw allow ssh # If you are connected via ssh to a server and don't want to lock yourself out.
ufw enable

Yet, there is another nice quirk of Docker waiting for you: Docker will circumvent ufw by directly writing iptable rules, which is of course great and definitely expected behavior. To learn more about this feature and how to configure ufw so that Docker respects your rules, you should read https://github.com/chaifeng/ufw-docker#tldr.

The quintessence is, you need to add these rules to /etc/ufw/after.rules:

# BEGIN UFW AND DOCKER
*filter
:ufw-user-forward - [0:0]
:ufw-docker-logging-deny - [0:0]
:DOCKER-USER - [0:0]
-A DOCKER-USER -j ufw-user-forward

-A DOCKER-USER -j RETURN -s 10.0.0.0/8
-A DOCKER-USER -j RETURN -s 172.16.0.0/12
-A DOCKER-USER -j RETURN -s 192.168.0.0/16

-A DOCKER-USER -p udp -m udp --sport 53 --dport 1024:65535 -j RETURN

-A DOCKER-USER -j ufw-docker-logging-deny -p tcp -m tcp --tcp-flags FIN,SYN,RST,ACK SYN -d 192.168.0.0/16
-A DOCKER-USER -j ufw-docker-logging-deny -p tcp -m tcp --tcp-flags FIN,SYN,RST,ACK SYN -d 10.0.0.0/8
-A DOCKER-USER -j ufw-docker-logging-deny -p tcp -m tcp --tcp-flags FIN,SYN,RST,ACK SYN -d 172.16.0.0/12
-A DOCKER-USER -j ufw-docker-logging-deny -p udp -m udp --dport 0:32767 -d 192.168.0.0/16
-A DOCKER-USER -j ufw-docker-logging-deny -p udp -m udp --dport 0:32767 -d 10.0.0.0/8
-A DOCKER-USER -j ufw-docker-logging-deny -p udp -m udp --dport 0:32767 -d 172.16.0.0/12

-A DOCKER-USER -j RETURN

-A ufw-docker-logging-deny -m limit --limit 3/min --limit-burst 10 -j LOG --log-prefix "[UFW DOCKER BLOCK] "
-A ufw-docker-logging-deny -j DROP

COMMIT
# END UFW AND DOCKER

Install Kubewarden

Finally, we are ready to install Kubewarden. We mostly follow the official documentation now, so feel free to have a look at it as well.

Install cert-manager

Currently Kubewarden depends on cert-manager to manage TLS certificates, which we need to install first. This is easy, however, since we already installed Helm:

helm repo add jetstack https://charts.jetstack.io
helm repo update

helm install \
  cert-manager jetstack/cert-manager \
  --namespace cert-manager \
  --create-namespace \
  --version v1.8.2 \
  --set installCRDs=true

Install Kubewarden

Installing Kubewarden is pretty much the same process as installing cert-manager:

helm repo add kubewarden https://charts.kubewarden.io
helm repo update

helm install --wait -n kubewarden --create-namespace kubewarden-crds kubewarden/kubewarden-crds
helm install --wait -n kubewarden kubewarden-controller kubewarden/kubewarden-controller
helm install --wait -n kubewarden kubewarden-defaults kubewarden/kubewarden-defaults

Test Kubewarden

Now we should be up and running. Yet, it is always good to check if everything is working correctly. Therefore, we try to enable one of the default policies of Kubewarden and see if the admission controller prevents the creation of non-compliant resources.

As many people working with K8s know, allowing to run privileged pods is not a really good idea in general, as such a pod is able to compromise the whole cluster.

Primer: of privileged pods and the docker group

So we know that it is bad to allow privileged pods, but how bad is it really? Well, see for yourself:

We will try to escape from a privileged pod to get root on the host system. First, we will create a new cluster with k3d. Note, that the argument --host-pid-mode is necessary here, since k3d understandably disables this per default:

$ k3d cluster create vulnerable --host-pid-mode
INFO[0000] Prep: Network
INFO[0000] Created network 'k3d-vulnerable'
INFO[0000] Created image volume k3d-vulnerable-images
INFO[0000] Starting new tools node...
INFO[0001] Creating node 'k3d-vulnerable-server-0'
INFO[0001] Starting Node 'k3d-vulnerable-tools'
INFO[0002] Creating LoadBalancer 'k3d-vulnerable-serverlb'
INFO[0003] Using the k3d-tools node to gather environment information
INFO[0005] HostIP: using network gateway 172.19.0.1 address
INFO[0005] Starting cluster 'vulnerable'
INFO[0005] Starting servers...
INFO[0005] Starting Node 'k3d-vulnerable-server-0'
INFO[0012] All agents already running.
INFO[0012] Starting helpers...
INFO[0012] Starting Node 'k3d-vulnerable-serverlb'
INFO[0019] Injecting records for hostAliases (incl. host.k3d.internal) and for 2 network members into CoreDNS configmap...
INFO[0022] Cluster 'vulnerable' created successfully!
INFO[0022] You can now use it like this:
kubectl cluster-info

$ kubectl cluster-info
Kubernetes control plane is running at https://0.0.0.0:35345
CoreDNS is running at https://0.0.0.0:35345/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
Metrics-server is running at https://0.0.0.0:35345/api/v1/namespaces/kube-system/services/https:metrics-server:/proxy

To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.

Now that our vulnerable cluster is up and running we can try to spawn a privileged pod to get root on the system: (credits to @IanColdwater for this polished oneliner)

$ kubectl run h0nk --rm -it --image alpine --privileged --overrides '{"spec":{"hostPID":true}}' --command nsenter -- --mount=/proc/1/ns/mnt
If you don't see a command prompt, try pressing enter.
# id
uid=0(root) gid=0(root) groups=0(root),1(daemon),2(bin),3(sys),4(adm),6(disk),10(uucp),11,20(dialout),26(tape),27(sudo)
# mkdir /secret && touch /secret/key.txt
# exit
Session ended, resume using 'kubectl attach h0nk -c h0nk -i -t' command when the pod is running
pod "h0nk" deleted

To verify that our container escape was successful, we check if the key.txt file did indeed get created on the host system:

$ ls -l /secret/key.txt
-rw-r--r-- 1 root root 0 Jul  9 10:30 /secret/key.txt

Keep in mind that the whole process can be performed by any user able to spawn privileged pods within the cluster or any user with access to the Docker daemon.

This is also one of the reasons we did not perform Docker’s optional post-installation steps to add our user to the docker group. Even though Docker warns that “the docker group grants privileges equivalent to the root user”, we want to emphasize that this is not sudo-style root, but direct, unadulterated root, that bypasses many established security mechanisms.

First, we check that we can indeed create pods with the privileged flag set to true:

$ kubectl apply -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
  name: privileged-pod
spec:
  containers:
    - name: nginx
      image: nginx:latest
      securityContext:
          privileged: true
EOF
pod/privileged-pod created
stefan@Ubuntu-2204-jammy-amd64-base:~$ kubectl get pods
NAME             READY   STATUS              RESTARTS   AGE
privileged-pod   0/1     ContainerCreating   0          7s
stefan@Ubuntu-2204-jammy-amd64-base:~$ kubectl get pods
NAME             READY   STATUS              RESTARTS   AGE
privileged-pod   0/1     ContainerCreating   0          11s

$ kubectl get pods
NAME             READY   STATUS    RESTARTS   AGE
privileged-pod   1/1     Running   0          13s

As we have seen, K3s (and K8s in general) does not restrict the creation of privileged pods per default. We will now look at what we can do with Kubewarden to prevent the creation of such pods.

Kubewarden admission policies are defined via ClusterAdmissionPolicies custom resource definitions (also known as CRDs). We will not go into details about the specification of Kubewardens CRDs, note however, that the most important part of the spec below is the module, which references a precompiled wasm module (see Kubewardens Policy Hub for a list of all available modules), and contains the whole admission logic:

$ kubectl apply -f - <<EOF
apiVersion: policies.kubewarden.io/v1alpha2
kind: ClusterAdmissionPolicy
metadata:
  name: privileged-pods
spec:
  module: registry://ghcr.io/kubewarden/policies/pod-privileged:v0.1.9
  rules:
  - apiGroups: [""]
    apiVersions: ["v1"]
    resources: ["pods"]
    operations:
    - CREATE
    - UPDATE
  mutating: false
EOF
clusteradmissionpolicy.policies.kubewarden.io/privileged-pods created

$ # The next command will return STATUS "pending" as long as the policy is provisioning.
$ kubectl get clusteradmissionpolicy.policies.kubewarden.io/privileged-pods
NAME              POLICY SERVER   MUTATING   MODE      OBSERVED MODE   STATUS
privileged-pods   default         false      protect   protect         active

$ kubectl get validatingwebhookconfigurations.admissionregistration.k8s.io -l kubewarden
NAME                          WEBHOOKS   AGE
clusterwide-privileged-pods   1          20s

Now that our policy is loaded into Kubernetes, we should be able to test if the admission controller does apply them correctly. For that, we try to provision two pods, one unprivileged and one privileged:

stefan@Ubuntu-2204-jammy-amd64-base:~$ kubectl apply -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
  name: unprivileged-pod
spec:
  containers:
    - name: nginx
      image: nginx:latest
EOF
pod/unprivileged-pod created

$ kubectl apply -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
  name: privileged-pod
spec:
  containers:
    - name: nginx
      image: nginx:latest
      securityContext:
          privileged: true
EOF
Error from server: error when creating "STDIN": admission webhook "clusterwide-privileged-pods.kubewarden.admission" denied the request: User 'system:admin' cannot schedule privileged containers

As expected, Kubewarden blocked the creation of the privileged pod and raised an error.

End of part one

We have set up a lab environment consisting of a Kubernetes cluster running in Docker (based on k3d/K3s) and showed how to use Helm (the package manager of K8s) to install Kubewarden into our cluster, an admission controller for K8s.

Afterwards, we deployed one of Kubewardens provided ClusterAdmissionPolicys to prevent the creation of privileged pods in our cluster. At this point, you should be able to deploy any of the policies from Kubewardens Policy Hub to prevent object creation in Kubernetes as you wish.

In part two, we will look at custom policies, how Kubewarden handles rego, the differences between OPA and OPA gatekeeper and if Kubewarden may solve the problems these two somehow competing systems might raise.

As a final step, you can stop the cluster we created for the lab to save your precious system resources:

k3d cluster stop kubewarden

Kubewarden 1.0.0: Yay or Nay? (Part 1)

What is Kubewarden?

TLDR: admission controllers

Playing field

Install the dependencies

Install Docker

Install kubectl

Install Helm

Install k3d

Primer: K3s vs. k3d

Restrict Docker from exposing arbitrary ports on your server

Install Kubewarden

Install cert-manager

Install Kubewarden

Test Kubewarden

Primer: of privileged pods and the docker group

End of part one

Stefan Feuerstein