Kubewarden, an only recently admitted CNCF sandbox project, has had its first stable release on 22nd of June, 2022 — a perfect time to have a quick look at it.
What is Kubewarden?
In short: Kubewarden is an admission controller for Kubernetes (stylized as K8s), that tries to replace the now deprecated Pod Security Policies and unify the current ecosystem by supporting both versions of Rego policies (used by Open Policy Agent and OPA gatekeeper).
However, Kubewarden tries to accomplish much more, e.g. by allowing to specify admission policies in more or less any programming language that compiles to web assembly (wasm). We won’t go into the details of all the features of Kubewarden (this is where you should consult the official documentation if you are interested in these), but instead give a short introduction to admission controllers in general and set up a lightweight lab environment for Kubewarden.
TLDR: admission controllers
An admission controller is a piece of code that intercepts requests to the Kubernetes API server prior to persistence of the object, but after the request is authenticated and authorized.
– from the Kubernetes Documentation
In a nutshell: admission controllers decide if a Kubernetes object (e.g. a pod) gets to live on or not. Kubernetes provides a range of default controllers (e.g. to enforce resource quotas on pods) but you are free to install further controllers into your cluster. In general, there are two (or three, depending on how you count) types of admission controllers: validating and mutating ones (and ones that are both at the same time). The difference simply is, that mutating controllers may change the original request to a compliant one (e.g. if a pod requests risky capabilities, an admission controller might just drop these transparently).
Playing field
The first thing we need is a working Kubernetes cluster. We will not set up a production-ready K8s cluster, as this is material for another series of blog posts, but rather set up a minimal working example that is easy to use and on resources.
There is a whole range of solutions for our use case by now, some of the better known, for example, are minikube, kind, MicroK8s and K3s. The following should be easily adaptable to any one of the mentioned options, so feel free to use whatever you like best (see minikube vs. kind vs. K3s for a nice comparison). We will go with K3s, as it is lightweight and allows us to spin up multiple independent clusters on a single development machine.
Install the dependencies
Install Docker
We run our lab on an Ubuntu server. Either follow the official install instructions for Ubuntu or use the steps down below (alternatively follow the instructions for your distro).
curl -fsSL https://download.docker.com/linux/ubuntu/gpg | sudo gpg --dearmor -o /etc/apt/keyrings/docker.gpg
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/docker.gpg] https://download.docker.com/linux/ubuntu \
$(lsb_release -cs) stable" | sudo tee /etc/apt/sources.list.d/docker.list > /dev/null
sudo apt update
sudo apt install docker-ce docker-ce-cli containerd.io docker-compose-plugin
Install kubectl
We use kubectl
to manage the resources in our cluster. You can install kubectl
either by downloading the binary from GitHub or by adding the Kubernetes repository to apt
:
curl -fsSL https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo gpg --dearmor -o /etc/apt/keyrings/google.gpg
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/etc/apt/keyrings/google.gpg] https://apt.kubernetes.io/ kubernetes-xenial main" |\
sudo tee -a /etc/apt/sources.list.d/kubernetes.list
sudo apt update
sudo apt install kubectl
Install Helm
Helm is the package manager for K8s. Helm apps come bundled in charts that can be stored in any OCI registry. Charts contain all resources needed to deploy a given app and Helm can be used to install/remove/update all resources within a chart at once. Helm can be installed by downloading the binary from GitHub or by adding the Helm repository to apt
:
curl https://baltocdn.com/helm/signing.asc | sudo gpg --dearmor -o /usr/share/keyrings/helm.gpg
echo \
"deb [arch=$(dpkg --print-architecture) signed-by=/usr/share/keyrings/helm.gpg] https://baltocdn.com/helm/stable/debian/ all main" |\
sudo tee /etc/apt/sources.list.d/helm-stable-debian.list
sudo apt update
sudo apt install helm
Install k3d
Now that we have prepared our system, the rest is pretty much straightforward. First we install k3d. (Note the difference: k3d stands for K3s in Docker). There are a slew of options to install k3d, but we went with one of the easiest for us (note, however, that you might get pre-release versions when installing like that):
go install github.com/k3d-io/k3d/[email protected]
Primer: K3s vs. k3d
K3s is a minimal Kubernetes distribution from Rancher Labs. It’s a single binary meant to be used in IoT and CI environments. The goals of K3s are to be more leightweight than K8s but nevertheless be fully compliant to K8s. This is achieved by, for example, replacing complex components by less complex alternatives (e.g. etcd by sqlite), as well as by removing some optional third-party additions.
k3d builds upon that foundation, but is not officially affiliated with K3s or Rancher Labs. k3d is more or less a simple wrapper around the official Docker images of K3s. k3d makes it easy to run multiple instances of K3s within Docker on a single machine, which makes it perfect for local development.
With k3d installed, we can test our lab environment:
$ k3d cluster create kubewarden
INFO[0000] Prep: Network
INFO[0000] Created network 'k3d-kubewarden'
INFO[0000] Created image volume k3d-kubewarden-images
INFO[0000] Starting new tools node...
INFO[0001] Creating node 'k3d-kubewarden-server-0'
INFO[0002] Starting Node 'k3d-kubewarden-tools'
INFO[0003] Creating LoadBalancer 'k3d-kubewarden-serverlb'
INFO[0004] Using the k3d-tools node to gather environment information
INFO[0006] HostIP: using network gateway 172.18.0.1 address
INFO[0006] Starting cluster 'kubewarden'
INFO[0006] Starting servers...
INFO[0007] Starting Node 'k3d-kubewarden-server-0'
INFO[0013] All agents already running.
INFO[0013] Starting helpers...
INFO[0014] Starting Node 'k3d-kubewarden-serverlb'
INFO[0021] Injecting records for hostAliases (incl. host.k3d.internal) and for 2 network members into CoreDNS configmap...
INFO[0024] Cluster 'kubewarden' created successfully!
INFO[0024] You can now use it like this:
kubectl cluster-info
Now that the cluster has been created, we can check its status by using kubectl cluster-info
:
$ kubectl cluster-info
Kubernetes control plane is running at https://0.0.0.0:45163
CoreDNS is running at https://0.0.0.0:45163/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
Metrics-server is running at https://0.0.0.0:45163/api/v1/namespaces/kube-system/services/https:metrics-server:/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
Restrict Docker from exposing arbitrary ports on your server
We have one problem now. After starting the K3s cluster, the kube-api is exposed to the internet on a randomly chosen port:
$ netstat -tulpn
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 127.0.0.53:53 0.0.0.0:* LISTEN 28107/systemd-resol
tcp 0 0 0.0.0.0:40501 0.0.0.0:* LISTEN 50353/docker-proxy
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN 28080/sshd: /usr/sb
tcp6 0 0 :::22 :::* LISTEN 28080/sshd: /usr/sb
udp 0 0 127.0.0.53:53 0.0.0.0:* 28107/systemd-resol
As this is a lab environment, it should definitely not be exposed to the web. Keep in mind that this is a general thing to be aware of: If you bind a port to your host with Docker, Docker exposes the port on 0.0.0.0
:
$ docker run -d -p 8080:80 nginx
583844ed8df90170f3b20230d207e037bf6c239c92145048322170196e9c2a32
$ netstat -tulpn
(Not all processes could be identified, non-owned process info
will not be shown, you would have to be root to see it all.)
Active Internet connections (only servers)
Proto Recv-Q Send-Q Local Address Foreign Address State PID/Program name
tcp 0 0 0.0.0.0:22 0.0.0.0:* LISTEN -
tcp 0 0 0.0.0.0:8080 0.0.0.0:* LISTEN -
tcp 0 0 127.0.0.53:53 0.0.0.0:* LISTEN -
tcp6 0 0 :::22 :::* LISTEN -
tcp6 0 0 :::8080 :::* LISTEN -
udp 0 0 127.0.0.53:53 0.0.0.0:* -
Thus, if working with Docker, it might be a good idea to enable a firewall that prevents exposing ports accidentially. Below, we give a short introduction how you can do this with ufw:
ufw allow ssh # If you are connected via ssh to a server and don't want to lock yourself out.
ufw enable
Yet, there is another nice quirk of Docker waiting for you: Docker will circumvent ufw by directly writing iptable rules, which is of course great and definitely expected behavior. To learn more about this feature and how to configure ufw so that Docker respects your rules, you should read https://github.com/chaifeng/ufw-docker#tldr.
The quintessence is, you need to add these rules to /etc/ufw/after.rules
:
# BEGIN UFW AND DOCKER
*filter
:ufw-user-forward - [0:0]
:ufw-docker-logging-deny - [0:0]
:DOCKER-USER - [0:0]
-A DOCKER-USER -j ufw-user-forward
-A DOCKER-USER -j RETURN -s 10.0.0.0/8
-A DOCKER-USER -j RETURN -s 172.16.0.0/12
-A DOCKER-USER -j RETURN -s 192.168.0.0/16
-A DOCKER-USER -p udp -m udp --sport 53 --dport 1024:65535 -j RETURN
-A DOCKER-USER -j ufw-docker-logging-deny -p tcp -m tcp --tcp-flags FIN,SYN,RST,ACK SYN -d 192.168.0.0/16
-A DOCKER-USER -j ufw-docker-logging-deny -p tcp -m tcp --tcp-flags FIN,SYN,RST,ACK SYN -d 10.0.0.0/8
-A DOCKER-USER -j ufw-docker-logging-deny -p tcp -m tcp --tcp-flags FIN,SYN,RST,ACK SYN -d 172.16.0.0/12
-A DOCKER-USER -j ufw-docker-logging-deny -p udp -m udp --dport 0:32767 -d 192.168.0.0/16
-A DOCKER-USER -j ufw-docker-logging-deny -p udp -m udp --dport 0:32767 -d 10.0.0.0/8
-A DOCKER-USER -j ufw-docker-logging-deny -p udp -m udp --dport 0:32767 -d 172.16.0.0/12
-A DOCKER-USER -j RETURN
-A ufw-docker-logging-deny -m limit --limit 3/min --limit-burst 10 -j LOG --log-prefix "[UFW DOCKER BLOCK] "
-A ufw-docker-logging-deny -j DROP
COMMIT
# END UFW AND DOCKER
Install Kubewarden
Finally, we are ready to install Kubewarden. We mostly follow the official documentation now, so feel free to have a look at it as well.
Install cert-manager
Currently Kubewarden depends on cert-manager to manage TLS certificates, which we need to install first. This is easy, however, since we already installed Helm:
helm repo add jetstack https://charts.jetstack.io
helm repo update
helm install \
cert-manager jetstack/cert-manager \
--namespace cert-manager \
--create-namespace \
--version v1.8.2 \
--set installCRDs=true
Install Kubewarden
Installing Kubewarden is pretty much the same process as installing cert-manager:
helm repo add kubewarden https://charts.kubewarden.io
helm repo update
helm install --wait -n kubewarden --create-namespace kubewarden-crds kubewarden/kubewarden-crds
helm install --wait -n kubewarden kubewarden-controller kubewarden/kubewarden-controller
helm install --wait -n kubewarden kubewarden-defaults kubewarden/kubewarden-defaults
Test Kubewarden
Now we should be up and running. Yet, it is always good to check if everything is working correctly. Therefore, we try to enable one of the default policies of Kubewarden and see if the admission controller prevents the creation of non-compliant resources.
As many people working with K8s know, allowing to run privileged pods is not a really good idea in general, as such a pod is able to compromise the whole cluster.
Primer: of privileged pods and the docker group
So we know that it is bad to allow privileged pods, but how bad is it really? Well, see for yourself:
We will try to escape from a privileged pod to get root on the host system. First, we will create a new cluster with k3d. Note, that the argument --host-pid-mode
is necessary here, since k3d understandably disables this per default:
$ k3d cluster create vulnerable --host-pid-mode
INFO[0000] Prep: Network
INFO[0000] Created network 'k3d-vulnerable'
INFO[0000] Created image volume k3d-vulnerable-images
INFO[0000] Starting new tools node...
INFO[0001] Creating node 'k3d-vulnerable-server-0'
INFO[0001] Starting Node 'k3d-vulnerable-tools'
INFO[0002] Creating LoadBalancer 'k3d-vulnerable-serverlb'
INFO[0003] Using the k3d-tools node to gather environment information
INFO[0005] HostIP: using network gateway 172.19.0.1 address
INFO[0005] Starting cluster 'vulnerable'
INFO[0005] Starting servers...
INFO[0005] Starting Node 'k3d-vulnerable-server-0'
INFO[0012] All agents already running.
INFO[0012] Starting helpers...
INFO[0012] Starting Node 'k3d-vulnerable-serverlb'
INFO[0019] Injecting records for hostAliases (incl. host.k3d.internal) and for 2 network members into CoreDNS configmap...
INFO[0022] Cluster 'vulnerable' created successfully!
INFO[0022] You can now use it like this:
kubectl cluster-info
$ kubectl cluster-info
Kubernetes control plane is running at https://0.0.0.0:35345
CoreDNS is running at https://0.0.0.0:35345/api/v1/namespaces/kube-system/services/kube-dns:dns/proxy
Metrics-server is running at https://0.0.0.0:35345/api/v1/namespaces/kube-system/services/https:metrics-server:/proxy
To further debug and diagnose cluster problems, use 'kubectl cluster-info dump'.
Now that our vulnerable cluster is up and running we can try to spawn a privileged pod to get root on the system: (credits to @IanColdwater for this polished oneliner)
$ kubectl run h0nk --rm -it --image alpine --privileged --overrides '{"spec":{"hostPID":true}}' --command nsenter -- --mount=/proc/1/ns/mnt
If you don't see a command prompt, try pressing enter.
# id
uid=0(root) gid=0(root) groups=0(root),1(daemon),2(bin),3(sys),4(adm),6(disk),10(uucp),11,20(dialout),26(tape),27(sudo)
# mkdir /secret && touch /secret/key.txt
# exit
Session ended, resume using 'kubectl attach h0nk -c h0nk -i -t' command when the pod is running
pod "h0nk" deleted
To verify that our container escape was successful, we check if the key.txt
file did indeed get created on the host system:
$ ls -l /secret/key.txt
-rw-r--r-- 1 root root 0 Jul 9 10:30 /secret/key.txt
Keep in mind that the whole process can be performed by any user able to spawn privileged pods within the cluster or any user with access to the Docker daemon.
This is also one of the reasons we did not perform Docker’s optional post-installation steps to add our user to the docker group. Even though Docker warns that “the docker group grants privileges equivalent to the root
user”, we want to emphasize that this is not sudo-style root, but direct, unadulterated root, that bypasses many established security mechanisms.
First, we check that we can indeed create pods with the privileged flag set to true
:
$ kubectl apply -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
name: privileged-pod
spec:
containers:
- name: nginx
image: nginx:latest
securityContext:
privileged: true
EOF
pod/privileged-pod created
stefan@Ubuntu-2204-jammy-amd64-base:~$ kubectl get pods
NAME READY STATUS RESTARTS AGE
privileged-pod 0/1 ContainerCreating 0 7s
stefan@Ubuntu-2204-jammy-amd64-base:~$ kubectl get pods
NAME READY STATUS RESTARTS AGE
privileged-pod 0/1 ContainerCreating 0 11s
$ kubectl get pods
NAME READY STATUS RESTARTS AGE
privileged-pod 1/1 Running 0 13s
As we have seen, K3s (and K8s in general) does not restrict the creation of privileged pods per default. We will now look at what we can do with Kubewarden to prevent the creation of such pods.
Kubewarden admission policies are defined via ClusterAdmissionPolicies custom resource definitions (also known as CRDs). We will not go into details about the specification of Kubewardens CRDs, note however, that the most important part of the spec below is the module, which references a precompiled wasm module (see Kubewardens Policy Hub for a list of all available modules), and contains the whole admission logic:
$ kubectl apply -f - <<EOF
apiVersion: policies.kubewarden.io/v1alpha2
kind: ClusterAdmissionPolicy
metadata:
name: privileged-pods
spec:
module: registry://ghcr.io/kubewarden/policies/pod-privileged:v0.1.9
rules:
- apiGroups: [""]
apiVersions: ["v1"]
resources: ["pods"]
operations:
- CREATE
- UPDATE
mutating: false
EOF
clusteradmissionpolicy.policies.kubewarden.io/privileged-pods created
$ # The next command will return STATUS "pending" as long as the policy is provisioning.
$ kubectl get clusteradmissionpolicy.policies.kubewarden.io/privileged-pods
NAME POLICY SERVER MUTATING MODE OBSERVED MODE STATUS
privileged-pods default false protect protect active
$ kubectl get validatingwebhookconfigurations.admissionregistration.k8s.io -l kubewarden
NAME WEBHOOKS AGE
clusterwide-privileged-pods 1 20s
Now that our policy is loaded into Kubernetes, we should be able to test if the admission controller does apply them correctly. For that, we try to provision two pods, one unprivileged and one privileged:
stefan@Ubuntu-2204-jammy-amd64-base:~$ kubectl apply -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
name: unprivileged-pod
spec:
containers:
- name: nginx
image: nginx:latest
EOF
pod/unprivileged-pod created
$ kubectl apply -f - <<EOF
apiVersion: v1
kind: Pod
metadata:
name: privileged-pod
spec:
containers:
- name: nginx
image: nginx:latest
securityContext:
privileged: true
EOF
Error from server: error when creating "STDIN": admission webhook "clusterwide-privileged-pods.kubewarden.admission" denied the request: User 'system:admin' cannot schedule privileged containers
As expected, Kubewarden blocked the creation of the privileged pod and raised an error.
End of part one
We have set up a lab environment consisting of a Kubernetes cluster running in Docker (based on k3d/K3s) and showed how to use Helm (the package manager of K8s) to install Kubewarden into our cluster, an admission controller for K8s.
Afterwards, we deployed one of Kubewardens provided ClusterAdmissionPolicys to prevent the creation of privileged pods in our cluster. At this point, you should be able to deploy any of the policies from Kubewardens Policy Hub to prevent object creation in Kubernetes as you wish.
In part two, we will look at custom policies, how Kubewarden handles rego, the differences between OPA and OPA gatekeeper and if Kubewarden may solve the problems these two somehow competing systems might raise.
As a final step, you can stop the cluster we created for the lab to save your precious system resources:
k3d cluster stop kubewarden