RKE2 Provisioning and Hardening with CIS Profile
RKE2 (Rancher Kubernetes Engine 2) is Rancher’s next-generation, enterprise-ready Kubernetes distribution, designed with security and compliance at its core. It’s fully CNCF conformant and optimized for CIS Benchmark and FIPS 140-2 compliance, making it ideal for secure production environments.
This post walks you through installing and hardening RKE2 cluster — perfect for production, compliance evaluation, or secure proof-of-concept setups with Cilium eBPF.
Prerequisites
| IP Address | Hostname | Operating System |
|---|---|---|
| 172.100.0.5 | master01.ajinf.id | Ubuntu 24.04 |
| 172.100.0.9 | master02.ajinf.id | Ubuntu 24.04 |
| 172.100.0.12 | master03.ajinf.id | Ubuntu 24.04 |
| 172.100.0.13 | worker01.ajinf.id | Ubuntu 24.04 |
Pre Installation
- Update & upgrade system
## Run on all nodes
sudo apt update && sudo apt -y upgrade && reboot
- Change hostname
## Run on all nodes
hostnamectl set-hostname <master/worker>.<domain>
- Disable automatic system upgrades for stability
sudo systemctl disable --now apt-daily-upgrade.timer apt-daily.timer
systemctl list-timers
- Add auto-completion
echo "source /usr/share/bash-completion/bash_completion" | tee -a ~/.bashrc
source ~/.bashrc
- Set timezone
timedatectl set-timezone Asia/Jakarta
- NTP Client
## execute on all nodes
sudo apt install -y chrony
vi /etc/chrony/chrony.conf
---
# Use internal NTP servers for primary synchronization
server 172.100.0.4 iburst prefer
# If you have a second internal NTP server
server 172.100.0.3 iburst
# Log files location.
logdir /var/log/chrony
log measurements statistics tracking
# Allow NTP client access from specific internal networks (if this instance also acts as a server)
allow 10.20.30.0/24
## Verify
systemctl restart chrony
timedatectl
chronyc sources -v
chronyc tracking

- Set DNS Resolver
vi /etc/systemd/resolved.conf
---
[Resolve]
DNS=172.100.0.1 172.100.0.4
DNSStubListener=no
Apply dns server on the nodes
systemctl restart systemd-resolved.service
- Map hostname
cat <<EOF | sudo tee -a /etc/hosts
172.100.0.5 master01-ajinf-id master01.ajinf.id master1
172.100.0.9 master02-ajinf-id master02.ajinf.id master2
172.100.0.12 master03-ajinf-id master03.ajinf.id master3
172.100.0.13 master01-ajinf-id worker01.ajinf.id worker1
EOF
- Disable swap
sudo sed -i '/ swap / s/^/#/' /etc/fstab
sudo swapoff -a
- Update system config
echo "net.ipv4.ip_nonlocal_bind=1" | sudo tee /etc/sysctl.d/ip_nonlocal_bind.conf
cat <<EOF | sudo tee /etc/modules-load.d/containerd.conf
overlay
br_netfilter
EOF
cat <<EOF | sudo tee /etc/sysctl.d/99-kubernetes-cri.conf
net.bridge.bridge-nf-call-iptables = 1
net.ipv4.ip_forward = 1
net.bridge.bridge-nf-call-ip6tables = 1
EOF
cat <<EOF > /etc/sysctl.d/60-keepalived.conf
net.ipv4.conf.all.arp_ignore = 1
net.ipv4.conf.all.arp_announce = 2
net.ipv4.conf.default.arp_ignore = 1
net.ipv4.conf.default.arp_announce = 2
net.ipv4.conf.wg0.arp_ignore = 1
net.ipv4.conf.wg0.arp_announce = 2
EOF
sudo modprobe overlay && sudo modprobe br_netfilter
sysctl -p && sysctl --system
Install RKE2 cluster
Installing master
- Install rke2 server
curl -sfL https://get.rke2.io | INSTALL_RKE2_CHANNEL=stable INSTALL_RKE2_TYPE=server sh -
- Create config file
mkdir -p /etc/rancher/rke2
Configuration for master01.ajinf.id
sudo tee /etc/rancher/rke2/config.yaml << 'EOF'
write-kubeconfig-mode: "0600"
node-ip: 172.100.0.5
node-name: master01.ajinf.id
cluster-domain: rke2-production
token: rke@password
tls-san:
- 127.0.0.1
- 192.168.10.70
- 172.100.0.5
- master01.ajinf.id
- master02.ajinf.id
- master03.ajinf.id
### Used for Monitoring
etcd-expose-metrics: "true"
### CNI & eBPF
cni: cilium
disable-kube-proxy: true
### ETCD Snapshot and Limit to 2GB
etcd-arg:
- '--quota-backend-bytes 2048000000'
etcd-snapshot-schedule-cron: "0 3 * * *"
etcd-snapshot-retention: 10
### Audit RKE2
kube-apiserver-arg:
- "bind-address=0.0.0.0"
- "audit-log-maxbackup=15"
- "audit-log-path=/var/log/rke2/audit.log"
- "audit-log-maxage=40"
- "audit-log-maxsize=150"
kube-scheduler-arg:
- "bind-address=0.0.0.0"
EOF
Enable and start rke2 server on master01.ajinf.id
systemctl enable --now rke2-server
- Installing and configuring Cilium CNI
sudo sh -c 'cat <<EOF > /var/lib/rancher/rke2/server/manifests/rke2-cilium-config.yaml
apiVersion: helm.cattle.io/v1
kind: HelmChartConfig
metadata:
name: rke2-cilium
namespace: kube-system
spec:
valuesContent: |-
kubeProxyReplacement: true
k8sServiceHost: "localhost"
k8sServicePort: "6443"
enableHostReachableServices: true
ipam:
mode: kubernetes
bandwidthManager:
enabled: true
bpf:
masquerade: true
tunnel-protocol: vxlan
operator:
replicas: 2
EOF'
- Join
master02.ajinf.idto the cluster
sudo tee /etc/rancher/rke2/config.yaml << 'EOF'
server: https://master01.ajinf.id:9345
write-kubeconfig-mode: "0600"
node-ip: 172.100.0.9
node-name: master02.ajinf.id
cluster-domain: rke2-production
token: rke@password
tls-san:
- 127.0.0.1
- 192.168.100.172
- 172.100.0.9
- master01.ajinf.id
- master02.ajinf.id
- master03.ajinf.id
### Used for Monitoring
etcd-expose-metrics: "true"
### CNI & eBPF
cni: cilium
disable-kube-proxy: true
### ETCD Snapshot and Limit to 2GB
etcd-arg:
- '--quota-backend-bytes 2048000000'
etcd-snapshot-schedule-cron: "0 3 * * *"
etcd-snapshot-retention: 10
### Audit RKE2
kube-apiserver-arg:
- "bind-address=0.0.0.0"
- "audit-log-maxbackup=15"
- "audit-log-path=/var/log/rke2/audit.log"
- "audit-log-maxage=40"
- "audit-log-maxsize=150"
kube-scheduler-arg:
- "bind-address=0.0.0.0"
EOF
Join master03.ajinf.id to the cluster
sudo tee /etc/rancher/rke2/config.yaml << 'EOF'
server: https://master01.ajinf.id:9345
write-kubeconfig-mode: "0600"
node-ip: 172.100.0.12
node-name: master03.ajinf.id
cluster-domain: rke2-production
token: rke@password
tls-san:
- 127.0.0.1
- 10.20.30.49
- 172.100.0.12
- master01.ajinf.id
- master02.ajinf.id
- master03.ajinf.id
### Used for Monitoring
etcd-expose-metrics: "true"
### CNI & eBPF
cni: cilium
disable-kube-proxy: true
### ETCD Snapshot and Limit to 2GB
etcd-arg:
- '--quota-backend-bytes 2048000000'
etcd-snapshot-schedule-cron: "0 3 * * *"
etcd-snapshot-retention: 10
### Audit RKE2
kube-apiserver-arg:
- "bind-address=0.0.0.0"
- "audit-log-maxbackup=15"
- "audit-log-path=/var/log/rke2/audit.log"
- "audit-log-maxage=40"
- "audit-log-maxsize=150"
kube-scheduler-arg:
- "bind-address=0.0.0.0"
EOF
Enable and start rke2 server on master2 and master3 nodes
systemctl enable --now rke2-server
- Add kubeconfig to all master nodes
cat<<EOF >> ~/.bashrc
export PATH=$PATH:/var/lib/rancher/rke2/bin
export KUBECONFIG=/etc/rancher/rke2/rke2.yaml
EOF
source ~/.bashrc


Setup Rancher Dashboard
- Add rancher repository
helm repo add rancher-prime https://charts.rancher.com/server-charts/prime
helm repo update
helm install rancher rancher-prime/rancher \
--create-namespace \
--namespace cattle-system \
--set hostname=rancher.<your_domain> \
--set bootstrapPassword=rke@password \
--set replicas=1 \
--set ingress.tls.source=secret \
--set ingress.tls.secretName=<your_secret> \
--set ingress.ingressClassName=nginx
Apply CIS Profile
- Set CIS kernel parameters on
all nodes
Note: protect-kernel-defaults is exposed as a top-level flag for RKE2. If you have set profile to cis-1.XX and protect-kernel-defaults to false explicitly, RKE2 will exit with an error.
sudo cp -f /usr/local/share/rke2/rke2-cis-sysctl.conf /etc/sysctl.d/60-rke2-cis.conf
sudo systemctl restart systemd-sysctl
sudo sysctl -p /usr/local/share/rke2/rke2-cis-sysctl.conf

- Create the etcd user on
master nodes
sudo useradd -r -c "etcd user" -s /sbin/nologin -M etcd -U
chown -R etcd:etcd /var/lib/rancher/rke2/server/db/etcd/

- Edit RKE2 configuration with CIS profile and add Pod Security Admission.
we need to exempt a set of namespaces from the Pod Security Admission policy that prevents Rancher components from running. See the Rancher documentation here: https://ranchermanager.docs.rancher.com/how-to-guides/new-user-guides/authentication-permissions-and-global-configuration/psa-config-templates#exempting-required-rancher-namespaces
## execute on all master node
cat > /etc/rancher/rke2/rke2-pss.yaml << 'EOF'
apiVersion: apiserver.config.k8s.io/v1
kind: AdmissionConfiguration
plugins:
- name: PodSecurity
configuration:
apiVersion: pod-security.admission.config.k8s.io/v1beta1
kind: PodSecurityConfiguration
defaults:
enforce: "restricted"
enforce-version: "latest"
audit: "restricted"
audit-version: "latest"
warn: "restricted"
warn-version: "latest"
exemptions:
usernames: []
runtimeClasses: []
namespaces:
- calico-apiserver
- calico-system
- cattle-alerting
- cattle-csp-adapter-system
- cattle-epinio-system
- cattle-externalip-system
- cattle-fleet-local-system
- cattle-fleet-system
- cattle-gatekeeper-system
- cattle-global-data
- cattle-global-nt
- cattle-impersonation-system
- cattle-istio
- cattle-istio-system
- cattle-logging
- cattle-logging-system
- cattle-monitoring-system
- cattle-neuvector-system
- cattle-prometheus
- cattle-sriov-system
- cattle-system
- cattle-ui-plugin-system
- cattle-windows-gmsa-system
- cert-manager
- cis-operator-system
- compliance-operator-system
- fleet-default
- ingress-nginx
- istio-system
- kube-node-lease
- kube-public
- kube-system
- longhorn-system
- rancher-alerting-drivers
- security-scan
- tigera-operator
EOF
vi /etc/rancher/rke2/config.yaml
---
...
profile: "cis"
pod-security-admission-config-file: /etc/rancher/rke2/rke2-pss.yaml
...
- Apply configuration by restarting RKE2 server
systemctl restart rke2-server
- Configure default Service Account
Kubernetes provides a default service account which is used by cluster workloads where no specific service account is assigned to the pod. Where access to the Kubernetes API from a pod is required, a specific service account should be created for that pod, and rights granted to that service account. The default service account should be configured such that it does not provide a service account token and does not have any explicit rights assignments.
cat<<EOF >> /etc/rancher/rke2/account_update.yaml
apiVersion: v1
kind: ServiceAccount
metadata:
name: default
automountServiceAccountToken: false
EOF
cat << 'EOF' > /etc/rancher/rke2/account_update.sh
#!/bin/bash -e
for namespace in $(kubectl get namespaces -A -o=jsonpath="{.items[*]['metadata.name']}"); do
echo -n "Patching namespace $namespace - "
kubectl patch serviceaccount default -n ${namespace} -p "$(cat /etc/rancher/rke2/account_update.yaml)"
done
EOF
---
sudo chmod +x /etc/rancher/rke2/account_update.sh
bash /etc/rancher/rke2/account_update.sh
Scan the RKE2 Cluster
- Installing Rancher Compliance
Go to Apps > Charts > Search for “Rancher Compliance > Install
Create cluster scan with profile rke2-cis

and here is the output scan

Total 130: 80 pass, 40 warn, 5 fails
Deploy pods as non root
- The RKE2 PSA will restrict root pods, so we must run the pods as non-root users
cat << EOF | kubectl apply -f -
apiVersion: apps/v1
kind: Deployment
metadata:
name: nginx-app
spec:
replicas: 1
selector:
matchLabels:
app: nginx-app
template:
metadata:
labels:
app: nginx-app
spec:
securityContext:
runAsNonRoot: true
runAsUser: 101
runAsGroup: 101
seccompProfile: # ← ADD THIS at POD level
type: RuntimeDefault
containers:
- name: app
image: nginxinc/nginx-unprivileged:latest
ports:
- containerPort: 8080
securityContext:
allowPrivilegeEscalation: false
capabilities:
drop:
- ALL
seccompProfile: # ← AND/OR at CONTAINER level
type: RuntimeDefault
---
apiVersion: v1
kind: Service
metadata:
name: nginx-service
spec:
selector:
app: nginx-app
ports:
- protocol: TCP
port: 8080
targetPort: 8080
---
apiVersion: networking.k8s.io/v1
kind: Ingress
metadata:
name: nginx-ingress
annotations:
nginx.ingress.kubernetes.io/rewrite-target: /
spec:
ingressClassName: nginx
rules:
- host: nginx.k8s.ajinf.id # ← Change to your domain
http:
paths:
- path: /
pathType: Prefix
backend:
service:
name: nginx-service
port:
number: 8080
EOF

| Security Measure | CIS Benchmark Compliance/Rationale |
|---|---|
runAsNonRoot: true |
Critical: Ensures the container process cannot execute as the root user (UID 0), preventing container escapes from gaining root access to the host kernel. |
runAsUser: 101 & runAsGroup: 101 |
Principle of Least Privilege (PoLP): Explicitly specifies a non-root, low-privileged user (101 is common for Nginx). This is enforced by the nginxinc/nginx-unprivileged image. |
allowPrivilegeEscalation: false |
CIS 5.2.5: Prevents a process inside the container from gaining greater privileges than its parent process. This is a crucial defense against container break-out exploits. |
capabilities: drop: [ALL] |
CIS 5.2.8: Removes all default Linux capabilities (e.g., NET_RAW, SYS_ADMIN). The application is then granted only the bare minimum required capabilities, drastically reducing the kernel attack surface. |
seccompProfile: type: RuntimeDefault |
CIS 5.2.9 (Strongly Recommended): Enables Seccomp (Secure Computing), which filters the system calls a container can make to the kernel. RuntimeDefault is a strong, general-purpose filter provided by the container runtime (like Containerd in RKE2). |
- CIS hardening restricts cross-namespace communication by default, we have to create Network Policy to allow cross-namespace traffic between ingress and other pods
cat << EOF | kubectl apply -f -
apiVersion: cilium.io/v2
kind: CiliumClusterwideNetworkPolicy
metadata:
name: allow-cross-namespace
spec:
description: "Allow traffic between all namespaces"
endpointSelector: {} # All pods
ingress:
- fromEndpoints:
- {} # Allow from all pods
EOF
While these security settings are excellent, they introduce trade-offs in deployment and troubleshooting.
| Factor | Plus (+) Security/Stability | Minus (-) Operational Maintenance |
|---|---|---|
runAsNonRoot: true |
Prevents host-level root access, isolating the blast radius of a vulnerability. | Requires the container image to be designed correctly (e.g., proper file permissions, no reliance on installing packages post-startup). |
capabilities: drop: [ALL] |
Severely limits what an attacker can do by blocking privileged kernel actions. | Difficult Troubleshooting: If the application needs a specific capability (e.g., NET_BIND_SERVICE for ports < 1024), the pod will crash with a cryptic “Permission denied” error, requiring tedious trial-and-error. |
seccompProfile |
Provides robust defense against zero-day kernel exploits by restricting system calls. | Debugging Complexity: If an application relies on a syscall not in the RuntimeDefault profile, the application will silently crash or fail, making root-cause analysis difficult without kernel-level tracing. |
| CIS Profile Enforcement | Ensures every workload meets a minimum bar of security (your restricted PSA profile). | Workload Friction: Prevents un-hardened, off-the-shelf public images (like older Nginx/busybox images) from running unless they are rebuilt to be non-root. |
- Verify by accessing pods and web ui
