Unattended security updates on Kubernetes nodes, without drama
The standard advice for keeping a Linux server patched is to turn on
unattended-upgrades and walk away. On a single VM that runs one workload,
that works. On a Kubernetes cluster, the same configuration installs a
midnight time-bomb: nodes apply a kernel update, write
/var/run/reboot-required, and then either reboot themselves out from under
the kubelet at an unpredictable moment, or — worse — never reboot, sit on
the unpatched kernel for weeks, and produce a kube-bench audit finding
nobody can explain.
This post is the three-piece setup that gets nodes patched on schedule without taking the cluster down. I run it on my Hetzner RKE2 cluster (three control-plane, two worker) and it has run unattended for months. The same shape works on managed clusters, k3s, kubeadm — anywhere the nodes are Linux boxes you own.
The three layers that need to cooperate
flowchart TB
subgraph A[1. Package layer — on every node]
UU[unattended-upgrades<br/>apt installs, security only]
NR["/var/run/reboot-required<br/>(kernel / libc / systemd)"]
UU --> NR
end
subgraph B[2. Reboot orchestration — inside the cluster]
K[kured DaemonSet<br/>watches sentinel, takes lock]
K --> CD[cordon + drain]
CD --> R[reboot via systemctl]
end
subgraph C[3. Workload resilience — per workload]
PDB[PodDisruptionBudget]
REP[multiple replicas]
end
NR --> K
PDB -.-> CD
The package layer installs and stages. The reboot layer notices and coordinates. The workload layer survives the drain. Each layer is short to configure on its own; the trouble is what happens when one of them is missing.
Layer 1: unattended-upgrades, security-only, reboot-aware
Install once on every node:
sudo apt install -y unattended-upgrades apt-listchanges
sudo dpkg-reconfigure -plow unattended-upgrades
The defaults are conservative — security pocket only — which is what you
want for a node. The bits that need editing live in
/etc/apt/apt.conf.d/50unattended-upgrades. The two changes that matter:
// Send mail on failure to a real mailbox. 'on-change' is also fine.
Unattended-Upgrade::Mail "ops@example.com";
Unattended-Upgrade::MailReport "on-change";
// Do NOT auto-reboot from here. kured handles reboots in the cluster.
Unattended-Upgrade::Automatic-Reboot "false";
That last line is the key difference between a single VM and a Kubernetes
node. On a single VM, Automatic-Reboot "true" is reasonable. On a
clustered node, letting each box decide on its own when to reboot
produces a sawtooth of unscheduled drains. We let unattended-upgrades
write /var/run/reboot-required and stop there. The actual reboot is
someone else’s job.
For comfort: enable verbose logging so the post-mortem-from-a-week-ago investigation is possible.
Unattended-Upgrade::Verbose "true";
Logs land in /var/log/unattended-upgrades/unattended-upgrades.log. Ship
them to your log stack like any other systemd service log; they’re worth
keeping for at least the retention period of your audit policy.
Layer 2: kured, the cluster’s reboot daemon
kured is a tiny DaemonSet that runs on every node, polls
/var/run/reboot-required, and when it fires, takes a cluster-wide lock,
cordons + drains the node, reboots it via systemctl reboot, and on the
way back uncordons. The lock serializes reboots so you never have two
nodes draining at once.
Install via Helm:
helm repo add kubereboot https://kubereboot.github.io/charts
helm install kured kubereboot/kured \
--namespace kube-system \
--set configuration.period=1h \
--set configuration.startTime=02:00 \
--set configuration.endTime=05:00 \
--set configuration.timeZone="Europe/Berlin" \
--set configuration.rebootDays='{mo,tu,we,th,fr}'
A few of those knobs deserve commentary:
startTime/endTime— define a reboot window. Outside the window, kured will see the sentinel file but won’t act on it. Pick a window that doesn’t overlap with traffic peaks. I use 02:00–05:00 local; nobody complains.rebootDays— skip weekends. The cost of “two days of unpatched kernel” is much less than the cost of “the on-call person paged on a Saturday because a kured-drained pod hit a PDB and got stuck.”period— how often kured polls the sentinel. 1 h is sensible. Setting it lower won’t make patching faster (unattended-upgrades already controls when the sentinel appears) but it will tighten the cordon-to-reboot lag.
Optional but recommended: a webhook for notifications.
configuration:
notifyUrl: "slack://hooks.slack.com/services/T0.../B0.../xxx"
messageTemplateDrain: "Draining {{ .NodeID }} for reboot"
messageTemplateReboot: "Rebooting {{ .NodeID }}"
messageTemplateUncordon:"{{ .NodeID }} back online"
When a node reboots silently and someone sees it in kubectl get nodes,
the first reaction is “incident”. Three messages in a Slack channel
turns it into “ok, expected”.
Layer 3: PodDisruptionBudgets on everything important
kubectl drain honors PDBs. That means if you don’t have one, kured will
happily evict your only replica of CoreDNS and your cluster will spend
sixty seconds resolving nothing. If you do have one but it’s misconfigured
(minAvailable: 1 on a workload with replicas: 1), kured will block
forever waiting for a slot it can never get.
The discipline is short:
- Anything with
replicas >= 2getsPodDisruptionBudget { minAvailable: 1 }. - Anything with
replicas: 1and no high-availability story gets a PDB withmaxUnavailable: 1so the drain succeeds even if there’s only one pod (the pod gets evicted, comes back on another node — there’s a small blip, but no hang). - StatefulSets with persistent storage that’s node-bound: extra care.
These often need
kured.weave.works/blockingPodSelectorannotations so kured waits for them before considering the drain done.
A real PDB:
apiVersion: policy/v1
kind: PodDisruptionBudget
metadata:
name: coredns
namespace: kube-system
spec:
minAvailable: 1
selector:
matchLabels:
k8s-app: kube-dns
The reason the PDB lives in the workload, not in kured: kured doesn’t need to know about your application. The drain mechanism asks the API server to evict pods one at a time; the API server checks PDBs; if the PDB would be violated, the eviction fails; the drain retries. Standard Kubernetes mechanics.
The end-to-end flow
Once the three layers are in place, a routine kernel patch looks like this:
- 03:00 local time: APT timer fires on each node, unattended-upgrades
runs, downloads security patches, installs them, writes
/var/run/reboot-requiredif a reboot-requiring package was touched. - Sometime in the next hour: kured on each node polls and notices the sentinel. It checks the reboot window — yes, 03:xx, in window — and tries to acquire the cluster lock.
- One node at a time: kured wins the lock on (say)
worker-1, cordons it, runskubectl drainhonoring PDBs, waits for pods to reschedule elsewhere, thensystemctl reboot. - Node back, uncordon: kured detects the node is
Readyagain, uncordons it, releases the lock. - Next node: kured on
worker-2acquires the lock, repeats. - By 05:00 (or earlier): every reboot-requiring node is patched and back in service. Nodes that didn’t need a reboot are untouched.
Total disruption: per-workload, on the order of the drain duration (10–60 seconds for a stateless workload) per node. From the outside, the cluster is fully available the whole time.
Failure modes I’ve actually hit
PDB blocks the drain forever. Symptom: kured sits cordoned for
hours, never reboots, eventually the maintenance window closes and it
gives up until tomorrow. Diagnosis: kubectl get pdb -A and look for
anything with ALLOWED DISRUPTIONS: 0. The fix is to find the workload
with too few replicas to satisfy its own PDB. Common offenders: legacy
single-replica Deployments without an HA story.
Unattended-upgrades has a config-file conflict. Symptom: nothing patches anymore. apt is wedged on a prompt nobody can see because there’s no TTY. The fix is to pre-answer the prompt in the unattended config:
Dpkg::Options { "--force-confdef"; "--force-confold"; };
--force-confold keeps your modified config; --force-confdef uses the
package’s default where there’s no conflict. Together they handle the
common cases without blocking. The corner-case (you modified a config
that the new version also rewrites significantly) still needs a human,
but at least the apt run completes.
Kured runs but the node doesn’t actually reboot. Symptom: cordon
happens, drain succeeds, then nothing. The systemctl call returned but
the node stays up. This is rare and usually means systemctl reboot is
blocked by a stuck service. The fix is upstream — find and fix the
service — but the symptom in kured logs is “reboot called, still alive
after 30s, will check again next period”.
Mail not delivered. Symptom: silence. The first failure becomes the
seventh failure becomes “I should look at the patching” three months
later. The fix is to test the mail path on day one (apt-listchanges
will mail a digest on the first run if mail works at all), and to also
treat the kured Slack channel as a passive health signal — if it’s
silent for two weeks, mail probably broke, not “no patches needed.”
What I’d remember
- Don’t let nodes reboot themselves. Set
Automatic-Reboot "false"in unattended-upgrades; letkuredcoordinate. - Pick a reboot window. Skip weekends. Pages on Saturday are the worst reason to discover a misconfigured PDB.
- Every workload that matters has a PDB. CoreDNS, ingress-nginx, cert-manager, your own services. Without it, kured will quietly take the cluster down one node at a time.
- The drain-blocking-forever case is the loudest failure mode and the most informative one — fix the PDB, fix the replica count, and the rest of the system works.
- Test the mail and Slack channels on day one. Silence is the worst signal in this whole stack.
It’s not a complicated setup, once it’s set up. The trick is recognizing that “auto-update” is three layers, not one, and that each layer has a different owner.