I recently found myself needing a way to better handle rollouts on new deployments in Kubernetes for some applications. In certain situations, updates to a deployment would stop a pod which was in the middle of processing a request. Whilst these failed requests can be retried - none of our systems should expect everything to work 100% - stopping a pod whilst it only needed a few seconds to complete its work, is just a bit of a naive approach.
After Googling a better approach, I was gleeful to find Pod Lifecycle Handlers, a ways to hook handlers into the lifecycle events of a pod. For me this meant I can control the manner in which my more delicate application pods are terminated.
Lifecycle Hooks
The postStart hook is triggered immediately as the container is created.
There are no guarantees this will run before the containers ENTRYPOINT as both
are triggered at the same time. If the postStart handler takes too long or
hangs, the container will not reach the running state.
The preStop hook is triggered immediately before the container is terminated.
It blocks termination until the preStop handler completes or the pods grace
period ends - which defaults to 30s. This allows a more graceful shutdown and
reduces the number of errors seen during a rollout.
The handlers can either be commands as the following example shows or HTTP requests.
apiVersion: v1
kind: Pod
metadata:
name: lifecycle-demo
spec:
containers:
- name: lifecycle-demo-container
image: my-app
lifecycle:
postStart:
exec:
command: ["/bin/sh", "-c", "curl -O http://example.com/config.yaml.latest > config.yaml"]
preStop:
exec:
command: ["/bin/sh","-c","/shutdown.sh"]
This example shows a simple usage of these hooks. Pulling some configuration
from a remote source and triggering a script for a shutdown procedure. The pod
will still live until the command in the preStop handler exits (or the
terminationGracePeriodSeconds is hit), if f you required the pod to stop
immediately you can call kubectl delete pod --grace-period=0 --force, this
will terminates the pod with extreme prejudice š.
Debugging Handlers
You can debug the handlers by running kubectl describe pod <pod_name>, which
displays the events for the pod.
Events:
FirstSeen LastSeen Count From SubobjectPath Type Reason Message
--------- -------- ----- ---- ------------- -------- ------ -------
1m 1m 1 {default-scheduler } Normal Scheduled Successfully assigned test-1730497541-cq1d2 to gke-test-cluster-default-pool-a07e5d30-siqd
1m 1m 1 {kubelet gke-test-cluster-default-pool-a07e5d30-siqd} spec.containers{main} Normal Pulling pulling image "test:1.0"
1m 1m 1 {kubelet gke-test-cluster-default-pool-a07e5d30-siqd} spec.containers{main} Normal Created Created container with docker id 5c6a256a2567; Security:[seccomp=unconfined]
1m 1m 1 {kubelet gke-test-cluster-default-pool-a07e5d30-siqd} spec.containers{main} Normal Pulled Successfully pulled image "test:1.0"
1m 1m 1 {kubelet gke-test-cluster-default-pool-a07e5d30-siqd} spec.containers{main} Normal Started Started container with docker id 5c6a256a2567
38s 38s 1 {kubelet gke-test-cluster-default-pool-a07e5d30-siqd} spec.containers{main} Normal Killing Killing container with docker id 5c6a256a2567: PostStart handler: Error executing in Docker Container: 1
37s 37s 1 {kubelet gke-test-cluster-default-pool-a07e5d30-siqd} spec.containers{main} Normal Killing Killing container with docker id 8df9fdfd7054: PostStart handler: Error executing in Docker Container: 1
38s 37s 2 {kubelet gke-test-cluster-default-pool-a07e5d30-siqd} Warning FailedSync Error syncing pod, skipping: failed to "StartContainer" for "main" with RunContainerError: "PostStart handler: Error executing in Docker Container: 1"
Other notes
- The handlers are intended to be triggered at least once but maybe triggered multiple time in [certain events]([Container Lifecycle Hooks - Kubernetes]https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/#hook-delivery-guarantees).
- When a pod enters the
terminatingstate it is removed from the endpoints list for a service.
As always, I appreciate any feedback or if you want to reach out, Iām
@neuralsandwich on twitter and most other places.
References:
- https://kubernetes.io/docs/tasks/configure-pod-container/attach-handler-lifecycle-event/
- https://kubernetes.io/docs/concepts/containers/container-lifecycle-hooks/
- https://kubernetes.io/docs/concepts/workloads/pods/pod/#termination-of-pods