NGINX/PHP-FPM graceful shutdown and 502 errors

We have a PHP application running with Kubernetes in pods with two dedicated containers — NGINX и PHP-FPM.

The problem is that during downscaling clients get 502 errors. E.g. when a pod is stopping, its containers can not correctly close existing connections.

So, in this post, we will take a closer look at the pods’ termination process in general, and NGINX and PHP-FPM containers in particular.

Testing will be performed on the AWS Elastic Kubernetes Service by the Yandex.Tank utility.

Ingress resource will create an AWS Application Load Balancer with the AWS ALB Ingress Controller.

Для управления контейнерами на Kubernetes WorkerNodes испольузется Docker.

Pod Lifecycle — Termination of Pods

So, let’s take an overview of the pods’ stopping and termination process.

Basically, a pod is a set of processes running on a Kubernetes WorkerNode, which are stopped by standard IPC (Inter-Process Communicationsignals.

To give the pod the ability to finish all its operations, a container runtime at first ties softly stop it (graceful shutdown) by sending a SIGTERM signal to a PID 1 in each container of this pod (see docker stop). Also, a cluster starts counting a grace period before force kill this pod by sending a SIGKILL signal.

The SIGTERM can be overridden by using the STOPSIGNAL in an image used to spin up a container.

Thus, the whole flow of the pod’s deleting is (actually, the part below is a kinda copy of the official documentation):

  1. a user issues a kubectl delete pod or kubectl scale deployment command which triggers the flow and the cluster start countdown of the grace period with the default value set to the 30 second
  2. the API server of the cluster updates the pod’s status — from the Running state, it becomes the Terminating (see Container states). A kubelet on the WorkerNode where this pod is running, receives this status update and start the pod’s termination process:
  3. if a container(s) in the pod have a preStop hook – kubelet will run it. If the hook is still running the default 30 seconds on the grace period – another 2 seconds will be added. The grace period can be set with the terminationGracePeriodSeconds
  4. when a preStop hook is finished, a kubelet will send a notification to the Docker runtime to stop containers related to the pod. The Docker daemon will send the SIGTERM signal to a process with the PID 1 in each container. Containers will get the signal in random order.
  5. simultaneously with the beginning of the graceful shutdown — Kubernetes Control Plane (its kube-controller-manager) will remove the pod from the endpoints (see Kubernetes – Endpoints) and a corresponding Service will stop sending traffic to this pod
  6. after the grace period countdown is finished, a kubelet will start force shutdown – Docker will send the SIGKILL signal to all remaining processes in all containers of the pod which can not be ignored and those process will be immediately terminated without change to correctly finish their operations
  7. kubelet triggers deletion of the pod from the API server
  8. API server deletes a record about this pod from the etcd
Image for post

Actually, there are two issues:

  1. the NGINX and PHP-FPM perceives the SIGTERM signal as a force как “brutal murder” and will finish their processes immediately , и завершают работу немедленно, without concern about existing connections (see Controlling nginx and php-fpm(8) – Linux man page)
  2. the 2 and 3 steps — sending the SIGTERM and an endpoint deletion – are performed at the same time. Still, an Ingress Service will update its data about endpoints not momentarily and a pod can be killed before then an Ingress will stop sending traffic to it causing 502 error for clients as the pod can not accept new connections

E.g. if we have a connection to an NGINX server, the NGINX master process during the fast shutdown will just drop this connection and our client will receive the 502 error, see the Avoiding dropped connections in nginx containers with “STOPSIGNAL SIGQUIT”.

NGINX STOPSIGNAL and 502

Okay, now we got some understanding of how it’s going — let’s try to reproduce the first issue with NGINX.

The example below is taken from the post above and will be deployed to a Kubernetes cluster.

Prepare a Dockerfile:

FROM nginx

RUN echo 'server {\n\
listen 80 default_server;\n\
location / {\n\
proxy_pass http://httpbin.org/delay/10;\n\
}\n\
}' > /etc/nginx/conf.d/default.conf

CMD ["nginx", "-g", "daemon off;"]

Here NGINX will proxy_pass a request to the http://httpbin.org which will respond with a 10 seconds delay to emulate a PHP-backend.

Build an image and push it to a repository:

$ docker build -t setevoy/nginx-sigterm .
$ docker push setevoy/nginx-sigterm

Now, add a Deployment manifest to spin up 10 pods from this image.

Here is the full file with a Namespace, Service, and Ingress, in the following part of this post, will add only updated parts of the manifest:

---
apiVersion: v1
kind: Namespace
metadata:
name: test-namespace
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: test-deployment
namespace: test-namespace
labels:
app: test
spec:
replicas: 10
selector:
matchLabels:
app: test
template:
metadata:
labels:
app: test
spec:
containers:
- name: web
image: setevoy/nginx-sigterm
ports:
- containerPort: 80
resources:
requests:
cpu: 100m
memory: 100Mi
readinessProbe:
tcpSocket:
port: 80
---
apiVersion: v1
kind: Service
metadata:
name: test-svc
namespace: test-namespace
spec:
type: NodePort
selector:
app: test
ports:
- protocol: TCP
port: 80
targetPort: 80
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
name: test-ingress
namespace: test-namespace
annotations:
kubernetes.io/ingress.class: alb
alb.ingress.kubernetes.io/scheme: internet-facing
alb.ingress.kubernetes.io/listen-ports: '[{"HTTP": 80}]'
spec:
rules:
- http:
paths:
- backend:
serviceName: test-svc
servicePort: 80

Deploy it:

$ kubectl apply -f test-deployment.yaml
namespace/test-namespace created
deployment.apps/test-deployment created
service/test-svc created
ingress.extensions/test-ingress created

Check the Ingress:

$ curl -I aadca942-testnamespace-tes-5874–698012771.us-east-2.elb.amazonaws.com
HTTP/1.1 200 OK

And we have 10 pods running:

$ kubectl -n test-namespace get pod
NAME READY STATUS RESTARTS AGE
test-deployment-ccb7ff8b6–2d6gn 1/1 Running 0 26s
test-deployment-ccb7ff8b6–4scxc 1/1 Running 0 35s
test-deployment-ccb7ff8b6–8b2cj 1/1 Running 0 35s
test-deployment-ccb7ff8b6-bvzgz 1/1 Running 0 35s
test-deployment-ccb7ff8b6-db6jj 1/1 Running 0 35s
test-deployment-ccb7ff8b6-h9zsm 1/1 Running 0 20s
test-deployment-ccb7ff8b6-n5rhz 1/1 Running 0 23s
test-deployment-ccb7ff8b6-smpjd 1/1 Running 0 23s
test-deployment-ccb7ff8b6-x5dc2 1/1 Running 0 35s
test-deployment-ccb7ff8b6-zlqxs 1/1 Running 0 25s

Prepare a load.yaml for the Yandex.Tank:

phantom:
address: aadca942-testnamespace-tes-5874-698012771.us-east-2.elb.amazonaws.com
header_http: "1.1"
headers:
- "[Host: aadca942-testnamespace-tes-5874-698012771.us-east-2.elb.amazonaws.com]"
uris:
- /
load_profile:
load_type: rps
schedule: const(100,30m)
ssl: false
console:
enabled: true
telegraf:
enabled: false
package: yandextank.plugins.Telegraf
config: monitoring.xml

Here, we will perform 1 request per second to pods behind our Ingress.

Run tests:

Image for post

All good so far.

Now, scale down the Deployment to only one pod:

$ kubectl -n test-namespace scale deploy test-deployment — replicas=1
deployment.apps/test-deployment scaled

Pods became Terminating:

$ kubectl -n test-namespace get pod
NAME READY STATUS RESTARTS AGE
test-deployment-647ddf455–67gv8 1/1 Terminating 0 4m15s
test-deployment-647ddf455–6wmcq 1/1 Terminating 0 4m15s
test-deployment-647ddf455-cjvj6 1/1 Terminating 0 4m15s
test-deployment-647ddf455-dh7pc 1/1 Terminating 0 4m15s
test-deployment-647ddf455-dvh7g 1/1 Terminating 0 4m15s
test-deployment-647ddf455-gpwc6 1/1 Terminating 0 4m15s
test-deployment-647ddf455-nbgkn 1/1 Terminating 0 4m15s
test-deployment-647ddf455-tm27p 1/1 Running 0 26m

And we got our 502 errors:

Image for post

Next, update the Dockerfile — add the STOPSIGNAL SIGQUIT:

FROM nginx

RUN echo 'server {\n\
listen 80 default_server;\n\
location / {\n\
proxy_pass http://httpbin.org/delay/10;\n\
}\n\
}' > /etc/nginx/conf.d/default.conf

STOPSIGNAL SIGQUIT

CMD ["nginx", "-g", "daemon off;"]

Build, push:

$ docker build -t setevoy/nginx-sigquit .
docker push setevoy/nginx-sigquit

Update the Deployment with the new image:

...
spec:
containers:
- name: web
image: setevoy/nginx-sigquit
ports:
- containerPort: 80
...

Redeploy, and check again.

Run tests:

Image for post

Scale down the deployment again:

$ kubectl -n test-namespace scale deploy test-deployment — replicas=1
deployment.apps/test-deployment scaled

And no errors this time:

Image for post

Great!

Traffic, preStop, and sleep

But still, if repeat tests few times we still can get some 502 errors:

Image for post

This time most likely we are facing the second issue — endpoints update is performed at the same time when the SIGTERM Is sent.

Let’s add a preStop hook with the sleep to give some time to update endpoints and our Ingress, so after the cluster will receive a request to stop a pod, a kubelet on a WorkerNode will wait for 5 seconds before sending the SIGTERM:

...
spec:
containers:
- name: web
image: setevoy/nginx-sigquit
ports:
- containerPort: 80
lifecycle:
preStop:
exec:
command: ["/bin/sleep","5"]
...

Repeat tests — and now everything is fine

Our PHP-FPM had no such issue as its image was initially built with the STOPSIGNAL SIGQUIT.

Other possible solutions

And of course, during debugging I’ve tried some other approaches to mitigate the issue.

See links at the end of this post and here I’ll describe them in short terms.

preStop and nginx -s quit

One of the solutions was to add a preStop hook which will send QUIT to NGINX:

lifecycle:
preStop:
exec:
command:
- /usr/sbin/nginx
- -s
- quit

Or:

...
lifecycle:
preStop:
exec:
command:
- /bin/sh
- -SIGQUIT
- 1
....

But it didn’t help. Not sure why as the idea seems to be correct — instead of waiting for the TERM from Kubernetes/Docker – we gracefully stopping the NGINX master process by sending QUIT.

You can also run the strace utility check which signal is really received by the NGINX.

NGINX + PHP-FPM, supervisord, and stopsignal

Our application is running in two containers in one pod, but during the debugging, I’ve also tried to use a single container with both NGINX and PHP-FPM, for example, trafex/alpine-nginx-php7.

There I’ve tried to add to stopsignal to the supervisor.conf for both NGINX and PHP-FPM with the QUIT value, but this also didn’t help although the idea also seems to be correct.

Still, one can try this way.

PHP-FPM, and process_control_timeout

In the Graceful shutdown in Kubernetes is not always trivial and on the Stackoveflow in the Nginx / PHP FPM graceful stop (SIGQUIT): not so graceful question is a note that FPM’s master process is killed before its child and this can lead to the 502 as well.

Not our current case, but pay your attention to the process_control_timeout.

NGINX, HTTP, and keep-alive session

Also, it can be a good idea to use the [Connection: close] header – then the client will close its connection right after a request is finished and this can decrease 502 errors count.

But anyway they will persist if NGINX will get the SIGTERM during processing a request.

See the HTTP persistent connection.

Deploying Kubernetes on bare metal with Rancher 2.0

Contents

  • Install Rancher server
  • Create a Kubernetes cluster
  • Add Kubernetes nodes
  • Install StorageOS as the Kubernetes storage class
  • Understand Nginx Ingress in Rancher

Install Rancher

Create a VM with Docker and Docker Compose installed and install Rancher 2.0 with docker compose:

  • Rancher docker-compose file: docker-compose.yaml
  • Run these commands to install Rancher with docker compose:
    • git clone https://github.com/polinchw/rancher-docker-compose
    • cd rancher-docker-compose
    • docker-compose up -d

Create your Kubernetes cluster with Rancher

Install a custom Kubernetes cluster with Rancher. Use the ‘Custom’ cluster.

Cluster!

Add Kubernetes nodes and join the Kubernetes cluster

Run the following commands on all the VMs that your Kubernetes cluster will run on. The final docker command will have the VM join the new Kubernetes cluster.

Replace the –server and –token with your Rancher server and cluster token.

#!/bin/bash

#sudo apt update
#sudo apt -y dist-upgrade

#Ubuntu (Docker install)
#sudo apt -y install docker.io

sudo apt -y install linux-image-extra-$(uname -r)

#Debian 9 (Docker install)
#sudo apt -y install apt-transport-https ca-certificates curl gnupg2 software-properties-common
#curl -fsSL https://download.docker.com/linux/debian/gpg | sudo apt-key add -
#sudo add-apt-repository "deb [arch=amd64] https://download.docker.com/linux/debian $(lsb_release -cs) stable"
#sudo apt update
#sudo apt -y install docker-ce

sudo mkdir -p /etc/systemd/system/docker.service.d/
sudo cat <<EOF > /etc/systemd/system/docker.service.d/mount_propagation_flags.conf
[Service]
MountFlags=shared
EOF

sudo systemctl daemon-reload
sudo systemctl restart docker.service

#This is dependent on your Rancher server
sudo docker run -d --privileged --restart=unless-stopped --net=host -v /etc/kubernetes:/etc/kubernetes -v /var/run:/var/run rancher/rancher-agent:v2.1.0-rc9 --server https://75.77.159.159 --token rb8k8kkqw55jqnqbbf4ssdjqtw6hndhfxxcghgv8257kx4p6qsqq55 --ca-checksum 641b2888ce3f1091d20149a495d10457154428f440475b42291b6af1b6c0dd06 --etcd --controlplane --worker

Download the kub config file for the cluster

Helloservice!

After you download the kub config file you can use it by running this command:

export KUBECONFIG=$HOME/.kube/rancher-config

Install Helm on the cluster

git clone https://github.com/polinchw/set-up-tiller

cd set-up-tiller

chmod u+x set-up-tiller.sh

./set-up-tiller.sh

helm init --service-account tiller


Install StorageOS Helm Chart

helm repo add storageos https://charts.storageos.com
helm install --name storageos --namespace storageos-operator --version 1.1.3 storageos/storageoscluster-operator

Add the Storage OS Secret

apiVersion: v1
kind: Secret
metadata:
  name: storageos-api
  namespace: default
  labels:
    app: storageos
type: kubernetes.io/storageos
data:
  # echo -n '<secret>' | base64
  apiUsername: c3RvcmFnZW9z
  apiPassword: c3RvcmFnZW9z


Add the StorageOSCluster

apiVersion: storageos.com/v1
kind: StorageOSCluster
metadata:
  name: example-storageos
  namespace: default
spec:
  secretRefName: storageos-api
  secretRefNamespace: default
  csi:
    enable: true


Set StorageOS as the default storage class

kubectl patch storageclass fast -p ‘{“metadata”: {“annotations”:{“storageclass.kubernetes.io/is-default-class”:”true”}}}’

Using the default Nginx Igress

Rancher automatically installs the nginx ingress controller on all the nodes in the cluster.
If you are able to expose one of the VMs in the cluster to the outside world with a public IP then you can connect to the ingress based services on ports 80 and 443.

Any app you want to be accessible through the default nginx ingress must be added to the ‘default’ project in Rancher.

Tutorial: creazione di un cluster con un’attività Fargate utilizzando la CLI di Amazon ECS

Questo tutorial mostra come configurare un cluster e distribuire un servizio con attività che utilizzano il tipo di lancio Fargate.

Prerequisiti

Verifica i seguenti requisiti preliminari:

Fase 1: Crea il ruolo IAM per l’esecuzione dell’attività

L’agente del container Amazon ECS effettua chiamate all’API di AWS per tuo conto, pertanto richiede una policy e un ruolo IAM che consentano al servizio di stabilire che l’agente appartiene a te. Questo ruolo IAM viene definito un ruolo IAM di esecuzione delle attività. Se disponi già di un ruolo per l’esecuzione delle attività pronto per essere utilizzato, puoi ignorare questa fase. Per ulteriori informazioni, consulta Ruolo IAM per l’esecuzione di attività Amazon ECS.

Per creare il ruolo IAM per l’esecuzione delle attività utilizzando AWS CLI

  1. Crea un file denominato task-execution-assume-role.json con i seguenti contenuti:{ "Version": "2012-10-17", "Statement": [ { "Sid": "", "Effect": "Allow", "Principal": { "Service": "ecs-tasks.amazonaws.com" }, "Action": "sts:AssumeRole" } ] }
  2. Crea il ruolo per l’esecuzione delle attivitàaws iam --region us-west-2 create-role --role-name ecsTaskExecutionRole --assume-role-policy-document file://task-execution-assume-role.json
  3. Collega la policy relativa al ruolo per l’esecuzione delle attività:aws iam --region us-west-2 attach-role-policy --role-name ecsTaskExecutionRole --policy-arn arn:aws:iam::aws:policy/service-role/AmazonECSTaskExecutionRolePolicy

Fase 2: Configura la CLI di Amazon ECS

Per poter effettuare richieste API a tuo nome, la CLI di Amazon ECS necessita delle credenziali, che può estrarre da variabili di ambiente, da un profilo AWS o da un profilo Amazon ECS. Per ulteriori informazioni, consulta Configurazione della CLI di Amazon ECS.

Per creare una configurazione della CLI di Amazon ECS

  1. Crea una configurazione cluster, che definisce la regione AWS, i prefissi di creazione delle risorse e il nome del cluster da utilizzare con la CLI di Amazon ECS:ecs-cli configure --cluster tutorial --default-launch-type FARGATE --config-name tutorial --region us-west-2
  2. Crea un profilo CLI utilizzando l’ID chiave di accesso e la chiave segreta:ecs-cli configure profile --access-key AWS_ACCESS_KEY_ID --secret-key AWS_SECRET_ACCESS_KEY --profile-name tutorial-profile

Fase 3: creare un cluster e configurare il gruppo di sicurezza

Per creare un cluster ECS e un gruppo di sicurezza

  1. Creare un cluster Amazon ECS con il comando ecs-cli up. Poiché nella configurazione del cluster hai specificato Fargate come tipo di lancio predefinito, il comando crea un cluster vuoto e un VPC configurato con due sottoreti pubbliche.ecs-cli up --cluster-config tutorial --ecs-profile tutorial-profilePossono essere necessari alcuni minuti perché le risorse vengano create e il comando venga completato. L’output di questo comando contiene gli ID di sottorete e i VPC creati. Prendi nota di questi ID poiché verranno utilizzati in un secondo momento.
  2. Utilizzando l’AWS CLI, recupera l’ID del gruppo di sicurezza predefinito per il VPC. Utilizza l’ID VPC dell’output precedente:aws ec2 describe-security-groups --filters Name=vpc-id,Values=VPC_ID --region us-west-2L’output di questo comando contiene l’ID del gruppo di sicurezza, utilizzato nella fase successiva.
  3. Tramite l’AWS CLI, aggiungi una regola del gruppo di sicurezza per consentire l’accesso in entrata sulla porta 80:aws ec2 authorize-security-group-ingress --group-id security_group_id --protocol tcp --port 80 --cidr 0.0.0.0/0 --region us-west-2

Fase 4: Crea un file Compose

In questa fase dovrai generare un semplice file Docker Compose che crea un’applicazione Web PHP. Attualmente, la CLI di Amazon ECS supporta le versione 1, 2 e 3 della sintassi del file di Docker Compose. Questo tutorial utilizza Docker Compose v3.

Di seguito è riportato il file di Compose, che puoi denominare docker-compose.yml. Il container web espone la porta 80 per il traffico in entrata al server Web, oltre a configurare il posizionamento dei log di container nel gruppo di log CloudWatch creato in precedenza. Questa riportata è la best practice per le attività Fargate.

version: '3'
services:
  web:
    image: amazon/amazon-ecs-sample
    ports:
      - "80:80"
    logging:
      driver: awslogs
      options: 
        awslogs-group: tutorial
        awslogs-region: us-west-2
        awslogs-stream-prefix: web

Nota

Se il tuo account contiene già un gruppo di log CloudWatch Logs denominato tutorial nella regione us-west-2, scegli un nome univoco in modo che l’interfaccia a riga di comando ECS crei un nuovo gruppo di log per questo tutorial.

Oltre alle informazioni del file di Docker Compose, dovrai specificare alcuni parametri specifici di Amazon ECS necessari per il servizio. Utilizzando gli ID per VPC, sottorete e gruppo di sicurezza ottenuti nel passo precedente, crea un file denominato ecs-params.yml con il seguente contenuto:

version: 1
task_definition:
  task_execution_role: ecsTaskExecutionRole
  ecs_network_mode: awsvpc
  task_size:
    mem_limit: 0.5GB
    cpu_limit: 256
run_params:
  network_configuration:
    awsvpc_configuration:
      subnets:
        - "subnet ID 1"
        - "subnet ID 2"
      security_groups:
        - "security group ID"
      assign_public_ip: ENABLED

Fase 5: Distribuisci il file Compose a un cluster

Dopo aver creato il file Compose, puoi distribuirlo al cluster con il comando ecs-cli compose service up. Per impostazione predefinita, il comando cerca i file denominati docker-compose.yml ed ecs-params.yml nella directory corrente; puoi specificare un altro file di Docker Compose con l’opzione --file e un altro file ECS Params con l’opzione --ecs-params. Per impostazione predefinita, le risorse create da questo comando contengono la directory corrente nel titolo, ma puoi sostituire questo valore con l’opzione --project-name. L’opzione --create-log-groups crea i gruppi di log CloudWatch per i log di container.

ecs-cli compose --project-name tutorial service up --create-log-groups --cluster-config tutorial --ecs-profile tutorial-profile

Fase 6: Visualizza i container in esecuzione su un cluster

Dopo aver distribuito il file Compose, puoi visualizzare i container in esecuzione nel servizio con il comando ecs-cli compose service ps.

ecs-cli compose --project-name tutorial service ps --cluster-config tutorial --ecs-profile tutorial-profile

Output:

Name                                           State    Ports                     TaskDefinition  Health
tutorial/0c2862e6e39e4eff92ca3e4f843c5b9a/web  RUNNING  34.222.202.55:80->80/tcp  tutorial:1      UNKNOWN

Nell’esempio precedente, dal file Compose puoi visualizzare sia il container web, sia l’indirizzo IP e la porta del server Web. Se il browser Web fa riferimento a tale indirizzo, viene visualizzata l’applicazione Web PHP. Nell’output è riportato anche il valore task-id del container. Copia l’ID attività, che ti servirà nella fase successiva.

Fase 7: Visualizza i log di container

Visualizza i log per l’attività:

ecs-cli logs --task-id 0c2862e6e39e4eff92ca3e4f843c5b9a --follow --cluster-config tutorial --ecs-profile tutorial-profile

Nota

L’opzione --follow indica alla CLI di Amazon ECS di eseguire continuamente il polling per i log.

Fase 8: Dimensiona le attività sul cluster

Con il comando ecs-cli compose service scale puoi ampliare il numero di attività per aumentare il numero di istanze dell’applicazione. In questo esempio, il conteggio in esecuzione dell’applicazione viene portato a due.

ecs-cli compose --project-name tutorial service scale 2 --cluster-config tutorial --ecs-profile tutorial-profile

Ora nel cluster saranno presenti due container in più:

ecs-cli compose --project-name tutorial service ps --cluster-config tutorial --ecs-profile tutorial-profile

Output:

Name                                           State    Ports                      TaskDefinition  Health
tutorial/0c2862e6e39e4eff92ca3e4f843c5b9a/web  RUNNING  34.222.202.55:80->80/tcp   tutorial:1      UNKNOWN
tutorial/d9fbbc931d2e47ae928fcf433041648f/web  RUNNING  34.220.230.191:80->80/tcp  tutorial:1      UNKNOWN

Fase 9: visualizzare l’applicazione Web

Inserisci l’indirizzo IP dell’attività nel browser Web per visualizzare una pagina Web contenente l’applicazione Web Simple PHP App (App PHP semplice).

Fase 10: Elimina

Al termine di questo tutorial, dovrai eliminare le risorse in modo che non comportino ulteriori addebiti. Per prima cosa, elimina il servizio: in questo modo i container esistenti verranno interrotti e non tenteranno di eseguire altre attività.

ecs-cli compose --project-name tutorial service down --cluster-config tutorial --ecs-profile tutorial-profile

Ora arresta il cluster per eliminare le risorse create in precedenza con il comando ecs-cli up.

ecs-cli down --force --cluster-config tutorial --ecs-profile tutorial-profile

Ready-to-use commands and tips for kubectl

Image for post

Kubectl is the most important Kubernetes command-line tool that allows you to run commands against clusters. We at Flant internally share our knowledge of using it via formal wiki-like instructions as well as Slack messages (we also have a handy and smart search engine in place — but that’s a whole different story…). Over the years, we have accumulated a large number of various kubectl tips and tricks. Now, we’ve decided to share some of our cheat sheets with a wider community.

I am sure our readers might be familiar with many of them. But still, I hope you will learn something new and, thereby, improve your productivity.

NB: While some of the commands & techniques listed below were compiled by our engineers, others were found on the Web. In the latter case, we checked them thoroughly and found them useful.

Well, let’s get started!

Getting lists of pods and nodes

1. I guess you are all aware of how to get a list of pods across all Kubernetes namespaces using the --all-namespaces flag. Many people are so used to it that they have not noticed the emergence of its shorter version, -A (it exists since at least Kubernetes 1.15).

2. How do you find all non-running pods (i.e., with a state other than Running)?

kubectl get pods -A --field-selector=status.phase!=Running | grep -v Complete
Image for post

By the way, examining the --field-selector flag more closely (see the relevant documentation) might be a good general recommendation.

3. Here is how you can get the list of nodes and their memory size:

kubectl get no -o json | \
jq -r '.items | sort_by(.status.capacity.memory)[]|[.metadata.name,.status.capacity.memory]| @tsv'
Image for post

4. Getting the list of nodes and the number of pods running on them:

kubectl get po -o json --all-namespaces | \
jq '.items | group_by(.spec.nodeName) | map({"nodeName": .[0].spec.nodeName, "count": length}) | sort_by(.count)'
Image for post

5. Sometimes, DaemonSet does not schedule a pod on a node for whatever reason. Manually searching for them is a tedious task, so here is a mini-script to get a list of such nodes:

ns=my-namespace
pod_template=my-pod
kubectl get node | grep -v \"$(kubectl -n ${ns} get pod --all-namespaces -o wide | fgrep ${pod_template} | awk '{print $8}' | xargs -n 1 echo -n "\|" | sed 's/[[:space:]]*//g')\"

6. This is how you can use kubectl top to get a list of pods that eat up CPU and memory resources:

# cpu
kubectl top pods -A | sort --reverse --key 3 --numeric
# memory
kubectl top pods -A | sort --reverse --key 4 --numeric

7. Sorting the list of pods (in this case, by the number of restarts):

kubectl get pods --sort-by=.status.containerStatuses[0].restartCount
Image for post

Of course, you can sort them by other fields, too (see PodStatus and ContainerStatus for details).

Getting other data

1. When tuning the Ingress resource, we inevitably go down to the service itself and then search for pods based on its selector. I used to look for this selector in the service manifest, but later switched to the -o wide flag:

kubectl -n jaeger get svc -o wide
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE SELECTORjaeger-cassandra ClusterIP None <none> 9042/TCP 77d app=cassandracluster,cassandracluster=jaeger-cassandra,cluster=jaeger-cassandra

As you can see, in this case, we get the selector used by our service to find the appropriate pods.

2. Here is how you can easily print limits and requests of each pod:

kubectl get pods -n my-namespace -o=custom-columns='NAME:spec.containers[*].name,MEMREQ:spec.containers[*].resources.requests.memory,MEMLIM:spec.containers[*].resources.limits.memory,CPUREQ:spec.containers[*].resources.requests.cpu,CPULIM:spec.containers[*].resources.limits.cpu'
Image for post

3. The kubectl run command (as well as createapplypatch) has a great feature that allows you to see the expected changes without actually applying them — the --dry-run flag. When it is used with -o yaml, this command outputs the manifest of the required object. For example:

kubectl run test --image=grafana/grafana --dry-run -o yamlapiVersion: apps/v1
kind: Deployment
metadata:
creationTimestamp: null
labels:
run: test
name: test
spec:
replicas: 1
selector:
matchLabels:
run: test
strategy: {}
template:
metadata:
creationTimestamp: null
labels:
run: test
spec:
containers:
- image: grafana/grafana
name: test
resources: {}
status: {}

All you have to do now is to save it to a file, delete a couple of system/unnecessary fields, et voila.

NB: Please note that the kubectl run behavior has been changed in Kubernetes v1.18 (now, it generates Pods instead of Deployments). You can find a great summary on this issue here.

4. Getting a description of the manifest of a given resource:

kubectl explain hpaKIND:     HorizontalPodAutoscaler
VERSION: autoscaling/v1DESCRIPTION:
configuration of a horizontal pod autoscaler.FIELDS:
apiVersion <string>
APIVersion defines the versioned schema of this representation of an
object. Servers should convert recognized schemas to the latest internal
value, and may reject unrecognized values. More info:
https://git.k8s.io/community/contributors/devel/api-conventions.md#resourceskind <string>
Kind is a string value representing the REST resource this object
represents. Servers may infer this from the endpoint the client submits
requests to. Cannot be updated. In CamelCase. More info:
https://git.k8s.io/community/contributors/devel/api-conventions.md#types-kindsmetadata <Object>
Standard object metadata. More info:
https://git.k8s.io/community/contributors/devel/api-conventions.md#metadataspec <Object>
behaviour of autoscaler. More info:
https://git.k8s.io/community/contributors/devel/api-conventions.md#spec-and-status.status <Object>
current information about the autoscaler.

Well, that is a piece of extensive and very helpful information, I must say.

Networking

1. Here is how you can get internal IP addresses of cluster nodes:

kubectl get nodes -o json | \
jq -r '.items[].status.addresses[]? | select (.type == "InternalIP") | .address' | \
paste -sd "\n" -
Image for post

2. And this way, you can print all services and their respective nodePorts:

kubectl get --all-namespaces svc -o json | \
jq -r '.items[] | [.metadata.name,([.spec.ports[].nodePort | tostring ] | join("|"))]| @tsv'

3. In situations where there are problems with the CNI (for example, with Flannel), you have to check the routes to identify the problem pod. Pod subnets that are used in the cluster can be very helpful in this task:

kubectl get nodes -o jsonpath='{.items[*].spec.podCIDR}' | tr " " "\n"
Image for post

Logs

1. Print logs with a human-readable timestamp (if it is not set):

kubectl -n my-namespace logs -f my-pod --timestamps2020-07-08T14:01:59.581788788Z fail: Microsoft.EntityFrameworkCore.Query[10100]

Logs look so much better now, don’t they?

2. You do not have to wait until the entire log of the pod’s container is printed out — just use --tail:

kubectl -n my-namespace logs -f my-pod --tail=50

3. Here is how you can print all the logs from all containers of a pod:

kubectl -n my-namespace logs -f my-pod --all-containers

4. Getting logs from all pods using a label to filter:

kubectl -n my-namespace logs -f -l app=nginx

5. Getting logs of the “previous” container (for example, if it has crashed):

kubectl -n my-namespace logs my-pod --previous

Other quick actions

1. Here is how you can quickly copy secrets from one namespace to another:

kubectl get secrets -o json --namespace namespace-old | \
jq '.items[].metadata.namespace = "namespace-new"' | \
kubectl create-f -

2. Run these two commands to create a self-signed certificate for testing:

openssl req -x509 -nodes -days 365 -newkey rsa:2048 -keyout tls.key -out tls.crt -subj "/CN=grafana.mysite.ru/O=MyOrganization"
kubectl -n myapp create secret tls selfsecret --key tls.key --cert tls.crt

Helpful links on the topic

In lieu of conclusion — here is a small list of similar publications and cheat sheets’ collections we’ve found online:

Image for post

Deploying WordPress and MySQL with Persistent Volumes in Kubernetes

This tutorial shows you how to deploy a WordPress site and a MySQL database using Minikube. Both applications use PersistentVolumes and PersistentVolumeClaims to store data.

PersistentVolume (PV) is a piece of storage in the cluster that has been manually provisioned by an administrator, or dynamically provisioned by Kubernetes using a StorageClass. A PersistentVolumeClaim (PVC) is a request for storage by a user that can be fulfilled by a PV. PersistentVolumes and PersistentVolumeClaims are independent from Pod lifecycles and preserve data through restarting, rescheduling, and even deleting Pods.Warning: This deployment is not suitable for production use cases, as it uses single instance WordPress and MySQL Pods. Consider using WordPress Helm Chart to deploy WordPress in production.Note: The files provided in this tutorial are using GA Deployment APIs and are specific to kubernetes version 1.9 and later. If you wish to use this tutorial with an earlier version of Kubernetes, please update the API version appropriately, or reference earlier versions of this tutorial.

Objectives

  • Create PersistentVolumeClaims and PersistentVolumes
  • Create a kustomization.yaml with
    • a Secret generator
    • MySQL resource configs
    • WordPress resource configs
  • Apply the kustomization directory by kubectl apply -k ./
  • Clean up

Before you begin

You need to have a Kubernetes cluster, and the kubectl command-line tool must be configured to communicate with your cluster. If you do not already have a cluster, you can create one by using Minikube, or you can use one of these Kubernetes playgrounds:

To check the version, enter kubectl version. The example shown on this page works with kubectl 1.14 and above.

Download the following configuration files:

  1. mysql-deployment.yaml
  2. wordpress-deployment.yaml

Create PersistentVolumeClaims and PersistentVolumes

MySQL and WordPress each require a PersistentVolume to store data. Their PersistentVolumeClaims will be created at the deployment step.

Many cluster environments have a default StorageClass installed. When a StorageClass is not specified in the PersistentVolumeClaim, the cluster’s default StorageClass is used instead.

When a PersistentVolumeClaim is created, a PersistentVolume is dynamically provisioned based on the StorageClass configuration.Warning: In local clusters, the default StorageClass uses the hostPath provisioner. hostPath volumes are only suitable for development and testing. With hostPath volumes, your data lives in /tmp on the node the Pod is scheduled onto and does not move between nodes. If a Pod dies and gets scheduled to another node in the cluster, or the node is rebooted, the data is lost.Note: If you are bringing up a cluster that needs to use the hostPath provisioner, the --enable-hostpath-provisioner flag must be set in the controller-manager component.Note: If you have a Kubernetes cluster running on Google Kubernetes Engine, please follow this guide.

Create a kustomization.yaml

Add a Secret generator

Secret is an object that stores a piece of sensitive data like a password or key. Since 1.14, kubectl supports the management of Kubernetes objects using a kustomization file. You can create a Secret by generators in kustomization.yaml.

Add a Secret generator in kustomization.yaml from the following command. You will need to replace YOUR_PASSWORD with the password you want to use.

cat <<EOF >./kustomization.yaml
secretGenerator:
- name: mysql-pass
  literals:
  - password=YOUR_PASSWORD
EOF

Add resource configs for MySQL and WordPress

The following manifest describes a single-instance MySQL Deployment. The MySQL container mounts the PersistentVolume at /var/lib/mysql. The MYSQL_ROOT_PASSWORD environment variable sets the database password from the Secret.

application/wordpress/mysql-deployment.yaml 
apiVersion: v1 kind: Service metadata: name: wordpress labels: app: wordpress spec: ports: - port: 80 selector: app: wordpress tier: frontend type: LoadBalancer --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: wp-pv-claim labels: app: wordpress spec: accessModes: - ReadWriteOnce resources: requests: storage: 20Gi --- apiVersion: apps/v1 # for versions before 1.9.0 use apps/v1beta2 kind: Deployment metadata: name: wordpress labels: app: wordpress spec: selector: matchLabels: app: wordpress tier: frontend strategy: type: Recreate template: metadata: labels: app: wordpress tier: frontend spec: containers: - image: wordpress:4.8-apache name: wordpress env: - name: WORDPRESS_DB_HOST value: wordpress-mysql - name: WORDPRESS_DB_PASSWORD valueFrom: secretKeyRef: name: mysql-pass key: password ports: - containerPort: 80 name: wordpress volumeMounts: - name: wordpress-persistent-storage mountPath: /var/www/html volumes: - name: wordpress-persistent-storage persistentVolumeClaim: claimName: wp-pv-claim

The following manifest describes a single-instance WordPress Deployment. The WordPress container mounts the PersistentVolume at /var/www/html for website data files. The WORDPRESS_DB_HOST environment variable sets the name of the MySQL Service defined above, and WordPress will access the database by Service. The WORDPRESS_DB_PASSWORD environment variable sets the database password from the Secret kustomize generated.

application/wordpress/wordpress-deployment.yaml 
apiVersion: v1 kind: Service metadata: name: wordpress labels: app: wordpress spec: ports: - port: 80 selector: app: wordpress tier: frontend type: LoadBalancer --- apiVersion: v1 kind: PersistentVolumeClaim metadata: name: wp-pv-claim labels: app: wordpress spec: accessModes: - ReadWriteOnce resources: requests: storage: 20Gi --- apiVersion: apps/v1 # for versions before 1.9.0 use apps/v1beta2 kind: Deployment metadata: name: wordpress labels: app: wordpress spec: selector: matchLabels: app: wordpress tier: frontend strategy: type: Recreate template: metadata: labels: app: wordpress tier: frontend spec: containers: - image: wordpress:4.8-apache name: wordpress env: - name: WORDPRESS_DB_HOST value: wordpress-mysql - name: WORDPRESS_DB_PASSWORD valueFrom: secretKeyRef: name: mysql-pass key: password ports: - containerPort: 80 name: wordpress volumeMounts: - name: wordpress-persistent-storage mountPath: /var/www/html volumes: - name: wordpress-persistent-storage persistentVolumeClaim: claimName: wp-pv-claim
  1. Download the MySQL deployment configuration file.curl -LO https://k8s.io/examples/application/wordpress/mysql-deployment.yaml
  2. Download the WordPress configuration file.curl -LO https://k8s.io/examples/application/wordpress/wordpress-deployment.yaml
  3. Add them to kustomization.yaml file.
cat <<EOF >>./kustomization.yaml
resources:
  - mysql-deployment.yaml
  - wordpress-deployment.yaml
EOF

Apply and Verify

The kustomization.yaml contains all the resources for deploying a WordPress site and a MySQL database. You can apply the directory by

kubectl apply -k ./

Now you can verify that all objects exist.

  1. Verify that the Secret exists by running the following command:kubectl get secrets The response should be like this:NAME TYPE DATA AGE mysql-pass-c57bb4t7mf Opaque 1 9s
  2. Verify that a PersistentVolume got dynamically provisioned.kubectl get pvc Note: It can take up to a few minutes for the PVs to be provisioned and bound.The response should be like this:NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE mysql-pv-claim Bound pvc-8cbd7b2e-4044-11e9-b2bb-42010a800002 20Gi RWO standard 77s wp-pv-claim Bound pvc-8cd0df54-4044-11e9-b2bb-42010a800002 20Gi RWO standard 77s
  3. Verify that the Pod is running by running the following command:kubectl get pods Note: It can take up to a few minutes for the Pod’s Status to be RUNNING.The response should be like this:NAME READY STATUS RESTARTS AGE wordpress-mysql-1894417608-x5dzt 1/1 Running 0 40s
  4. Verify that the Service is running by running the following command:kubectl get services wordpress The response should be like this:NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE wordpress LoadBalancer 10.0.0.89 <pending> 80:32406/TCP 4m Note: Minikube can only expose Services through NodePort. The EXTERNAL-IP is always pending.
  5. Run the following command to get the IP Address for the WordPress Service:minikube service wordpress --url The response should be like this:http://1.2.3.4:32406
  6. Copy the IP address, and load the page in your browser to view your site.You should see the WordPress set up page similar to the following screenshot.wordpress-init

Warning: Do not leave your WordPress installation on this page. If another user finds it, they can set up a website on your instance and use it to serve malicious content.

Either install WordPress by creating a username and password or delete your instance.

Cleaning up

  1. Run the following command to delete your Secret, Deployments, Services and PersistentVolumeClaims:kubectl delete -k ./

What’s next

What is CI/CD

Our space is about mobile CI/CD but, let’s start with the basics in this article before going into the details of why mobile CI/CD is actually different than other CI/CD flows in our future articles.

Image for post

What is Continuous Integration?

Continuous Integration (CI) is the building process for every code pushed to a repository automatically.

This is of course a very simplified definition and CI contains a number of different components, which are like the building blocks that come together to become a workflow.

A CI workflow may include operations like compiling the app, static code reviews and unit/UI testing with automated tools.

There are a number of different tools for CI/CD, some of the most popular ones are Jenkins, Travis and CircleCI, but mobile CI/CD requires specialization and in our space, we will be talking more about mobile CI/CD tools like Appcircle and Bitrise.


What is Continuous Delivery (or Deployment)?

Continuous Delivery (CD) is the delivery of the application for use with the components like the generation of environment-specific versions, codesigning, version management and deployment to a certain provider.

For mobile apps, the deployment can be done to a testing platform like TestFlight (for end user testing on actual devices) or Appcircle (for actual or virtual device testing).

Similarly, the release version of the app can be deployed to App Store and Google Play for B2C apps or to an enterprise app store for B2E apps.


Finally, what is a CI and CD pipeline?

With all CI/CD components are in place and connected together in an automated manner, they form an application pipeline. You just push your code and the code is processed through the pipeline.

CI is mainly responsible for processing the contents of the pipeline and CD is mainly responsible for directing the contents of the pipeline in the right direction.

In a well-functioning application pipeline, you don’t see the actual contents, it just flows by itself, but you have a full visibility on what is flowing and to where just like an actual pipeline.