30 KiB
- kubectl - BASH autocompletion
- Install k3s
- Namespaces and resource limits
- Persistent volumes (StorageClass - dynamic provisioning)
- Ingress controller
- Cert-Manager (references ingress controller)
- HELM charts
- Kubernetes in action
kubectl - BASH autocompletion
For current shell only:
source <(kubectl completion bash)
Persistent:
echo "source <(kubectl completion bash)" >> ~/.bashrc
Install k3s
On premises
curl -sfL https://get.k3s.io | sh -
If disired, set a memory consumption limit of the systemd-unit like so:
root#> mkdir /etc/systemd/system/k3s.service.d
root#> vi /etc/systemd/system/k3s.service.d/limits.conf
[Service]
MemoryMax=1024M
root#> systemctl daemon-reload
root#> systemctl restart k3s
root#> systemctl status k3s
k3s.service - Lightweight Kubernetes
Loaded: loaded (/etc/systemd/system/k3s.service; enabled; vendor preset: enabled)
Drop-In: /etc/systemd/system/k3s.service.d
└─limits.conf
Active: active (running) since Thu 2020-11-26 10:46:26 CET; 13min ago
Docs: https://k3s.io
Process: 9618 ExecStartPre=/sbin/modprobe br_netfilter (code=exited, status=0/SUCCESS)
Process: 9619 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
Main PID: 9620 (k3s-server)
Tasks: 229
Memory: 510.6M (max: 1.0G)
CGroup: /system.slice/k3s.service
Upstream DNS-resolver
Docs: https://rancher.com/docs/rancher/v2.x/en/troubleshooting/dns/
Default: 8.8.8.8 => does not resolve local domains!
- local /etc/resolv.k3s.conf -> ip-of-dnsresolver (127.0.0.1 does not work!)
- vi /etc/systemd/system/k3s.service:
[...]
ExecStart=/usr/local/bin/k3s \
server [...] --resolv-conf /etc/resolv.k3s.conf \
- Re-load systemd config:
systemctl daemon-reload - Re-start k3s:
systemctl restart k3s.service - Re-deploy coredns-pods:
kubectl -n kube-system delete pod name-of-coredns-pods
Change NodePort range to 1 - 65535
- vi /etc/systemd/system/k3s.service:
[...]
ExecStart=/usr/local/bin/k3s \
server [...] --kube-apiserver-arg service-node-port-range=1-65535 \
- Re-load systemd config:
systemctl daemon-reload - Re-start k3s:
systemctl restart k3s.service
On Docker with K3d
K3d is a terraforming orchestrator which deploys a K3s cluster (masters and nodes) directly on docker without the need for virtual machines for each node (master/worker).
- Prerequisites: a local docker installation without user-namespaces enabled.
- Warning: K3d deploys privileged containers!
curl -s https://raw.githubusercontent.com/rancher/k3d/main/install.sh | bash
Create a K3s cluster without traefik as well as metrics-server
k3d cluster create cluster1 \
--agents 2 \
--k3s-server-arg '--disable=traefik' \
--k3s-server-arg '--disable=metrics-server' \
--k3s-server-arg '--kube-apiserver-arg=service-node-port-range=1-65535'
If you encounter helm throwing errors like this one:
Error: Kubernetes cluster unreachable
... just do:
$ kubectl config view --raw > ~/kubeconfig-k3d.yaml
$ export KUBECONFIG=~/kubeconfig-k3d.yaml
Namespaces and resource limits
kubectl apply -f https://gitea.zwackl.de/dominik/k3s/raw/branch/master/namespaces_limits.yaml
Persistent Volumes (StorageClass - dynamic provisioning)
Read more about AccessModes
Rancher Local
https://rancher.com/docs/k3s/latest/en/storage/
Only supports AccessMode: ReadWriteOnce (RWO)
Longhorn (distributed in local cluster)
- Requirements: https://longhorn.io/docs/0.8.0/install/requirements/
- Debian:
apt install open-iscsi
- Debian:
- Install: https://rancher.com/docs/k3s/latest/en/storage/
NFS
For testing purposes as well as simplicity you may use following NFS container image:
mkdir -p
docker run -d --name nfs-server \
--net=host \
--privileged \
-v /data/docker/nfs-server/data/:/nfsshare \
-e SHARED_DIRECTORY=/nfsshare \
itsthenetwork/nfs-server-alpine:latest
All Nodes need to have the NFS-client package (Ubuntu: nfs-common) installed
helm repo add ckotzbauer https://ckotzbauer.github.io/helm-charts
helm install my-nfs-client-provisioner --set nfs.server=<nfs-server/ip-addr> --set nfs.path=</data/nfs> ckotzbauer/nfs-client-provisioner
Check if NFS StorageClass is available:
$ kubectl get sc
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
local-path (default) rancher.io/local-path Delete WaitForFirstConsumer false 101d
nfs-client cluster.local/my-nfs-client-provisioner Delete Immediate true 172m
Now you can use nfs-client as StorageClass like so:
apiVersion: apps/v1
kind: StatefulSet
[...]
volumeClaimTemplates:
- metadata:
name: nfs-backend
spec:
accessModes: [ "ReadWriteMany" ]
storageClassName: "nfs-client"
resources:
requests:
storage: 32Mi
or so:
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: nfs-pvc-1
namespace: <blubb>
spec:
storageClassName: "nfs-client"
accessModes:
- ReadWriteMany
resources:
requests:
storage: 32Mi
Ingress controller
Disable Traefik-ingress
edit /etc/systemd/system/k3s.service:
[...]
ExecStart=/usr/local/bin/k3s \
server --disable traefik --resolv-conf /etc/resolv.conf \
[...]
Finally systemctl daemon-reload and systemctl restart k3s
Enable K8s own NGINX-ingress with OCSP stapling
Installation
This is the helm chart of the K8s own nginx ingress controller: https://kubernetes.github.io/ingress-nginx/deploy/#using-helm
kubectl create ns ingress-nginx
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
helm install my-release ingress-nginx/ingress-nginx -n ingress-nginx
kubectl -n ingress-nginx get all:
NAME READY STATUS RESTARTS AGE
pod/svclb-my-release-ingress-nginx-controller-m6gxl 2/2 Running 0 110s
pod/my-release-ingress-nginx-controller-695774d99c-t794f 1/1 Running 0 110s
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
service/my-release-ingress-nginx-controller-admission ClusterIP 10.43.116.191 <none> 443/TCP 110s
service/my-release-ingress-nginx-controller LoadBalancer 10.43.55.41 192.168.178.116 80:31110/TCP,443:31476/TCP 110s
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
daemonset.apps/svclb-my-release-ingress-nginx-controller 1 1 1 1 1 <none> 110s
NAME READY UP-TO-DATE AVAILABLE AGE
deployment.apps/my-release-ingress-nginx-controller 1/1 1 1 110s
NAME DESIRED CURRENT READY AGE
replicaset.apps/my-release-ingress-nginx-controller-695774d99c 1 1 1 110s
As nginx ingress is hungry for memory, let´s reduce the number of workers to 1:
kubectl -n ingress-nginx edit configmap my-release-ingress-nginx-controller
apiVersion: v1
<<<ADD BEGINN>>>
data:
enable-ocsp: "true"
worker-processes: "1"
<<<ADD END>>>
kind: ConfigMap
[...]
Finally the deployment needs to be restarted:
kubectl -n ingress-nginx rollout restart deployment my-release-ingress-nginx-controller
If you are facing deployment problems like the following one
Error: UPGRADE FAILED: cannot patch "gitea-ingress-staging" with kind Ingress: Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io": Post https://my-release-ingress-nginx-controller-admission.ingress-nginx.svc:443/networking/v1beta1/ingresses?timeout=10s: context deadline exceeded
A possible fix: kubectl -n ingress-nginx delete ValidatingWebhookConfiguration my-release-ingress-nginx-admission
Cert-Manager (references ingress controller)
Installation
Docs: https://hub.helm.sh/charts/jetstack/cert-manager
helm repo add jetstack https://charts.jetstack.io
helm repo update
kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v1.0.2/cert-manager.crds.yaml
kubectl create namespace cert-manager
helm install cert-manager --namespace cert-manager jetstack/cert-manager
kubectl -n cert-manager get all
Let´s Encrypt issuer
Docs: https://cert-manager.io/docs/tutorials/acme/ingress/#step-6-configure-let-s-encrypt-issuer
ClusterIssuers are a resource type similar to Issuers. They are specified in exactly the same way,
but they do not belong to a single namespace and can be referenced by Certificate resources from
multiple different namespaces.
lets-encrypt-cluster-issuers.yaml:
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-staging-issuer
spec:
acme:
# You must replace this email address with your own.
# Let's Encrypt will use this to contact you about expiring
# certificates, and issues related to your account.
email: user@example.com
server: https://acme-staging-v02.api.letsencrypt.org/directory
privateKeySecretRef:
# Secret resource that will be used to store the account's private key.
name: letsencrypt-staging-account-key
# Add a single challenge solver, HTTP01 using nginx
solvers:
- http01:
ingress:
class: nginx
---
apiVersion: cert-manager.io/v1
kind: ClusterIssuer
metadata:
name: letsencrypt-prod-issuer
spec:
acme:
# The ACME server URL
server: https://acme-v02.api.letsencrypt.org/directory
# Email address used for ACME registration
email: user@example.com
# Name of a secret used to store the ACME account private key
privateKeySecretRef:
name: letsencrypt-prod-account-key
# Enable the HTTP-01 challenge provider
solvers:
- http01:
ingress:
class: nginx
kubectl apply -f lets-encrypt-cluster-issuers.yaml
Deploying a LE-certificate
All you need is an Ingress resource of class nginx which references a ClusterIssuer (letsencrypt-prod-issuer) resource:
apiVersion: networking.k8s.io/v1beta1
kind: Ingress
metadata:
namespace: <stage>
name: some-ingress-name
annotations:
# use the shared ingress-nginx
kubernetes.io/ingress.class: "nginx"
cert-manager.io/cluster-issuer: "letsencrypt-prod-issuer"
spec:
tls:
- hosts:
- some-certificate.name.san
secretName: target-certificate-secret-name
rules:
- host: some-certificate.name.san
http:
paths:
- path: /
backend:
serviceName: some-target-service
servicePort: some-target-service-port
Troubleshooting
Docs: https://cert-manager.io/docs/faq/acme/
ClusterIssuer runs in default namespace:
kubectl get clusterissuer
kubectl describe clusterissuer <object>
All other ingres-specific cert-manager resources are running specific namespaces:
kubectl -n <stage> get certificaterequest
kubectl -n <stage> describe certificaterequest <object>
kubectl -n <stage> get certificate
kubectl -n <stage> describe certificate <object>
kubectl -n <stage> get secret
kubectl -n <stage> describe secret <object>
kubectl -n <stage> get challenge
kubectl -n <stage> describe challenge <object>
After successfull setup perform a TLS-test: https://www.ssllabs.com/ssltest/index.html
HELM charts
Docs:
Prerequisites:
- running kubernetes installation
- kubectl with ENV[KUBECONFIG] pointing to appropriate config file
- helm
Create a chart
helm create helm-test
~/kubernetes/helm$ tree helm-test/
helm-test/
├── charts
├── Chart.yaml
├── templates
│ ├── deployment.yaml
│ ├── _helpers.tpl
│ ├── hpa.yaml
│ ├── ingress.yaml
│ ├── NOTES.txt
│ ├── serviceaccount.yaml
│ ├── service.yaml
│ └── tests
│ └── test-connection.yaml
└── values.yaml
Install local chart without packaging
helm install helm-test-dev helm-test/ --set image.tag=latest --debug --wait
or just a dry-run:
helm install helm-test-dev helm-test/ --set image.tag=latest --debug --dry-run
--wait: Waits until all Pods are in a ready state, PVCs are bound, Deployments have minimum (Desired minus maxUnavailable)
Pods in ready state and Services have an IP address (and Ingress if a LoadBalancer) before marking the release as successful.
It will wait for as long as the --timeout value. If timeout is reached, the release will be marked as FAILED. Note: In
scenarios where Deployment has replicas set to 1 and maxUnavailable is not set to 0 as part of rolling update strategy,
--wait will return as ready as it has satisfied the minimum Pod in ready condition.
List deployed helm charts
~/kubernetes/helm$ helm list
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
helm-test-dev default 4 2020-08-27 12:30:38.98457042 +0200 CEST deployed helm-test-0.1.0 1.16.0
Upgrade local chart without packaging
~/kubernetes/helm$ helm upgrade helm-test-dev helm-test/ --set image.tag=latest --wait --timeout 60s
Release "helm-test-dev" has been upgraded. Happy Helming!
NAME: helm-test-dev
LAST DEPLOYED: Thu Aug 27 12:47:09 2020
NAMESPACE: default
STATUS: deployed
REVISION: 7
NOTES:
1. Get the application URL by running these commands:
export POD_NAME=$(kubectl get pods --namespace default -l "app.kubernetes.io/name=helm-test,app.kubernetes.io/instance=helm-test-dev" -o jsonpath="{.items[0].metadata.name}")
echo "Visit http://127.0.0.1:8080 to use your application"
kubectl --namespace default port-forward $POD_NAME 8080:80
helm upgrade [...] --wait is synchronous and exit with 0 on success, otherwise with >0 on failure. helm upgrade will wait for 5 minutes Setting the --timeout (Default 5 minutes) flag makes This can be used in term of CI/CD deployments with Jenkins.
Get status of deployed chart
~/kubernetes/helm$ helm status helm-test-dev
NAME: helm-test-dev
LAST DEPLOYED: Thu Aug 27 12:47:09 2020
NAMESPACE: default
STATUS: deployed
REVISION: 7
NOTES:
1. Get the application URL by running these commands:
export POD_NAME=$(kubectl get pods --namespace default -l "app.kubernetes.io/name=helm-test,app.kubernetes.io/instance=helm-test-dev" -o jsonpath="{.items[0].metadata.name}")
echo "Visit http://127.0.0.1:8080 to use your application"
kubectl --namespace default port-forward $POD_NAME 8080:80
Get deployment history
~/kubernetes/helm$ helm history helm-test-dev
REVISION UPDATED STATUS CHART APP VERSION DESCRIPTION
10 Thu Aug 27 12:56:33 2020 failed helm-test-0.1.0 1.16.0 Upgrade "helm-test-dev" failed: timed out waiting for the condition
11 Thu Aug 27 13:08:34 2020 superseded helm-test-0.1.0 1.16.0 Upgrade complete
12 Thu Aug 27 13:09:59 2020 superseded helm-test-0.1.0 1.16.0 Upgrade complete
13 Thu Aug 27 13:10:24 2020 superseded helm-test-0.1.0 1.16.0 Rollback to 11
14 Thu Aug 27 13:23:22 2020 failed helm-test-0.1.1 blubb Upgrade "helm-test-dev" failed: timed out waiting for the condition
15 Thu Aug 27 13:26:43 2020 pending-upgrade helm-test-0.1.1 blubb Preparing upgrade
16 Thu Aug 27 13:27:12 2020 superseded helm-test-0.1.1 blubb Upgrade complete
17 Thu Aug 27 14:32:32 2020 superseded helm-test-0.1.1 Upgrade complete
18 Thu Aug 27 14:33:58 2020 superseded helm-test-0.1.1 Upgrade complete
19 Thu Aug 27 14:36:49 2020 failed helm-test-0.1.1 cosmetics Upgrade "helm-test-dev" failed: timed out waiting for the condition
Rollback
helm rollback helm-test-dev 18 --wait
~/kubernetes/helm$ helm history helm-test-dev
REVISION UPDATED STATUS CHART APP VERSION DESCRIPTION
10 Thu Aug 27 12:56:33 2020 failed helm-test-0.1.0 1.16.0 Upgrade "helm-test-dev" failed: timed out waiting for the condition
11 Thu Aug 27 13:08:34 2020 superseded helm-test-0.1.0 1.16.0 Upgrade complete
12 Thu Aug 27 13:09:59 2020 superseded helm-test-0.1.0 1.16.0 Upgrade complete
13 Thu Aug 27 13:10:24 2020 superseded helm-test-0.1.0 1.16.0 Rollback to 11
14 Thu Aug 27 13:23:22 2020 failed helm-test-0.1.1 blubb Upgrade "helm-test-dev" failed: timed out waiting for the condition
15 Thu Aug 27 13:26:43 2020 pending-upgrade helm-test-0.1.1 blubb Preparing upgrade
16 Thu Aug 27 13:27:12 2020 superseded helm-test-0.1.1 blubb Upgrade complete
17 Thu Aug 27 14:32:32 2020 superseded helm-test-0.1.1 Upgrade complete
18 Thu Aug 27 14:33:58 2020 superseded helm-test-0.1.1 Upgrade complete
19 Thu Aug 27 14:36:49 2020 failed helm-test-0.1.1 cosmetics Upgrade "helm-test-dev" failed: timed out waiting for the condition
20 Thu Aug 27 14:37:36 2020 deployed helm-test-0.1.1 Rollback to 18
~/kubernetes/helm$ helm status helm-test-dev
NAME: helm-test-dev
LAST DEPLOYED: Thu Aug 27 14:37:36 2020
NAMESPACE: default
STATUS: deployed
REVISION: 20
NOTES:
1. Get the application URL by running these commands:
export POD_NAME=$(kubectl get pods --namespace default -l "app.kubernetes.io/name=helm-test,app.kubernetes.io/instance=helm-test-dev" -o jsonpath="{.items[0].metadata.name}")
echo "Visit http://127.0.0.1:8080 to use your application"
kubectl --namespace default port-forward $POD_NAME 8080:80
Kubernetes in action
Running DaemonSets on hostPort
- Docs: https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/
- Good article: https://medium.com/stakater/k8s-deployments-vs-statefulsets-vs-daemonsets-60582f0c62d4
In this case configuration of networking in context of services is not needed.
This setup is suitable for legacy scenarios where static IP-address are required and a NodePort service is not an alternative:
kind: DaemonSet
apiVersion: apps/v1
metadata:
name: netcat-daemonset
labels:
app: netcat-daemonset
spec:
selector:
matchLabels:
app: netcat-daemonset
template:
metadata:
labels:
app: netcat-daemonset
spec:
containers:
- command:
- nc
- -lk
- -p
- "23456"
- -v
- -e
- /bin/true
env:
- name: DEMO_GREETING
value: Hello from the environment
image: dockreg-zdf.int.zwackl.de/alpine/latest/amd64:prod
imagePullPolicy: Always
name: netcat-daemonset
ports:
- containerPort: 23456
hostPort: 23456
protocol: TCP
resources:
limits:
cpu: 500m
memory: 64Mi
requests:
cpu: 50m
memory: 32Mi
restartPolicy: Always
securityContext: {}
terminationGracePeriodSeconds: 30
updateStrategy:
rollingUpdate:
maxUnavailable: 1
type: RollingUpdate
Running StatefulSet with NFS storage
- https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/
- NFS dynamic volume provisioning deployed
Be careful: StatefulSets are designed for stateful applications (like databases). To avoid split-brain scenarios StatefulSets behave as static as possible. If a node goes down, the StatefulSet controller will not reschedule the pods to another functioning nodes! This only happens to stateless Deployments! In this case you need to force the rescheduling by hand like this:
kubectl delete pod web-1 --grace-period=0 --force
More details on this can be found here
If you want DaemonSet-like Node-affinity on StatefulSets then read this
---
apiVersion: v1
kind: Service
metadata:
name: nginx
labels:
app: nginx
spec:
ports:
- port: 80
name: web
clusterIP: None
selector:
app: nginx
---
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: web
spec:
selector:
matchLabels:
app: nginx
serviceName: "nginx"
replicas: 2
template:
metadata:
labels:
app: nginx
spec:
terminationGracePeriodSeconds: 10
containers:
- name: nginx
image: nginx:alpine
ports:
- containerPort: 80
name: web
volumeMounts:
- name: nfs-backend
mountPath: /nfs-backend
volumeClaimTemplates:
- metadata:
name: nfs-backend
spec:
accessModes: [ "ReadWriteMany" ]
storageClassName: "nfs-client"
resources:
requests:
storage: 32Mi
Services
Client-IP transparency and loadbalancing
apiVersion: v1
kind: Service
[...]
spec:
type: NodePort
externalTrafficPolicy: <<Local|Cluster>>
[...]
externalTrafficPolicy: Cluster (default) spreads the incoming traffic over all pods evenly. To achieve this the client ip-address must be source-NATted and therefore it´s not visible to the PODs.
externalTrafficPolicy: Local preserves the original client ip-address which is visible to the PODs. In any case (DaemonSet or StatefulSet) traffic remains on the Node which gets the traffic. In case of StatefulSet if more than one POD of a ReplicaSet is scheduled on the same Node, the workload gets balanced over all PODs on the same Node.
Session affinity/persistence
apiVersion: v1
kind: Service
[...]
spec:
type: NodePort
sessionAffinity: <<ClientIP|None>>
sessionAffinityConfig:
clientIP:
timeoutSeconds: 10
[]
Session persistence is only possible
What happens if a node goes down?
If a node goes down kubernetes marks this node as NotReady, but nothing else:
$ kubectl get node
NAME STATUS ROLES AGE VERSION
k3s-node2 Ready <none> 103d v1.19.5+k3s2
k3s-master Ready master 103d v1.19.5+k3s2
k3s-node1 NotReady <none> 103d v1.19.5+k3s2
$ kubectl get pod
NAME READY STATUS RESTARTS AGE
ds-test-5mlkt 1/1 Running 14 28h
my-nfs-client-provisioner-57ff8c84c7-p75ck 1/1 Running 0 31m
web-1 1/1 Running 0 26m
web-2 1/1 Running 0 26m
ds-test-c6xx8 1/1 Running 0 18m
ds-test-w45dv 1/1 Running 5 28h
Kubernetes knows something like a --pod-eviction-timeout, which is a grace period (default: 5 minutes) for deleting pods on failed nodes. This timeout is useful to keep pods on nodes, which are rebooted in term of maintenance reasons. So, first of all, nothing happens to the pods on failed nodes until pod eviction timeout exceeded. If the pod eviction period times out, Kubernetes reschedules stateless Deployments to working nodes. DaemonSets as well as StatefulSets will not be rescheduled on other nodes at all.
Docs: https://kubernetes.io/docs/concepts/scheduling-eviction/eviction-policy/
Keep your cluster balanced
Kubernetes, in first place, takes care of high availability, but not of well balance of pod/node. This project could be a solution! Pod/Node balance is not a subject to DaemonSets.
Node maintenance
Mark a node for maintenance:
$ kubectl drain k3s-node2 --ignore-daemonsets
$ kubectl get node
NAME STATUS ROLES AGE VERSION
k3s-node1 Ready <none> 105d v1.19.5+k3s2
k3s-master Ready master 105d v1.19.5+k3s2
k3s-node2 Ready,SchedulingDisabled <none> 105d v1.19.5+k3s
All Deployment as well as StatefulSet pods have been rescheduled on remaining nodes. DaemonSet pods were not touched! Node maintenance can be performed now.
To bring the maintained node back in cluster:
$ kubectl uncordon k3s-node2
node/k3s-node2 uncordoned