614 lines
27 KiB
Markdown
614 lines
27 KiB
Markdown
* [kubectl - BASH autocompletion](#kubectl-bash-autocompletion)
|
||
* [Install k3s](#install-k3s)
|
||
* [Configure upstream DNS-resolver](#upstream-dns-resolver)
|
||
* [Change NodePort range](#nodeport-range)
|
||
* [Namespaces and resource limits](#namespaces-limits)
|
||
* [Persistent volumes (StorageClass - dynamic provisioning)](#pv)
|
||
* [Rancher Local](#pv-local)
|
||
* [Rancher Longhorn - distributed in local cluster](#pv-longhorn)
|
||
* [NFS](#pv-nfs)
|
||
* [Ingress controller](#ingress-controller)
|
||
* [Disable Traefik-ingress](#disable-traefik-ingress)
|
||
* [Enable NGINX-ingress with OCSP stapling](#enable-nginx-ingress)
|
||
* [Installation](#install-nginx-ingress)
|
||
* [Cert-Manager (references ingress controller)](#cert-manager)
|
||
* [Installation](#cert-manager-install)
|
||
* [Let´s Encrypt issuer](#cert-manager-le-issuer)
|
||
* [Deploying a LE-certificate](#cert-manager-ingress)
|
||
* [Troubleshooting](#cert-manager-troubleshooting)
|
||
* [Keep your cluster balanced](#keep-cluster-balanced)
|
||
* [HELM charts](#helm)
|
||
* [Create a chart](#helm-create)
|
||
* [Install local chart without packaging](#helm-install-without-packaging)
|
||
* [List deployed helm charts](#helm-list)
|
||
* [Upgrade local chart without packaging](#helm-upgrade)
|
||
* [Get status of deployed chart](#helm-status)
|
||
* [Get deployment history](#helm-history)
|
||
* [Rollback](#helm-rollback)
|
||
* [Kubernetes in action](#kubernetes-in-action)
|
||
* [Running DaemonSets on `hostPort`](#running-daemonsets)
|
||
* [Running StatefulSet with NFS storage](#running-statefulset-nfs)
|
||
* [What happens if a node goes down?](#what-happens-node-down)
|
||
* [Dealing with disruptions](#disruptions)
|
||
|
||
# kubectl - BASH autocompletion <a name="user-content-kubectl-bash-autocompletion"></a>
|
||
For current shell only:
|
||
```
|
||
source <(kubectl completion bash)
|
||
```
|
||
Persistent:
|
||
```
|
||
echo "source <(kubectl completion bash)" >> ~/.bashrc
|
||
```
|
||
|
||
# Install k3s <a name="user-content-install-k3s"></a>
|
||
https://k3s.io/:
|
||
```
|
||
curl -sfL https://get.k3s.io | sh -
|
||
```
|
||
If disired, set a memory consumption limit of the systemd-unit like so:
|
||
```
|
||
root#> mkdir /etc/systemd/system/k3s.service.d
|
||
root#> vi /etc/systemd/system/k3s.service.d/limits.conf
|
||
[Service]
|
||
MemoryMax=1024M
|
||
|
||
root#> systemctl daemon-reload
|
||
root#> systemctl restart k3s
|
||
|
||
root#> systemctl status k3s
|
||
k3s.service - Lightweight Kubernetes
|
||
Loaded: loaded (/etc/systemd/system/k3s.service; enabled; vendor preset: enabled)
|
||
Drop-In: /etc/systemd/system/k3s.service.d
|
||
└─limits.conf
|
||
Active: active (running) since Thu 2020-11-26 10:46:26 CET; 13min ago
|
||
Docs: https://k3s.io
|
||
Process: 9618 ExecStartPre=/sbin/modprobe br_netfilter (code=exited, status=0/SUCCESS)
|
||
Process: 9619 ExecStartPre=/sbin/modprobe overlay (code=exited, status=0/SUCCESS)
|
||
Main PID: 9620 (k3s-server)
|
||
Tasks: 229
|
||
Memory: 510.6M (max: 1.0G)
|
||
CGroup: /system.slice/k3s.service
|
||
|
||
```
|
||
|
||
# Upstream DNS-resolver <a name="user-content-upstream-dns-resolver"></a>
|
||
Docs: https://rancher.com/docs/rancher/v2.x/en/troubleshooting/dns/
|
||
|
||
Default: 8.8.8.8 => does not resolve local domains!
|
||
1. local /etc/resolv.k3s.conf -> ip-of-dnsresolver (127.0.0.1 **does not work!**)
|
||
2. vi /etc/systemd/system/k3s.service:
|
||
```
|
||
[...]
|
||
ExecStart=/usr/local/bin/k3s \
|
||
server [...] --resolv-conf /etc/resolv.k3s.conf \
|
||
```
|
||
3. Re-load systemd config: `systemctl daemon-reload`
|
||
4. Re-start k3s: `systemctl restart k3s.service`
|
||
5. Re-deploy coredns-pods: `kubectl -n kube-system delete pod name-of-coredns-pods`
|
||
|
||
# Change NodePort range to 1 - 65535 <a name="user-content-nodeport-range"></a>
|
||
1. vi /etc/systemd/system/k3s.service:
|
||
```
|
||
[...]
|
||
ExecStart=/usr/local/bin/k3s \
|
||
server [...] --kube-apiserver-arg service-node-port-range=1-65535 \
|
||
```
|
||
2. Re-load systemd config: `systemctl daemon-reload`
|
||
3. Re-start k3s: `systemctl restart k3s.service`
|
||
|
||
# Namespaces and resource limits <a name="user-content-namespaces-limits"></a>
|
||
```
|
||
kubectl apply -f https://gitea.zwackl.de/dominik/k3s/raw/branch/master/namespaces_limits.yaml
|
||
```
|
||
|
||
# Persistent Volumes (StorageClass - dynamic provisioning) <a name="user-content-pv"></a>
|
||
## Rancher Local <a name="user-content-pv-local"></a>
|
||
https://rancher.com/docs/k3s/latest/en/storage/
|
||
|
||
## Longhorn (distributed in local cluster) <a name="user-content-pv-longhorn"></a>
|
||
* Requirements: https://longhorn.io/docs/0.8.0/install/requirements/
|
||
* Debian: `apt install open-iscsi`
|
||
* Install: https://rancher.com/docs/k3s/latest/en/storage/
|
||
|
||
## NFS <a name="user-content-pv-nfs"></a>
|
||
If you want to use NFS based storage...
|
||
|
||
**All Nodes need to have the NFS-client package (Ubuntu: `nfs-common`) installed**
|
||
```
|
||
helm3 repo add ckotzbauer https://ckotzbauer.github.io/helm-charts
|
||
helm3 install my-nfs-client-provisioner --set nfs.server=<nfs-server/ip-addr> --set nfs.path=</data/nfs> ckotzbauer/nfs-client-provisioner
|
||
```
|
||
Check if NFS *StorageClass* is available:
|
||
```
|
||
$ kubectl get sc
|
||
NAME PROVISIONER RECLAIMPOLICY VOLUMEBINDINGMODE ALLOWVOLUMEEXPANSION AGE
|
||
local-path (default) rancher.io/local-path Delete WaitForFirstConsumer false 101d
|
||
nfs-client cluster.local/my-nfs-client-provisioner Delete Immediate true 172m
|
||
```
|
||
Now you can use `nfs-client` as StorageClass like so:
|
||
```
|
||
apiVersion: apps/v1
|
||
kind: StatefulSet
|
||
[...]
|
||
volumeClaimTemplates:
|
||
- metadata:
|
||
name: nfs-backend
|
||
spec:
|
||
accessModes: [ "ReadWriteMany" ]
|
||
storageClassName: "nfs-client"
|
||
resources:
|
||
requests:
|
||
storage: 32Mi
|
||
```
|
||
or so:
|
||
```
|
||
apiVersion: v1
|
||
kind: PersistentVolumeClaim
|
||
metadata:
|
||
name: nfs-pvc-1
|
||
namespace: <blubb>
|
||
spec:
|
||
storageClassName: "nfs-client"
|
||
accessModes:
|
||
- ReadWriteMany
|
||
resources:
|
||
requests:
|
||
storage: 32Mi
|
||
```
|
||
|
||
# Ingress controller <a name="user-content-ingress-controller"></a>
|
||
## Disable Traefik-ingress <a name="user-content-disable-traefik-ingress"></a>
|
||
edit /etc/systemd/system/k3s.service:
|
||
```
|
||
[...]
|
||
ExecStart=/usr/local/bin/k3s \
|
||
server --disable traefik --resolv-conf /etc/resolv.conf \
|
||
[...]
|
||
```
|
||
Finally `systemctl daemon-reload` and `systemctl restart k3s`
|
||
|
||
## Enable K8s own NGINX-ingress with OCSP stapling <a name="user-content-enable-nginx-ingress"></a>
|
||
### Installation <a name="user-content-install-nginx-ingress"></a>
|
||
This is the helm chart of the K8s own nginx ingress controller:
|
||
https://kubernetes.github.io/ingress-nginx/deploy/#using-helm
|
||
|
||
```
|
||
kubectl create ns ingress-nginx
|
||
helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
|
||
helm install my-release ingress-nginx/ingress-nginx -n ingress-nginx
|
||
```
|
||
|
||
`kubectl -n ingress-nginx get all`:
|
||
```
|
||
NAME READY STATUS RESTARTS AGE
|
||
pod/svclb-my-release-ingress-nginx-controller-m6gxl 2/2 Running 0 110s
|
||
pod/my-release-ingress-nginx-controller-695774d99c-t794f 1/1 Running 0 110s
|
||
|
||
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
|
||
service/my-release-ingress-nginx-controller-admission ClusterIP 10.43.116.191 <none> 443/TCP 110s
|
||
service/my-release-ingress-nginx-controller LoadBalancer 10.43.55.41 192.168.178.116 80:31110/TCP,443:31476/TCP 110s
|
||
|
||
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
|
||
daemonset.apps/svclb-my-release-ingress-nginx-controller 1 1 1 1 1 <none> 110s
|
||
|
||
NAME READY UP-TO-DATE AVAILABLE AGE
|
||
deployment.apps/my-release-ingress-nginx-controller 1/1 1 1 110s
|
||
|
||
NAME DESIRED CURRENT READY AGE
|
||
replicaset.apps/my-release-ingress-nginx-controller-695774d99c 1 1 1 110s
|
||
```
|
||
As nginx ingress is hungry for memory, let´s reduce the number of workers to 1:
|
||
```
|
||
kubectl -n ingress-nginx edit configmap my-release-ingress-nginx-controller
|
||
|
||
apiVersion: v1
|
||
<<<ADD BEGINN>>>
|
||
data:
|
||
enable-ocsp: "true"
|
||
worker-processes: "1"
|
||
<<<ADD END>>>
|
||
kind: ConfigMap
|
||
[...]
|
||
```
|
||
Finally the deployment needs to be restarted:
|
||
|
||
`kubectl -n ingress-nginx rollout restart deployment my-release-ingress-nginx-controller`
|
||
|
||
**If you are facing deployment problems like the following one**
|
||
```
|
||
Error: UPGRADE FAILED: cannot patch "gitea-ingress-staging" with kind Ingress: Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io": Post https://my-release-ingress-nginx-controller-admission.ingress-nginx.svc:443/networking/v1beta1/ingresses?timeout=10s: context deadline exceeded
|
||
```
|
||
A possible fix: `kubectl -n ingress-nginx delete ValidatingWebhookConfiguration my-release-ingress-nginx-admission`
|
||
|
||
# Cert-Manager (references ingress controller) <a name="user-content-cert-manager"></a>
|
||
## Installation <a name="user-content-cert-manager-install"></a>
|
||
Docs: https://hub.helm.sh/charts/jetstack/cert-manager
|
||
```
|
||
helm repo add jetstack https://charts.jetstack.io
|
||
helm repo update
|
||
kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v1.0.2/cert-manager.crds.yaml
|
||
kubectl create namespace cert-manager
|
||
helm install cert-manager --namespace cert-manager jetstack/cert-manager
|
||
kubectl -n cert-manager get all
|
||
```
|
||
## Let´s Encrypt issuer <a name="user-content-cert-manager-le-issuer"></a>
|
||
Docs: https://cert-manager.io/docs/tutorials/acme/ingress/#step-6-configure-let-s-encrypt-issuer
|
||
```
|
||
ClusterIssuers are a resource type similar to Issuers. They are specified in exactly the same way,
|
||
but they do not belong to a single namespace and can be referenced by Certificate resources from
|
||
multiple different namespaces.
|
||
```
|
||
|
||
lets-encrypt-cluster-issuers.yaml:
|
||
```
|
||
apiVersion: cert-manager.io/v1
|
||
kind: ClusterIssuer
|
||
metadata:
|
||
name: letsencrypt-staging-issuer
|
||
spec:
|
||
acme:
|
||
# You must replace this email address with your own.
|
||
# Let's Encrypt will use this to contact you about expiring
|
||
# certificates, and issues related to your account.
|
||
email: user@example.com
|
||
server: https://acme-staging-v02.api.letsencrypt.org/directory
|
||
privateKeySecretRef:
|
||
# Secret resource that will be used to store the account's private key.
|
||
name: letsencrypt-staging-account-key
|
||
# Add a single challenge solver, HTTP01 using nginx
|
||
solvers:
|
||
- http01:
|
||
ingress:
|
||
class: nginx
|
||
---
|
||
apiVersion: cert-manager.io/v1
|
||
kind: ClusterIssuer
|
||
metadata:
|
||
name: letsencrypt-prod-issuer
|
||
spec:
|
||
acme:
|
||
# The ACME server URL
|
||
server: https://acme-v02.api.letsencrypt.org/directory
|
||
# Email address used for ACME registration
|
||
email: user@example.com
|
||
# Name of a secret used to store the ACME account private key
|
||
privateKeySecretRef:
|
||
name: letsencrypt-prod-account-key
|
||
# Enable the HTTP-01 challenge provider
|
||
solvers:
|
||
- http01:
|
||
ingress:
|
||
class: nginx
|
||
```
|
||
`kubectl apply -f lets-encrypt-cluster-issuers.yaml`
|
||
|
||
## Deploying a LE-certificate <a name="user-content-cert-manager-ingress"></a>
|
||
All you need is an `Ingress` resource of class `nginx` which references a ClusterIssuer (`letsencrypt-prod-issuer`) resource:
|
||
```
|
||
apiVersion: networking.k8s.io/v1beta1
|
||
kind: Ingress
|
||
metadata:
|
||
namespace: <stage>
|
||
name: some-ingress-name
|
||
annotations:
|
||
# use the shared ingress-nginx
|
||
kubernetes.io/ingress.class: "nginx"
|
||
cert-manager.io/cluster-issuer: "letsencrypt-prod-issuer"
|
||
spec:
|
||
tls:
|
||
- hosts:
|
||
- some-certificate.name.san
|
||
secretName: target-certificate-secret-name
|
||
rules:
|
||
- host: some-certificate.name.san
|
||
http:
|
||
paths:
|
||
- path: /
|
||
backend:
|
||
serviceName: some-target-service
|
||
servicePort: some-target-service-port
|
||
```
|
||
|
||
## Troubleshooting <a name="user-content-cert-manager-troubleshooting"></a>
|
||
Docs: https://cert-manager.io/docs/faq/acme/
|
||
|
||
ClusterIssuer runs in default namespace:
|
||
```
|
||
kubectl get clusterissuer
|
||
kubectl describe clusterissuer <object>
|
||
```
|
||
All other ingres-specific cert-manager resources are running <stage> specific namespaces:
|
||
```
|
||
kubectl -n <stage> get certificaterequest
|
||
kubectl -n <stage> describe certificaterequest <object>
|
||
kubectl -n <stage> get certificate
|
||
kubectl -n <stage> describe certificate <object>
|
||
kubectl -n <stage> get secret
|
||
kubectl -n <stage> describe secret <object>
|
||
kubectl -n <stage> get challenge
|
||
kubectl -n <stage> describe challenge <object>
|
||
```
|
||
|
||
After successfull setup perform a TLS-test: `https://www.ssllabs.com/ssltest/index.html`
|
||
|
||
|
||
# Keep your cluster balanced <a name="user-content-keep-cluster-balanced"></a>
|
||
Kubernetes, in first place, takes care of high availability, but not of well balance of pod/node. In case of *stateless deployments* [this](https://itnext.io/keep-you-kubernetes-cluster-balanced-the-secret-to-high-availability-17edf60d9cb7) project could be a solution! Pod/Node balance is not a subject to *DaemonSets*.
|
||
|
||
# HELM charts <a name="user-content-helm"></a>
|
||
Docs:
|
||
* https://helm.sh/docs/intro/using_helm/
|
||
|
||
Prerequisites:
|
||
* running kubernetes installation
|
||
* kubectl with ENV[KUBECONFIG] pointing to appropriate config file
|
||
* helm
|
||
|
||
## Create a chart <a name="user-content-helm-create"></a>
|
||
`helm create helm-test`
|
||
|
||
```
|
||
~/kubernetes/helm$ tree helm-test/
|
||
helm-test/
|
||
├── charts
|
||
├── Chart.yaml
|
||
├── templates
|
||
│ ├── deployment.yaml
|
||
│ ├── _helpers.tpl
|
||
│ ├── hpa.yaml
|
||
│ ├── ingress.yaml
|
||
│ ├── NOTES.txt
|
||
│ ├── serviceaccount.yaml
|
||
│ ├── service.yaml
|
||
│ └── tests
|
||
│ └── test-connection.yaml
|
||
└── values.yaml
|
||
```
|
||
|
||
## Install local chart without packaging <a name="user-content-helm-install-without-packaging"></a>
|
||
`helm install helm-test-dev helm-test/ --set image.tag=latest --debug --wait`
|
||
|
||
or just a *dry-run*:
|
||
|
||
`helm install helm-test-dev helm-test/ --set image.tag=latest --debug --dry-run`
|
||
|
||
```
|
||
--wait: Waits until all Pods are in a ready state, PVCs are bound, Deployments have minimum (Desired minus maxUnavailable)
|
||
Pods in ready state and Services have an IP address (and Ingress if a LoadBalancer) before marking the release as successful.
|
||
It will wait for as long as the --timeout value. If timeout is reached, the release will be marked as FAILED. Note: In
|
||
scenarios where Deployment has replicas set to 1 and maxUnavailable is not set to 0 as part of rolling update strategy,
|
||
|
||
--wait will return as ready as it has satisfied the minimum Pod in ready condition.
|
||
```
|
||
|
||
## List deployed helm charts <a name="user-content-helm-list"></a>
|
||
```
|
||
~/kubernetes/helm$ helm list
|
||
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
|
||
helm-test-dev default 4 2020-08-27 12:30:38.98457042 +0200 CEST deployed helm-test-0.1.0 1.16.0
|
||
```
|
||
|
||
## Upgrade local chart without packaging <a name="user-content-helm-upgrade"></a>
|
||
```
|
||
~/kubernetes/helm$ helm upgrade helm-test-dev helm-test/ --set image.tag=latest --wait --timeout 60s
|
||
Release "helm-test-dev" has been upgraded. Happy Helming!
|
||
NAME: helm-test-dev
|
||
LAST DEPLOYED: Thu Aug 27 12:47:09 2020
|
||
NAMESPACE: default
|
||
STATUS: deployed
|
||
REVISION: 7
|
||
NOTES:
|
||
1. Get the application URL by running these commands:
|
||
export POD_NAME=$(kubectl get pods --namespace default -l "app.kubernetes.io/name=helm-test,app.kubernetes.io/instance=helm-test-dev" -o jsonpath="{.items[0].metadata.name}")
|
||
echo "Visit http://127.0.0.1:8080 to use your application"
|
||
kubectl --namespace default port-forward $POD_NAME 8080:80
|
||
```
|
||
`helm upgrade [...] --wait` is synchronous and exit with 0 on success, otherwise with >0 on failure. `helm upgrade` will wait for 5 minutes Setting the `--timeout` (Default 5 minutes) flag makes This can be used in term of CI/CD deployments with Jenkins.
|
||
|
||
## Get status of deployed chart <a name="user-content-helm-status"></a>
|
||
```
|
||
~/kubernetes/helm$ helm status helm-test-dev
|
||
NAME: helm-test-dev
|
||
LAST DEPLOYED: Thu Aug 27 12:47:09 2020
|
||
NAMESPACE: default
|
||
STATUS: deployed
|
||
REVISION: 7
|
||
NOTES:
|
||
1. Get the application URL by running these commands:
|
||
export POD_NAME=$(kubectl get pods --namespace default -l "app.kubernetes.io/name=helm-test,app.kubernetes.io/instance=helm-test-dev" -o jsonpath="{.items[0].metadata.name}")
|
||
echo "Visit http://127.0.0.1:8080 to use your application"
|
||
kubectl --namespace default port-forward $POD_NAME 8080:80
|
||
```
|
||
|
||
## Get deployment history <a name="user-content-helm-history"></a>
|
||
```
|
||
~/kubernetes/helm$ helm history helm-test-dev
|
||
REVISION UPDATED STATUS CHART APP VERSION DESCRIPTION
|
||
10 Thu Aug 27 12:56:33 2020 failed helm-test-0.1.0 1.16.0 Upgrade "helm-test-dev" failed: timed out waiting for the condition
|
||
11 Thu Aug 27 13:08:34 2020 superseded helm-test-0.1.0 1.16.0 Upgrade complete
|
||
12 Thu Aug 27 13:09:59 2020 superseded helm-test-0.1.0 1.16.0 Upgrade complete
|
||
13 Thu Aug 27 13:10:24 2020 superseded helm-test-0.1.0 1.16.0 Rollback to 11
|
||
14 Thu Aug 27 13:23:22 2020 failed helm-test-0.1.1 blubb Upgrade "helm-test-dev" failed: timed out waiting for the condition
|
||
15 Thu Aug 27 13:26:43 2020 pending-upgrade helm-test-0.1.1 blubb Preparing upgrade
|
||
16 Thu Aug 27 13:27:12 2020 superseded helm-test-0.1.1 blubb Upgrade complete
|
||
17 Thu Aug 27 14:32:32 2020 superseded helm-test-0.1.1 Upgrade complete
|
||
18 Thu Aug 27 14:33:58 2020 superseded helm-test-0.1.1 Upgrade complete
|
||
19 Thu Aug 27 14:36:49 2020 failed helm-test-0.1.1 cosmetics Upgrade "helm-test-dev" failed: timed out waiting for the condition
|
||
```
|
||
|
||
## Rollback <a name="user-content-helm-rollback"></a>
|
||
`helm rollback helm-test-dev 18 --wait`
|
||
```
|
||
~/kubernetes/helm$ helm history helm-test-dev
|
||
REVISION UPDATED STATUS CHART APP VERSION DESCRIPTION
|
||
10 Thu Aug 27 12:56:33 2020 failed helm-test-0.1.0 1.16.0 Upgrade "helm-test-dev" failed: timed out waiting for the condition
|
||
11 Thu Aug 27 13:08:34 2020 superseded helm-test-0.1.0 1.16.0 Upgrade complete
|
||
12 Thu Aug 27 13:09:59 2020 superseded helm-test-0.1.0 1.16.0 Upgrade complete
|
||
13 Thu Aug 27 13:10:24 2020 superseded helm-test-0.1.0 1.16.0 Rollback to 11
|
||
14 Thu Aug 27 13:23:22 2020 failed helm-test-0.1.1 blubb Upgrade "helm-test-dev" failed: timed out waiting for the condition
|
||
15 Thu Aug 27 13:26:43 2020 pending-upgrade helm-test-0.1.1 blubb Preparing upgrade
|
||
16 Thu Aug 27 13:27:12 2020 superseded helm-test-0.1.1 blubb Upgrade complete
|
||
17 Thu Aug 27 14:32:32 2020 superseded helm-test-0.1.1 Upgrade complete
|
||
18 Thu Aug 27 14:33:58 2020 superseded helm-test-0.1.1 Upgrade complete
|
||
19 Thu Aug 27 14:36:49 2020 failed helm-test-0.1.1 cosmetics Upgrade "helm-test-dev" failed: timed out waiting for the condition
|
||
20 Thu Aug 27 14:37:36 2020 deployed helm-test-0.1.1 Rollback to 18
|
||
```
|
||
```
|
||
~/kubernetes/helm$ helm status helm-test-dev
|
||
NAME: helm-test-dev
|
||
LAST DEPLOYED: Thu Aug 27 14:37:36 2020
|
||
NAMESPACE: default
|
||
STATUS: deployed
|
||
REVISION: 20
|
||
NOTES:
|
||
1. Get the application URL by running these commands:
|
||
export POD_NAME=$(kubectl get pods --namespace default -l "app.kubernetes.io/name=helm-test,app.kubernetes.io/instance=helm-test-dev" -o jsonpath="{.items[0].metadata.name}")
|
||
echo "Visit http://127.0.0.1:8080 to use your application"
|
||
kubectl --namespace default port-forward $POD_NAME 8080:80
|
||
```
|
||
|
||
# Kubernetes in action <a name="user-content-kubernetes-in-action"></a>
|
||
## Running DaemonSets on `hostPort` <a name="user-content-running-daemonsets"></a>
|
||
* Docs: https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/
|
||
* Good article: https://medium.com/stakater/k8s-deployments-vs-statefulsets-vs-daemonsets-60582f0c62d4
|
||
|
||
In this case configuration of networking in context of services is not needed.
|
||
|
||
This setup is suitable for legacy scenarios where static IP-address are required and a NodePort service is not an alternative:
|
||
```
|
||
kind: DaemonSet
|
||
apiVersion: apps/v1
|
||
metadata:
|
||
name: netcat-daemonset
|
||
labels:
|
||
app: netcat-daemonset
|
||
spec:
|
||
selector:
|
||
matchLabels:
|
||
app: netcat-daemonset
|
||
template:
|
||
metadata:
|
||
labels:
|
||
app: netcat-daemonset
|
||
spec:
|
||
containers:
|
||
- command:
|
||
- nc
|
||
- -lk
|
||
- -p
|
||
- "23456"
|
||
- -v
|
||
- -e
|
||
- /bin/true
|
||
env:
|
||
- name: DEMO_GREETING
|
||
value: Hello from the environment
|
||
image: dockreg-zdf.int.zwackl.de/alpine/latest/amd64:prod
|
||
imagePullPolicy: Always
|
||
name: netcat-daemonset
|
||
ports:
|
||
- containerPort: 23456
|
||
hostPort: 23456
|
||
protocol: TCP
|
||
resources:
|
||
limits:
|
||
cpu: 500m
|
||
memory: 64Mi
|
||
requests:
|
||
cpu: 50m
|
||
memory: 32Mi
|
||
restartPolicy: Always
|
||
securityContext: {}
|
||
terminationGracePeriodSeconds: 30
|
||
updateStrategy:
|
||
rollingUpdate:
|
||
maxUnavailable: 1
|
||
type: RollingUpdate
|
||
```
|
||
## Running StatefulSet with NFS storage <a name="user-content-running-statefulset-nfs"></a>
|
||
* https://kubernetes.io/docs/concepts/workloads/controllers/statefulset/
|
||
* [NFS dynamic volume provisioning deployed](#pv-nfs)
|
||
|
||
**Be careful:** *StatefulSets* are designed for stateful applications (like databases). To avoid split-brain scenarios StatefulSets behave as static as possible. If a node goes down, the StatefulSet controller will not reschedule the pods to another functioning nodes! This only happens to stateless *Deployments*! In this case you need to force the rescheduling by hand like this:
|
||
`kubectl delete pod web-1 --grace-period=0 --force`
|
||
|
||
More details on this can be found [here](https://kubernetes.io/docs/tasks/run-application/force-delete-stateful-set-pod/)
|
||
|
||
If you want DaemonSet-like Node-affinity on StatefulSets then read [this](https://medium.com/@johnjjung/building-a-kubernetes-daemonstatefulset-30ad0592d8cb)
|
||
```
|
||
---
|
||
apiVersion: v1
|
||
kind: Service
|
||
metadata:
|
||
name: nginx
|
||
labels:
|
||
app: nginx
|
||
spec:
|
||
ports:
|
||
- port: 80
|
||
name: web
|
||
clusterIP: None
|
||
selector:
|
||
app: nginx
|
||
---
|
||
apiVersion: apps/v1
|
||
kind: StatefulSet
|
||
metadata:
|
||
name: web
|
||
spec:
|
||
selector:
|
||
matchLabels:
|
||
app: nginx
|
||
serviceName: "nginx"
|
||
replicas: 2
|
||
template:
|
||
metadata:
|
||
labels:
|
||
app: nginx
|
||
spec:
|
||
terminationGracePeriodSeconds: 10
|
||
containers:
|
||
- name: nginx
|
||
image: nginx:alpine
|
||
ports:
|
||
- containerPort: 80
|
||
name: web
|
||
volumeMounts:
|
||
- name: nfs-backend
|
||
mountPath: /nfs-backend
|
||
volumeClaimTemplates:
|
||
- metadata:
|
||
name: nfs-backend
|
||
spec:
|
||
accessModes: [ "ReadWriteMany" ]
|
||
storageClassName: "nfs-client"
|
||
resources:
|
||
requests:
|
||
storage: 32Mi
|
||
```
|
||
## What happens if a node goes down? <a name="user-content-what-happens-node-down"></a>
|
||
If a node goes down kubernetes marks this node as *NotReady*, but nothing else:
|
||
```
|
||
$ kubectl get node
|
||
NAME STATUS ROLES AGE VERSION
|
||
k3s-node2 Ready <none> 103d v1.19.5+k3s2
|
||
k3s-master Ready master 103d v1.19.5+k3s2
|
||
k3s-node1 NotReady <none> 103d v1.19.5+k3s2
|
||
|
||
$ kubectl get pod
|
||
NAME READY STATUS RESTARTS AGE
|
||
ds-test-5mlkt 1/1 Running 14 28h
|
||
my-nfs-client-provisioner-57ff8c84c7-p75ck 1/1 Running 0 31m
|
||
web-1 1/1 Running 0 26m
|
||
web-2 1/1 Running 0 26m
|
||
ds-test-c6xx8 1/1 Running 0 18m
|
||
ds-test-w45dv 1/1 Running 5 28h
|
||
```
|
||
Kubernetes knows something like a `--pod-eviction-timeout`, which is a grace period (**default: 5 minutes**) for deleting pods on failed nodes. This timeout is useful to keep pods on nodes, which are rebooted in term of maintenance reasons. So, first of all, nothing happens to the pods on failed nodes until *pod eviction timeout* exceeded. If the *pod eviction period* times out, Kubernetes reschedules *stateless Deployments* to working nodes. *DaemonSets* as well as *StatefulSets* will not be rescheduled on other nodes at all.
|
||
|
||
Docs: https://kubernetes.io/docs/concepts/scheduling-eviction/eviction-policy/
|
||
|
||
## Dealing with disruptions <a name="user-content-disruptions"></a>
|
||
* https://kubernetes.io/docs/concepts/workloads/pods/disruptions/
|
||
* https://kubernetes.io/docs/tasks/administer-cluster/safely-drain-node/ |