916 lines
37 KiB
Markdown
916 lines
37 KiB
Markdown
* [kubectl - BASH autocompletion](#kubectl-bash-autocompletion)
|
||
* [Install k3s](#install-k3s)
|
||
* [On premises/IaaS](#install-k3s-on-premises)
|
||
* [Configure upstream DNS-resolver](#upstream-dns-resolver)
|
||
* [Change NodePort range](#nodeport-range)
|
||
* [Install Canal as NetworkPolicy controller](#canal)
|
||
* [Clustering](#clustering)
|
||
* [On Docker with k3d](#install-k3s-on-docker-k3d)
|
||
* [Namespaces and resource limits](#namespaces-limits)
|
||
* [Persistent volumes (StorageClass - dynamic provisioning)](#pv)
|
||
* [Rancher Local (k3s default)](#pv-local)
|
||
* [Rancher Longhorn (distributed in local cluster) - MY FAVOURITE :-)](#pv-longhorn)
|
||
* [Custom StorageClass](#pv-longhorn-custom-storageclass)
|
||
* [Volume backups with S3 (compatible) storage](#pv-longhorn-s3-backup)
|
||
* [Ingress controller](#ingress-controller)
|
||
* [Disable Traefik-ingress](#disable-traefik-ingress)
|
||
* [Enable NGINX-ingress with OCSP stapling](#enable-nginx-ingress)
|
||
* [Installation](#install-nginx-ingress)
|
||
* [Cert-Manager (references ingress controller)](#cert-manager)
|
||
* [Installation](#cert-manager-install)
|
||
* [Cluster-internal CA issuer](#cert-manager-cluster-ca-issuer)
|
||
* [Let´s Encrypt (HTTP-01/DNS-01) issuer](#cert-manager-le-issuer)
|
||
* [Deploying a LE-certificate with ingress](#cert-manager-ingress)
|
||
* [Deploying a LE-certificate by CRD](#cert-manager-crd)
|
||
* [Troubleshooting](#cert-manager-troubleshooting)
|
||
* [Cluster monitoring](#cluster-monitoring)
|
||
* [Log correlation with Loki-stack](#loki-stack)
|
||
* [Metrics with Prometheus-stack + Grafana](#prometheus-grafana)
|
||
* [HELM charts](#helm)
|
||
* [Create a chart](#helm-create)
|
||
* [Install local chart without packaging](#helm-install-without-packaging)
|
||
* [List deployed helm charts](#helm-list)
|
||
* [Upgrade local chart without packaging](#helm-upgrade)
|
||
* [Get status of deployed chart](#helm-status)
|
||
* [Get deployment history](#helm-history)
|
||
* [Rollback](#helm-rollback)
|
||
* [Kubernetes in action](#kubernetes-in-action)
|
||
* [Running DaemonSets with `hostNetwork: true`](#running-daemonsets)
|
||
* [Services](#services)
|
||
* [Client-IP transparency and loadbalancing](#services-client-ip-transparency)
|
||
* [Session affinity/persistence](#services-session-persistence)
|
||
* [Keeping the cluster balanced](#keep-cluster-balanced)
|
||
* [Node maintenance](#node-maintenance)
|
||
* [What happens if a node goes down?](#what-happens-node-down)
|
||
* [Dealing with disruptions](#disruptions)
|
||
* [Troubleshooting](#troubleshooting)
|
||
* [Deleting a stuck namespace](#ts-delete-stuck-namespace)
|
||
* [Deleting stuck CRDs](#ts-delete-stuck-crd)
|
||
|
||
# kubectl - BASH autocompletion <a name="user-content-kubectl-bash-autocompletion"></a>
|
||
For current shell only:
|
||
```
|
||
source <(kubectl completion bash)
|
||
```
|
||
Persistent:
|
||
```
|
||
echo "source <(kubectl completion bash)" >> ~/.bashrc
|
||
```
|
||
|
||
# Install k3s <a name="user-content-install-k3s"></a>
|
||
## On premises/IaaS <a name="user-content-install-k3s-on-premises"></a>
|
||
https://k3s.io/:
|
||
```
|
||
curl -sfL https://get.k3s.io | sh -
|
||
```
|
||
|
||
### Upstream DNS-resolver <a name="user-content-upstream-dns-resolver"></a>
|
||
Docs: https://rancher.com/docs/rancher/v2.x/en/troubleshooting/dns/
|
||
|
||
Default: 8.8.8.8 => does not resolve local domains!
|
||
1. local /etc/resolv.k3s.conf -> ip-of-dnsresolver (127.0.0.1 **does not work!**)
|
||
2. vi /etc/systemd/system/k3s.service:
|
||
```
|
||
[...]
|
||
ExecStart=/usr/local/bin/k3s \
|
||
server [...] --resolv-conf /etc/resolv.k3s.conf \
|
||
```
|
||
3. Re-load systemd config: `systemctl daemon-reload`
|
||
4. Re-start k3s: `systemctl restart k3s.service`
|
||
5. Re-deploy coredns-pods: `kubectl -n kube-system delete pod name-of-coredns-pods`
|
||
|
||
### Change NodePort range to 1 - 65535 <a name="user-content-nodeport-range"></a>
|
||
1. vi /etc/systemd/system/k3s.service:
|
||
```
|
||
[...]
|
||
ExecStart=/usr/local/bin/k3s \
|
||
server [...] --kube-apiserver-arg service-node-port-range=1-65535 \
|
||
```
|
||
2. Re-load systemd config: `systemctl daemon-reload`
|
||
3. Re-start k3s: `systemctl restart k3s.service`
|
||
|
||
### Install Canal as NetworkPolicy controller <a name="user-content-canal"></a>
|
||
1. Download the yaml manifest Canal: `wget https://docs.projectcalico.org/manifests/canal.yaml -O canal.yaml`
|
||
1. Find and enable (uncomment) the env variable `CALICO_IPV4POOL_CIDR`
|
||
1. Set the value of `CALICO_IPV4POOL_CIDR` to `10.42.0.0/16` (or your value of `--cluster-cidr` - k3s defaults to `10.42.0.0/16`)
|
||
1. Apply the manifest: `kubectl apply -f canal.yaml`
|
||
1. Wait a moment and then check if canal was installed successfully
|
||
```
|
||
kubectl -n kube-system get pod | grep canal
|
||
canal-drmrl 2/2 Running 0 1m
|
||
[...]
|
||
```
|
||
|
||
### Clustering <a name="user-content-clustering"></a>
|
||
If you want to build a K3s-cluster the default networking model is *overlay@VXLAN*. In this case make sure that
|
||
* all of your nodes can reach (ping) each other over the underlying network (local, routed/vpn). This is required for the overlay network to work properly. VXLAN spans a mashed network over all K3s-nodes.
|
||
* if your nodes are spread over public networks (like the internet) use a VPN (like IPSec or OpenVPN) to secure the traffic between the nodes. **VXLAN uses plain UDP for transport!**
|
||
* if your nodes are connected through VPN, `flannel` (overlay network daemon) should explicitly communicate via the vpn network interface instead of the public network interface. Following settings should be made on the nodes:
|
||
```
|
||
/etc/systemd/system/k3s-agent.service:
|
||
|
||
[...]
|
||
ExecStartPre=sleep 60
|
||
ExecStart=/usr/local/bin/k3s \
|
||
agent \
|
||
--flannel-iface <name-of-vpn-interface> \
|
||
```
|
||
* if your public/external nodes are connected through VPN and you have configured [canal](https://github.com/projectcalico/canal) to manage NetworkPolicies you will need to edit node config and change the public IP-addresses (in this example: `1.2.3.4`) of your nodes to internal VPN-IPs (in this example: `172.16.1.2`). Otherwise canal will bypass VPN and route VXLAN traffic through public IP addresses:
|
||
```
|
||
kubectl edit node <external-node-01>
|
||
|
||
apiVersion: v1
|
||
kind: Node
|
||
metadata:
|
||
annotations:
|
||
alpha.kubernetes.io/provided-node-ip: 172.16.1.2
|
||
[...]
|
||
flannel.alpha.coreos.com/backend-data: '{"VtepMAC":"ce:09:ce:de:4d:36"}'
|
||
flannel.alpha.coreos.com/backend-type: vxlan
|
||
flannel.alpha.coreos.com/kube-subnet-manager: "true"
|
||
>> DEL >> flannel.alpha.coreos.com/public-ip: 1.2.3.4
|
||
>> ADD >> flannel.alpha.coreos.com/public-ip: 172.16.1.2
|
||
[...]
|
||
```
|
||
|
||
## On Docker with K3d <a name="user-content-install-k3s-on-docker-k3d"></a>
|
||
K3d is a terraforming orchestrator which deploys a K3s cluster (masters and nodes) directly on docker without the need for virtual machines for each node (master/worker).
|
||
|
||
* Prerequisites: a local docker installation **without user-namespaces enabled**.
|
||
* **Warning**: K3d deploys privileged containers!
|
||
|
||
https://k3d.io/:
|
||
```
|
||
curl -s https://raw.githubusercontent.com/rancher/k3d/main/install.sh | bash
|
||
```
|
||
Create a K3s cluster without `traefik``
|
||
```
|
||
k3d cluster create cluster1 \
|
||
--agents 2 \
|
||
--k3s-server-arg '--disable=traefik' \
|
||
--k3s-server-arg '--kube-apiserver-arg=service-node-port-range=1-65535'
|
||
```
|
||
If you encounter `helm` throwing errors like this one:
|
||
```
|
||
Error: Kubernetes cluster unreachable
|
||
```
|
||
... just do:
|
||
```
|
||
$ kubectl config view --raw > ~/kubeconfig-k3d.yaml
|
||
$ export KUBECONFIG=~/kubeconfig-k3d.yaml
|
||
```
|
||
If you need to change the upstream DNS-resolver:
|
||
```
|
||
kubectl -n kube-system edit configmap coredns
|
||
```
|
||
Find the line containing
|
||
```
|
||
forward . /etc/resolv.conf
|
||
```
|
||
and change the content to
|
||
```
|
||
forward . ipaddr.of.your.dns-resolver
|
||
```
|
||
Finally re-deploy the CoreDNS deployment with:
|
||
`kubectl -n kube-system rollout restart deployment coredns`
|
||
|
||
**Note:** If you restart the cluster (`k3d cluster stop your-cluster` and `k3d cluster start your-cluster`), the changes will be gone!
|
||
|
||
# Namespaces and resource limits <a name="user-content-namespaces-limits"></a>
|
||
```
|
||
kubectl apply -f https://gitea.zwackl.de/dominik/k3s/raw/branch/master/namespaces_limits.yaml
|
||
```
|
||
|
||
# Persistent Volumes (StorageClass - dynamic provisioning) <a name="user-content-pv"></a>
|
||
Read more about [AccessModes](https://kubernetes.io/docs/concepts/storage/persistent-volumes/#access-modes)
|
||
## Rancher Local (k3s default)<a name="user-content-pv-local"></a>
|
||
https://rancher.com/docs/k3s/latest/en/storage/
|
||
Only supports *AccessMode*: ReadWriteOnce (RWO)
|
||
|
||
## Rancher Longhorn (distributed in local cluster) - MY FAVOURITE :-) <a name="user-content-pv-longhorn"></a>
|
||
* Requirements: https://longhorn.io/docs/0.8.0/install/requirements/
|
||
* Debian/Ubuntu: `apt install open-iscsi`
|
||
* Ubuntu: uninstall the multipathd as it can interfere with iscsid: `apt purge multipath-tools`
|
||
* Install: https://rancher.com/docs/k3s/latest/en/storage/
|
||
|
||
### Custom StorageClass <a name="user-content-pv-longhorn-custom-storageclass"></a>
|
||
The following storageClass `longhorn-2r` will define 2 `replicas`, no `dataLocality` and EXT4 as filesystem:
|
||
```
|
||
allowVolumeExpansion: true
|
||
apiVersion: storage.k8s.io/v1
|
||
kind: StorageClass
|
||
metadata:
|
||
annotations:
|
||
longhorn.io/last-applied-configmap: |
|
||
kind: StorageClass
|
||
apiVersion: storage.k8s.io/v1
|
||
metadata:
|
||
name: longhorn-2r
|
||
annotations:
|
||
storageclass.kubernetes.io/is-default-class: "true"
|
||
provisioner: driver.longhorn.io
|
||
allowVolumeExpansion: true
|
||
reclaimPolicy: "Delete"
|
||
volumeBindingMode: Immediate
|
||
parameters:
|
||
numberOfReplicas: "2"
|
||
staleReplicaTimeout: "30"
|
||
fromBackup: ""
|
||
fsType: "ext4"
|
||
dataLocality: "disabled"
|
||
storageclass.kubernetes.io/is-default-class: "true"
|
||
name: longhorn-2r
|
||
parameters:
|
||
fromBackup: ""
|
||
fsType: ext4
|
||
numberOfReplicas: "2"
|
||
staleReplicaTimeout: "30"
|
||
dataLocality: "disabled"
|
||
provisioner: driver.longhorn.io
|
||
reclaimPolicy: Delete
|
||
volumeBindingMode: Immediate
|
||
```
|
||
### Volume backups with S3 (compatible) storage <a name="user-content-pv-longhorn-s3-backup"></a>
|
||
If you do not want to expose your volume backups to public cloud (e.g. AWS), you need to provide a local S3 storage server. This can be easily done with [minio](https://longhorn.io/docs/1.2.2/snapshots-and-backups/backup-and-restore/set-backup-target/#set-up-a-local-testing-backupstore):
|
||
```
|
||
apiVersion: v1
|
||
kind: Secret
|
||
metadata:
|
||
name: s3-backup-secret
|
||
namespace: longhorn-system
|
||
type: Opaque
|
||
data:
|
||
AWS_ACCESS_KEY_ID: bG9uZ2hvcm4= # Base64: <THE access key ID>
|
||
AWS_SECRET_ACCESS_KEY: Qmx1YmIxMjM0IQ== # Base64: <THE acces key secret>
|
||
AWS_ENDPOINTS: aHR0cHM6Ly95b3VyLnMzLmVuZHBvaW50 # Base64: <https://your.s3.endpoint>
|
||
```
|
||
|
||
# Ingress controller <a name="user-content-ingress-controller"></a>
|
||
## Disable Traefik-ingress <a name="user-content-disable-traefik-ingress"></a>
|
||
edit /etc/systemd/system/k3s.service:
|
||
```
|
||
[...]
|
||
ExecStart=/usr/local/bin/k3s \
|
||
server [...] --disable traefik \
|
||
[...]
|
||
```
|
||
Finally `systemctl daemon-reload` and `systemctl restart k3s`
|
||
|
||
## Enable K8s own NGINX-ingress with OCSP stapling <a name="user-content-enable-nginx-ingress"></a>
|
||
### Installation <a name="user-content-install-nginx-ingress"></a>
|
||
This is the helm chart of the K8s own nginx ingress controller:
|
||
https://kubernetes.github.io/ingress-nginx/deploy/#using-helm
|
||
|
||
```
|
||
$ kubectl create ns ingress-nginx
|
||
$ helm repo add ingress-nginx https://kubernetes.github.io/ingress-nginx
|
||
$ helm -n ingress-nginx install ingress-nginx ingress-nginx/ingress-nginx
|
||
```
|
||
```
|
||
$ kubectl -n ingress-nginx get all
|
||
|
||
NAME READY STATUS RESTARTS AGE
|
||
pod/svclb-nginx-ingress-controller-m6gxl 2/2 Running 0 110s
|
||
pod/nginx-ingress-controller-695774d99c-t794f 1/1 Running 0 110s
|
||
|
||
NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
|
||
service/nginx-ingress-controller-admission ClusterIP 10.43.116.191 <none> 443/TCP 110s
|
||
service/nginx-ingress-controller LoadBalancer 10.43.55.41 192.168.178.116 80:31110/TCP,443:31476/TCP 110s
|
||
|
||
NAME DESIRED CURRENT READY UP-TO-DATE AVAILABLE NODE SELECTOR AGE
|
||
daemonset.apps/svclb-nginx-ingress-controller 1 1 1 1 1 <none> 110s
|
||
|
||
NAME READY UP-TO-DATE AVAILABLE AGE
|
||
deployment.apps/nginx-ingress-controller 1/1 1 1 110s
|
||
|
||
NAME DESIRED CURRENT READY AGE
|
||
replicaset.apps/nginx-ingress-controller-695774d99c 1 1 1 110s
|
||
```
|
||
The nginx ingress global configuration can be modified as follows:
|
||
```
|
||
$ kubectl -n ingress-nginx edit configmap ingress-nginx-controller
|
||
|
||
apiVersion: v1
|
||
<<<ADD BEGINN>>>
|
||
data:
|
||
enable-ocsp: "true"
|
||
use-gzip: "true"
|
||
worker-processes: "1"
|
||
<<<ADD END>>>
|
||
kind: ConfigMap
|
||
[...]
|
||
```
|
||
Finally the deployment needs to be restarted:
|
||
|
||
`kubectl -n ingress-nginx rollout restart deployment ingress-nginx-controller`
|
||
|
||
**If you are facing deployment problems like the following one**
|
||
```
|
||
Error: UPGRADE FAILED: cannot patch "gitea-ingress-staging" with kind Ingress: Internal error occurred: failed calling webhook "validate.nginx.ingress.kubernetes.io": Post https://nginx-ingress-controller-admission.ingress-nginx.svc:443/networking/v1beta1/ingresses?timeout=10s: context deadline exceeded
|
||
```
|
||
A possible fix: `kubectl -n ingress-nginx delete ValidatingWebhookConfiguration ingress-nginx-admission`
|
||
|
||
# Cert-Manager (references ingress controller) <a name="user-content-cert-manager"></a>
|
||
## Installation <a name="user-content-cert-manager-install"></a>
|
||
Docs: https://hub.helm.sh/charts/jetstack/cert-manager
|
||
|
||
**Note on split-horizon DNS**: If you are planning to use DNS-01 validation in term of [split-horizon-DNS](https://en.wikipedia.org/wiki/Split-horizon_DNS), you will need to specify an external DNS-resolver (Google, Cloudflare or your ISPs resolver) instead of your internal upstream DNS-resolver for DNS self-checks! Read [this](https://cert-manager.io/docs/configuration/acme/dns01/#setting-nameservers-for-dns01-self-check) for further details.
|
||
```
|
||
helm repo add jetstack https://charts.jetstack.io
|
||
helm repo update
|
||
kubectl apply --validate=false -f https://github.com/jetstack/cert-manager/releases/download/v1.0.2/cert-manager.crds.yaml
|
||
kubectl create namespace cert-manager
|
||
helm install cert-manager --namespace cert-manager --set 'extraArgs={--dns01-recursive-nameservers-only,--dns01-recursive-nameservers=8.8.8.8:53\,1.1.1.1:53}' jetstack/cert-manager
|
||
kubectl -n cert-manager get all
|
||
```
|
||
## Cluster-internal CA Issuer <a name="user-content-cert-manager-cluster-ca-issuer"></a>
|
||
Docs: https://cert-manager.io/docs/configuration/ca/
|
||
|
||
## Let´s Encrypt (HTTP-01/DNS-01) issuer <a name="user-content-cert-manager-le-issuer"></a>
|
||
Docs: https://cert-manager.io/docs/tutorials/acme/ingress/#step-6-configure-let-s-encrypt-issuer
|
||
```
|
||
ClusterIssuers are a resource type similar to Issuers. They are specified in exactly the same way,
|
||
but they do not belong to a single namespace and can be referenced by Certificate resources from
|
||
multiple different namespaces.
|
||
```
|
||
|
||
lets-encrypt-cluster-issuers.yaml:
|
||
```
|
||
apiVersion: cert-manager.io/v1
|
||
kind: ClusterIssuer
|
||
metadata:
|
||
name: letsencrypt-http01-staging-issuer
|
||
spec:
|
||
acme:
|
||
# You must replace this email address with your own.
|
||
# Let's Encrypt will use this to contact you about expiring
|
||
# certificates, and issues related to your account.
|
||
email: user@example.com
|
||
server: https://acme-staging-v02.api.letsencrypt.org/directory
|
||
privateKeySecretRef:
|
||
# Secret resource that will be used to store the account's private key.
|
||
name: letsencrypt-staging-account-key
|
||
# Add a single challenge solver, HTTP01 using nginx
|
||
solvers:
|
||
- http01:
|
||
ingress:
|
||
class: nginx
|
||
---
|
||
apiVersion: cert-manager.io/v1
|
||
kind: ClusterIssuer
|
||
metadata:
|
||
name: letsencrypt-http01-prod-issuer
|
||
spec:
|
||
acme:
|
||
# The ACME server URL
|
||
server: https://acme-v02.api.letsencrypt.org/directory
|
||
# Email address used for ACME registration
|
||
email: user@example.com
|
||
# Name of a secret used to store the ACME account private key
|
||
privateKeySecretRef:
|
||
name: letsencrypt-prod-account-key
|
||
# Enable the HTTP-01 challenge provider
|
||
solvers:
|
||
- http01:
|
||
ingress:
|
||
class: nginx
|
||
---
|
||
apiVersion: v1
|
||
kind: Secret
|
||
metadata:
|
||
name: tsig-dyn-update-secret
|
||
namespace: cert-manager
|
||
type: Opaque
|
||
data:
|
||
key: BASE64 encoded of BASE64 encoded (double-base64) TSIG-key
|
||
---
|
||
apiVersion: cert-manager.io/v1
|
||
kind: ClusterIssuer
|
||
metadata:
|
||
name: letsencrypt-dns01-prod-issuer
|
||
spec:
|
||
acme:
|
||
email: user@example.com
|
||
server: https://acme-v02.api.letsencrypt.org/directory
|
||
privateKeySecretRef:
|
||
# Secret resource that will be used to store the account's private key.
|
||
name: letsencrypt-dns01-account-key
|
||
# Add a single challenge solver, HTTP01 using nginx
|
||
solvers:
|
||
- dns01:
|
||
rfc2136:
|
||
nameserver: ip_address_of_your_authoritative_nameserver:nameserver_port
|
||
tsigKeyName: name_of_tsig_key_in_your_authoritative_nameserver
|
||
tsigAlgorithm: HMACSHA512
|
||
tsigSecretSecretRef:
|
||
name: tsig-dyn-update-secret
|
||
key: key
|
||
selector:
|
||
dnsZones:
|
||
- 'int.example.org'
|
||
```
|
||
`kubectl apply -f lets-encrypt-cluster-issuers.yaml`
|
||
|
||
## Deploying a LE-certificate with ingress <a name="user-content-cert-manager-ingress"></a>
|
||
All you need is an `Ingress` resource of class `nginx` which references a ClusterIssuer (`letsencrypt-http01-prod-issuer`) resource.
|
||
|
||
HTTP-01 solver (`cert-manager.io/cluster-issuer: "letsencrypt-http01-prod-issuer"`):
|
||
```
|
||
apiVersion: networking.k8s.io/v1beta1
|
||
kind: Ingress
|
||
metadata:
|
||
namespace: <stage>
|
||
name: some-ingress-name
|
||
annotations:
|
||
# use the shared ingress-nginx
|
||
kubernetes.io/ingress.class: "nginx"
|
||
cert-manager.io/cluster-issuer: "letsencrypt-http01-prod-issuer"
|
||
spec:
|
||
tls:
|
||
- hosts:
|
||
- some-certificate.name.san
|
||
secretName: target-certificate-secret-name
|
||
rules:
|
||
- host: some-certificate.name.san
|
||
http:
|
||
paths:
|
||
- path: /
|
||
backend:
|
||
serviceName: some-target-service
|
||
servicePort: some-target-service-port
|
||
```
|
||
DNS-01 solver (`cert-manager.io/cluster-issuer: "letsencrypt-dns01-prod-issuer"`):
|
||
```
|
||
apiVersion: networking.k8s.io/v1beta1
|
||
kind: Ingress
|
||
metadata:
|
||
namespace: <stage>
|
||
name: some-ingress-name
|
||
annotations:
|
||
# use the shared ingress-nginx
|
||
kubernetes.io/ingress.class: "nginx"
|
||
cert-manager.io/cluster-issuer: "letsencrypt-dns01-prod-issuer"
|
||
spec:
|
||
tls:
|
||
- hosts:
|
||
- some-certificate.name.san
|
||
secretName: target-certificate-secret-name
|
||
rules:
|
||
- host: some-certificate.name.san
|
||
http:
|
||
paths:
|
||
- path: /
|
||
backend:
|
||
serviceName: some-target-service
|
||
servicePort: some-target-service-port
|
||
```
|
||
## Deploying a LE-certificate by CRD <a name="user-content-cert-manager-crd"></a>
|
||
All you need is a Certificate-CRD (Custom Resource Definition) like this one:
|
||
```
|
||
apiVersion: cert-manager.io/v1
|
||
kind: Certificate
|
||
metadata:
|
||
name: some-certificate
|
||
namespace: staging
|
||
spec:
|
||
# Secret names are always required.
|
||
secretName: some-secret
|
||
|
||
duration: 2160h # 90d
|
||
renewBefore: 360h # 15d
|
||
# The use of the common name field has been deprecated since 2000 and is
|
||
# discouraged from being used.
|
||
commonName: some.fully.qualified.domain.name
|
||
isCA: false
|
||
privateKey:
|
||
algorithm: RSA
|
||
encoding: PKCS1
|
||
size: 4096
|
||
usages:
|
||
- server auth
|
||
- client auth
|
||
# At least one of a DNS Name, URI, or IP address is required.
|
||
dnsNames:
|
||
- some.fully.qualified.domain.name
|
||
# Issuer references are always required.
|
||
issuerRef:
|
||
name: <your-favourite-cluster-issuer>
|
||
# We can reference ClusterIssuers by changing the kind here.
|
||
# The default value is Issuer (i.e. a locally namespaced Issuer)
|
||
kind: ClusterIssuer
|
||
```
|
||
After the certificate was issued, you can reference it as a volume within a deployment:
|
||
```
|
||
apiVersion: apps/v1
|
||
kind: Deployment
|
||
metadata:
|
||
labels:
|
||
app: nginx-ssl
|
||
name: nginx-ssl
|
||
namespace: staging
|
||
spec:
|
||
replicas: 1
|
||
selector:
|
||
matchLabels:
|
||
app: nginx-ssl
|
||
strategy:
|
||
type: Recreate
|
||
template:
|
||
metadata:
|
||
labels:
|
||
app: nginx-ssl
|
||
spec:
|
||
volumes:
|
||
- name: nginx-ssl-volume
|
||
secret:
|
||
secretName: some-secret
|
||
containers:
|
||
- image: nginx
|
||
name: nginx-ssl
|
||
volumeMounts:
|
||
- mountPath: "/etc/nginx/ssl"
|
||
name: nginx-ssl-volume
|
||
readOnly: true
|
||
ports:
|
||
- containerPort: 80
|
||
restartPolicy: Always
|
||
```
|
||
|
||
## Troubleshooting <a name="user-content-cert-manager-troubleshooting"></a>
|
||
Docs: https://cert-manager.io/docs/faq/acme/
|
||
|
||
ClusterIssuers are *visible* any namespaces:
|
||
```
|
||
kubectl get clusterissuer
|
||
kubectl describe clusterissuer <object>
|
||
```
|
||
All other ingres-specific cert-manager resources are running <stage> specific namespaces:
|
||
```
|
||
kubectl -n <stage> get certificaterequest
|
||
kubectl -n <stage> describe certificaterequest <object>
|
||
kubectl -n <stage> get certificate
|
||
kubectl -n <stage> describe certificate <object>
|
||
kubectl -n <stage> get secret
|
||
kubectl -n <stage> describe secret <object>
|
||
kubectl -n <stage> get challenge
|
||
kubectl -n <stage> describe challenge <object>
|
||
```
|
||
|
||
After successfull setup perform a TLS-test:
|
||
* https://testssl.sh/ (`apt install testssl.sh`)
|
||
* https://www.ssllabs.com/ssltest/index.html
|
||
|
||
# Cluster monitoring <a name="user-content-cluster-monitoring"></a>
|
||
Create namespace for monitoring
|
||
```
|
||
kubectl create ns monitoring
|
||
```
|
||
|
||
## Log correlation with Loki-stack <a name="user-content-loki-stack"></a>
|
||
Docs: https://github.com/grafana/helm-charts/tree/main/charts/loki-stack
|
||
```
|
||
helm repo add grafana https://grafana.github.io/helm-charts
|
||
helm repo update
|
||
```
|
||
Download values file for loki-stack helm chart and replace any macros with corresponding values:
|
||
* %PVC_STORAGECLASS% -> your storageclass for persistent storage
|
||
* %PVC_STORAGE_SIZE% -> size of persistent storage, e.g. 4Gi
|
||
```
|
||
wget https://gitea.zwackl.de/dominik/k3s/raw/branch/master/loki-stack-values.yaml
|
||
```
|
||
Install loki-stack:
|
||
```
|
||
helm -n monitoring upgrade --install -f loki-stack-values.yaml loki-stack grafana/loki-stack
|
||
```
|
||
Grafana will be installed with Prometheus-stack...
|
||
|
||
## Metrics with Prometheus-stack + Grafana <a name="user-content-prometheus-grafana"></a>
|
||
```
|
||
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
|
||
helm repo update
|
||
```
|
||
Download values file for prometheus-stack helm chart and replace any macros with corresponding values:
|
||
* %ADMIN_PASSWORD% -> Grafana admin password
|
||
* %SERVICE_FQDN% -> Service FQDN
|
||
* %MASTER_NODE_IPV4_ADDR% -> IPv4 address of your cluster master node
|
||
* %PVC_STORAGECLASS% -> your storageclass for persistent storage
|
||
* %PVC_STORAGE_SIZE% -> size of persistent storage, e.g. 4Gi
|
||
* %PVC_STORAGE_SIZE_GRAFANA% -> size of persistent storage of grafana, e.g. 1Gi
|
||
* %SMTP_HOST%
|
||
* %SMTP_USER%
|
||
* %SMTP_PASSWORD%
|
||
* %SMTP_SENDER_ADDRESS%
|
||
* %SMTP_FROM_HEADER%
|
||
```
|
||
wget https://gitea.zwackl.de/dominik/k3s/raw/branch/master/prom-stack-values.yaml
|
||
```
|
||
Install promethous-stack:
|
||
```
|
||
helm -n monitoring upgrade --install -f prom-stack-values.yaml prom-stack prometheus-community/kube-prometheus-stack
|
||
```
|
||
Access grafana web ui via port-forwarding at http://localhost:8080 (or configure an ingress instance):
|
||
```
|
||
kubectl -n monitoring port-forward service/prom-stack-grafana 8080:80
|
||
```
|
||
|
||
# HELM charts <a name="user-content-helm"></a>
|
||
Docs:
|
||
* https://helm.sh/docs/intro/using_helm/
|
||
|
||
## Create a chart <a name="user-content-helm-create"></a>
|
||
`helm create helm-test`
|
||
|
||
```
|
||
~/kubernetes/helm$ tree helm-test/
|
||
helm-test/
|
||
├── charts
|
||
├── Chart.yaml
|
||
├── templates
|
||
│ ├── deployment.yaml
|
||
│ ├── _helpers.tpl
|
||
│ ├── hpa.yaml
|
||
│ ├── ingress.yaml
|
||
│ ├── NOTES.txt
|
||
│ ├── serviceaccount.yaml
|
||
│ ├── service.yaml
|
||
│ └── tests
|
||
│ └── test-connection.yaml
|
||
└── values.yaml
|
||
```
|
||
|
||
## Install local chart without packaging <a name="user-content-helm-install-without-packaging"></a>
|
||
`helm install helm-test-dev helm-test/ --set image.tag=latest --debug --wait`
|
||
|
||
or just a *dry-run*:
|
||
|
||
`helm install helm-test-dev helm-test/ --set image.tag=latest --debug --dry-run`
|
||
|
||
```
|
||
--wait: Waits until all Pods are in a ready state, PVCs are bound, Deployments have minimum (Desired minus maxUnavailable)
|
||
Pods in ready state and Services have an IP address (and Ingress if a LoadBalancer) before marking the release as successful.
|
||
It will wait for as long as the --timeout value. If timeout is reached, the release will be marked as FAILED. Note: In
|
||
scenarios where Deployment has replicas set to 1 and maxUnavailable is not set to 0 as part of rolling update strategy,
|
||
|
||
--wait will return as ready as it has satisfied the minimum Pod in ready condition.
|
||
```
|
||
|
||
## List deployed helm charts <a name="user-content-helm-list"></a>
|
||
```
|
||
~/kubernetes/helm$ helm list
|
||
NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION
|
||
helm-test-dev default 4 2020-08-27 12:30:38.98457042 +0200 CEST deployed helm-test-0.1.0 1.16.0
|
||
```
|
||
|
||
## Upgrade local chart without packaging <a name="user-content-helm-upgrade"></a>
|
||
```
|
||
~/kubernetes/helm$ helm upgrade helm-test-dev helm-test/ --set image.tag=latest --wait --timeout 60s
|
||
Release "helm-test-dev" has been upgraded. Happy Helming!
|
||
NAME: helm-test-dev
|
||
LAST DEPLOYED: Thu Aug 27 12:47:09 2020
|
||
NAMESPACE: default
|
||
STATUS: deployed
|
||
REVISION: 7
|
||
NOTES:
|
||
1. Get the application URL by running these commands:
|
||
export POD_NAME=$(kubectl get pods --namespace default -l "app.kubernetes.io/name=helm-test,app.kubernetes.io/instance=helm-test-dev" -o jsonpath="{.items[0].metadata.name}")
|
||
echo "Visit http://127.0.0.1:8080 to use your application"
|
||
kubectl --namespace default port-forward $POD_NAME 8080:80
|
||
```
|
||
`helm upgrade [...] --wait` is synchronous and exit with 0 on success, otherwise with >0 on failure. `helm upgrade` will wait for 5 minutes Setting the `--timeout` (Default 5 minutes) flag makes This can be used in term of CI/CD deployments with Jenkins.
|
||
|
||
## Get status of deployed chart <a name="user-content-helm-status"></a>
|
||
```
|
||
~/kubernetes/helm$ helm status helm-test-dev
|
||
NAME: helm-test-dev
|
||
LAST DEPLOYED: Thu Aug 27 12:47:09 2020
|
||
NAMESPACE: default
|
||
STATUS: deployed
|
||
REVISION: 7
|
||
NOTES:
|
||
1. Get the application URL by running these commands:
|
||
export POD_NAME=$(kubectl get pods --namespace default -l "app.kubernetes.io/name=helm-test,app.kubernetes.io/instance=helm-test-dev" -o jsonpath="{.items[0].metadata.name}")
|
||
echo "Visit http://127.0.0.1:8080 to use your application"
|
||
kubectl --namespace default port-forward $POD_NAME 8080:80
|
||
```
|
||
|
||
## Get deployment history <a name="user-content-helm-history"></a>
|
||
```
|
||
~/kubernetes/helm$ helm history helm-test-dev
|
||
REVISION UPDATED STATUS CHART APP VERSION DESCRIPTION
|
||
10 Thu Aug 27 12:56:33 2020 failed helm-test-0.1.0 1.16.0 Upgrade "helm-test-dev" failed: timed out waiting for the condition
|
||
11 Thu Aug 27 13:08:34 2020 superseded helm-test-0.1.0 1.16.0 Upgrade complete
|
||
12 Thu Aug 27 13:09:59 2020 superseded helm-test-0.1.0 1.16.0 Upgrade complete
|
||
13 Thu Aug 27 13:10:24 2020 superseded helm-test-0.1.0 1.16.0 Rollback to 11
|
||
14 Thu Aug 27 13:23:22 2020 failed helm-test-0.1.1 blubb Upgrade "helm-test-dev" failed: timed out waiting for the condition
|
||
15 Thu Aug 27 13:26:43 2020 pending-upgrade helm-test-0.1.1 blubb Preparing upgrade
|
||
16 Thu Aug 27 13:27:12 2020 superseded helm-test-0.1.1 blubb Upgrade complete
|
||
17 Thu Aug 27 14:32:32 2020 superseded helm-test-0.1.1 Upgrade complete
|
||
18 Thu Aug 27 14:33:58 2020 superseded helm-test-0.1.1 Upgrade complete
|
||
19 Thu Aug 27 14:36:49 2020 failed helm-test-0.1.1 cosmetics Upgrade "helm-test-dev" failed: timed out waiting for the condition
|
||
```
|
||
|
||
## Rollback <a name="user-content-helm-rollback"></a>
|
||
`helm rollback helm-test-dev 18 --wait`
|
||
```
|
||
~/kubernetes/helm$ helm history helm-test-dev
|
||
REVISION UPDATED STATUS CHART APP VERSION DESCRIPTION
|
||
10 Thu Aug 27 12:56:33 2020 failed helm-test-0.1.0 1.16.0 Upgrade "helm-test-dev" failed: timed out waiting for the condition
|
||
11 Thu Aug 27 13:08:34 2020 superseded helm-test-0.1.0 1.16.0 Upgrade complete
|
||
12 Thu Aug 27 13:09:59 2020 superseded helm-test-0.1.0 1.16.0 Upgrade complete
|
||
13 Thu Aug 27 13:10:24 2020 superseded helm-test-0.1.0 1.16.0 Rollback to 11
|
||
14 Thu Aug 27 13:23:22 2020 failed helm-test-0.1.1 blubb Upgrade "helm-test-dev" failed: timed out waiting for the condition
|
||
15 Thu Aug 27 13:26:43 2020 pending-upgrade helm-test-0.1.1 blubb Preparing upgrade
|
||
16 Thu Aug 27 13:27:12 2020 superseded helm-test-0.1.1 blubb Upgrade complete
|
||
17 Thu Aug 27 14:32:32 2020 superseded helm-test-0.1.1 Upgrade complete
|
||
18 Thu Aug 27 14:33:58 2020 superseded helm-test-0.1.1 Upgrade complete
|
||
19 Thu Aug 27 14:36:49 2020 failed helm-test-0.1.1 cosmetics Upgrade "helm-test-dev" failed: timed out waiting for the condition
|
||
20 Thu Aug 27 14:37:36 2020 deployed helm-test-0.1.1 Rollback to 18
|
||
```
|
||
```
|
||
~/kubernetes/helm$ helm status helm-test-dev
|
||
NAME: helm-test-dev
|
||
LAST DEPLOYED: Thu Aug 27 14:37:36 2020
|
||
NAMESPACE: default
|
||
STATUS: deployed
|
||
REVISION: 20
|
||
NOTES:
|
||
1. Get the application URL by running these commands:
|
||
export POD_NAME=$(kubectl get pods --namespace default -l "app.kubernetes.io/name=helm-test,app.kubernetes.io/instance=helm-test-dev" -o jsonpath="{.items[0].metadata.name}")
|
||
echo "Visit http://127.0.0.1:8080 to use your application"
|
||
kubectl --namespace default port-forward $POD_NAME 8080:80
|
||
```
|
||
|
||
# Kubernetes in action <a name="user-content-kubernetes-in-action"></a>
|
||
## Running DaemonSets with `hostNetwork: true` <a name="user-content-running-daemonsets"></a>
|
||
* [Docs: DaemonSet](https://kubernetes.io/docs/concepts/workloads/controllers/daemonset/)
|
||
* [(Security) hints on using `hostNetwork`](https://kubernetes.io/docs/concepts/policy/pod-security-policy/#host-namespaces)
|
||
* [Pod´s DNS policy](https://kubernetes.io/docs/concepts/services-networking/dns-pod-service/#pod-s-dns-policy)
|
||
|
||
This setup is suitable for scenarios where kubernetes nodes are already running with dual-IP-stack (IPv4 and IPv6) and the Pod needs IPv6 too, but k3s was deployed in ipv4-only mode. In this case the Pod can be deployed in the network namespace of the kubernetes node.
|
||
```
|
||
kind: DaemonSet
|
||
apiVersion: apps/v1
|
||
metadata:
|
||
name: netcat-daemonset
|
||
labels:
|
||
app: netcat-daemonset
|
||
spec:
|
||
selector:
|
||
matchLabels:
|
||
app: netcat-daemonset
|
||
template:
|
||
metadata:
|
||
labels:
|
||
app: netcat-daemonset
|
||
spec:
|
||
hostNetwork: true
|
||
dnsPolicy: ClusterFirstWithHostNet
|
||
restartPolicy: Always
|
||
terminationGracePeriodSeconds: 10
|
||
containers:
|
||
- name: alpine-netcat-daemonset
|
||
image: alpine
|
||
imagePullPolicy: IfNotPresent
|
||
command: ["nc", "-lk", "-p", "23456", "-v", "-e", "/bin/true"]
|
||
```
|
||
|
||
## Services <a name="user-content-services"></a>
|
||
### Client-IP transparency and loadbalancing <a name="user-content-services-client-ip-transparency"></a>
|
||
```
|
||
apiVersion: v1
|
||
kind: Service
|
||
[...]
|
||
spec:
|
||
type: NodePort
|
||
externalTrafficPolicy: <<Local|Cluster>>
|
||
[...]
|
||
```
|
||
`externalTrafficPolicy: Cluster` (default) spreads the incoming traffic over all pods evenly. To achieve this the client ip-address must be source-NATted and therefore it´s not *visible* to the PODs.
|
||
|
||
`externalTrafficPolicy: Local` preserves the original client ip-address which is visible to the PODs. In any case (`DaemonSet` or `StatefulSet`) traffic remains on the Node which gets the traffic. In case of `StatefulSet` if more than one POD of a `ReplicaSet` is scheduled on the same Node, the workload gets balanced over all PODs on the same Node.
|
||
|
||
### Session affinity/persistence <a name="user-content-services-session-persistence"></a>
|
||
```
|
||
apiVersion: v1
|
||
kind: Service
|
||
[...]
|
||
spec:
|
||
type: NodePort
|
||
sessionAffinity: <<ClientIP|None>>
|
||
sessionAffinityConfig:
|
||
clientIP:
|
||
timeoutSeconds: 10
|
||
[]
|
||
```
|
||
Session persistence (`None` by default) is only supported on client ip-address. Cookie-stickiness or stickiness on any other/higher layer is not supported yet.
|
||
|
||
## What happens if a node goes down? <a name="user-content-what-happens-node-down"></a>
|
||
If a node goes down kubernetes marks this node as *NotReady*, but nothing else happens until [Pod tolerations](https://kubernetes.io/docs/concepts/scheduling-eviction/taint-and-toleration) take place, which are per default configured with a *timeout* of `300s` (5 minutes!).
|
||
```
|
||
$ kubectl get node
|
||
NAME STATUS ROLES AGE VERSION
|
||
k3s-node2 Ready <none> 103d v1.19.5+k3s2
|
||
k3s-master Ready master 103d v1.19.5+k3s2
|
||
k3s-node1 NotReady <none> 103d v1.19.5+k3s2
|
||
|
||
$ kubectl get pod
|
||
NAME READY STATUS RESTARTS AGE
|
||
ds-test-5mlkt 1/1 Running 14 28h
|
||
web-1 1/1 Running 0 26m
|
||
web-2 1/1 Running 0 26m
|
||
ds-test-c6xx8 1/1 Running 0 18m
|
||
ds-test-w45dv 1/1 Running 5 28h
|
||
```
|
||
Pod tolerations can be determined like this:
|
||
```
|
||
$ kubectl -n <namespace> describe pod <pod-name>
|
||
|
||
[...]
|
||
Tolerations: node.kubernetes.io/not-ready:NoExecute op=Exists for 300s
|
||
node.kubernetes.io/unreachable:NoExecute op=Exists for 300s
|
||
[...]
|
||
```
|
||
|
||
To be more reactive to node failures Pod tolerations can be adjusted within the Pod template as follows:
|
||
```
|
||
kind: Deployment or StatefulSet
|
||
apiVersion: apps/v1
|
||
metadata:
|
||
[...]
|
||
spec:
|
||
[...]
|
||
template:
|
||
[...]
|
||
spec:
|
||
tolerations:
|
||
- key: "node.kubernetes.io/unreachable"
|
||
operator: "Exists"
|
||
effect: "NoExecute"
|
||
tolerationSeconds: 30
|
||
- key: "node.kubernetes.io/not-ready"
|
||
operator: "Exists"
|
||
effect: "NoExecute"
|
||
tolerationSeconds: 30
|
||
[...]
|
||
```
|
||
|
||
## Keeping the cluster balanced <a name="user-content-keep-cluster-balanced"></a>
|
||
In first place, Kubernetes takes care of high availability, but not of well balance of pods per node.
|
||
|
||
In case of `Deployment` or `StatefulSet` a [`topologySpreadConstraint`](https://kubernetes.io/docs/concepts/workloads/pods/pod-topology-spread-constraints/) needs to be specified:
|
||
```
|
||
kind: Deployment or StatefulSet
|
||
apiVersion: apps/v1
|
||
metadata:
|
||
[...]
|
||
spec:
|
||
[...]
|
||
template:
|
||
[...]
|
||
spec:
|
||
# Prevent to schedule more than one pod per node
|
||
topologySpreadConstraints:
|
||
- labelSelector:
|
||
matchLabels:
|
||
app: the-app
|
||
maxSkew: 1
|
||
topologyKey: kubernetes.io/hostname
|
||
whenUnsatisfiable: DoNotSchedule
|
||
[...]
|
||
```
|
||
`DaemonSet` workloads do not support `topologySpreadConstraints` at all.
|
||
|
||
## Node maintenance <a name="user-content-node-maintenance"></a>
|
||
*Mark* a node for maintenance:
|
||
```
|
||
$ kubectl drain k3s-node2 --ignore-daemonsets
|
||
|
||
$ kubectl get node
|
||
NAME STATUS ROLES AGE VERSION
|
||
k3s-node1 Ready <none> 105d v1.19.5+k3s2
|
||
k3s-master Ready master 105d v1.19.5+k3s2
|
||
k3s-node2 Ready,SchedulingDisabled <none> 105d v1.19.5+k3s
|
||
```
|
||
All Deployment as well as StatefulSet pods have been rescheduled on remaining nodes. DaemonSet pods were not touched! Node maintenance can be performed now.
|
||
|
||
To bring the maintained node back in cluster:
|
||
```
|
||
$ kubectl uncordon k3s-node2
|
||
node/k3s-node2 uncordoned
|
||
```
|
||
|
||
## Dealing with disruptions <a name="user-content-disruptions"></a>
|
||
* https://kubernetes.io/docs/concepts/workloads/pods/disruptions/
|
||
* https://kubernetes.io/docs/tasks/administer-cluster/safely-drain-node/
|
||
|
||
# Troubleshooting <a name="user-content-troubleshooting"></a>
|
||
## Deleting a stuck namespace <a name="user-content-ts-delete-stuck-namespace"></a>
|
||
```
|
||
kubectl get namespace "stucked-namespace" -o json \
|
||
| tr -d "\n" | sed "s/\"finalizers\": \[[^]]\+\]/\"finalizers\": []/" \
|
||
| kubectl replace --raw /api/v1/namespaces/stucked-namespace/finalize -f -
|
||
```
|
||
|
||
## Deleting stuck CRDs <a name="user-content-ts-delete-stuck-crd"></a>
|
||
https://github.com/kubernetes/kubernetes/issues/60538#issuecomment-369099998
|