This is super cool and useful: https://medium.com/faun/kubectl-commands-cheatsheet-43ce8f13adfb Tips & tricks: https://hackernoon.com/top-10-kubernetes-tips-and-tricks-27528c2d0222 Configure bash complete: echo "source <(kubectl completion bash)" >> ~/.bashrc

kubectl get pods --all-namespaces -o custom-columns=NAME:.metadata.name,NAMESPACE:.metadata.namespace,QOS-CLASS:.status.qosClass

etcd guides:

https://github.com/etcd-io/etcd/blob/master/Documentation/op-guide/hardware.md#example-hardware-configurations https://github.com/etcd-io/etcd/blob/master/Documentation/op-guide/hardware.md#cpus

DataDog setup

Check agent status: https://docs.datadoghq.com/agent/guide/agent-commands/?tab=agentv6#agent-status-and-information

When datadog agent cannot detect kubelet and the dashboard is empty: DataDog/integrations-core#2582

More generic agent troubleshooting: https://docs.datadoghq.com/agent/troubleshooting/

Manage datadog monitors using code: https://github.com/trueaccord/DogPush

Resource requests and limits

https://www.replex.io/blog/kubernetes-in-production-readiness-checklist-and-best-practices-for-resource-management https://www.magalix.com/blog/kubernetes-resource-requests-and-limits-101 https://cloud.google.com/blog/products/gcp/kubernetes-best-practices-resource-requests-and-limits

Pod Disruption Budgets, number 6 here: https://hackernoon.com/top-10-kubernetes-tips-and-tricks-27528c2d0222

Monitoring

As explained in the section about container metrics, some statistics reported by Docker should be also monitored as they provide deeper (and more accurate) insights. The CPU throttling metric is a great example, as it represents the number of times a container hit its specified limit.

About nodes and resources: https://kubernetes.io/docs/tasks/administer-cluster/out-of-resource/ How to change eviction values: https://medium.com/faun/kubelet-pod-the-node-was-low-on-resource-diskpressure-384f590892f5

Kubernetes metrics in production: https://www.replex.io/blog/kubernetes-in-production-the-ultimate-guide-to-monitoring-resource-metrics

Kubernetes varia

draining nodes: https://kubernetes.io/docs/tasks/administer-cluster/safely-drain-node/ taints and toleration: https://kubernetes.io/docs/concepts/configuration/taint-and-toleration/

alexchiri/k8s-maintenance.md

etcd guides:

DataDog setup

Resource requests and limits

Monitoring

Kubernetes varia