In Episode 3 we wrote five YAML files to deploy one application. The kube-prometheus-stack chart, which gives us Prometheus, Grafana, Alertmanager, node-exporter, and kube-state-metrics, would require 20+ manifests written by hand. We're not doing that. This is where Helm enters the picture.
This is the companion article to Episode 4 of the Kubernetes on Raspberry Pi series. We deploy full cluster monitoring using Prometheus and Grafana, and work through the Talos-specific issues that come up along the way.
All configs are in the kubernetes-series GitHub repo under video-04-helm-prometheus-grafana/.
What Is Helm?
Helm is Kubernetes' package manager, like apt or brew but for cluster apps. A chart is a package of Kubernetes manifests, templated and versioned. A repository is a collection of charts. A release is a deployed instance of a chart, and values are configuration overrides you provide at install time.
Install Helm and add the Prometheus community repository:
curl https://raw.githubusercontent.com/helm/helm/main/scripts/get-helm-3 | bash
helm version
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts
helm repo update
Configuring the Install
Rather than accepting all defaults, we customize a few things in a values.yaml file:
# values.yaml
grafana:
adminPassword: "your-secure-password"
service:
type: ClusterIP
prometheus:
prometheusSpec:
retention: 30d
storageSpec: {}
prometheus-node-exporter:
hostRootFsMount:
enabled: false
The hostRootFsMount: false setting is critical for Talos. More on that shortly.
helm install kube-prometheus-stack prometheus-community/kube-prometheus-stack \
--namespace monitoring \
--create-namespace \
--values values.yaml
Watch the resources get created:
kubectl get all -n monitoring
One command, 20+ resources. That's the value of Helm.
Talos PSS Troubleshooting
This is where Talos starts pushing back. Pod Security Standards are enforced at the API server level, and several monitoring components need capabilities that Talos blocks by default.
node-exporter
After install, you may notice node-exporter pods aren't running:
kubectl get daemonset kube-prometheus-stack-prometheus-node-exporter -n monitoring
# DESIRED: 6, CURRENT: 0
Describing the DaemonSet reveals the problem:
violates PodSecurity "baseline:latest": host namespaces (hostNetwork=true, hostPID=true)...
node-exporter needs deep host access (hostPID, hostNetwork) to collect node metrics. We address this with two fixes. The hostRootFsMount: false value we already added to values.yaml handles Talos's read-only root filesystem conflict. For the PSS restriction, add monitoring to the namespace exemptions in the Talos machine config:
export EDITOR=nano
talosctl edit machineconfig --nodes <control-plane-ip>
Find the exemptions section and add monitoring:
exemptions:
namespaces:
- kube-system
- monitoring
Reboot the control plane:
talosctl reboot --nodes <control-plane-ip>
All 6 node-exporter pods should appear after reboot.
kube-scheduler and kube-controller-manager
Prometheus targets for kube-scheduler and kube-controller-manager will show connection refused. Talos binds these components to 127.0.0.1 by default, making them unreachable from other pods. Patch them to bind on all interfaces:
cluster:
scheduler:
extraArgs:
bind-address: 0.0.0.0
controllerManager:
extraArgs:
bind-address: 0.0.0.0
talosctl patch machineconfig --nodes <control-plane-ip> --patch @scheduler-patch.yaml
talosctl reboot --nodes <control-plane-ip>
kube-proxy
kube-proxy has the same bind address issue, but it runs on all 6 nodes. Patch all nodes at once, then delete the DaemonSet to force regeneration:
cluster:
proxy:
extraArgs:
metrics-bind-address: 0.0.0.0:10249
talosctl patch machineconfig \
--nodes <node1>,<node2>,<node3>,<node4>,<node5>,<node6> \
--patch @proxy-patch.yaml
kubectl delete daemonset kube-proxy -n kube-system
Note: kube-proxy is a DaemonSet managed by Talos. Unlike control plane components, it doesn't pick up changes on reboot. Talos regenerates it when it detects the DaemonSet is missing.
Accessing Grafana
Port-forward to Grafana:
kubectl port-forward svc/kube-prometheus-stack-grafana 3000:80 -n monitoring
Open http://localhost:3000 and log in with admin and the password from your values file. Explore the pre-built dashboards: cluster overview, per-node CPU and memory, per-pod metrics. The cluster stops being a black box.
Two Helm commands worth memorizing for ongoing maintenance:
helm list -n monitoring # see what's installed
helm upgrade kube-prometheus-stack prometheus-community/kube-prometheus-stack \
--namespace monitoring \
--values values.yaml # apply config changes
What's Next
Grafana is accessible via port-forward, which is a temporary shortcut and not a real solution. In Episode 5 we add MetalLB and Traefik so every service gets a real URL with no port numbers.