Below is a detailed overview of how to collect metrics at different layers in a Kubernetes cluster—covering container-level, pod-level, node-level, and overall cluster-level metrics. We’ll focus on the most common, open-source approaches, although there are many commercial or cloud-specific variants that work similarly.
1. Collecting Container and Pod Metrics
cAdvisor (Container Advisor)
- What It Is: A daemon that collects resource usage and performance characteristics of running containers.
- Where It Runs: Typically embedded inside the Kubernetes kubelet process on each node.
- Metrics Collected: CPU usage, memory usage, network I/O, filesystem I/O per container and pod.
- Accessing Metrics: Exposed on the kubelet endpoint (e.g.,
http://<node-ip>:10255/metrics/cadvisor
orhttps://<node-ip>:10250/metrics/cadvisor
if secure).
Kubernetes uses cAdvisor under the hood, so you usually don’t install it separately—it’s already built into the kubelet. These cAdvisor metrics are then scraped by a metrics collector (e.g., Prometheus).
Prometheus Scraping
- Prometheus is commonly used to scrape cAdvisor metrics.
- How It Works:
- You install the Prometheus Operator or a standalone Prometheus instance in the cluster.
- You configure Prometheus to scrape the kubelet’s cAdvisor metrics endpoint (and other endpoints).
- Collected Data: CPU, memory, disk, network usage for containers/pods.
Metrics Server (For HPA)
- What It Is: A lightweight, cluster-wide aggregator of resource usage data.
- Primary Use Case: Used by Kubernetes’ Horizontal Pod Autoscaler (HPA) to scale workloads based on CPU/memory usage.
- Data Source: Fetches metrics from Kubelets/cAdvisor, then makes them available via the
metrics.k8s.io
API. - Limitations: Designed for autoscaling, not for long-term storage or advanced analytics.
2. Collecting Node Metrics
Kubelet’s /metrics Endpoint
- What It Is: The kubelet itself exposes node metrics (e.g., CPU/memory usage of the node, runtime stats).
- Where to Find:
http://<node-ip>:10255/metrics
(insecure endpoint)https://<node-ip>:10250/metrics
(secure endpoint)
- Collected Data: Node-wide CPU usage, memory usage, runtime container stats (via cAdvisor integration).
Node Exporter (Prometheus)
- What It Is: A Prometheus exporter that collects Linux system-level metrics.
- How to Deploy: Typically deployed as a DaemonSet so that every node runs a Node Exporter container.
- Collected Data: CPU, memory, disk usage, file system stats, network, etc., at the node level.
- Scraping: Prometheus scrapes the Node Exporter endpoints, adding those metrics to the time-series database.
3. Collecting Cluster Metrics & State
kube-state-metrics (KSM)
- What It Is: A component that listens to the Kubernetes API and generates metrics about cluster objects.
- Examples of Metrics:
- Number of desired/available replicas in Deployments, DaemonSets, StatefulSets
- Pod status, job status, node status
- Resource quotas, limits, requests
- How to Deploy: Install via Helm chart or YAML manifest. Usually deployed as a single Deployment, which listens to the API server.
- Scraping: Prometheus scrapes
/metrics
endpoint of kube-state-metrics to retrieve cluster-level metrics.
Control Plane Metrics (API Server, Scheduler, Controller Manager)
- API Server: Exposes metrics on
:6443/metrics
(secure port). - Scheduler: Exposes metrics on a separate port (often
:10251/metrics
). - Controller Manager: Exposes metrics on another port (often
:10252/metrics
). - Scraping: Configure Prometheus to scrape these endpoints, often requiring RBAC and service discovery settings to allow secure scraping.
4. Putting It All Together with Prometheus
A common and recommended way to gather Kubernetes metrics at all levels (containers, pods, nodes, and cluster objects) is:
- Prometheus Operator:
- Manages Prometheus, Alertmanager, and other CRDs (ServiceMonitor, PodMonitor).
- Automatically discovers Kubernetes services (including kubelet, cAdvisor, kube-state-metrics, Node Exporter) based on labels or annotations.
- Components to Install:
- Prometheus (for scraping all metrics).
- Node Exporter (usually as a DaemonSet).
- kube-state-metrics (as a Deployment).
- (Optional) Metrics Server (for HPA functionality).
- Scrape Configurations:
- ServiceMonitor and PodMonitor CRDs tell Prometheus which endpoints to scrape and on which ports.
- For example, a ServiceMonitor might point to
kubelet
pods’ 10250 port for cAdvisor data.
- Storage & Retention:
- Prometheus has an internal time-series database.
- For longer-term storage or large-scale clusters, use Thanos, Cortex, or Mimir to extend Prometheus’ capabilities.
5. Visualization & Dashboards
Once you have Prometheus collecting container, pod, node, and cluster metrics, you can visualize them:
- Grafana:
- Very common with Prometheus.
- Community dashboards for Kubernetes out of the box (includes cluster overview, node metrics, pod resource usage, etc.).
- Additional dashboards available for kube-state-metrics, cAdvisor, Node Exporter, etc.
- Splunk Observability, Elastic Stack, Datadog, etc.:
- You can forward Prometheus data (via OpenTelemetry Collector or Prometheus Remote Write) to these platforms.
- Each platform typically provides dashboards and alerting for Kubernetes metrics.
6. Example Deployment Steps (Prometheus Stack)
Here’s a simplified example workflow using Helm:
- Add Repo & Install Prometheus Stack:
helm repo add prometheus-community https://prometheus-community.github.io/helm-charts helm repo update # Install the kube-prometheus-stack (includes Prometheus, Alertmanager, Grafana, Node Exporter, kube-state-metrics, etc.) helm install my-prom-stack prometheus-community/kube-prometheus-stack --namespace monitoring --create-namespace
- Confirm Pods:
kubectl get pods -n monitoring
You should see pods like:- Prometheus server
- Alertmanager
- Node Exporter (DaemonSet)
- kube-state-metrics
- Grafana
- Access Grafana:
- By default, the Helm chart creates a
Service
for Grafana. - You can port-forward to it and log in, or expose it through an Ingress.
kubectl port-forward svc/my-prom-stack-grafana 3000:80 -n monitoring
Then open http://localhost:3000. - By default, the Helm chart creates a
- Dashboards:
- Grafana has built-in “Kubernetes / Compute Resources” dashboards when using the kube-prometheus-stack.
- You can also import community dashboards from Grafana.com.
7. Additional Best Practices
- RBAC & Security:
- Secure access to kubelet metrics (
/metrics/cadvisor
). - Use SSL/TLS if needed, along with appropriate certificates.
- Restrict who can query your metrics endpoints.
- Secure access to kubelet metrics (
- Limit Over-Collection:
- High-frequency scraping can lead to large data volumes and performance overhead.
- Consider adjusting scrape intervals or sampling strategies.
- Resource Requests & Limits:
- Ensure the Prometheus server has enough CPU/memory to handle the ingestion load.
- Tune retention times, storage volume, and potential remote write solutions.
- High Availability:
- Run multiple Prometheus replicas if you need HA.
- Tools like Thanos or Cortex can replicate data across multiple Prometheus instances.
- Extend with Logs & Tracing:
- For a full observability stack, add log aggregation (e.g., Fluentd, Loki, Splunk) and distributed tracing (e.g., Jaeger, OpenTelemetry).
- This helps correlate metrics with logs and traces for faster root cause analysis.
In Summary
- cAdvisor (built into kubelet) collects container-level CPU/memory metrics.
- Node Exporter provides OS-level metrics from each node.
- kube-state-metrics exposes cluster resource and object metrics.
- Metrics Server is essential for the Horizontal Pod Autoscaler.
- Prometheus (scraping) + Grafana (dashboards) is the most common open-source solution.
By installing these components (often packaged together with the Prometheus Operator or the kube-prometheus-stack Helm chart), you’ll have a comprehensive view of container, pod, node, and cluster-level metrics in Kubernetes, all accessible for visualization and alerting.