Below is an overview of how Prometheus scraping works, how Prometheus discovers (“finds”) targets in Kubernetes or other environments, and how it retrieves metrics from those targets.
1. Prometheus Scraping Fundamentals
- Pull-Based Model
- Prometheus uses a pull model: it periodically sends HTTP requests (scrapes) to endpoints (targets) that expose metrics in a plaintext or OpenMetrics format.
- By default, metrics are served at a path like
http://<host>:<port>/metrics
.
- Prometheus Configuration (
prometheus.yml
)- Prometheus’ behavior is controlled by a YAML config file (often named
prometheus.yml
). - This config includes one or more
scrape_configs
sections. Eachscrape_config
defines how Prometheus discovers targets and where it scrapes them from.
- Prometheus’ behavior is controlled by a YAML config file (often named
Example snippet of a scrape_config
:
scrape_configs:
- job_name: 'example-service'
kubernetes_sd_configs:
- role: service
relabel_configs:
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_scrape]
action: keep
regex: true
- source_labels: [__meta_kubernetes_service_annotation_prometheus_io_path]
action: replace
target_label: __metrics_path__
regex: (.+)
- source_labels: [__address__, __meta_kubernetes_service_annotation_prometheus_io_port]
action: replace
regex: (.+):(?:\d+);(\d+)
replacement: $1:$2
target_label: __address__
- job_name: The name of the scrape job.
- kubernetes_sd_configs: Uses Kubernetes service discovery to find services.
- relabel_configs: Filters or transforms discovered targets to the correct address/port/path for scraping.
- Scrape Interval
- Each job has a
scrape_interval
(default: 15 seconds). Prometheus queries each discovered target at that interval.
- Each job has a
- No “Metrics Files”
- Prometheus doesn’t fetch “metrics files” in the sense of logs on disk. It sends HTTP GET requests to the target’s
/metrics
(or another path) endpoint, which returns the metrics in text format (the Prometheus exposition format).
- Prometheus doesn’t fetch “metrics files” in the sense of logs on disk. It sends HTTP GET requests to the target’s
2. How Prometheus Finds Targets
A. Static Configuration (Non-Kubernetes)
For basic setups (e.g., dev or PoC), you can hardcode targets:
scrape_configs:
- job_name: 'static_example'
static_configs:
- targets: ['192.168.1.10:9100', '192.168.1.11:9100']
Prometheus will scrape each of those targets on the specified port and path.
B. Service Discovery (Kubernetes, EC2, Consul, etc.)
- Kubernetes Service Discovery
- In a Kubernetes cluster, Prometheus can use the Kubernetes API to dynamically discover pods/services/endpoints.
- Common approaches:
- role: service: Discover services.
- role: pod: Discover pods directly.
- Service Monitors / Pod Monitors if using the Prometheus Operator.
- Annotations in Kubernetes
- A common pattern is to annotate Services or Pods:
prometheus.io/scrape: "true"
prometheus.io/port: "8080"
prometheus.io/path: "/metrics"
- Prometheus’
relabel_configs
can filter in only those targets that haveprometheus.io/scrape
set to"true"
.
- A common pattern is to annotate Services or Pods:
- Other Service Discovery
- Prometheus also supports EC2, Azure, GCE, Consul, etc.
- Each discovery mechanism has its own config block (
ec2_sd_configs:
,consul_sd_configs:
, etc.).
3. Scraping Flow in Kubernetes
- Prometheus Queries the K8s API
- Using the credentials provided (often via in-cluster config if you run Prometheus inside K8s), Prometheus queries the Kubernetes API to list Pods, Services, or Endpoints.
- Relabeling
- The discovered targets have metadata (like labels, annotations).
- Via
relabel_configs
, Prometheus transforms or filters this metadata to determine the final scrape endpoint (i.e., IP:port and path).
- HTTP GET to
/metrics
- On each scrape interval, Prometheus sends an HTTP GET request to each valid target.
- The target returns a plaintext metrics payload (like
node_cpu_seconds_total{cpu="0"} 1000
) for each metric.
- Prometheus Ingests & Stores
- Prometheus parses the returned data and stores the time series in its internal TSDB (time-series database).
4. Verifying Which Targets Are Scraped
- Prometheus Web UI
- Access the Prometheus web UI (e.g.,
http://<prometheus-host>:9090
). - Go to
Status
->Targets
. - You’ll see a list of all targets Prometheus is currently scraping, their job name, last scrape time, and scrape status.
- Access the Prometheus web UI (e.g.,
- Debugging Discovery
- In the web UI, go to
Status
->Service Discovery
. - This shows you the raw data returned by the service discovery mechanism (like the Kubernetes API) before relabeling.
- You can see which pods/services are being discovered and how they are labeled.
- In the web UI, go to
5. How to “Find the Target Node and Get the Metrics”
- Kubernetes (Node)
- If you want node metrics, you often run Node Exporter as a DaemonSet.
- This exporter runs on each node (so each node is a target).
- The node exporter typically exposes metrics on port
9100
at the/metrics
path.
- Alternatively, you can scrape kubelet’s cAdvisor endpoint to get container-level metrics.
- If you want node metrics, you often run Node Exporter as a DaemonSet.
- Pods and Services
- If your microservice is instrumented with a Prometheus client library and you expose
/metrics
, Prometheus can discover and scrape that endpoint. - The underlying node is “found” automatically via the K8s service discovery logic (the node IP or Pod IP).
- You can see in the Prometheus “Targets” page exactly which IP and port it’s scraping.
- If your microservice is instrumented with a Prometheus client library and you expose
- Raw “Metrics File”
- Technically, you can fetch the raw metrics text from any target by doing
curl http://<target-ip>:<port>/metrics
. - This is not stored as a file on the node by default. It’s generated dynamically when you hit the
/metrics
endpoint.
- Technically, you can fetch the raw metrics text from any target by doing
6. Example (Kubernetes Service Annotation)
Let’s say you have a Service manifest like:
apiVersion: v1
kind: Service
metadata:
name: my-app
namespace: default
annotations:
prometheus.io/scrape: "true"
prometheus.io/port: "8080"
prometheus.io/path: "/metrics"
spec:
selector:
app: my-app
ports:
- port: 8080
targetPort: 8080
- Prometheus sees the annotation
prometheus.io/scrape: "true"
- It scrapes each pod behind that Service on port 8080 at the
/metrics
path. - You can verify this in the Prometheus web UI under
Status -> Targets
(look formy-app
).
7. Key Takeaways
- Prometheus “Scrape Configs” define how targets are found and how often they are scraped.
- Kubernetes Service Discovery (and optional annotations) automates target discovery in a cluster.
- Prometheus Doesn’t Pull Metrics Files from the file system; it makes HTTP GET requests to the
/metrics
endpoint each scrape interval. - Check the Prometheus UI under
Status -> Targets
orStatus -> Service Discovery
to see what endpoints are being scraped and how they’re labeled.
By setting up your scrape_configs
properly in prometheus.yml
(or by using ServiceMonitor
/PodMonitor
objects with the Prometheus Operator), Prometheus will automatically find the node, pod, and service endpoints in Kubernetes and scrape the metrics they expose.