Physical Node Monitoring¶
Learning how to monitor the physical nodes of your Kubernetes cluster is critical when running Arrikto EKF in production. Monitoring your physical nodes lets you validate that EKF performs as expected, while it also helps you detect and troubleshoot issues in a timely manner.
Inspecting the performance and status of your physical nodes is key to keep all the components of your EKF installation healthy and functional. The Rok Monitoring increases system observability by collecting and visualizing both hardware and OS metrics from your physical nodes. This helps you maintain high levels of performance and availability.
Overview
Introduction¶
The Rok Monitoring Stack uses the Prometheus Node Exporter to collect machine
metrics and serve them at the /metrics
HTTP endpoint. By default, the
Prometheus Node Exporter enables a large variety of collectors that cover
different areas of the underlying operating system and hardware, such as:
- Machine Specs
- CPU
- Memory
- Disk
- Filesystem
- Network
See also
The metrics that the Prometheus Node Exporter exposes can be used for real-time monitoring, debugging, and performance testing. The Prometheus Node Exporter does not persist its metrics on its own, that is, metrics are reset upon restarts.
To persist etcd metrics on Kubernetes, the Rok Monitoring Stack creates a
ServiceMonitor
custom resource in the namespace where Rok is deployed to
configure Rok Prometheus to periodically pull metrics from the Prometheus Node
Exporter and save them in its time-series database.
Note
By default, Rok Prometheus retains metrics for three days.
Metrics¶
The Prometheus Node Exporter exposes metrics under the following prefixes:
- Go application metrics, under the
go_
prefix - Prometheus metric handler metrics, under the
promhttp_
prefix - Node metrics under the
node_
prefix, e.g.,node_cpu_
,node_disk_
,node_filesystem_
, etc.
See also
Rok Prometheus collects and stores all metrics exposed by the Prometheus Node Exporter, while Rok Grafana provides a wide variety of dashboards that query for and visualize metrics collected from physical nodes.