Physical Node Monitoring¶
Learning how to monitor the physical nodes of your Kubernetes cluster is critical when running Arrikto EKF in production. Monitoring your physical nodes lets you validate that EKF performs as expected, while it also helps you detect and troubleshoot issues in a timely manner.
Inspecting the performance and status of your physical nodes is key to keep all the components of your EKF installation healthy and functional. The Rok Monitoring increases system observability by collecting and visualizing both hardware and OS metrics from your physical nodes. This helps you maintain high levels of performance and availability.
The Rok Monitoring Stack uses the Prometheus Node Exporter to collect machine
metrics and serve them at the
/metrics HTTP endpoint. By default, the
Prometheus Node Exporter enables a large variety of collectors that cover
different areas of the underlying operating system and hardware, such as:
- Machine Specs
The metrics that the Prometheus Node Exporter exposes can be used for real-time monitoring, debugging, and performance testing. The Prometheus Node Exporter does not persist its metrics on its own, that is, metrics are reset upon restarts.
To persist etcd metrics on Kubernetes, the Rok Monitoring Stack creates a
ServiceMonitor custom resource in the namespace where Rok is deployed to
configure Rok Prometheus to periodically pull metrics from the Prometheus Node
Exporter and save them in its time-series database.
By default, Rok Prometheus retains metrics for three days.
The Prometheus Node Exporter exposes metrics under the following prefixes:
- Go application metrics, under the
- Prometheus metric handler metrics, under the
- Node metrics under the
Rok Prometheus collects and stores all metrics exposed by the Prometheus Node Exporter, while Rok Grafana provides a wide variety of dashboards that query for and visualize metrics collected from physical nodes.