Update tech_docs/prometheus.md
This commit is contained in:
@@ -1,3 +1,294 @@
|
|||||||
|
To add Grafana to your setup, we need to extend the `docker-compose.yml` file and configure Grafana to use Prometheus as a data source. Here are the steps:
|
||||||
|
|
||||||
|
### Step 1: Extend docker-compose.yml to Include Grafana
|
||||||
|
|
||||||
|
Add the Grafana service to your `docker-compose.yml` file:
|
||||||
|
|
||||||
|
```yaml
|
||||||
|
version: '3.8'
|
||||||
|
|
||||||
|
services:
|
||||||
|
prometheus:
|
||||||
|
image: prom/prometheus:latest
|
||||||
|
container_name: prometheus
|
||||||
|
ports:
|
||||||
|
- "9090:9090"
|
||||||
|
volumes:
|
||||||
|
- ./prometheus.yml:/etc/prometheus/prometheus.yml
|
||||||
|
- ./alert.rules:/etc/prometheus/alert.rules
|
||||||
|
networks:
|
||||||
|
- monitoring
|
||||||
|
|
||||||
|
node_exporter:
|
||||||
|
image: prom/node-exporter:latest
|
||||||
|
container_name: node_exporter
|
||||||
|
networks:
|
||||||
|
- monitoring
|
||||||
|
|
||||||
|
grafana:
|
||||||
|
image: grafana/grafana:latest
|
||||||
|
container_name: grafana
|
||||||
|
ports:
|
||||||
|
- "3000:3000"
|
||||||
|
networks:
|
||||||
|
- monitoring
|
||||||
|
volumes:
|
||||||
|
- grafana-storage:/var/lib/grafana
|
||||||
|
|
||||||
|
networks:
|
||||||
|
monitoring:
|
||||||
|
driver: bridge
|
||||||
|
|
||||||
|
volumes:
|
||||||
|
grafana-storage:
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 2: Restart Docker Services
|
||||||
|
|
||||||
|
Restart your Docker services to include Grafana:
|
||||||
|
|
||||||
|
```bash
|
||||||
|
docker-compose down
|
||||||
|
docker-compose up -d
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 3: Configure Grafana
|
||||||
|
|
||||||
|
1. **Access Grafana**:
|
||||||
|
Open your web browser and go to `http://localhost:3000` (Grafana default credentials: `admin/admin`).
|
||||||
|
|
||||||
|
2. **Add Prometheus Data Source**:
|
||||||
|
- Go to **Configuration > Data Sources > Add data source**.
|
||||||
|
- Select **Prometheus**.
|
||||||
|
- Set the URL to `http://prometheus:9090` and save.
|
||||||
|
|
||||||
|
3. **Create a Dashboard**:
|
||||||
|
- Create a new dashboard.
|
||||||
|
- Add new panels to visualize metrics from Node Exporter, such as CPU usage, memory usage, disk usage, etc.
|
||||||
|
|
||||||
|
### Example PromQL Queries for Grafana Panels
|
||||||
|
|
||||||
|
- **CPU Usage**:
|
||||||
|
```promql
|
||||||
|
100 - (avg by (instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)
|
||||||
|
```
|
||||||
|
|
||||||
|
- **Memory Usage**:
|
||||||
|
```promql
|
||||||
|
(node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100
|
||||||
|
```
|
||||||
|
|
||||||
|
- **Disk Usage**:
|
||||||
|
```promql
|
||||||
|
100 - ((node_filesystem_avail_bytes{fstype!~"tmpfs|fuse.lxcfs|rootfs"} / node_filesystem_size_bytes{fstype!~"tmpfs|fuse.lxcfs|rootfs"}) * 100)
|
||||||
|
```
|
||||||
|
|
||||||
|
### Step 4: Verify the Setup
|
||||||
|
|
||||||
|
1. **Check Grafana Dashboard**:
|
||||||
|
Open Grafana at `http://localhost:3000` and verify that you can see the metrics from your Linux systems.
|
||||||
|
|
||||||
|
2. **Check Prometheus Targets**:
|
||||||
|
Open Prometheus at `http://localhost:9090/targets` to ensure all targets are being scraped correctly.
|
||||||
|
|
||||||
|
### Summary
|
||||||
|
|
||||||
|
By adding Grafana to your Docker Compose setup and configuring it to use Prometheus as a data source, you can create powerful dashboards to visualize metrics from your Linux systems. This provides a comprehensive monitoring solution using Prometheus and Grafana. If you have any questions or need further assistance, feel free to ask!
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
### Key Metrics and KPIs to Monitor
|
||||||
|
|
||||||
|
1. **CPU Usage**
|
||||||
|
2. **Memory Usage**
|
||||||
|
3. **Disk Usage**
|
||||||
|
4. **Network Traffic**
|
||||||
|
5. **System Load**
|
||||||
|
6. **Uptime**
|
||||||
|
7. **Temperature (if available)**
|
||||||
|
|
||||||
|
### Step-by-Step Guide to Create a Grafana Dashboard
|
||||||
|
|
||||||
|
#### Step 1: Access Grafana
|
||||||
|
|
||||||
|
Open your web browser and go to `http://localhost:3000` (Grafana default credentials: `admin/admin`).
|
||||||
|
|
||||||
|
#### Step 2: Add Prometheus Data Source
|
||||||
|
|
||||||
|
1. **Configuration > Data Sources > Add data source**
|
||||||
|
2. **Select Prometheus**
|
||||||
|
3. **Set the URL to `http://prometheus:9090` and save**
|
||||||
|
|
||||||
|
#### Step 3: Create a New Dashboard
|
||||||
|
|
||||||
|
1. **Dashboard > New Dashboard > Add a New Panel**
|
||||||
|
|
||||||
|
#### Step 4: Add Panels with PromQL Queries
|
||||||
|
|
||||||
|
Here are the important metrics and their corresponding PromQL queries:
|
||||||
|
|
||||||
|
1. **CPU Usage**
|
||||||
|
|
||||||
|
- **Panel Title:** CPU Usage (%)
|
||||||
|
- **PromQL Query:**
|
||||||
|
```promql
|
||||||
|
100 - (avg by (instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)
|
||||||
|
```
|
||||||
|
|
||||||
|
2. **Memory Usage**
|
||||||
|
|
||||||
|
- **Panel Title:** Memory Usage (%)
|
||||||
|
- **PromQL Query:**
|
||||||
|
```promql
|
||||||
|
(node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100
|
||||||
|
```
|
||||||
|
|
||||||
|
3. **Disk Usage**
|
||||||
|
|
||||||
|
- **Panel Title:** Disk Usage (%)
|
||||||
|
- **PromQL Query:**
|
||||||
|
```promql
|
||||||
|
100 - ((node_filesystem_avail_bytes{fstype!~"tmpfs|fuse.lxcfs|rootfs"} / node_filesystem_size_bytes{fstype!~"tmpfs|fuse.lxcfs|rootfs"}) * 100)
|
||||||
|
```
|
||||||
|
|
||||||
|
4. **Network Traffic**
|
||||||
|
|
||||||
|
- **Panel Title:** Network Inbound Traffic (Bytes/s)
|
||||||
|
- **PromQL Query:**
|
||||||
|
```promql
|
||||||
|
rate(node_network_receive_bytes_total[5m])
|
||||||
|
```
|
||||||
|
|
||||||
|
- **Panel Title:** Network Outbound Traffic (Bytes/s)
|
||||||
|
- **PromQL Query:**
|
||||||
|
```promql
|
||||||
|
rate(node_network_transmit_bytes_total[5m])
|
||||||
|
```
|
||||||
|
|
||||||
|
5. **System Load**
|
||||||
|
|
||||||
|
- **Panel Title:** System Load (1m)
|
||||||
|
- **PromQL Query:**
|
||||||
|
```promql
|
||||||
|
node_load1
|
||||||
|
```
|
||||||
|
|
||||||
|
6. **Uptime**
|
||||||
|
|
||||||
|
- **Panel Title:** System Uptime (seconds)
|
||||||
|
- **PromQL Query:**
|
||||||
|
```promql
|
||||||
|
node_time_seconds - node_boot_time_seconds
|
||||||
|
```
|
||||||
|
|
||||||
|
7. **Temperature (if available)**
|
||||||
|
|
||||||
|
- **Panel Title:** CPU Temperature (°C)
|
||||||
|
- **PromQL Query:**
|
||||||
|
```promql
|
||||||
|
node_hwmon_temp_celsius
|
||||||
|
```
|
||||||
|
|
||||||
|
### Example Panel Configurations
|
||||||
|
|
||||||
|
#### CPU Usage Panel
|
||||||
|
|
||||||
|
1. **Add a new panel**.
|
||||||
|
2. **Set the title to "CPU Usage (%)"**.
|
||||||
|
3. **Enter the PromQL query**:
|
||||||
|
```promql
|
||||||
|
100 - (avg by (instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)
|
||||||
|
```
|
||||||
|
4. **Set visualization type to "Graph"**.
|
||||||
|
5. **Customize the visualization settings (e.g., set y-axis unit to percentage)**.
|
||||||
|
|
||||||
|
#### Memory Usage Panel
|
||||||
|
|
||||||
|
1. **Add a new panel**.
|
||||||
|
2. **Set the title to "Memory Usage (%)"**.
|
||||||
|
3. **Enter the PromQL query**:
|
||||||
|
```promql
|
||||||
|
(node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100
|
||||||
|
```
|
||||||
|
4. **Set visualization type to "Graph"**.
|
||||||
|
5. **Customize the visualization settings (e.g., set y-axis unit to percentage)**.
|
||||||
|
|
||||||
|
#### Disk Usage Panel
|
||||||
|
|
||||||
|
1. **Add a new panel**.
|
||||||
|
2. **Set the title to "Disk Usage (%)"**.
|
||||||
|
3. **Enter the PromQL query**:
|
||||||
|
```promql
|
||||||
|
100 - ((node_filesystem_avail_bytes{fstype!~"tmpfs|fuse.lxcfs|rootfs"} / node_filesystem_size_bytes{fstype!~"tmpfs|fuse.lxcfs|rootfs"}) * 100)
|
||||||
|
```
|
||||||
|
4. **Set visualization type to "Graph"**.
|
||||||
|
5. **Customize the visualization settings (e.g., set y-axis unit to percentage)**.
|
||||||
|
|
||||||
|
#### Network Traffic Panels
|
||||||
|
|
||||||
|
**Inbound Traffic**
|
||||||
|
|
||||||
|
1. **Add a new panel**.
|
||||||
|
2. **Set the title to "Network Inbound Traffic (Bytes/s)"**.
|
||||||
|
3. **Enter the PromQL query**:
|
||||||
|
```promql
|
||||||
|
rate(node_network_receive_bytes_total[5m])
|
||||||
|
```
|
||||||
|
4. **Set visualization type to "Graph"**.
|
||||||
|
5. **Customize the visualization settings (e.g., set y-axis unit to bytes/sec)**.
|
||||||
|
|
||||||
|
**Outbound Traffic**
|
||||||
|
|
||||||
|
1. **Add a new panel**.
|
||||||
|
2. **Set the title to "Network Outbound Traffic (Bytes/s)"**.
|
||||||
|
3. **Enter the PromQL query**:
|
||||||
|
```promql
|
||||||
|
rate(node_network_transmit_bytes_total[5m])
|
||||||
|
```
|
||||||
|
4. **Set visualization type to "Graph"**.
|
||||||
|
5. **Customize the visualization settings (e.g., set y-axis unit to bytes/sec)**.
|
||||||
|
|
||||||
|
#### System Load Panel
|
||||||
|
|
||||||
|
1. **Add a new panel**.
|
||||||
|
2. **Set the title to "System Load (1m)"**.
|
||||||
|
3. **Enter the PromQL query**:
|
||||||
|
```promql
|
||||||
|
node_load1
|
||||||
|
```
|
||||||
|
4. **Set visualization type to "Graph"**.
|
||||||
|
5. **Customize the visualization settings (e.g., set y-axis unit to none)**.
|
||||||
|
|
||||||
|
#### Uptime Panel
|
||||||
|
|
||||||
|
1. **Add a new panel**.
|
||||||
|
2. **Set the title to "System Uptime (seconds)"**.
|
||||||
|
3. **Enter the PromQL query**:
|
||||||
|
```promql
|
||||||
|
node_time_seconds - node_boot_time_seconds
|
||||||
|
```
|
||||||
|
4. **Set visualization type to "Stat"**.
|
||||||
|
5. **Customize the visualization settings (e.g., set y-axis unit to seconds)**.
|
||||||
|
|
||||||
|
#### Temperature Panel (if available)
|
||||||
|
|
||||||
|
1. **Add a new panel**.
|
||||||
|
2. **Set the title to "CPU Temperature (°C)"**.
|
||||||
|
3. **Enter the PromQL query**:
|
||||||
|
```promql
|
||||||
|
node_hwmon_temp_celsius
|
||||||
|
```
|
||||||
|
4. **Set visualization type to "Graph"**.
|
||||||
|
5. **Customize the visualization settings (e.g., set y-axis unit to degrees Celsius)**.
|
||||||
|
|
||||||
|
### Summary
|
||||||
|
|
||||||
|
By setting up these panels in Grafana, you'll have a comprehensive dashboard displaying key metrics and KPIs for your Linux systems. This will provide valuable insights into the performance and health of your infrastructure.
|
||||||
|
|
||||||
|
If you have specific metrics or additional customizations you'd like to include, feel free to ask!
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
### Directory Structure
|
### Directory Structure
|
||||||
|
|
||||||
We'll organize the directories under `/volume1/docker/prometheus` as follows:
|
We'll organize the directories under `/volume1/docker/prometheus` as follows:
|
||||||
|
|||||||
Reference in New Issue
Block a user