Files
the_information_nexus/tech_docs/prometheus.md
2024-05-29 19:14:01 +00:00

1010 lines
23 KiB
Markdown
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

To add Grafana to your setup, we need to extend the `docker-compose.yml` file and configure Grafana to use Prometheus as a data source. Here are the steps:
### Step 1: Extend docker-compose.yml to Include Grafana
Add the Grafana service to your `docker-compose.yml` file:
```yaml
version: '3.8'
services:
prometheus:
image: prom/prometheus:latest
container_name: prometheus
ports:
- "9090:9090"
volumes:
- ./prometheus.yml:/etc/prometheus/prometheus.yml
- ./alert.rules:/etc/prometheus/alert.rules
networks:
- monitoring
node_exporter:
image: prom/node-exporter:latest
container_name: node_exporter
networks:
- monitoring
grafana:
image: grafana/grafana:latest
container_name: grafana
ports:
- "3000:3000"
networks:
- monitoring
volumes:
- grafana-storage:/var/lib/grafana
networks:
monitoring:
driver: bridge
volumes:
grafana-storage:
```
### Step 2: Restart Docker Services
Restart your Docker services to include Grafana:
```bash
docker-compose down
docker-compose up -d
```
### Step 3: Configure Grafana
1. **Access Grafana**:
Open your web browser and go to `http://localhost:3000` (Grafana default credentials: `admin/admin`).
2. **Add Prometheus Data Source**:
- Go to **Configuration > Data Sources > Add data source**.
- Select **Prometheus**.
- Set the URL to `http://prometheus:9090` and save.
3. **Create a Dashboard**:
- Create a new dashboard.
- Add new panels to visualize metrics from Node Exporter, such as CPU usage, memory usage, disk usage, etc.
### Example PromQL Queries for Grafana Panels
- **CPU Usage**:
```promql
100 - (avg by (instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)
```
- **Memory Usage**:
```promql
(node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100
```
- **Disk Usage**:
```promql
100 - ((node_filesystem_avail_bytes{fstype!~"tmpfs|fuse.lxcfs|rootfs"} / node_filesystem_size_bytes{fstype!~"tmpfs|fuse.lxcfs|rootfs"}) * 100)
```
### Step 4: Verify the Setup
1. **Check Grafana Dashboard**:
Open Grafana at `http://localhost:3000` and verify that you can see the metrics from your Linux systems.
2. **Check Prometheus Targets**:
Open Prometheus at `http://localhost:9090/targets` to ensure all targets are being scraped correctly.
### Summary
By adding Grafana to your Docker Compose setup and configuring it to use Prometheus as a data source, you can create powerful dashboards to visualize metrics from your Linux systems. This provides a comprehensive monitoring solution using Prometheus and Grafana. If you have any questions or need further assistance, feel free to ask!
---
### Key Metrics and KPIs to Monitor
1. **CPU Usage**
2. **Memory Usage**
3. **Disk Usage**
4. **Network Traffic**
5. **System Load**
6. **Uptime**
7. **Temperature (if available)**
### Step-by-Step Guide to Create a Grafana Dashboard
#### Step 1: Access Grafana
Open your web browser and go to `http://localhost:3000` (Grafana default credentials: `admin/admin`).
#### Step 2: Add Prometheus Data Source
1. **Configuration > Data Sources > Add data source**
2. **Select Prometheus**
3. **Set the URL to `http://prometheus:9090` and save**
#### Step 3: Create a New Dashboard
1. **Dashboard > New Dashboard > Add a New Panel**
#### Step 4: Add Panels with PromQL Queries
Here are the important metrics and their corresponding PromQL queries:
1. **CPU Usage**
- **Panel Title:** CPU Usage (%)
- **PromQL Query:**
```promql
100 - (avg by (instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)
```
2. **Memory Usage**
- **Panel Title:** Memory Usage (%)
- **PromQL Query:**
```promql
(node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100
```
3. **Disk Usage**
- **Panel Title:** Disk Usage (%)
- **PromQL Query:**
```promql
100 - ((node_filesystem_avail_bytes{fstype!~"tmpfs|fuse.lxcfs|rootfs"} / node_filesystem_size_bytes{fstype!~"tmpfs|fuse.lxcfs|rootfs"}) * 100)
```
4. **Network Traffic**
- **Panel Title:** Network Inbound Traffic (Bytes/s)
- **PromQL Query:**
```promql
rate(node_network_receive_bytes_total[5m])
```
- **Panel Title:** Network Outbound Traffic (Bytes/s)
- **PromQL Query:**
```promql
rate(node_network_transmit_bytes_total[5m])
```
5. **System Load**
- **Panel Title:** System Load (1m)
- **PromQL Query:**
```promql
node_load1
```
6. **Uptime**
- **Panel Title:** System Uptime (seconds)
- **PromQL Query:**
```promql
node_time_seconds - node_boot_time_seconds
```
7. **Temperature (if available)**
- **Panel Title:** CPU Temperature (°C)
- **PromQL Query:**
```promql
node_hwmon_temp_celsius
```
### Example Panel Configurations
#### CPU Usage Panel
1. **Add a new panel**.
2. **Set the title to "CPU Usage (%)"**.
3. **Enter the PromQL query**:
```promql
100 - (avg by (instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100)
```
4. **Set visualization type to "Graph"**.
5. **Customize the visualization settings (e.g., set y-axis unit to percentage)**.
#### Memory Usage Panel
1. **Add a new panel**.
2. **Set the title to "Memory Usage (%)"**.
3. **Enter the PromQL query**:
```promql
(node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100
```
4. **Set visualization type to "Graph"**.
5. **Customize the visualization settings (e.g., set y-axis unit to percentage)**.
#### Disk Usage Panel
1. **Add a new panel**.
2. **Set the title to "Disk Usage (%)"**.
3. **Enter the PromQL query**:
```promql
100 - ((node_filesystem_avail_bytes{fstype!~"tmpfs|fuse.lxcfs|rootfs"} / node_filesystem_size_bytes{fstype!~"tmpfs|fuse.lxcfs|rootfs"}) * 100)
```
4. **Set visualization type to "Graph"**.
5. **Customize the visualization settings (e.g., set y-axis unit to percentage)**.
#### Network Traffic Panels
**Inbound Traffic**
1. **Add a new panel**.
2. **Set the title to "Network Inbound Traffic (Bytes/s)"**.
3. **Enter the PromQL query**:
```promql
rate(node_network_receive_bytes_total[5m])
```
4. **Set visualization type to "Graph"**.
5. **Customize the visualization settings (e.g., set y-axis unit to bytes/sec)**.
**Outbound Traffic**
1. **Add a new panel**.
2. **Set the title to "Network Outbound Traffic (Bytes/s)"**.
3. **Enter the PromQL query**:
```promql
rate(node_network_transmit_bytes_total[5m])
```
4. **Set visualization type to "Graph"**.
5. **Customize the visualization settings (e.g., set y-axis unit to bytes/sec)**.
#### System Load Panel
1. **Add a new panel**.
2. **Set the title to "System Load (1m)"**.
3. **Enter the PromQL query**:
```promql
node_load1
```
4. **Set visualization type to "Graph"**.
5. **Customize the visualization settings (e.g., set y-axis unit to none)**.
#### Uptime Panel
1. **Add a new panel**.
2. **Set the title to "System Uptime (seconds)"**.
3. **Enter the PromQL query**:
```promql
node_time_seconds - node_boot_time_seconds
```
4. **Set visualization type to "Stat"**.
5. **Customize the visualization settings (e.g., set y-axis unit to seconds)**.
#### Temperature Panel (if available)
1. **Add a new panel**.
2. **Set the title to "CPU Temperature (°C)"**.
3. **Enter the PromQL query**:
```promql
node_hwmon_temp_celsius
```
4. **Set visualization type to "Graph"**.
5. **Customize the visualization settings (e.g., set y-axis unit to degrees Celsius)**.
### Summary
By setting up these panels in Grafana, you'll have a comprehensive dashboard displaying key metrics and KPIs for your Linux systems. This will provide valuable insights into the performance and health of your infrastructure.
If you have specific metrics or additional customizations you'd like to include, feel free to ask!
---
### Directory Structure
We'll organize the directories under `/volume1/docker/prometheus` as follows:
```plaintext
/volume1/docker/prometheus
├── alertmanager
│ └── config.yml
├── grafana
├── loki
│ └── local-config.yaml
├── prometheus
│ └── prometheus.yml
├── promtail
│ └── config.yml
├── docker-compose.yml
```
### Docker Compose File (docker-compose.yml)
Update the `docker-compose.yml` with absolute paths:
```yaml
---
version: '3.8'
networks:
loki:
monitoring:
driver: bridge
services:
prometheus:
image: prom/prometheus
container_name: prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yml'
ports:
- 9090:9090
restart: unless-stopped
volumes:
- /volume1/docker/prometheus/prometheus:/etc/prometheus
- prom_data:/prometheus
networks:
- monitoring
alertmanager:
image: prom/alertmanager
container_name: alertmanager
ports:
- 9093:9093
restart: unless-stopped
volumes:
- /volume1/docker/prometheus/alertmanager:/etc/alertmanager
command:
- '--config.file=/etc/alertmanager/config.yml'
networks:
- monitoring
depends_on:
- prometheus
grafana:
image: grafana/grafana
container_name: grafana
ports:
- '3030:3000'
restart: unless-stopped
volumes:
- /volume1/docker/prometheus/grafana:/var/lib/grafana
networks:
- monitoring
- loki
depends_on:
- prometheus
node_exporter:
image: prom/node-exporter
container_name: node_exporter
ports:
- 9100:9100
restart: unless-stopped
networks:
- monitoring
depends_on:
- prometheus
loki:
image: grafana/loki:2.6.0
container_name: loki
ports:
- "3100:3100"
command: -config.file=/etc/loki/local-config.yaml
volumes:
- /volume1/docker/prometheus/loki:/etc/loki
- loki_data:/loki
networks:
- loki
depends_on:
- promtail
promtail:
image: grafana/promtail:2.6.0
container_name: promtail
volumes:
- /var/log:/var/log
- /volume1/docker/prometheus/promtail:/etc/promtail
command: -config.file=/etc/promtail/config.yml
networks:
- loki
depends_on:
- loki
volumes:
prom_data:
grafana-storage:
loki_data:
```
### Prometheus Configuration File (prometheus.yml)
Create the `prometheus.yml` file in the `/volume1/docker/prometheus/prometheus` directory:
```yaml
global:
scrape_interval: 15s
scrape_timeout: 10s
evaluation_interval: 15s
alerting:
alertmanagers:
- static_configs:
- targets: ['alertmanager:9093']
scrape_configs:
- job_name: 'prometheus'
static_configs:
- targets: ['localhost:9090']
- job_name: 'node_exporter'
static_configs:
- targets: ['node_exporter:9100']
```
### Alertmanager Configuration File (config.yml)
Create the `config.yml` file in the `/volume1/docker/prometheus/alertmanager` directory:
```yaml
global:
resolve_timeout: 5m
route:
receiver: 'default'
receivers:
- name: 'default'
```
### Loki Configuration File (local-config.yaml)
Create the `local-config.yaml` file in the `/volume1/docker/prometheus/loki` directory:
```yaml
auth_enabled: false
server:
http_listen_port: 3100
ingester:
lifecycler:
ring:
kvstore:
store: inmemory
replication_factor: 1
chunk_idle_period: 5m
chunk_retain_period: 30s
max_transfer_retries: 0
schema_config:
configs:
- from: 2020-10-24
store: boltdb-shipper
object_store: filesystem
schema: v11
index:
prefix: index_
period: 24h
storage_config:
boltdb_shipper:
active_index_directory: /loki/index
cache_location: /loki/cache
cache_ttl: 24h
shared_store: filesystem
filesystem:
directory: /loki/chunks
limits_config:
enforce_metric_name: false
reject_old_samples: true
reject_old_samples_max_age: 168h
chunk_store_config:
max_look_back_period: 0s
table_manager:
retention_deletes_enabled: false
retention_period: 0s
```
### Promtail Configuration File (config.yml)
Create the `config.yml` file in the `/volume1/docker/prometheus/promtail` directory:
```yaml
server:
http_listen_port: 9080
grpc_listen_port: 0
positions:
filename: /tmp/positions.yaml
clients:
- url: http://loki:3100/loki/api/v1/push
scrape_configs:
- job_name: system
static_configs:
- targets:
- localhost
labels:
job: varlogs
__path__: /var/log/*log
```
### Configuring Docker to Route Logs to Loki
1. **Install the Docker Loki plugin:**
```bash
docker plugin install grafana/loki-docker-driver:latest --alias loki --grant-all-permissions
```
2. **Configure Docker daemon to use Loki:**
Edit or create the Docker daemon configuration file (`dockerd.json`):
**Synology:**
```bash
sudo vi var/packages/ContainerManager/etc/dockerd.json
```
Add the following content:
```json
{
"log-driver": "loki",
"log-opts": {
"loki-url": "http://localhost:3100/loki/api/v1/push"
}
}
```
3. **Restart Docker daemon:**
**Synology:**
```bash
sudo synopkgctl stop ContainerManager
sudo synopkgctl start ContainerManager
```
### Configuring Grafana to Query Logs
1. **Open Grafana:**
Navigate to `http://192.168.1.67:3000`.
2. **Log in:**
Use the default credentials (admin/admin) and change the password upon first login.
3. **Add Loki as a Data Source:**
- Go to Configuration > Data Sources > Add data source.
- Choose Loki.
- Set the URL to `http://loki:3100`.
- Click Save & Test.
### Deploying the Stack
With all the configuration files in place, navigate to the `/volume1/docker/prometheus` directory and start the stack using Docker Compose:
```bash
cd /volume1/docker/prometheus
docker-compose up -d
```
### Conclusion
You now have a comprehensive observability stack including Prometheus, Grafana, Loki, and Promtail running on your Synology NAS. This setup provides monitoring, alerting, and log aggregation capabilities. If you need any more specific configurations or have any questions, feel free to ask!
---
Yes, using `---` at the beginning of a YAML document is a good practice as it explicitly denotes the start of a document. Here is the revised Docker Compose file with the inclusion of `---`:
### Docker Compose File (docker-compose.yml)
```yaml
---
version: '3.8'
networks:
loki:
monitoring:
driver: bridge
services:
prometheus:
image: prom/prometheus
container_name: prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yml'
ports:
- 9090:9090
restart: unless-stopped
volumes:
- /volume1/docker/prometheus/prometheus:/etc/prometheus
- /volume1/docker/prometheus/prom_data:/prometheus
networks:
- monitoring
alertmanager:
image: prom/alertmanager
container_name: alertmanager
ports:
- 9093:9093
restart: unless-stopped
volumes:
- /volume1/docker/prometheus/alertmanager:/etc/alertmanager
command:
- '--config.file=/etc/alertmanager/config.yml'
networks:
- monitoring
grafana:
image: grafana/grafana
container_name: grafana
ports:
- '3000:3000'
restart: unless-stopped
volumes:
- /volume1/docker/prometheus/grafana:/var/lib/grafana
networks:
- monitoring
- loki
node_exporter:
image: prom/node-exporter
container_name: node_exporter
ports:
- 9100:9100
restart: unless-stopped
networks:
- monitoring
loki:
image: grafana/loki:2.6.0
container_name: loki
ports:
- "3100:3100"
command: -config.file=/etc/loki/local-config.yaml
volumes:
- /volume1/docker/prometheus/loki:/etc/loki
- /volume1/docker/prometheus/loki_data:/loki
networks:
- loki
promtail:
image: grafana/promtail:2.6.0
container_name: promtail
volumes:
- /var/log:/var/log
- /volume1/docker/prometheus/promtail:/etc/promtail
command: -config.file=/etc/promtail/config.yml
networks:
- loki
volumes:
prom_data:
grafana-storage:
loki_data:
```
### Updated Directory Structure
Here is a reminder of the directory structure:
```plaintext
/volume1/docker/prometheus
├── alertmanager
│ └── config.yml
├── grafana
├── loki
│ └── local-config.yaml
├── prometheus
│ └── prometheus.yml
├── promtail
│ └── config.yml
├── docker-compose.yml
```
### Deploying the Stack
With all the configuration files in place, navigate to the `/volume1/docker/prometheus` directory and start the stack using Docker Compose:
```bash
cd /volume1/docker/prometheus
docker-compose up -d
```
### Configuring Docker to Route Logs to Loki
1. **Install the Docker Loki plugin:**
```bash
docker plugin install grafana/loki-docker-driver:latest --alias loki --grant-all-permissions
```
2. **Configure Docker daemon to use Loki:**
Edit or create the Docker daemon configuration file (`dockerd.json`):
**Synology:**
```bash
sudo nano /var/packages/Docker/etc/dockerd.json
```
Add the following content:
```json
{
"log-driver": "loki",
"log-opts": {
"loki-url": "http://localhost:3100/loki/api/v1/push"
}
}
```
3. **Restart Docker daemon:**
**Synology:**
```bash
sudo synopkgctl stop Docker
sudo synopkgctl start Docker
```
### Configuring Grafana to Query Logs
1. **Open Grafana:**
Navigate to `http://<your-synology-ip>:3000`.
2. **Log in:**
Use the default credentials (admin/admin) and change the password upon first login.
3. **Add Loki as a Data Source:**
- Go to Configuration > Data Sources > Add data source.
- Choose Loki.
- Set the URL to `http://loki:3100`.
- Click Save & Test.
### Conclusion
You now have a comprehensive observability stack including Prometheus, Grafana, Loki, and Promtail running on your Synology NAS. This setup provides monitoring, alerting, and log aggregation capabilities. If you need any more specific configurations or have any questions, feel free to ask!
---
### Adding Loki and Promtail to the Docker Compose Setup
Heres an updated `docker-compose.yml` file that includes Loki and Promtail:
```yaml
---
version: '3.8'
networks:
loki:
monitoring:
driver: bridge
services:
prometheus:
image: prom/prometheus
container_name: prometheus
command:
- '--config.file=/etc/prometheus/prometheus.yml'
ports:
- 9090:9090
restart: unless-stopped
volumes:
- ./prometheus:/etc/prometheus
- prom_data:/prometheus
networks:
- monitoring
alertmanager:
image: prom/alertmanager
container_name: alertmanager
ports:
- 9093:9093
restart: unless-stopped
volumes:
- ./alertmanager:/etc/alertmanager
command:
- '--config.file=/etc/alertmanager/config.yml'
networks:
- monitoring
grafana:
image: grafana/grafana
container_name: grafana
ports:
- '3000:3000'
restart: unless-stopped
volumes:
- grafana-storage:/var/lib/grafana
networks:
- monitoring
- loki
node_exporter:
image: prom/node-exporter
container_name: node_exporter
ports:
- 9100:9100
restart: unless-stopped
networks:
- monitoring
loki:
image: grafana/loki:2.6.0
container_name: loki
ports:
- "3100:3100"
command: -config.file=/etc/loki/local-config.yaml
volumes:
- ./loki:/etc/loki
networks:
- loki
promtail:
image: grafana/promtail:2.6.0
container_name: promtail
volumes:
- /var/log:/var/log
- ./promtail:/etc/promtail
command: -config.file=/etc/promtail/config.yml
networks:
- loki
volumes:
prom_data:
grafana-storage:
```
### Loki Configuration File
Create a `local-config.yaml` file in the `loki` directory with the following content:
```yaml
auth_enabled: false
server:
http_listen_port: 3100
ingester:
lifecycler:
ring:
kvstore:
store: inmemory
replication_factor: 1
chunk_idle_period: 5m
chunk_retain_period: 30s
max_transfer_retries: 0
schema_config:
configs:
- from: 2020-10-24
store: boltdb-shipper
object_store: filesystem
schema: v11
index:
prefix: index_
period: 24h
storage_config:
boltdb_shipper:
active_index_directory: /loki/index
cache_location: /loki/cache
cache_ttl: 24h
shared_store: filesystem
filesystem:
directory: /loki/chunks
limits_config:
enforce_metric_name: false
reject_old_samples: true
reject_old_samples_max_age: 168h
chunk_store_config:
max_look_back_period: 0s
table_manager:
retention_deletes_enabled: false
retention_period: 0s
```
### Promtail Configuration File
Create a `config.yml` file in the `promtail` directory with the following content:
```yaml
server:
http_listen_port: 9080
grpc_listen_port: 0
positions:
filename: /tmp/positions.yaml
clients:
- url: http://loki:3100/loki/api/v1/push
scrape_configs:
- job_name: system
static_configs:
- targets:
- localhost
labels:
job: varlogs
__path__: /var/log/*log
```
### Configuring Docker to Route Logs to Loki
1. **Install the Docker Loki plugin:**
```bash
docker plugin install grafana/loki-docker-driver:latest --alias loki --grant-all-permissions
```
2. **Configure Docker daemon to use Loki:**
Edit or create the Docker daemon configuration file (`dockerd.json`):
**Linux default location:**
```bash
sudo nano /etc/docker/daemon.json
```
**Synology:**
```bash
sudo nano /var/packages/Docker/etc/dockerd.json
```
Add the following content:
```json
{
"log-driver": "loki",
"log-opts": {
"loki-url": "http://localhost:3100/loki/api/v1/push"
}
}
```
3. **Restart Docker daemon:**
**Linux:**
```bash
sudo systemctl restart docker
```
**Synology:**
```bash
sudo synopkgctl stop Docker
sudo synopkgctl start Docker
```
### Configuring Grafana to Query Logs
1. **Open Grafana:**
Navigate to `http://<your-synology-ip>:3000`.
2. **Log in:**
Use the default credentials (admin/admin) and change the password upon first login.
3. **Add Loki as a Data Source:**
- Go to Configuration > Data Sources > Add data source.
- Choose Loki.
- Set the URL to `http://loki:3100`.
- Click Save & Test.
### Deploying the Stack
With all the configuration files in place, navigate to the `/volume1/docker/prometheus` directory and start the stack using Docker Compose:
```bash
cd /volume1/docker/prometheus
docker-compose up -d
```
### Conclusion
You now have a comprehensive observability stack including Prometheus, Grafana, Loki, and Promtail running on your Synology NAS. This setup provides monitoring, alerting, and log aggregation capabilities. If you need any more specific configurations or have any questions, feel free to ask!