diff --git a/tech_docs/prometheus.md b/tech_docs/prometheus.md index 8a1c8f1..54f7985 100644 --- a/tech_docs/prometheus.md +++ b/tech_docs/prometheus.md @@ -1,3 +1,294 @@ +To add Grafana to your setup, we need to extend the `docker-compose.yml` file and configure Grafana to use Prometheus as a data source. Here are the steps: + +### Step 1: Extend docker-compose.yml to Include Grafana + +Add the Grafana service to your `docker-compose.yml` file: + +```yaml +version: '3.8' + +services: + prometheus: + image: prom/prometheus:latest + container_name: prometheus + ports: + - "9090:9090" + volumes: + - ./prometheus.yml:/etc/prometheus/prometheus.yml + - ./alert.rules:/etc/prometheus/alert.rules + networks: + - monitoring + + node_exporter: + image: prom/node-exporter:latest + container_name: node_exporter + networks: + - monitoring + + grafana: + image: grafana/grafana:latest + container_name: grafana + ports: + - "3000:3000" + networks: + - monitoring + volumes: + - grafana-storage:/var/lib/grafana + +networks: + monitoring: + driver: bridge + +volumes: + grafana-storage: +``` + +### Step 2: Restart Docker Services + +Restart your Docker services to include Grafana: + +```bash +docker-compose down +docker-compose up -d +``` + +### Step 3: Configure Grafana + +1. **Access Grafana**: + Open your web browser and go to `http://localhost:3000` (Grafana default credentials: `admin/admin`). + +2. **Add Prometheus Data Source**: + - Go to **Configuration > Data Sources > Add data source**. + - Select **Prometheus**. + - Set the URL to `http://prometheus:9090` and save. + +3. **Create a Dashboard**: + - Create a new dashboard. + - Add new panels to visualize metrics from Node Exporter, such as CPU usage, memory usage, disk usage, etc. + +### Example PromQL Queries for Grafana Panels + +- **CPU Usage**: + ```promql + 100 - (avg by (instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) + ``` + +- **Memory Usage**: + ```promql + (node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100 + ``` + +- **Disk Usage**: + ```promql + 100 - ((node_filesystem_avail_bytes{fstype!~"tmpfs|fuse.lxcfs|rootfs"} / node_filesystem_size_bytes{fstype!~"tmpfs|fuse.lxcfs|rootfs"}) * 100) + ``` + +### Step 4: Verify the Setup + +1. **Check Grafana Dashboard**: + Open Grafana at `http://localhost:3000` and verify that you can see the metrics from your Linux systems. + +2. **Check Prometheus Targets**: + Open Prometheus at `http://localhost:9090/targets` to ensure all targets are being scraped correctly. + +### Summary + +By adding Grafana to your Docker Compose setup and configuring it to use Prometheus as a data source, you can create powerful dashboards to visualize metrics from your Linux systems. This provides a comprehensive monitoring solution using Prometheus and Grafana. If you have any questions or need further assistance, feel free to ask! + +--- + +### Key Metrics and KPIs to Monitor + +1. **CPU Usage** +2. **Memory Usage** +3. **Disk Usage** +4. **Network Traffic** +5. **System Load** +6. **Uptime** +7. **Temperature (if available)** + +### Step-by-Step Guide to Create a Grafana Dashboard + +#### Step 1: Access Grafana + +Open your web browser and go to `http://localhost:3000` (Grafana default credentials: `admin/admin`). + +#### Step 2: Add Prometheus Data Source + +1. **Configuration > Data Sources > Add data source** +2. **Select Prometheus** +3. **Set the URL to `http://prometheus:9090` and save** + +#### Step 3: Create a New Dashboard + +1. **Dashboard > New Dashboard > Add a New Panel** + +#### Step 4: Add Panels with PromQL Queries + +Here are the important metrics and their corresponding PromQL queries: + +1. **CPU Usage** + + - **Panel Title:** CPU Usage (%) + - **PromQL Query:** + ```promql + 100 - (avg by (instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) + ``` + +2. **Memory Usage** + + - **Panel Title:** Memory Usage (%) + - **PromQL Query:** + ```promql + (node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100 + ``` + +3. **Disk Usage** + + - **Panel Title:** Disk Usage (%) + - **PromQL Query:** + ```promql + 100 - ((node_filesystem_avail_bytes{fstype!~"tmpfs|fuse.lxcfs|rootfs"} / node_filesystem_size_bytes{fstype!~"tmpfs|fuse.lxcfs|rootfs"}) * 100) + ``` + +4. **Network Traffic** + + - **Panel Title:** Network Inbound Traffic (Bytes/s) + - **PromQL Query:** + ```promql + rate(node_network_receive_bytes_total[5m]) + ``` + + - **Panel Title:** Network Outbound Traffic (Bytes/s) + - **PromQL Query:** + ```promql + rate(node_network_transmit_bytes_total[5m]) + ``` + +5. **System Load** + + - **Panel Title:** System Load (1m) + - **PromQL Query:** + ```promql + node_load1 + ``` + +6. **Uptime** + + - **Panel Title:** System Uptime (seconds) + - **PromQL Query:** + ```promql + node_time_seconds - node_boot_time_seconds + ``` + +7. **Temperature (if available)** + + - **Panel Title:** CPU Temperature (°C) + - **PromQL Query:** + ```promql + node_hwmon_temp_celsius + ``` + +### Example Panel Configurations + +#### CPU Usage Panel + +1. **Add a new panel**. +2. **Set the title to "CPU Usage (%)"**. +3. **Enter the PromQL query**: + ```promql + 100 - (avg by (instance) (irate(node_cpu_seconds_total{mode="idle"}[5m])) * 100) + ``` +4. **Set visualization type to "Graph"**. +5. **Customize the visualization settings (e.g., set y-axis unit to percentage)**. + +#### Memory Usage Panel + +1. **Add a new panel**. +2. **Set the title to "Memory Usage (%)"**. +3. **Enter the PromQL query**: + ```promql + (node_memory_MemTotal_bytes - node_memory_MemAvailable_bytes) / node_memory_MemTotal_bytes * 100 + ``` +4. **Set visualization type to "Graph"**. +5. **Customize the visualization settings (e.g., set y-axis unit to percentage)**. + +#### Disk Usage Panel + +1. **Add a new panel**. +2. **Set the title to "Disk Usage (%)"**. +3. **Enter the PromQL query**: + ```promql + 100 - ((node_filesystem_avail_bytes{fstype!~"tmpfs|fuse.lxcfs|rootfs"} / node_filesystem_size_bytes{fstype!~"tmpfs|fuse.lxcfs|rootfs"}) * 100) + ``` +4. **Set visualization type to "Graph"**. +5. **Customize the visualization settings (e.g., set y-axis unit to percentage)**. + +#### Network Traffic Panels + +**Inbound Traffic** + +1. **Add a new panel**. +2. **Set the title to "Network Inbound Traffic (Bytes/s)"**. +3. **Enter the PromQL query**: + ```promql + rate(node_network_receive_bytes_total[5m]) + ``` +4. **Set visualization type to "Graph"**. +5. **Customize the visualization settings (e.g., set y-axis unit to bytes/sec)**. + +**Outbound Traffic** + +1. **Add a new panel**. +2. **Set the title to "Network Outbound Traffic (Bytes/s)"**. +3. **Enter the PromQL query**: + ```promql + rate(node_network_transmit_bytes_total[5m]) + ``` +4. **Set visualization type to "Graph"**. +5. **Customize the visualization settings (e.g., set y-axis unit to bytes/sec)**. + +#### System Load Panel + +1. **Add a new panel**. +2. **Set the title to "System Load (1m)"**. +3. **Enter the PromQL query**: + ```promql + node_load1 + ``` +4. **Set visualization type to "Graph"**. +5. **Customize the visualization settings (e.g., set y-axis unit to none)**. + +#### Uptime Panel + +1. **Add a new panel**. +2. **Set the title to "System Uptime (seconds)"**. +3. **Enter the PromQL query**: + ```promql + node_time_seconds - node_boot_time_seconds + ``` +4. **Set visualization type to "Stat"**. +5. **Customize the visualization settings (e.g., set y-axis unit to seconds)**. + +#### Temperature Panel (if available) + +1. **Add a new panel**. +2. **Set the title to "CPU Temperature (°C)"**. +3. **Enter the PromQL query**: + ```promql + node_hwmon_temp_celsius + ``` +4. **Set visualization type to "Graph"**. +5. **Customize the visualization settings (e.g., set y-axis unit to degrees Celsius)**. + +### Summary + +By setting up these panels in Grafana, you'll have a comprehensive dashboard displaying key metrics and KPIs for your Linux systems. This will provide valuable insights into the performance and health of your infrastructure. + +If you have specific metrics or additional customizations you'd like to include, feel free to ask! + +--- + ### Directory Structure We'll organize the directories under `/volume1/docker/prometheus` as follows: