diff --git a/tech_docs/linux_motd.md b/tech_docs/linux_motd.md new file mode 100644 index 0000000..41edfd2 --- /dev/null +++ b/tech_docs/linux_motd.md @@ -0,0 +1,131 @@ +Excellent! Let's fine-tune and expand on the system resource monitor idea, as it provides a great opportunity to learn about Linux system operations, bash scripting, and Python data processing. We'll create a more comprehensive script that monitors system resources over time and provides useful insights. + +Here's an enhanced version of the system resource monitor: + +```bash +#!/bin/bash + +# Function to collect system data +collect_data() { + cpu_usage=$(top -bn1 | grep "Cpu(s)" | awk '{print $2 + $4}') + mem_usage=$(free | grep Mem | awk '{print $3/$2 * 100.0}') + disk_usage=$(df -h / | awk '/\// {print $(NF-1)}' | sed 's/%//') + io_wait=$(iostat -c 1 2 | tail -n 2 | head -n 1 | awk '{print $4}') + load_avg=$(uptime | awk -F'load average:' '{print $2}' | cut -d, -f1 | tr -d ' ') + + echo "$cpu_usage,$mem_usage,$disk_usage,$io_wait,$load_avg,$(date +%s)" +} + +# Collect data every 5 seconds for 1 minute +for i in {1..12}; do + collect_data >> system_data.csv + sleep 5 +done + +# Process data with Python +python3 - << EOF +import pandas as pd +import matplotlib.pyplot as plt +from io import StringIO + +# Read the CSV data +df = pd.read_csv('system_data.csv', header=None, + names=['CPU', 'Memory', 'Disk', 'IO_Wait', 'Load_Avg', 'Timestamp']) + +# Convert timestamp to datetime +df['Timestamp'] = pd.to_datetime(df['Timestamp'], unit='s') + +# Calculate averages +averages = df.mean() + +print("System Resource Usage Summary:") +print(f"Average CPU Usage: {averages['CPU']:.2f}%") +print(f"Average Memory Usage: {averages['Memory']:.2f}%") +print(f"Average Disk Usage: {averages['Disk']:.2f}%") +print(f"Average I/O Wait: {averages['IO_Wait']:.2f}%") +print(f"Average Load Average: {averages['Load_Avg']:.2f}") + +# Check for high usage +high_usage = df[(df['CPU'] > 80) | (df['Memory'] > 80) | (df['Disk'] > 80)] +if not high_usage.empty: + print("\nHigh Usage Detected:") + print(high_usage) + +# Plot the data +plt.figure(figsize=(12, 8)) +plt.plot(df['Timestamp'], df['CPU'], label='CPU') +plt.plot(df['Timestamp'], df['Memory'], label='Memory') +plt.plot(df['Timestamp'], df['Disk'], label='Disk') +plt.plot(df['Timestamp'], df['IO_Wait'], label='I/O Wait') +plt.plot(df['Timestamp'], df['Load_Avg'], label='Load Average') +plt.title('System Resource Usage Over Time') +plt.xlabel('Time') +plt.ylabel('Usage (%)') +plt.legend() +plt.grid(True) +plt.savefig('system_usage.png') +plt.close() + +print("\nPlot saved as system_usage.png") + +# Identify the most variable metric +variances = df.var() +most_variable = variances.idxmax() +print(f"\nMost variable metric: {most_variable}") + +EOF + +# Clean up +rm system_data.csv +``` + +This enhanced script does the following: + +1. Collects more system metrics: CPU usage, memory usage, disk usage, I/O wait time, and load average. +2. Samples data every 5 seconds for a minute, storing it in a CSV file. +3. Uses Python with pandas for data analysis: + - Calculates average usage for each metric + - Detects periods of high usage + - Creates a time series plot of all metrics + - Identifies the most variable metric + +To run this script: + +1. Save it as `system_monitor.sh` +2. Make it executable: `chmod +x system_monitor.sh` +3. Run it: `./system_monitor.sh` + +This script provides valuable insights into Linux system operations: + +1. **Command-line tools**: + - `top`: for CPU usage + - `free`: for memory statistics + - `df`: for disk usage + - `iostat`: for I/O statistics + - `uptime`: for load average + +2. **Bash scripting**: + - Functions in bash + - Looping and timing operations + - Text processing with `awk`, `sed`, `cut` + +3. **Python data analysis**: + - Using pandas for data manipulation + - matplotlib for data visualization + - Time series analysis + +4. **System metrics understanding**: + - CPU usage patterns + - Memory utilization + - Disk space monitoring + - I/O wait time implications + - Load average interpretation + +To further expand your learning: + +1. Try running this script under different system loads (e.g., while running a big compilation job). +2. Modify the script to monitor specific processes instead of the whole system. +3. Extend the Python analysis to provide recommendations based on the observed patterns. +4. Add network usage monitoring using tools like `netstat` or `ss`. + +Would you like me to explain any part of this script in more detail or suggest ways to extend it further? \ No newline at end of file