diff --git a/tech_docs/linux/grep.md b/tech_docs/linux/grep.md new file mode 100644 index 0000000..b8d638a --- /dev/null +++ b/tech_docs/linux/grep.md @@ -0,0 +1,234 @@ +# **The Grep Mastery Guide: From Basic Search to Advanced Pattern Matching** + +## **Table of Contents** +1. [Grep Fundamentals](#1-grep-fundamentals) +2. [Pattern Matching](#2-pattern-matching) +3. [Context Control](#3-context-control) +4. [File Operations](#4-file-operations) +5. [Output Control](#5-output-control) +6. [Performance Optimization](#6-performance-optimization) +7. [Advanced Techniques](#7-advanced-techniques) +8. [Real-World Recipes](#8-real-world-recipes) +9. [Alternatives & Complements](#9-alternatives--complements) + +--- + +## **1. Grep Fundamentals** + +### **Core Syntax** +```bash +grep [OPTIONS] PATTERN [FILE...] +``` + +### **Essential Modes** +```bash +grep "text" file.txt # Basic search +grep -E "regex" file.txt # Extended regex (ERE) +grep -F "string" file.txt # Fixed string (no regex) +grep -P "pattern" file.txt # Perl regex (GNU only) +``` + +### **Common Options** +```bash +-i # Case insensitive +-v # Invert match +-w # Whole words only +-n # Show line numbers +-c # Count matches +-l # Show matching files only +``` + +--- + +## **2. Pattern Matching** + +### **Regex Basics** +```bash +grep "^start" file.txt # Lines starting with +grep "end$" file.txt # Lines ending with +grep "a.b" file.txt # Any char between +grep "a.*b" file.txt # Anything between +``` + +### **Character Classes** +```bash +grep "[[:digit:]]" file.txt # Any digit +grep "[A-Za-z]" file.txt # Any letter +grep "[^0-9]" file.txt # Non-digits +``` + +### **Quantifiers** +```bash +grep -E "a{3}" file.txt # Exactly 3 a's +grep -E "a{2,4}" file.txt # 2 to 4 a's +grep -E "a+" file.txt # One or more +grep -E "a*" file.txt # Zero or more +``` + +--- + +## **3. Context Control** + +### **Show Surrounding Lines** +```bash +grep -A 2 "error" log.txt # 2 lines After +grep -B 3 "warning" log.txt # 3 lines Before +grep -C 1 "critical" log.txt # 1 line Context +``` + +### **Match Grouping** +```bash +grep -E "(error|warning)" file.txt # OR condition +grep -E "(fatal).*\1" file.txt # Backreference +``` + +--- + +## **4. File Operations** + +### **Recursive Search** +```bash +grep -r "function" /src/ # Search directories +grep -R --include="*.py" "def" . # Python files only +``` + +### **Binary Files** +```bash +grep -a "text" binary.file # Treat binary as text +grep -I "pattern" * # Skip binary files +``` + +### **Multiple Files** +```bash +grep "pattern" *.log # All log files +grep "pattern" file1 file2 # Specific files +``` + +--- + +## **5. Output Control** + +### **Formatting** +```bash +grep -H "text" file.txt # Show filename +grep -h "text" *.log # Hide filename +grep -o "pattern" file.txt # Only matching part +``` + +### **Color Highlighting** +```bash +grep --color=auto "error" log.txt +export GREP_COLORS='mt=01;31:sl=:cx=:fn=35:ln=32:bn=32:se=36' +``` + +### **Null-Terminated Output** +```bash +grep -l -Z "pattern" *.txt | xargs -0 rm # Safe for pipes +``` + +--- + +## **6. Performance Optimization** + +### **Fast Searches** +```bash +grep -F "static string" largefile.txt # Fixed string faster +LC_ALL=C grep "pattern" file.txt # ASCII speedup +``` + +### **Parallel Grep** +```bash +find . -type f | parallel -j+0 grep "pattern" {} +``` + +### **Exclude Directories** +```bash +grep -r --exclude-dir={node_modules,.git} "function" . +``` + +--- + +## **7. Advanced Techniques** + +### **Multiple Patterns** +```bash +grep -e "error" -e "warning" log.txt # Multiple patterns +grep -f patterns.txt file.txt # Patterns from file +``` + +### **Inverse Matching** +```bash +grep -v "success" log.txt | grep "error" # Errors without success +``` + +### **Process Substitution** +```bash +diff <(grep "A" file1) <(grep "B" file2) +``` + +--- + +## **8. Real-World Recipes** + +### **Log Analysis** +```bash +# Top 10 frequent error messages +grep -o "ERROR: .*" app.log | sort | uniq -c | sort -nr | head +``` + +### **Code Refactoring** +```bash +# Find all function calls needing update +grep -rnw '/src/' -e 'deprecated_function' +``` + +### **CSV Processing** +```bash +# Extract column 3 where column 1 matches +awk -F, '$1 ~ /pattern/{print $3}' data.csv | grep "value" +``` + +### **Network Diagnostics** +```bash +# Find all unique IPs hitting error pages +grep " 404 " access.log | grep -oE "\b([0-9]{1,3}\.){3}[0-9]{1,3}\b" | sort -u +``` + +--- + +## **9. Alternatives & Complements** + +### **When Not to Use Grep** +| Task | Better Tool | +|------|------------| +| JSON processing | `jq` | +| Columnar data | `awk` | +| Complex regex | `perl`/`rg` | +| File content replacement | `sed` | + +### **Enhanced Grep Variants** +```bash +rg (ripgrep) # Faster, .gitignore aware +ack # Perl-powered, for coders +ag (silver searcher) # Similar to rg +``` + +--- + +## **Pro Tips** +- **Always quote patterns** to prevent shell expansion +- **Combine with `xargs`** for bulk operations: + ```bash + grep -l "old_version" *.js | xargs sed -i 's/old_version/new_version/g' + ``` +- **Use `--`** to indicate end of options: + ```bash + grep -- "-v" file.txt # Search for literal "-v" + ``` + +## **Further Learning** +- `man grep` - Official documentation +- [regex101.com](https://regex101.com) - Interactive regex tester +- [ripgrep](https://github.com/BurntSushi/ripgrep) - Modern grep alternative + +**Need a specific grep solution?** Describe your search challenge! \ No newline at end of file