Files
the_information_nexus/tech_docs/linux/grep.md
2025-07-01 12:57:26 +00:00

234 lines
5.2 KiB
Markdown

# **The Grep Mastery Guide: From Basic Search to Advanced Pattern Matching**
## **Table of Contents**
1. [Grep Fundamentals](#1-grep-fundamentals)
2. [Pattern Matching](#2-pattern-matching)
3. [Context Control](#3-context-control)
4. [File Operations](#4-file-operations)
5. [Output Control](#5-output-control)
6. [Performance Optimization](#6-performance-optimization)
7. [Advanced Techniques](#7-advanced-techniques)
8. [Real-World Recipes](#8-real-world-recipes)
9. [Alternatives & Complements](#9-alternatives--complements)
---
## **1. Grep Fundamentals**
### **Core Syntax**
```bash
grep [OPTIONS] PATTERN [FILE...]
```
### **Essential Modes**
```bash
grep "text" file.txt # Basic search
grep -E "regex" file.txt # Extended regex (ERE)
grep -F "string" file.txt # Fixed string (no regex)
grep -P "pattern" file.txt # Perl regex (GNU only)
```
### **Common Options**
```bash
-i # Case insensitive
-v # Invert match
-w # Whole words only
-n # Show line numbers
-c # Count matches
-l # Show matching files only
```
---
## **2. Pattern Matching**
### **Regex Basics**
```bash
grep "^start" file.txt # Lines starting with
grep "end$" file.txt # Lines ending with
grep "a.b" file.txt # Any char between
grep "a.*b" file.txt # Anything between
```
### **Character Classes**
```bash
grep "[[:digit:]]" file.txt # Any digit
grep "[A-Za-z]" file.txt # Any letter
grep "[^0-9]" file.txt # Non-digits
```
### **Quantifiers**
```bash
grep -E "a{3}" file.txt # Exactly 3 a's
grep -E "a{2,4}" file.txt # 2 to 4 a's
grep -E "a+" file.txt # One or more
grep -E "a*" file.txt # Zero or more
```
---
## **3. Context Control**
### **Show Surrounding Lines**
```bash
grep -A 2 "error" log.txt # 2 lines After
grep -B 3 "warning" log.txt # 3 lines Before
grep -C 1 "critical" log.txt # 1 line Context
```
### **Match Grouping**
```bash
grep -E "(error|warning)" file.txt # OR condition
grep -E "(fatal).*\1" file.txt # Backreference
```
---
## **4. File Operations**
### **Recursive Search**
```bash
grep -r "function" /src/ # Search directories
grep -R --include="*.py" "def" . # Python files only
```
### **Binary Files**
```bash
grep -a "text" binary.file # Treat binary as text
grep -I "pattern" * # Skip binary files
```
### **Multiple Files**
```bash
grep "pattern" *.log # All log files
grep "pattern" file1 file2 # Specific files
```
---
## **5. Output Control**
### **Formatting**
```bash
grep -H "text" file.txt # Show filename
grep -h "text" *.log # Hide filename
grep -o "pattern" file.txt # Only matching part
```
### **Color Highlighting**
```bash
grep --color=auto "error" log.txt
export GREP_COLORS='mt=01;31:sl=:cx=:fn=35:ln=32:bn=32:se=36'
```
### **Null-Terminated Output**
```bash
grep -l -Z "pattern" *.txt | xargs -0 rm # Safe for pipes
```
---
## **6. Performance Optimization**
### **Fast Searches**
```bash
grep -F "static string" largefile.txt # Fixed string faster
LC_ALL=C grep "pattern" file.txt # ASCII speedup
```
### **Parallel Grep**
```bash
find . -type f | parallel -j+0 grep "pattern" {}
```
### **Exclude Directories**
```bash
grep -r --exclude-dir={node_modules,.git} "function" .
```
---
## **7. Advanced Techniques**
### **Multiple Patterns**
```bash
grep -e "error" -e "warning" log.txt # Multiple patterns
grep -f patterns.txt file.txt # Patterns from file
```
### **Inverse Matching**
```bash
grep -v "success" log.txt | grep "error" # Errors without success
```
### **Process Substitution**
```bash
diff <(grep "A" file1) <(grep "B" file2)
```
---
## **8. Real-World Recipes**
### **Log Analysis**
```bash
# Top 10 frequent error messages
grep -o "ERROR: .*" app.log | sort | uniq -c | sort -nr | head
```
### **Code Refactoring**
```bash
# Find all function calls needing update
grep -rnw '/src/' -e 'deprecated_function'
```
### **CSV Processing**
```bash
# Extract column 3 where column 1 matches
awk -F, '$1 ~ /pattern/{print $3}' data.csv | grep "value"
```
### **Network Diagnostics**
```bash
# Find all unique IPs hitting error pages
grep " 404 " access.log | grep -oE "\b([0-9]{1,3}\.){3}[0-9]{1,3}\b" | sort -u
```
---
## **9. Alternatives & Complements**
### **When Not to Use Grep**
| Task | Better Tool |
|------|------------|
| JSON processing | `jq` |
| Columnar data | `awk` |
| Complex regex | `perl`/`rg` |
| File content replacement | `sed` |
### **Enhanced Grep Variants**
```bash
rg (ripgrep) # Faster, .gitignore aware
ack # Perl-powered, for coders
ag (silver searcher) # Similar to rg
```
---
## **Pro Tips**
- **Always quote patterns** to prevent shell expansion
- **Combine with `xargs`** for bulk operations:
```bash
grep -l "old_version" *.js | xargs sed -i 's/old_version/new_version/g'
```
- **Use `--`** to indicate end of options:
```bash
grep -- "-v" file.txt # Search for literal "-v"
```
## **Further Learning**
- `man grep` - Official documentation
- [regex101.com](https://regex101.com) - Interactive regex tester
- [ripgrep](https://github.com/BurntSushi/ripgrep) - Modern grep alternative
**Need a specific grep solution?** Describe your search challenge!