diff --git a/tech_docs/cloud/aws_notes.md b/tech_docs/cloud/aws_notes.md index b751a30..36e206e 100644 --- a/tech_docs/cloud/aws_notes.md +++ b/tech_docs/cloud/aws_notes.md @@ -1,3 +1,149 @@ +Absolutely—**mastering `tcpdump` is invaluable** for cloud engineers, even in AWS/GCP/Azure environments. Here’s why, when to use it, and how it complements cloud-native tools: + +--- + +### **1. Why Learn `tcpdump` in the Cloud Era?** +#### **Situations Where It Shines:** +- **Debugging EC2 Instance Connectivity**: + When Security Groups, NACLs, and Flow Logs show "allowed" traffic but packets still aren’t reaching your app. + ```bash + sudo tcpdump -i eth0 host 10.0.1.5 and port 80 -nnv + ``` + - `-nnv`: Disables DNS resolution (faster) and adds verbose output. + +- **Validating Encryption**: + Verify TLS handshakes (e.g., AWS ALB → EC2 traffic). + ```bash + sudo tcpdump -i eth0 'tcp port 443 and (tcp-syn|tcp-ack)!=0' -XX + ``` + +- **Packet-Level Drops**: + Flow Logs show `REJECT` but don’t explain *why*—`tcpdump` reveals RST packets, MTU issues, or malformed headers. + +#### **Cloud-Native Gaps It Fills:** +| **Cloud Tool** | **Limitation** | **How `tcpdump` Helps** | +|--------------------------|----------------------------------------|---------------------------------------------| +| VPC Flow Logs | No packet payloads | Inspect HTTP headers, TLS versions | +| Security Groups | No TCP flag logging | Check SYN/ACK/RST flags | +| Network ACLs | No visibility into interface drops | See if packets reach the ENI | + +--- + +### **2. Key `tcpdump` Commands for Cloud Engineers** +#### **Basic Capture (Save to File)** +```bash +sudo tcpdump -i eth0 -w /tmp/debug.pcap host 10.0.1.10 and port 443 +``` +- **Use Case**: Post-mortem analysis with Wireshark. + +#### **Filter AWS Metadata Service** +```bash +sudo tcpdump -i eth0 dst 169.254.169.254 -nnv +``` +- **Why**: Verify IMDSv2 token hops or SSRF vulnerabilities. + +#### **Check MTU Issues** +```bash +sudo tcpdump -i eth0 'icmp and icmp[0] == 3 and icmp[1] == 4' -vv +``` +- **Interpretation**: ICMP "Fragmentation Needed" messages (AWS drops these by default). + +#### **Validate NAT Gateway Traffic** +```bash +sudo tcpdump -i eth0 src 10.0.1.5 and dst not 10.0.0.0/16 -nn +``` +- **Why**: Confirm outbound traffic is SNAT’d correctly. + +--- + +### **3. When to *Avoid* `tcpdump` in the Cloud** +- **For VPC-Wide Analysis**: Use **VPC Flow Logs** instead (lower overhead). +- **Encrypted Traffic**: Without decryption keys, `tcpdump` only shows gibberish (use Layer 7 tools like ALB access logs). +- **High-Throughput Services**: Capturing 100 Gbps traffic will crush your instance. + +--- + +### **4. Cloud-Specific `tcpdump` Tricks** +#### **Traffic Mirroring (AWS)** +1. Set up a **Traffic Mirror Session** to copy packets to a monitoring instance. +2. Capture on the mirror interface: + ```bash + sudo tcpdump -i ens5 -w /tmp/mirror.pcap + ``` + +#### **Containerized Workloads (EKS/EKS)** +```bash +kubectl exec -it -- tcpdump -i eth0 -nn -c 10 'port 53' +``` +- **Why**: Debug DNS issues in Kubernetes pods. + +#### **Lambda Cold Starts** +- **Not Possible**: No shell access, but use **VPC Flow Logs** instead. + +--- + +### **5. How `tcpdump` Complements Cloud Tools** +#### **Debugging Flow Log "REJECT" Entries** +1. **Flow Log Query**: + ```sql + fields @timestamp, srcAddr, dstAddr, action | filter action="REJECT" + ``` +2. **`tcpdump` Follow-Up**: + ```bash + sudo tcpdump -i eth0 src 10.0.1.5 and dst 10.0.2.10 -nnvv + ``` + - Look for **RST packets** (indicates a reject). + +#### **Validating Security Groups** +- **Flow Logs Say**: Traffic allowed. +- **But App Fails**: + ```bash + sudo tcpdump -i eth0 port 5432 -nn -c 5 + ``` + - If no packets arrive, check **OS firewall** (iptables/ufw). + +--- + +### **6. Learning Roadmap** +1. **Start With Basics**: + ```bash + sudo tcpdump -i eth0 -nnvv 'tcp port 22' + ``` +2. **Progress To**: + - Filtering by TCP flags (`tcp[tcpflags] & (tcp-syn|tcp-ack) != 0`). + - Decoding HTTP headers (`tcpdump -A -s0 port 80`). +3. **Cloud Integration**: + - Automate captures during deployments. + - Correlate with CloudWatch Logs. + +--- + +### **7. Alternatives in Managed Services** +| **Scenario** | **Native Tool** | **`tcpdump` Equivalent** | +|----------------------------|--------------------------------|-----------------------------------| +| VPC Traffic Inspection | Traffic Mirroring | `tcpdump` on mirror target | +| Container Networking | EKS/ECS logs | `kubectl exec + tcpdump` | +| Serverless | VPC Flow Logs | Not applicable | + +--- + +### **Final Verdict** +**Yes, master `tcpdump`—but strategically:** +- **Essential For**: + - Instance-level debugging. + - Validating encryption/MTU. + - Hybrid cloud (on-prem + cloud). +- **Optional For**: + - Pure serverless architectures. + - High-throughput analytics (use Flow Logs + Athena instead). + +**Pro Tip**: Combine with `tshark` (Wireshark CLI) for advanced analysis: +```bash +sudo tcpdump -i eth0 -w - | tshark -r - -Y 'http.request.method=="GET"' +``` + +--- + A **Cloud Network SME** operates at the same level of mastery as a traditional network engineer but with a cloud-native lens. Here’s what they have **top of mind**, structured like the OSI model for clarity: ---