18 KiB
The Ultimate Linux Networking & CLI Fluency Guide for AWS Professionals
(A tactical, no-fluff manual for mastering the fundamentals that power AWS under the hood)
Part 1: Linux Networking Fundamentals
1. TCP/IP Stack: The Bare Metal
Key Concepts
- IP Addressing: IPv4 (e.g.,
10.0.0.1/24), IPv6 (e.g.,fd00::1/64) - Ports:
0-65535(Well-known:0-1023, Ephemeral:32768-60999) - Protocols: TCP (reliable), UDP (unreliable), ICMP (ping/traceroute).
Commands to Master
# View IP addresses and interfaces
ip addr show # Modern replacement for `ifconfig`
ip -4 addr # Show only IPv4 addresses
# Check listening ports
ss -tulnp # Replacement for `netstat -tulnp`
lsof -i :80 # Find processes using port 80
# Test connectivity
ping -c 4 8.8.8.8 # Basic ICMP test
traceroute -n 8.8.8.8 # Path discovery (no DNS resolution)
nc -zv 10.0.1.5 443 # Test TCP port (like telnet)
AWS Relevance
- Security Groups →
iptablesrules - VPC CIDR blocks →
ip routetable
2. Routing: How Packets Move
Key Concepts
- Default Gateway: Route for "everything else" (
0.0.0.0/0). - Routing Tables: Linux supports multiple tables (e.g.,
main,local). - BGP/OSPF: Used in AWS Direct Connect and Transit Gateway.
Commands to Master
# View routing table
ip route show # Show main routing table
ip route show table all # All tables (e.g., AWS uses multiple)
# Add/delete routes
sudo ip route add 10.0.2.0/24 via 10.0.1.1 dev eth0
sudo ip route del 10.0.2.0/24
# Simulate AWS Route Tables
ip rule add from 10.0.1.5 lookup 100 # Like AWS route table associations
AWS Relevance
- VPC Route Tables →
ip route - NAT Gateway →
iptables -t nat
3. iptables/nftables: The Firewall
Key Concepts
- Tables:
filter(default),nat(NAT rules),mangle(packet modification). - Chains:
INPUT(inbound),OUTPUT(outbound),FORWARD(routed).
Commands to Master
# List all rules
sudo iptables -L -n -v # Security Groups map here
sudo iptables -t nat -L # NAT rules (for NAT Gateway simulation)
# Block/allow traffic (like Security Groups)
sudo iptables -A INPUT -p tcp --dport 22 -j ACCEPT # Allow SSH
sudo iptables -A INPUT -p tcp --dport 80 -j DROP # Block HTTP
# NAT example (AWS NAT Gateway behavior)
sudo iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE
AWS Relevance
- Security Groups →
iptablesfilter table - NACLs → Stateless (no
conntrack)
Part 2: Non-Negotiable CLI Fluency
1. awk: Text Processing Superpower
Key Use Cases
- Extract fields from AWS CLI output.
- Transform logs (e.g., VPC Flow Logs).
Examples
# Extract private IPs from `aws ec2 describe-instances`
aws ec2 describe-instances | jq -r '.Reservations[].Instances[].PrivateIpAddress' | awk '{print "IP:", $1}'
# Parse /etc/passwd
awk -F: '{print $1, $6}' /etc/passwd # Username and home dir
2. jq: JSON Wizardry
Key Use Cases
- Filter AWS CLI JSON output.
- Transform API responses.
Examples
# Get all VPC IDs in a region
aws ec2 describe-vpcs | jq -r '.Vpcs[].VpcId'
# Find Security Groups allowing 0.0.0.0/0
aws ec2 describe-security-groups | jq -r '.SecurityGroups[] | select(.IpPermissions[].IpRanges[].CidrIp == "0.0.0.0/0") | .GroupId'
3. tmux: Terminal Multiplexing
Key Use Cases
- Run parallel commands (e.g.,
tcpdump+aws cli). - Persist sessions across SSH disconnects.
Cheat Sheet
tmux new -s aws_lab # Start new session
Ctrl+b % # Split pane vertically
Ctrl+b " # Split pane horizontally
Ctrl+b [arrow key] # Switch panes
tmux attach -t aws_lab # Reattach session
Part 3: AWS + Linux Integration Drills
Drill 1: Simulate a Security Group
# Allow SSH only from 192.168.1.100
sudo iptables -A INPUT -p tcp --dport 22 -s 192.168.1.100 -j ACCEPT
sudo iptables -A INPUT -p tcp --dport 22 -j DROP
# Verify
sudo iptables -L INPUT -n -v
Drill 2: Debug EC2 Networking
# Check ENI attachment
ip link show eth0 # Is it UP?
# Verify routes (VPC route table)
ip route show | grep default
# Test metadata service (IMDS)
curl http://169.254.169.254/latest/meta-data/local-ipv4
Drill 3: Parse AWS CLI with jq/awk
# Find all EC2 instances with public IPs
aws ec2 describe-instances | jq -r '.Reservations[].Instances[] | select(.PublicIpAddress != null) | .InstanceId'
# Count running instances
aws ec2 describe-instances | jq -r '.Reservations[].Instances[] | .State.Name' | awk '{count[$1]++} END {for (s in count) print s, count[s]}'
Cheat Sheets
Linux Networking Quick Reference
| Command | Purpose | AWS Equivalent |
|---|---|---|
ip addr show |
List interfaces | aws ec2 describe-network-interfaces |
ip route show |
View routing table | aws ec2 describe-route-tables |
sudo iptables -L |
List firewall rules | Security Groups/NACLs |
ss -tulnp |
Check listening ports | aws ec2 describe-security-groups |
CLI Fluency Quick Reference
| Tool | Command Example | Use Case |
|---|---|---|
awk |
awk '{print $1}' file.txt |
Extract first column |
jq |
jq -r '.VpcId' vpc.json |
Parse AWS JSON output |
tmux |
tmux attach -t session |
Reattach to a saved session |
Final Challenge
Simulate a NAT Gateway:
- On a Linux VM, enable IP forwarding:
echo 1 > /proc/sys/net/ipv4/ip_forward - Add NAT rules:
sudo iptables -t nat -A POSTROUTING -o eth0 -j MASQUERADE - Test from a private instance:
curl ifconfig.me # Should return NAT VM's public IP
This is the toolkit AWS network engineers use daily. Master these, and you’ll debug issues faster than 90% of cloud teams.
Want real-world break/fix scenarios to practice? Let me know—I’ll draft a chaos engineering lab!
Here’s the ultimate workhorse lab setup for mastering cloud networking, hybrid environments, and CLI muscle memory—designed by a fellow nerd who values efficiency, realism, and cost-effectiveness.
🏗️ Lab Architecture Overview
Objective: Simulate a hybrid cloud enterprise network with AWS, on-prem, and multi-cloud components—all controllable via CLI.
Physical Hardware (Bare Minimum)
| Component | Purpose | Example Specs |
|---|---|---|
| Proxmox Server | Host VMs/LXC containers for networking services | 32GB RAM, 8 cores, NVMe |
| MicroPC (x2) | Act as "branch offices" (BGP speakers, VPN endpoints) | Intel NUC, 16GB RAM |
| Raspberry Pi 4 | Low-power edge device (IoT, DNS, monitoring) | 4GB RAM |
| Spare Laptop | Jump host/terminal (running tmux, AWS CLI, Terraform) | Any Linux OS |
🔥 Core Lab Components
1. Virtualized AWS Environment (No actual AWS bill needed!)
- LocalStack (AWS API emulator) for practicing AWS CLI commands:
docker run -d -p 4566:4566 --name localstack localstack/localstack export AWS_ENDPOINT=http://localhost:4566 aws ec2 create-vpc --cidr-block 10.0.0.0/16 --endpoint-url $AWS_ENDPOINT - Terraform to define "fake AWS" resources (VPCs, TGW, Direct Connect).
2. On-Prem Data Center (Proxmox VMs)
- VyOS (router OS) for BGP/OSPF/VPN:
qm create 1000 --memory 2048 --net0 virtio,bridge=vmbr0 --name vyos-router wget https://downloads.vyos.io/rolling/current/amd64/vyos-rolling-latest.iso qm importdisk 1000 vyos-rolling-latest.iso local-lvm qm start 1000 - FreeIPA for identity management (LDAP, RBAC).
3. Hybrid Connectivity
- WireGuard VPN between "AWS" (LocalStack) and "on-prem" (VyOS):
# On VyOS set interfaces wireguard wg0 address '10.1.1.1/24' set interfaces wireguard wg0 peer aws allowed-ips '10.0.0.0/16' - FRRouting for BGP peering:
sudo vtysh configure terminal router bgp 65001 neighbor 10.1.1.2 remote-as 65000 # "AWS" side network 192.168.1.0/24
4. Observability Stack
- Grafana + Prometheus + Elasticsearch for logs/metrics:
docker-compose up -d # Uses this compose file: https://gist.github.com/your-repo - NetFlow/sFlow from VyOS to ntopng.
💻 Daily Drills (CLI Muscle Memory)
Drill 1: "AWS" Network Build-Out (10 mins)
# Using LocalStack + Terraform
terraform apply -target=aws_vpc.prod -auto-approve
aws ec2 describe-route-tables --endpoint-url $AWS_ENDPOINT | jq '.RouteTables[].Routes[]'
Drill 2: BGP Route Injection (5 mins)
# On VyOS
show ip bgp summary # Verify peer
configure terminal
router bgp 65001
network 192.168.2.0/24 # Add new route
Drill 3: Packet Capture Debugging (5 mins)
# On "branch" MicroPC
sudo tcpdump -i eth0 'host 10.1.1.1 and tcp port 179' -nnvv # BGP packets
Drill 4: Cost-Ops Reflex (5 mins)
# Find untagged "AWS" resources (LocalStack)
aws ec2 describe-instances --endpoint-url $AWS_ENDPOINT \
--query 'Reservations[].Instances[?!not_null(Tags[?Key==`Owner`])].InstanceId' | jq
⚙️ Automation & Chaos Engineering
1. Automated Breakage (Nightly Cron)
# Randomly drop BGP peers or VPN tunnels
0 2 * * * sudo vtysh -c "configure terminal" -c "router bgp 65001" -c "neighbor 10.1.1.2 shutdown"
2. Self-Healing Scripts
# monitor_bgp.py (runs on Raspberry Pi)
import os
if "Established" not in os.popen("vtysh -c 'show ip bgp summary'").read():
os.system("vtysh -c 'configure terminal' -c 'router bgp 65001' -c 'neighbor 10.1.1.2 activate'")
📊 Lab Validation Checklist
| Test | Command | Expected Result |
|---|---|---|
| AWS VPC Reachability | ping 10.0.0.1 (from VyOS) |
0% packet loss |
| BGP Route Propagation | show ip route (on VyOS) |
Sees AWS CIDRs |
| VPN Tunnel Health | wg show |
Handshake < 2 mins old |
| Cost Leak Detection | aws ec2 describe-nat-gateways (LocalStack) |
No orphaned NATs |
🚀 Pro Tips for Nerds
- SSH Config Shortcuts:
# ~/.ssh/config Host aws-jump HostName 192.168.1.100 User ubuntu IdentityFile ~/.ssh/aws-lab - Tmux Workflow:
tmux new -s lab # Split panes: AWS CLI, tcpdump, BGP monitor - Cheat Sheets: Print these and tape them to your monitor:
💡 Why This Lab Wins
- Zero AWS Costs: LocalStack + Terraform simulates AWS without bills.
- Real Hardware: MicroPCs/RPi force you to deal with physical limitations.
- Chaos-Ready: Automated breakage ensures you’re always troubleshooting.
Want the exact Terraform configs/VyOS scripts? I’ll package them into a GitHub repo for you—just say the word!
The Ultimate CLI Muscle Memory Training Plan
(For Nerds Who Want to Achieve Cloud Networking CLI Mastery Fast)
1. The Setup: Build a Home Lab That Mimics Production
Hardware (Bare Minimum)
- Proxmox Server (or any hypervisor) – Run nested VMs/containers.
- MicroPC/Raspberry Pi – For low-power networking (BGP, VPNs).
- Spare Laptop – As a jump host/terminal.
Software Stack
| Tool | Purpose | Install Command |
|---|---|---|
| AWS CLI v2 | Cloud-native networking | curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" && unzip awscliv2.zip && sudo ./aws/install |
| Terraform | IaC for repeatable labs | sudo apt-get install terraform |
| FRRouting | BGP/OSPF practice | sudo apt-get install frr |
| WireGuard | VPN tunneling | sudo apt-get install wireguard |
| tcpdump | Packet-level debugging | sudo apt-get install tcpdump |
| jq | JSON parsing for AWS CLI outputs | sudo apt-get install jq |
| Tmux | Terminal multiplexing for drills | sudo apt-get install tmux |
2. The Drills: Daily CLI Workouts
(30-60 mins/day, designed for muscle memory)
Drill 1: AWS Networking Speed Run (15 mins)
Goal: Automate VPC creation + troubleshoot.
# Create a VPC with Terraform (save as `vpc.tf`)
resource "aws_vpc" "lab" {
cidr_block = "10.0.0.0/16"
tags = { Name = "cli-muscle-memory" }
}
# Deploy and debug
terraform init && terraform apply -auto-approve
aws ec2 describe-vpcs --query 'Vpcs[].CidrBlock' | jq
aws ec2 delete-vpc --vpc-id $(aws ec2 describe-vpcs --query 'Vpcs[?Tags[?Key==`Name` && Value==`cli-muscle-memory`]].VpcId' --output text)
Pro Tip: Time yourself. Aim for <2 mins by Day 7.
Drill 2: BGP + VPN Chaos (20 mins)
Goal: Simulate hybrid cloud failures.
- Set Up FRRouting (BGP) on a Linux VM:
sudo vtysh configure terminal router bgp 65001 neighbor 192.168.1.1 remote-as 65002 timers bgp 10 30 # Aggressive timers for failure sim - Break It:
sudo ifconfig eth0 down # Kill primary interface - Fix It:
show ip bgp summary # Diagnose sudo ifconfig eth0 up && sudo systemctl restart frr
Drill 3: Packet Kung Fu (10 mins)
Goal: Diagnose HTTPS failures without logs.
# Capture TLS handshake failures
sudo tcpdump -i any 'tcp port 443 and (tcp-syn|tcp-ack)!=0' -nnvv -w tls.pcap
# Analyze in Wireshark (or CLI):
tshark -r tls.pcap -Y 'ssl.handshake.type == 1' # Find failed handshakes
Drill 4: Cost-Ops Reflex Training (15 mins)
Goal: Find and nuke wasteful resources.
# Find untagged EC2 instances
aws ec2 describe-instances --query 'Reservations[].Instances[?!not_null(Tags[?Key==`Owner`])].InstanceId' | jq
# Terminate with prejudice
aws ec2 terminate-instances --instance-ids $(aws ec2 describe-instances --query 'Reservations[].Instances[?!not_null(Tags[?Key==`Owner`])].InstanceId' --output text)
# Find idle NAT Gateways
aws ec2 describe-nat-gateways --filter Name=state,Values=available --query 'NatGateways[?NetworkInterfaces[0].Status!=`in-use`].NatGatewayId' | jq
3. The Gauntlet: Weekly Challenges
(Simulate real outages—no Google allowed!)
Challenge 1: "The Silent NACL"
- Scenario: All traffic to
TCP/443is blocked, but Security Groups are open. - Tools Allowed: Only
tcpdump,aws ec2 describe-network-acls. - Fix Time: <10 mins.
Challenge 2: "BGP Route Leak"
- Scenario: Your VM can’t reach the internet, but
ping 8.8.8.8works. - Tools Allowed:
vtysh,ip route. - Fix Time: <15 mins.
4. Pro Tips for CLI Dominance
- Alias Everything:
alias aws-vpcs='aws ec2 describe-vpcs --query "Vpcs[*].{ID:VpcId,CIDR:CidrBlock}" --output table' alias kill-nats='aws ec2 describe-nat-gateways --query "NatGateways[?NetworkInterfaces[0].Status!=\`in-use\`].NatGatewayId" --output text | xargs -I {} aws ec2 delete-nat-gateway --nat-gateway-id {}' - CLI-Only Days:
- Spend 1 day/week without a GUI (AWS Console, Wireshark, etc.).
- Keybindings:
- Master
Ctrl+R(reverse search),Ctrl+A/E(line navigation).
- Master
5. Measure Your Progress
| Skill | Beginner | Master |
|---|---|---|
| VPC Creation | 3+ mins (manual clicks) | <60 secs (CLI/Terraform) |
| BGP Troubleshooting | Relies on logs | tcpdump + vtysh in <5 mins |
| Cost Hunting | Manual Cost Explorer | One-liner to find waste |
Final Wisdom
- Repetition > Theory: Do each drill 3x/week until it’s boring.
- Break Things Intentionally: Corrupt BGP tables, drop packets, then fix.
- Automate Your Punishment: Write scripts that break your lab nightly, forcing you to debug.
Want a ready-to-go Proxmox/K8s lab config? I can share Terraform templates to auto-build breakable environments!