Files
the_information_nexus/tech_docs/cloud/aws_lab.md

11 KiB
Raw Blame History

Heres the ultimate workhorse lab setup for mastering cloud networking, hybrid environments, and CLI muscle memory—designed by a fellow nerd who values efficiency, realism, and cost-effectiveness.


🏗️ Lab Architecture Overview

Objective: Simulate a hybrid cloud enterprise network with AWS, on-prem, and multi-cloud components—all controllable via CLI.

Physical Hardware (Bare Minimum)

Component Purpose Example Specs
Proxmox Server Host VMs/LXC containers for networking services 32GB RAM, 8 cores, NVMe
MicroPC (x2) Act as "branch offices" (BGP speakers, VPN endpoints) Intel NUC, 16GB RAM
Raspberry Pi 4 Low-power edge device (IoT, DNS, monitoring) 4GB RAM
Spare Laptop Jump host/terminal (running tmux, AWS CLI, Terraform) Any Linux OS

🔥 Core Lab Components

1. Virtualized AWS Environment (No actual AWS bill needed!)

  • LocalStack (AWS API emulator) for practicing AWS CLI commands:
    docker run -d -p 4566:4566 --name localstack localstack/localstack
    export AWS_ENDPOINT=http://localhost:4566
    aws ec2 create-vpc --cidr-block 10.0.0.0/16 --endpoint-url $AWS_ENDPOINT
    
  • Terraform to define "fake AWS" resources (VPCs, TGW, Direct Connect).

2. On-Prem Data Center (Proxmox VMs)

  • VyOS (router OS) for BGP/OSPF/VPN:
    qm create 1000 --memory 2048 --net0 virtio,bridge=vmbr0 --name vyos-router
    wget https://downloads.vyos.io/rolling/current/amd64/vyos-rolling-latest.iso
    qm importdisk 1000 vyos-rolling-latest.iso local-lvm
    qm start 1000
    
  • FreeIPA for identity management (LDAP, RBAC).

3. Hybrid Connectivity

  • WireGuard VPN between "AWS" (LocalStack) and "on-prem" (VyOS):
    # On VyOS
    set interfaces wireguard wg0 address '10.1.1.1/24'
    set interfaces wireguard wg0 peer aws allowed-ips '10.0.0.0/16'
    
  • FRRouting for BGP peering:
    sudo vtysh
    configure terminal
    router bgp 65001
     neighbor 10.1.1.2 remote-as 65000  # "AWS" side
     network 192.168.1.0/24
    

4. Observability Stack

  • Grafana + Prometheus + Elasticsearch for logs/metrics:
    docker-compose up -d  # Uses this compose file: https://gist.github.com/your-repo
    
  • NetFlow/sFlow from VyOS to ntopng.

💻 Daily Drills (CLI Muscle Memory)

Drill 1: "AWS" Network Build-Out (10 mins)

# Using LocalStack + Terraform
terraform apply -target=aws_vpc.prod -auto-approve
aws ec2 describe-route-tables --endpoint-url $AWS_ENDPOINT | jq '.RouteTables[].Routes[]'

Drill 2: BGP Route Injection (5 mins)

# On VyOS
show ip bgp summary  # Verify peer
configure terminal
router bgp 65001
 network 192.168.2.0/24  # Add new route

Drill 3: Packet Capture Debugging (5 mins)

# On "branch" MicroPC
sudo tcpdump -i eth0 'host 10.1.1.1 and tcp port 179' -nnvv  # BGP packets

Drill 4: Cost-Ops Reflex (5 mins)

# Find untagged "AWS" resources (LocalStack)
aws ec2 describe-instances --endpoint-url $AWS_ENDPOINT \
  --query 'Reservations[].Instances[?!not_null(Tags[?Key==`Owner`])].InstanceId' | jq

⚙️ Automation & Chaos Engineering

1. Automated Breakage (Nightly Cron)

# Randomly drop BGP peers or VPN tunnels
0 2 * * * sudo vtysh -c "configure terminal" -c "router bgp 65001" -c "neighbor 10.1.1.2 shutdown"

2. Self-Healing Scripts

# monitor_bgp.py (runs on Raspberry Pi)
import os
if "Established" not in os.popen("vtysh -c 'show ip bgp summary'").read():
    os.system("vtysh -c 'configure terminal' -c 'router bgp 65001' -c 'neighbor 10.1.1.2 activate'")

📊 Lab Validation Checklist

Test Command Expected Result
AWS VPC Reachability ping 10.0.0.1 (from VyOS) 0% packet loss
BGP Route Propagation show ip route (on VyOS) Sees AWS CIDRs
VPN Tunnel Health wg show Handshake < 2 mins old
Cost Leak Detection aws ec2 describe-nat-gateways (LocalStack) No orphaned NATs

🚀 Pro Tips for Nerds

  1. SSH Config Shortcuts:
    # ~/.ssh/config
    Host aws-jump
      HostName 192.168.1.100
      User ubuntu
      IdentityFile ~/.ssh/aws-lab
    
  2. Tmux Workflow:
    tmux new -s lab
    # Split panes: AWS CLI, tcpdump, BGP monitor
    
  3. Cheat Sheets: Print these and tape them to your monitor:

💡 Why This Lab Wins

  • Zero AWS Costs: LocalStack + Terraform simulates AWS without bills.
  • Real Hardware: MicroPCs/RPi force you to deal with physical limitations.
  • Chaos-Ready: Automated breakage ensures youre always troubleshooting.

Want the exact Terraform configs/VyOS scripts? Ill package them into a GitHub repo for you—just say the word!


The Ultimate CLI Muscle Memory Training Plan

(For Nerds Who Want to Achieve Cloud Networking CLI Mastery Fast)


1. The Setup: Build a Home Lab That Mimics Production

Hardware (Bare Minimum)

  • Proxmox Server (or any hypervisor) Run nested VMs/containers.
  • MicroPC/Raspberry Pi For low-power networking (BGP, VPNs).
  • Spare Laptop As a jump host/terminal.

Software Stack

Tool Purpose Install Command
AWS CLI v2 Cloud-native networking curl "https://awscli.amazonaws.com/awscli-exe-linux-x86_64.zip" -o "awscliv2.zip" && unzip awscliv2.zip && sudo ./aws/install
Terraform IaC for repeatable labs sudo apt-get install terraform
FRRouting BGP/OSPF practice sudo apt-get install frr
WireGuard VPN tunneling sudo apt-get install wireguard
tcpdump Packet-level debugging sudo apt-get install tcpdump
jq JSON parsing for AWS CLI outputs sudo apt-get install jq
Tmux Terminal multiplexing for drills sudo apt-get install tmux

2. The Drills: Daily CLI Workouts

(30-60 mins/day, designed for muscle memory)

Drill 1: AWS Networking Speed Run (15 mins)

Goal: Automate VPC creation + troubleshoot.

# Create a VPC with Terraform (save as `vpc.tf`)
resource "aws_vpc" "lab" {
  cidr_block = "10.0.0.0/16"
  tags = { Name = "cli-muscle-memory" }
}

# Deploy and debug
terraform init && terraform apply -auto-approve
aws ec2 describe-vpcs --query 'Vpcs[].CidrBlock' | jq
aws ec2 delete-vpc --vpc-id $(aws ec2 describe-vpcs --query 'Vpcs[?Tags[?Key==`Name` && Value==`cli-muscle-memory`]].VpcId' --output text)

Pro Tip: Time yourself. Aim for <2 mins by Day 7.


Drill 2: BGP + VPN Chaos (20 mins)

Goal: Simulate hybrid cloud failures.

  1. Set Up FRRouting (BGP) on a Linux VM:
    sudo vtysh
    configure terminal
    router bgp 65001
     neighbor 192.168.1.1 remote-as 65002
     timers bgp 10 30  # Aggressive timers for failure sim
    
  2. Break It:
    sudo ifconfig eth0 down  # Kill primary interface
    
  3. Fix It:
    show ip bgp summary  # Diagnose
    sudo ifconfig eth0 up && sudo systemctl restart frr
    

Drill 3: Packet Kung Fu (10 mins)

Goal: Diagnose HTTPS failures without logs.

# Capture TLS handshake failures
sudo tcpdump -i any 'tcp port 443 and (tcp-syn|tcp-ack)!=0' -nnvv -w tls.pcap

# Analyze in Wireshark (or CLI):
tshark -r tls.pcap -Y 'ssl.handshake.type == 1'  # Find failed handshakes

Drill 4: Cost-Ops Reflex Training (15 mins)

Goal: Find and nuke wasteful resources.

# Find untagged EC2 instances
aws ec2 describe-instances --query 'Reservations[].Instances[?!not_null(Tags[?Key==`Owner`])].InstanceId' | jq

# Terminate with prejudice
aws ec2 terminate-instances --instance-ids $(aws ec2 describe-instances --query 'Reservations[].Instances[?!not_null(Tags[?Key==`Owner`])].InstanceId' --output text)

# Find idle NAT Gateways
aws ec2 describe-nat-gateways --filter Name=state,Values=available --query 'NatGateways[?NetworkInterfaces[0].Status!=`in-use`].NatGatewayId' | jq

3. The Gauntlet: Weekly Challenges

(Simulate real outages—no Google allowed!)

Challenge 1: "The Silent NACL"

  • Scenario: All traffic to TCP/443 is blocked, but Security Groups are open.
  • Tools Allowed: Only tcpdump, aws ec2 describe-network-acls.
  • Fix Time: <10 mins.

Challenge 2: "BGP Route Leak"

  • Scenario: Your VM cant reach the internet, but ping 8.8.8.8 works.
  • Tools Allowed: vtysh, ip route.
  • Fix Time: <15 mins.

4. Pro Tips for CLI Dominance

  1. Alias Everything:
    alias aws-vpcs='aws ec2 describe-vpcs --query "Vpcs[*].{ID:VpcId,CIDR:CidrBlock}" --output table'
    alias kill-nats='aws ec2 describe-nat-gateways --query "NatGateways[?NetworkInterfaces[0].Status!=\`in-use\`].NatGatewayId" --output text | xargs -I {} aws ec2 delete-nat-gateway --nat-gateway-id {}'
    
  2. CLI-Only Days:
    • Spend 1 day/week without a GUI (AWS Console, Wireshark, etc.).
  3. Keybindings:
    • Master Ctrl+R (reverse search), Ctrl+A/E (line navigation).

5. Measure Your Progress

Skill Beginner Master
VPC Creation 3+ mins (manual clicks) <60 secs (CLI/Terraform)
BGP Troubleshooting Relies on logs tcpdump + vtysh in <5 mins
Cost Hunting Manual Cost Explorer One-liner to find waste

Final Wisdom

  • Repetition > Theory: Do each drill 3x/week until its boring.
  • Break Things Intentionally: Corrupt BGP tables, drop packets, then fix.
  • Automate Your Punishment: Write scripts that break your lab nightly, forcing you to debug.

Want a ready-to-go Proxmox/K8s lab config? I can share Terraform templates to auto-build breakable environments!