Add tech_docs/networking/sdwan.md

This commit is contained in:
2025-07-28 13:18:15 -05:00
parent fae46d1c96
commit 9531831f0f

View File

@@ -0,0 +1,430 @@
Yes, you're absolutely correct. In Cisco SD-WAN (formerly Viptela), **VPN 0** is indeed referred to as the **"front-door VRF" (FD-VRF)**. This is a critical concept in the architecture. Let me break down why it's special and how to properly configure it.
---
## **Key Characteristics of VPN 0 (Front-Door VRF)**
1. **Transport-Only VPN**:
- Exclusively handles control-plane traffic (vBond, vSmart, vManage connections).
- Carries encrypted data-plane traffic (IPsec/GRE tunnels between edges).
- Does **not** carry user traffic (that goes in VPN 1, VPN 2, etc.).
2. **Mandatory for SD-WAN Operation**:
- Every SD-WAN router **must** have VPN 0 configured.
- If VPN 0 fails, the device loses connectivity to controllers and other edges.
3. **Uses Physical/Tunnel Interfaces**:
- Physical WAN interfaces (e.g., `Gig0/1`) are assigned to VPN 0.
- Tunnel interfaces (e.g., `Tunnel1`) are also part of VPN 0 for overlay traffic.
---
## **CLI Configuration Deep Dive (VPN 0)**
Lets expand on your earlier config with more details:
### **1. Physical Interface in VPN 0**
```bash
vpn 0
interface GigabitEthernet0/1
ip address 192.168.1.2/24 # Branch 1 (Primary WAN)
tunnel-interface # Marks this as an SD-WAN transport
color private1 # Used for QoS/policy matching (e.g., MPLS)
allow-service all # Permits control traffic (DTLS, IPsec)
exit
exit
```
#### **Explanation:**
- `color private1` (or `biz-internet`, `metro-ethernet`, etc.) defines the WAN type for policies.
- `allow-service all` permits **vBond/vSmart/vManage** connections (critical!).
---
### **2. Tunnel Interface (Overlay) in VPN 0**
```bash
vpn 0
interface Tunnel1
tunnel-interface
encapsulation ipsec # Mandatory for secure overlay
color private1 # Must match physical interface color
no shutdown
exit
exit
```
#### **Why Tunnel in VPN 0?**
- The tunnel carries **control traffic** (DTLS to controllers) and **data-plane traffic** (IPsec to other edges).
- **No user data** passes here—only SD-WAN-encapsulated packets.
---
### **3. Static Route (For Controller Reachability)**
```bash
router static
vpn 0
0.0.0.0/0 192.168.1.1 # Default route (WAN gateway)
# Optional: Static route for controllers if not using default
203.0.113.1/32 192.168.1.1 # vBond controller
exit
```
#### **Critical Checks:**
- Can the router ping the vBond IP? (`ping vpn 0 203.0.113.1`)
- Is the route present? (`show sdwan control local-properties`)
---
### **4. Verification Commands**
```bash
show sdwan control connections # Are vSmart/vBond connections UP?
show sdwan tunnel # Are IPsec tunnels established?
show interface Tunnel1 # Is the tunnel operational?
show route vpn 0 # Are WAN routes correct?
```
---
## **Common Mistakes in VPN 0 Setup**
1. **Missing `allow-service`**: Blocks control traffic (device wont onboard).
2. **Wrong `color`**: Policies wont match (e.g., `private1` vs `mpls`).
3. **No route to vBond**: Device cant join the overlay.
4. **Firewall blocking UDP/12346 (DTLS)**: SD-WAN control wont establish.
---
## **Next Steps**
Once VPN 0 is working (controller connectivity is UP), we can move to:
- **VPN 1 (Data VRF)**: Where user traffic lives.
- **Policies**: Centralized rules for traffic steering, QoS, etc.
Would you like to dive deeper into any of these areas?
---
Absolutely! Lets break down **SD-WANs three planes** in detail—covering protocols, traffic flows, security, and real-world deployment considerations.
---
## **1. SD-WAN Management Plane**
### **Purpose:**
- **Centralized device provisioning, monitoring, and policy enforcement.**
- Handles **initial bootstrap, config push, telemetry, and analytics**.
### **Key Components:**
| Component | Protocol | Port | Description |
|-----------|----------|------|-------------|
| **vManage** | HTTPS (WebUI) | TCP/443 | GUI/API for centralized control. |
| **vBond** | DTLS | UDP/23456 | Orchestrator for device authentication and tunnel setup. |
| **Zero-Touch Provisioning (ZTP)** | DHCP/HTTPS | - | Auto-configures devices out-of-the-box. |
### **Traffic Flow:**
1. **Onboarding:**
- Device contacts **vBond** (DTLS) → gets redirected to **vManage**.
- Downloads config/CSR via **HTTPS**.
2. **Ongoing Management:**
- Devices send **telemetry** (metrics, logs) to vManage.
- Policies (security, routing) are pushed **from vManage**.
### **Security Considerations:**
**Always use FVRF** (isolate management traffic).
**Mutual TLS (mTLS)** for device-vManage communication.
**Role-Based Access Control (RBAC)** in vManage.
---
## **2. SD-WAN Control Plane**
### **Purpose:**
- **Distributes reachability info (routes, policies, topology)** between devices.
- **Decides path selection** (based on SLA, jitter, loss).
### **Key Protocols:**
| Protocol | Function | Port |
|----------|----------|------|
| **OMP (Overlay Management Protocol)** | Advertises routes, TLOCs, policies. | DTLS/UDP/40322 |
| **BGP (optional)** | Legacy WAN integration. | TCP/179 |
| **TLOC Extension** | Maps physical WAN links to logical tunnels. | - |
### **How OMP Works:**
1. **vSmart controllers** act as **route reflectors** for OMP.
2. **Edge devices (vEdges)** send:
- **Routes** (prefixes learned from LAN/WAN).
- **TLOCs** (tunnel endpoints, e.g., `public-IP:color`).
- **Policies** (e.g., "prefer MPLS for VoIP").
3. **vSmart redistributes** this info to all edges.
### **Example OMP Route Advertisement:**
```bash
vEdge# show omp routes
RECEIVED ROUTES:
Prefix TLOC IP Color Preference
10.1.1.0/24 203.0.113.1 mpls 100
10.1.1.0/24 198.51.100.1 biz-internet 50
```
*(MPLS is preferred over Internet due to higher preference.)*
### **Security Considerations:**
✔ **DTLS encryption** for OMP (no cleartext control traffic!).
✔ **Control-plane policing (CoPP)** to prevent floods.
✔ **Private WAN links (MPLS)** for critical control traffic.
---
## **3. SD-WAN Data Plane**
### **Purpose:**
- **Forwards actual user traffic** (VoIP, video, web) over optimal paths.
### **Key Technologies:**
| Technology | Role |
|------------|------|
| **IPsec/GRE** | Encrypted tunnels between edges. |
| **TLOC (Transport Locator)** | Logical tunnel endpoint (e.g., `public-IP:color`). |
| **Application-Aware Routing (AAR)** | Dynamically switches paths based on SLA. |
### **Data Flow Example:**
1. **Traffic arrives at vEdge:**
- Classified via **DPI (Deep Packet Inspection)**.
- Tagged with **QoS markings (DSCP)**.
2. **Path Selection:**
- Checks **OMP-learned TLOCs** and **SLA metrics**.
- Chooses best path (e.g., MPLS for VoIP, Internet for web).
3. **Encapsulation:**
- Wrapped in **IPsec (ESP/AH)** or **GRE**.
- Sent to peer vEdge via **WAN (MPLS/Internet/5G)**.
### **Packet Walkthrough:**
1. **Original Packet:**
```
SRC: 10.1.1.100 (LAN) | DST: 8.8.8.8 (Internet)
```
2. **After SD-WAN Processing:**
```
[IPsec][GRE][SD-WAN Header][Original Packet]
SRC: 203.0.113.1 (vEdge Public IP)
DST: 198.51.100.2 (Peer vEdge Public IP)
```
### **Security Considerations:**
✔ **IPsec (AES-256-GCM, IKEv2)** for all tunnels.
✔ **Zone-Based Firewall** on vEdges.
✔ **SLA-based DDoS protection** (drop jitter/lossy links).
---
## **Summary Table: SD-WAN Planes Compared**
| Plane | Components | Protocols | Security Measures |
|-------|------------|-----------|------------------|
| **Management** | vManage, vBond | HTTPS, DTLS | FVRF, mTLS, RBAC |
| **Control** | vSmart, OMP | DTLS/OMP, BGP | CoPP, DTLS, Private WAN |
| **Data** | vEdges, TLOCs | IPsec, GRE | IPsec, AAR, QoS |
---
## **Key Takeaways**
1. **Management Plane** = “The brain” (vManage pushes configs).
2. **Control Plane** = “The nervous system” (OMP shares routes/policies).
3. **Data Plane** = “The muscle” (IPsec tunnels move packets).
### **Critical Best Practices:**
- **Isolate management (FVRF + private network).**
- **Encrypt control traffic (OMP over DTLS).**
- **Monitor SLA metrics for dynamic failover.**
Would you like a deep dive into **troubleshooting OMP routes** or **IPsec tunnel debugging**? Let me know! 🔍
---
Yes! You're absolutely right to categorize them this way. In modern networking (especially with overlay technologies like VXLAN, SD-WAN, and MPLS), we deal with **three distinct planes**, each serving a critical role:
### **1. Management Plane**
- **Purpose:** Controls **device access and monitoring** (SSH, SNMP, HTTPS, syslog, etc.).
- **Key Traits:**
- Should be **isolated** (FVRF, OOB management like iLO/DRAC).
- Often runs over **dedicated interfaces or VRFs**.
- **Example:**
```bash
vrf definition MGMT
rd 65000:999
!
interface GigabitEthernet0/0
vrf forwarding MGMT
ip address 192.168.1.1 255.255.255.0
```
*(FVRF ensures management traffic never mixes with data/control planes.)*
### **2. Control Plane**
- **Purpose:** Handles **protocols that build network intelligence** (BGP, OSPF, VXLAN EVPN, SD-WAN OMP, STP, LACP, etc.).
- **Key Traits:**
- **Distributes reachability info** (routes, tunnels, topology).
- Runs on the **CPU (software-based)** and is vulnerable to floods (e.g., BGP attacks).
- Can be placed in a **separate VRF** (but not FVRF!).
- **Example:**
```bash
vrf definition CONTROL_PLANE
rd 65000:100
!
router bgp 65000
address-family ipv4 vrf CONTROL_PLANE
neighbor 10.0.0.2 remote-as 65000
```
*(BGP in a VRF—keeps control traffic logically isolated.)*
### **3. Data Plane (Forwarding Plane)**
- **Purpose:** **Moves user traffic** (packets/frames) at **line rate (hardware-accelerated)**.
- **Key Traits:**
- **ASIC/switch-chip driven** (not CPU).
- **Doesnt care about routes/tunnels**—just forwards based on FIB/TCAM.
- **Examples:**
- VXLAN data traffic (UDP 4789).
- MPLS-labeled packets.
- SD-WAN data flows.
---
### **Why This Separation Matters**
| Plane | Runs On | Isolation Needed? | Risks if Compromised |
|-------|--------|-------------------|----------------------|
| **Management** | CPU | **Yes (FVRF/OOB)** | Total device takeover |
| **Control** | CPU | **Yes (VRF/CoPP)** | Network meltdown (BGP hijacks, loops) |
| **Data** | ASIC | **No (but ACLs help)** | Performance drops (DDoS), but no config access |
---
### **Common Misconceptions**
1. **"Control Plane = Management Plane"** → **No!**
- **Control Plane:** BGP, OSPF, VXLAN EVPN.
- **Management Plane:** SSH, SNMP.
- *(Theyre both CPU-based but serve different purposes.)*
2. **"FVRF Can Carry BGP/VXLAN"** → **No!**
- FVRF is **only for management** (per Ciscos design).
- BGP/VXLAN go in **normal VRFs** (or a dedicated control-plane VRF).
3. **"Data Plane Needs a VRF"** → **Usually No.**
- Data traffic follows the **FIB** (built by the control plane).
- VRFs for data are **tenant isolation** (MPLS VPNs, multi-tenancy).
---
### **Real-World Use Cases**
1. **SD-WAN**
- **Management:** vManage (HTTPS).
- **Control:** OMP (Overlay Management Protocol).
- **Data:** Encrypted tunnels (IPsec/GRE).
2. **VXLAN EVPN**
- **Management:** SSH to switches.
- **Control:** BGP EVPN (MAC/IP routing).
- **Data:** VXLAN-encapsulated traffic.
3. **Service Provider MPLS**
- **Management:** TACACS+ for routers.
- **Control:** LDP/RSVP (label distribution).
- **Data:** Label-switched packets.
---
### **Key Takeaways**
1. **Management Plane** = Your **remote admin access** (FVRF/OOB).
2. **Control Plane** = **Protocols that build the network** (BGP, EVPN, OSPF).
3. **Data Plane** = **Raw packet forwarding** (ASIC-driven, no intelligence).
### **Final Thought**
The industrys failure to **physically separate all three planes** (like servers do with iLO) is a security flaw. But until vendors fix it:
- **Use FVRF for management.**
- **Use VRFs/CoPP for control-plane isolation.**
- **Trust ASICs for the data plane.**
Would you like a deep dive into **protecting the control plane** (CoPP, BGP hardening, etc.)? I can break it down. 🔐
---
### **Front-Door VRF (FVRF) Explained (Using Cisco Gear)**
**Front-Door VRF (FVRF)** is a Cisco feature that enhances security by separating the **management plane** from the **data plane** in network devices (routers, switches, firewalls). It achieves this by placing the management interface (SSH, SNMP, HTTPS, etc.) in a separate Virtual Routing and Forwarding (VRF) instance, isolating it from the default routing table.
---
## **Why Use Front-Door VRF?**
1. **Security:** Prevents unauthorized access to management interfaces via data-plane attacks.
2. **Isolation:** Ensures management traffic doesnt mix with production traffic.
3. **Multi-Tenancy:** Useful in service provider environments where management traffic must be segregated per customer.
4. **Simplified Routing:** Avoids route conflicts between management and data networks.
---
## **How FVRF Works**
- The **management interface (Mgmt0/0)** is assigned to a dedicated VRF (e.g., `MGMT-VRF`).
- All management traffic (SSH, SNMP, etc.) must go through this VRF.
- The data plane (regular traffic) uses the **default global routing table** or other VRFs.
---
## **Configuration Example (Cisco IOS-XE / IOS)**
### **1. Create the Management VRF**
```bash
configure terminal
vrf definition MGMT-VRF
rd 100:1 ! Route Distinguisher (for uniqueness)
address-family ipv4
exit-address-family
exit
```
### **2. Assign the Management Interface to the VRF**
```bash
interface GigabitEthernet0/0
description Management Interface
vrf forwarding MGMT-VRF
ip address 192.168.1.1 255.255.255.0
no shutdown
exit
```
### **3. Configure a Default Route for Management Traffic**
```bash
ip route vrf MGMT-VRF 0.0.0.0 0.0.0.0 192.168.1.254
```
*(Where `192.168.1.254` is the gateway for management traffic.)*
### **4. Enable VRF-Aware Services**
```bash
ip http server
ip http vrf MGMT-VRF ! Ensures HTTP/HTTPS uses the MGMT-VRF
line vty 0 4
transport input ssh
vrf-alias MGMT-VRF enable ! Ensures SSH uses the MGMT-VRF
exit
```
---
## **Verification**
- Check VRF routing table:
```bash
show ip route vrf MGMT-VRF
```
- Verify interface assignment:
```bash
show vrf brief
```
- Test connectivity:
```bash
ping vrf MGMT-VRF 192.168.1.254
```
---
## **Key Considerations**
- **NTP & DNS:** If management relies on NTP/DNS, ensure they are reachable via the FVRF.
- **Backup Access:** Always maintain an alternative access method (console) in case of misconfiguration.
- **Compatibility:** Some older Cisco devices may not support all VRF-aware services.
---
### **Conclusion**
Front-Door VRF is a best practice for securing management traffic in Cisco environments. By isolating management interfaces in a separate VRF, you reduce attack surfaces and prevent unauthorized access through data-plane vulnerabilities.
Would you like a more advanced example (e.g., with NAT or ACLs)? Let me know!
---