Update tech_docs/networking/sdwan_primer.md
This commit is contained in:
@@ -1,125 +1,883 @@
|
||||
Here’s the **20% of SD-WAN that covers 80% of what you need to know**, with a focus on **practical knowledge for senior network roles** and seamless integration with your IPSec expertise:
|
||||
Your collection of documents on SD-WAN is exceptionally thorough and well-structured. I've compiled them into a single, comprehensive draft, incorporating the refinements and addressing the inconsistencies we've discussed, particularly around VPN 0, VPN 512, and the Front-Door VRF concept.
|
||||
|
||||
---
|
||||
I've aimed to create a cohesive flow, starting with the "Top 1% Mindset" to set the stage, moving into the crash course for foundational understanding, then diving deep into TLOCs and the three planes, and finally, detailing key configurations and troubleshooting.
|
||||
|
||||
Here is the complete draft:
|
||||
|
||||
-----
|
||||
|
||||
# Mastering SD-WAN: From Fundamentals to the Top 1% Mindset
|
||||
|
||||
## The Top 1% Mindset
|
||||
|
||||
You don’t just deploy SD-WAN—you **orchestrate** it.
|
||||
You think in **abstractions** (colors, TLOCs, VPNs) not hardware.
|
||||
You troubleshoot like a surgeon—control plane first, then data plane, then app logic.
|
||||
|
||||
**Example:**
|
||||
Problem: VoIP calls drop but O365 works.
|
||||
**Top 1% Debug:**
|
||||
|
||||
* Check BFD (control plane).
|
||||
* Verify TLOC preferences (is LTE taking over incorrectly?).
|
||||
* Inspect `app-route` policy (is VoIP pinned to MPLS but PfR overriding?).
|
||||
* Drill into `show app-aware stats` (is jitter spiking on broadband?).
|
||||
|
||||
**Final Thought**
|
||||
Most SD-WAN "engineers" just click through vManage. The **real pros** know:
|
||||
|
||||
* Transport independence isn’t automatic—it’s designed.
|
||||
* Policies aren’t rules—they’re a logic flow.
|
||||
* Troubleshooting isn’t guessing—it’s methodical dissection.
|
||||
|
||||
You’re asking the right questions. Now go break (then fix) some TLOCs. 🚀
|
||||
(And yes, we both know Cisco’s docs don’t explain this stuff clearly—that’s why the top 1% reverse-engineer it.)
|
||||
|
||||
-----
|
||||
|
||||
## SD-WAN Crash Course: The 20% That Matters
|
||||
|
||||
### **SD-WAN Crash Course: The 20% That Matters**
|
||||
**Goal:** Understand **core SD-WAN concepts**, how they differ from traditional WAN, and how they integrate with IPSec.
|
||||
|
||||
---
|
||||
### 1\. SD-WAN vs Traditional WAN
|
||||
|
||||
## **1. SD-WAN vs Traditional WAN**
|
||||
| **Feature** | **Traditional WAN (MPLS/VPN)** | **SD-WAN** |
|
||||
|----------------------|-------------------------------|------------|
|
||||
| **Cost** | Expensive (MPLS circuits) | Cheaper (uses Internet + broadband) |
|
||||
| **Agility** | Manual config changes | Centralized, automated policies |
|
||||
| **Performance** | Predictable but rigid | Dynamic path selection (jitter/loss-aware) |
|
||||
| **Security** | Relies on IPSec/MPLS | Built-in encryption (IPSec, TLS) |
|
||||
| **Topology** | Hub-and-spoke | Any-to-any, mesh |
|
||||
| **Feature** | **Traditional WAN (MPLS/VPN)** | **SD-WAN** |
|
||||
| :---------- | :----------------------------- | :--------- |
|
||||
| **Cost** | Expensive (MPLS circuits) | Cheaper (uses Internet + broadband) |
|
||||
| **Agility** | Manual config changes | Centralized, automated policies |
|
||||
| **Performance** | Predictable but rigid | Dynamic path selection (jitter/loss-aware) |
|
||||
| **Security** | Relies on IPSec/MPLS | Built-in encryption (IPSec, TLS) |
|
||||
| **Topology** | Hub-and-spoke | Any-to-any, mesh |
|
||||
|
||||
**Key Takeaway:**
|
||||
- SD-WAN **decouples control plane from hardware**, allowing dynamic traffic routing over **any transport (MPLS, LTE, broadband)**.
|
||||
**Key Takeaway:**
|
||||
|
||||
---
|
||||
* SD-WAN **decouples control plane from hardware**, allowing dynamic traffic routing over **any transport (MPLS, LTE, broadband)**.
|
||||
|
||||
## **2. SD-WAN Core Components**
|
||||
### **(1) Edge Devices (CPE)**
|
||||
- **e.g., Cisco vEdge, FortiGate, VeloCloud**
|
||||
- Sit at branch offices, apply policies, and encrypt traffic.
|
||||
### 2\. SD-WAN Core Components
|
||||
|
||||
### **(2) Orchestrator (Controller)**
|
||||
- **e.g., Cisco vManage, VMware Orchestrator**
|
||||
- **Centralized policy management** (no CLI needed!).
|
||||
**(1) Edge Devices (CPE)**
|
||||
|
||||
### **(3) Overlay Tunnels**
|
||||
- **Encrypted tunnels** (IPSec, GRE, DTLS) between edges.
|
||||
- Uses **TLOC (Transport Locator)** = Public IP + Color (e.g., `INET`, `MPLS`).
|
||||
* e.g., Cisco vEdge, FortiGate, VeloCloud
|
||||
* Sit at branch offices, apply policies, and encrypt traffic.
|
||||
|
||||
### **(4) Underlay Transport**
|
||||
- **Any WAN link**: MPLS, Internet, LTE, 5G.
|
||||
**(2) Orchestrator (Controller)**
|
||||
|
||||
---
|
||||
* e.g., Cisco vManage, VMware Orchestrator
|
||||
* **Centralized policy management** (no CLI needed\!).
|
||||
|
||||
## **3. How SD-WAN Works (The 80% You Need)**
|
||||
### **(1) Path Selection**
|
||||
- **Dynamic multi-path steering**: Chooses best path based on:
|
||||
- **Application SLA** (e.g., VoIP → low latency).
|
||||
- **Real-time metrics** (jitter, packet loss, latency).
|
||||
**(3) Overlay Tunnels**
|
||||
|
||||
* **Encrypted tunnels** (IPSec, GRE, DTLS) between edges.
|
||||
* Uses **TLOC (Transport Locator)** = Public IP + Color (e.g., `INET`, `MPLS`).
|
||||
|
||||
**(4) Underlay Transport**
|
||||
|
||||
* **Any WAN link**: MPLS, Internet, LTE, 5G.
|
||||
|
||||
### 3\. How SD-WAN Works (The 80% You Need)
|
||||
|
||||
**(1) Path Selection**
|
||||
|
||||
* **Dynamic multi-path steering**: Chooses best path based on:
|
||||
* **Application SLA** (e.g., VoIP → low latency).
|
||||
* **Real-time metrics** (jitter, packet loss, latency).
|
||||
|
||||
**Example Policy:**
|
||||
|
||||
**Example Policy:**
|
||||
```plaintext
|
||||
IF (Application == VoIP) AND (Latency > 50ms) → SWITCH to backup link
|
||||
IF (Application == VoIP) AND (Latency > 50ms) → SWITCH to backup link
|
||||
```
|
||||
|
||||
### **(2) Zero-Touch Provisioning (ZTP)**
|
||||
- Plug in a device → auto-configures via orchestrator.
|
||||
**(2) Zero-Touch Provisioning (ZTP)**
|
||||
|
||||
### **(3) Application-Aware Routing**
|
||||
- **DPI (Deep Packet Inspection)** identifies apps (e.g., Teams, SAP).
|
||||
- **QoS prioritization** (VoIP > YouTube).
|
||||
* Plug in a device → auto-configures via orchestrator.
|
||||
|
||||
### **(4) Security Integration**
|
||||
- **IPSec for all overlays** (mandatory for Internet links).
|
||||
- **Cloud-based firewalls** (e.g., FortiGate, Zscaler).
|
||||
**(3) Application-Aware Routing**
|
||||
|
||||
---
|
||||
* **DPI (Deep Packet Inspection)** identifies apps (e.g., Teams, SAP).
|
||||
* *(Note: While effective, some advanced encryption like TLS 1.3 can limit DPI's visibility, requiring IP-based fallbacks.)*
|
||||
* **QoS prioritization** (VoIP \> YouTube).
|
||||
|
||||
## **4. SD-WAN + IPSec Integration**
|
||||
- **SD-WAN uses IPSec for secure tunnels** but adds:
|
||||
- **Automated key rotation** (no manual PSK updates).
|
||||
- **Tunnel bonding** (combines multiple links for throughput).
|
||||
**(4) Security Integration**
|
||||
|
||||
**Key Difference:**
|
||||
- Traditional IPSec VPN = **static tunnels**.
|
||||
- SD-WAN IPSec = **dynamic, SLA-driven tunnels**.
|
||||
* **IPSec for all overlays** (mandatory for Internet links).
|
||||
* **Cloud-based firewalls** (e.g., FortiGate, Zscaler).
|
||||
|
||||
---
|
||||
### 4\. SD-WAN + IPSec Integration
|
||||
|
||||
## **5. SD-WAN Troubleshooting (Top 5 Issues)**
|
||||
| **Issue** | **Debug Command** | **Fix** |
|
||||
|-------------------------------|--------------------------------------|---------|
|
||||
| **Tunnels not coming up** | `show sdwan tunnel` (Cisco) | Check underlay reachability |
|
||||
| **Poor VoIP quality** | `show sdwan app-route stats` | Adjust SLA thresholds |
|
||||
| **Orchestrator sync failure** | `show sdwan control connections` | Verify certs/connectivity |
|
||||
| **Traffic taking wrong path** | `show sdwan policy-service-path` | Fix application-aware rules |
|
||||
| **High latency on backup** | `show sdwan interface` | Enable FEC (Forward Error Correction) |
|
||||
* **SD-WAN uses IPSec for secure tunnels** but adds:
|
||||
* **Automated key rotation** (no manual PSK updates).
|
||||
* **Tunnel bonding** (combines multiple links for throughput).
|
||||
|
||||
---
|
||||
**Key Difference:**
|
||||
|
||||
## **6. SD-WAN vs. DMVPN (Common Interview Qs)**
|
||||
**Q: When would you use SD-WAN over DMVPN?**
|
||||
- **SD-WAN**: When you need **application-aware routing + centralized management**.
|
||||
- **DMVPN**: When you need **scalable IPSec tunnels but don’t need SaaS optimization**.
|
||||
* Traditional IPSec VPN = **static tunnels**.
|
||||
* SD-WAN IPSec = **dynamic, SLA-driven tunnels**.
|
||||
|
||||
**Q: Can SD-WAN replace IPSec?**
|
||||
- **No!** SD-WAN **uses** IPSec for encryption but adds intelligence on top.
|
||||
### 5\. SD-WAN Troubleshooting (Top 5 Issues)
|
||||
|
||||
---
|
||||
| **Issue** | **Debug Command** | **Fix** |
|
||||
| :-------- | :---------------- | :------ |
|
||||
| **Tunnels not coming up** | `show sdwan tunnel` (Cisco) | Check underlay reachability |
|
||||
| **Poor VoIP quality** | `show sdwan app-route stats` | Adjust SLA thresholds |
|
||||
| **Orchestrator sync failure** | `show sdwan control connections` | Verify certs/connectivity |
|
||||
| **Traffic taking wrong path** | `show sdwan policy-service-path` | Fix application-aware rules |
|
||||
| **High latency on backup** | `show sdwan interface` | Enable FEC (Forward Error Correction) |
|
||||
|
||||
## **7. Lab Practice (Quick Wins)**
|
||||
1. **Simulate link failure** in GNS3/EVE-NG → Watch SD-WAN switch paths.
|
||||
2. **Prioritize VoIP traffic** over YouTube.
|
||||
3. **Break the orchestrator** → Observe fallback to local policies.
|
||||
### 6\. SD-WAN vs. DMVPN (Common Interview Qs)
|
||||
|
||||
**Q: When would you use SD-WAN over DMVPN?**
|
||||
|
||||
* **SD-WAN**: When you need **application-aware routing + centralized management**.
|
||||
* **DMVPN**: When you need **scalable IPSec tunnels but don’t need SaaS optimization**.
|
||||
|
||||
**Q: Can SD-WAN replace IPSec?**
|
||||
|
||||
* **No\!** SD-WAN **uses** IPSec for encryption but adds intelligence on top.
|
||||
|
||||
### 7\. Lab Practice (Quick Wins)
|
||||
|
||||
1. **Simulate link failure** in GNS3/EVE-NG → Watch SD-WAN switch paths.
|
||||
2. **Prioritize VoIP traffic** over YouTube.
|
||||
3. **Break the orchestrator** → Observe fallback to local policies.
|
||||
|
||||
**CLI Examples (Cisco Viptela):**
|
||||
|
||||
**CLI Examples (Cisco Viptela):**
|
||||
```bash
|
||||
show sdwan control connections # Check orchestrator status
|
||||
show sdwan app-route stats # Verify path selection
|
||||
clear sdwan tunnel # Force tunnel re-establishment
|
||||
show sdwan control connections # Check orchestrator status
|
||||
show sdwan app-route stats # Verify path selection
|
||||
clear sdwan tunnel # Force tunnel re-establishment
|
||||
```
|
||||
|
||||
---
|
||||
### 8\. Interview Cheat Sheet
|
||||
|
||||
## **8. Interview Cheat Sheet**
|
||||
✅ **SD-WAN = Automation + Application-Aware Routing + Multiple Underlays**.
|
||||
✅ **IPSec is still used, but dynamically managed**.
|
||||
✅ **Key metrics: Jitter (<30ms), Latency (<150ms), Packet Loss (<1%)**.
|
||||
✅ **Orchestrator is the brain; edges are the muscle**.
|
||||
* ✅ **SD-WAN = Automation + Application-Aware Routing + Multiple Underlays**.
|
||||
* ✅ **IPSec is still used, but dynamically managed**.
|
||||
* ✅ **Key metrics: Jitter (\<30ms), Latency (\<150ms), Packet Loss (\<1%)**.
|
||||
* ✅ **Orchestrator is the brain; edges are the muscle**.
|
||||
|
||||
---
|
||||
-----
|
||||
|
||||
### **Where to Go Next?**
|
||||
1. **Deep dive into your vendor’s SD-WAN** (Cisco, Fortinet, VMware).
|
||||
2. **Learn cloud-integrated SD-WAN** (AWS Transit Gateway, Azure Virtual WAN).
|
||||
3. **Study real-world designs** (e.g., "How SD-WAN replaces MPLS").
|
||||
## The Three Planes of SD-WAN & Modern Networking
|
||||
|
||||
Need a **deep dive on a specific SD-WAN vendor** or **mock scenarios**? Let me know! 🚀
|
||||
In modern networking, especially with overlay technologies like SD-WAN, we deal with **three distinct planes**, each serving a critical role.
|
||||
|
||||
### 1\. Management Plane
|
||||
|
||||
* **Purpose:** Controls **device access and monitoring** (SSH, SNMP, HTTPS, syslog, etc.). It's about how you *interact with* the device.
|
||||
* **Key Components:**
|
||||
| Component | Protocol | Port | Description |
|
||||
| :-------- | :------- | :--- | :---------- |
|
||||
| **vManage** | HTTPS (WebUI) | TCP/443 | GUI/API for centralized control and configuration. |
|
||||
| **vBond** | DTLS | UDP/23456 | Orchestrator for device authentication and initial redirection to vManage. |
|
||||
| **Zero-Touch Provisioning (ZTP)** | DHCP/HTTPS | - | Auto-configures devices out-of-the-box. |
|
||||
* **Traffic Flow:**
|
||||
1. **Onboarding:** Device contacts **vBond** (DTLS) → gets redirected to **vManage**. Downloads config/CSR via **HTTPS**.
|
||||
2. **Ongoing Management:** Devices send **telemetry** (metrics, logs) to vManage. Policies (security, routing) are pushed **from vManage**.
|
||||
* **Security Considerations:**
|
||||
* ✔ **Always use isolated VRFs for management traffic** (e.g., traditional FVRF, or VPN 512 in SD-WAN for OOB management).
|
||||
* ✔ **Mutual TLS (mTLS)** for device-vManage communication.
|
||||
* ✔ **Role-Based Access Control (RBAC)** in vManage.
|
||||
|
||||
### 2\. Control Plane
|
||||
|
||||
* **Purpose:** Handles **protocols that build network intelligence** (BGP, OSPF, VXLAN EVPN, SD-WAN OMP, STP, LACP, etc.). It's about how the network *learns* its topology and reachability.
|
||||
* **Key Protocols (SD-WAN Specific):**
|
||||
| Protocol | Function | Port |
|
||||
| :------- | :------- | :--- |
|
||||
| **OMP (Overlay Management Protocol)** | Advertises routes, TLOCs, policies. | DTLS/UDP/40322 |
|
||||
| **BGP (optional)** | Legacy WAN integration or underlay routing. | TCP/179 |
|
||||
| **TLOC (Transport Locator)** | Maps physical WAN links to logical tunnels for policy application. | - |
|
||||
* **How OMP Works:**
|
||||
1. **vSmart controllers** act as **route reflectors** for OMP.
|
||||
2. **Edge devices (vEdges)** send:
|
||||
* **Routes** (prefixes learned from LAN/WAN).
|
||||
* **TLOCs** (tunnel endpoints, e.g., `public-IP:color`).
|
||||
* **Policies** (e.g., "prefer MPLS for VoIP").
|
||||
3. **vSmart redistributes** this info to all edges.
|
||||
* **Example OMP Route Advertisement:**
|
||||
```bash
|
||||
vEdge# show omp routes
|
||||
RECEIVED ROUTES:
|
||||
Prefix TLOC IP Color Preference
|
||||
10.1.1.0/24 203.0.113.1 mpls 100
|
||||
10.1.1.0/24 198.51.100.1 biz-internet 50
|
||||
```
|
||||
*(MPLS is preferred over Internet due to higher preference.)*
|
||||
* **Key Traits:**
|
||||
* **Distributes reachability info** (routes, tunnels, topology).
|
||||
* Runs on the **CPU (software-based)** and is vulnerable to floods (e.g., BGP attacks).
|
||||
* Can be placed in a **separate VRF** (but not a traditional FVRF which is management-only).
|
||||
* **Security Considerations:**
|
||||
* ✔ **DTLS encryption** for OMP (no cleartext control traffic\!).
|
||||
* ✔ **Control-plane policing (CoPP)** to prevent floods.
|
||||
* ✔ **Private WAN links (MPLS)** for critical control traffic.
|
||||
|
||||
### 3\. Data Plane (Forwarding Plane)
|
||||
|
||||
* **Purpose:** **Moves user traffic** (packets/frames) at **line rate (hardware-accelerated)**. It's about *moving* the actual data.
|
||||
* **Key Technologies:**
|
||||
| Technology | Role |
|
||||
| :--------- | :--- |
|
||||
| **IPsec/GRE** | Encrypted tunnels between edges. |
|
||||
| **TLOC (Transport Locator)** | Logical tunnel endpoint (e.g., `public-IP:color`). |
|
||||
| **Application-Aware Routing (AAR)** | Dynamically switches paths based on SLA. |
|
||||
* **Data Flow Example:**
|
||||
1. **Traffic arrives at vEdge:**
|
||||
* Classified via **DPI (Deep Packet Inspection)**.
|
||||
* Tagged with **QoS markings (DSCP)**.
|
||||
2. **Path Selection:**
|
||||
* Checks **OMP-learned TLOCs** and **SLA metrics**.
|
||||
* Chooses best path (e.g., MPLS for VoIP, Internet for web).
|
||||
3. **Encapsulation:**
|
||||
* Wrapped in **IPsec (ESP/AH)** or **GRE**.
|
||||
* Sent to peer vEdge via **WAN (MPLS/Internet/5G)**.
|
||||
* **Packet Walkthrough (Simplified):**
|
||||
1. **Original Packet:**
|
||||
```
|
||||
SRC: 10.1.1.100 (LAN) | DST: 8.8.8.8 (Internet)
|
||||
```
|
||||
2. **After SD-WAN Processing:**
|
||||
```
|
||||
[IPsec][GRE][SD-WAN Header][Original Packet]
|
||||
SRC: 203.0.113.1 (vEdge Public IP)
|
||||
DST: 198.51.100.2 (Peer vEdge Public IP)
|
||||
```
|
||||
* **Key Traits:**
|
||||
* **ASIC/switch-chip driven** (not CPU).
|
||||
* **Doesn’t care about routes/tunnels**—just forwards based on FIB/TCAM.
|
||||
* **Security Considerations:**
|
||||
* ✔ **IPsec (AES-256-GCM, IKEv2)** for all tunnels.
|
||||
* ✔ **Zone-Based Firewall** on vEdges.
|
||||
* ✔ **SLA-based DDoS protection** (drop jitter/lossy links).
|
||||
|
||||
### Why This Separation Matters
|
||||
|
||||
| Plane | Runs On | Isolation Needed? | Risks if Compromised |
|
||||
| :---- | :------ | :---------------- | :------------------- |
|
||||
| **Management** | CPU | **Yes (Dedicated VRF/OOB)** | Total device takeover |
|
||||
| **Control** | CPU | **Yes (VRF/CoPP)** | Network meltdown (BGP hijacks, loops) |
|
||||
| **Data** | ASIC | **No (but ACLs help)** | Performance drops (DDoS), but no config access |
|
||||
|
||||
### Common Misconceptions
|
||||
|
||||
1. **"Control Plane = Management Plane"** → **No\!**
|
||||
* **Control Plane:** BGP, OSPF, VXLAN EVPN.
|
||||
* **Management Plane:** SSH, SNMP.
|
||||
* *(They’re both CPU-based but serve different purposes.)*
|
||||
2. **"A traditional FVRF can carry BGP/VXLAN"** → **No\!**
|
||||
* Traditional FVRF (Front-Door VRF) is **only for management** traffic, isolated from data/control.
|
||||
* BGP/VXLAN go in **normal VRFs** or a dedicated control-plane VRF.
|
||||
3. **"Data Plane Needs a VRF"** → **Usually No.**
|
||||
* Data traffic follows the **FIB** (built by the control plane).
|
||||
* VRFs for data are typically for **tenant isolation** (e.g., MPLS VPNs, multi-tenancy service VPNs in SD-WAN).
|
||||
|
||||
### Real-World Use Cases
|
||||
|
||||
1. **SD-WAN**
|
||||
* **Management:** vManage (HTTPS).
|
||||
* **Control:** OMP (Overlay Management Protocol).
|
||||
* **Data:** Encrypted tunnels (IPsec/GRE).
|
||||
2. **VXLAN EVPN**
|
||||
* **Management:** SSH to switches.
|
||||
* **Control:** BGP EVPN (MAC/IP routing).
|
||||
* **Data:** VXLAN-encapsulated traffic.
|
||||
3. **Service Provider MPLS**
|
||||
* **Management:** TACACS+ for routers.
|
||||
* **Control:** LDP/RSVP (label distribution).
|
||||
* **Data:** Label-switched packets.
|
||||
|
||||
### Key Takeaways
|
||||
|
||||
1. **Management Plane** = Your **remote admin access** (dedicated VRF/OOB).
|
||||
2. **Control Plane** = **Protocols that build the network** (BGP, EVPN, OSPF, OMP).
|
||||
3. **Data Plane** = **Raw packet forwarding** (ASIC-driven, no intelligence).
|
||||
|
||||
### Final Thought
|
||||
|
||||
The industry’s failure to **physically separate all three planes** (like servers do with iLO) is a security flaw. But until vendors fix it:
|
||||
|
||||
* **Isolate management traffic in dedicated VRFs (like a traditional FVRF or SD-WAN's VPN 512 for OOB).**
|
||||
* **Use VRFs/CoPP for control-plane isolation and protection.**
|
||||
* **Trust ASICs for the data plane.**
|
||||
|
||||
-----
|
||||
|
||||
## Deep Dive: TLOCs (Transport Locators) – The Spine of SD-WAN
|
||||
|
||||
TLOCs are the **make-or-break abstraction** in SD-WAN architectures (especially Cisco Viptela). They’re the glue between the underlay (physical links) and overlay (logical policies). But most engineers only *think* they understand them. Let’s fix that.
|
||||
|
||||
### 1\. TLOCs: The Core Concept
|
||||
|
||||
A **TLOC** is a **logical representation** of a WAN edge router’s transport connection. It’s defined by three key attributes:
|
||||
|
||||
* **TLOC IP** (the physical interface IP).
|
||||
* **Color** (e.g., `mpls`, `biz-internet`, `lte`).
|
||||
* **Encapsulation** (IPsec or TLS).
|
||||
|
||||
**Why this matters:**
|
||||
|
||||
* TLOCs **decouple policies from hardware**. You can swap circuits (e.g., change ISP) without rewriting all your rules.
|
||||
* They enable **transport-independent routing**—policies reference colors, not IPs.
|
||||
|
||||
### 2\. TLOC Components – What’s Under the Hood
|
||||
|
||||
#### A. TLOC Extended Attributes
|
||||
|
||||
These are **hidden knobs** that influence path selection:
|
||||
|
||||
* **Preference** (like admin distance – higher = better).
|
||||
* **Weight** (for load-balancing across equal paths).
|
||||
* **Public/Private IP** (for NAT traversal).
|
||||
* **Site-ID** (prevents misrouting in multi-tenant setups).
|
||||
|
||||
**Example:**
|
||||
|
||||
```plaintext
|
||||
tloc-extension {
|
||||
ip = 203.0.113.1
|
||||
color = biz-internet
|
||||
encap = ipsec
|
||||
preference = 100 # Higher = more preferred
|
||||
}
|
||||
```
|
||||
|
||||
#### B. TLOC Groups
|
||||
|
||||
* **Primary/Backup Groups:** Force deterministic failover (e.g., "Use LTE only if MPLS is down").
|
||||
* **Geographic Groups:** Steer traffic regionally (e.g., "EU branches prefer EU-based TLOCs").
|
||||
|
||||
**Pro Tip:** Misconfigured groups cause **asymmetric routing**—always validate with `show sdwan tloc`.
|
||||
|
||||
### 3\. TLOC Lifecycle – How They’re Born, Live, and Die
|
||||
|
||||
#### A. TLOC Formation
|
||||
|
||||
* **Discovery:** Router advertises its TLOCs via OMP (Overlay Management Protocol).
|
||||
* **Validation:** BFD (Bidirectional Forwarding Detection) confirms reachability.
|
||||
* **Installation:** TLOC enters the RIB (Routing Information Base) if valid.
|
||||
|
||||
**Critical Check:**
|
||||
|
||||
* `show sdwan omp tlocs` \# Verify TLOC advertisements
|
||||
* `show sdwan bfd sessions` \# Confirm liveliness
|
||||
|
||||
#### B. TLOC States
|
||||
|
||||
* **Up/Active:** BFD is healthy, traffic can flow.
|
||||
* **Down/Dead:** BFD failed, TLOC is pulled from RIB.
|
||||
* **Partial:** One direction works (asymmetric routing risk\!).
|
||||
|
||||
**Debugging:**
|
||||
|
||||
* `show sdwan tloc | include Partial` \# Hunt for flapping TLOCs
|
||||
|
||||
### 4\. TLOC Policies – The Real Power
|
||||
|
||||
#### A. Influencing Path Selection
|
||||
|
||||
* **Route Policy:** Modify TLOC preferences per-application.
|
||||
```plaintext
|
||||
apply-policy {
|
||||
app-route voip {
|
||||
tloc = mpls preference 200 # Always prefer MPLS for VoIP
|
||||
}}
|
||||
```
|
||||
* **Smart TLOC Preemption:** Fail back aggressively (or not).
|
||||
|
||||
#### B. TLOC Affinity
|
||||
|
||||
* **Sticky TLOCs:** Pin flows to a TLOC (e.g., for SIP trunks).
|
||||
* **Load-Balancing:** Distribute across TLOCs with equal weight.
|
||||
|
||||
**Gotcha:** Affinity conflicts with **Performance Routing (PfR)**—tune carefully\!
|
||||
|
||||
### 5\. TLOC Troubleshooting – The Dark Arts
|
||||
|
||||
#### A. Common TLOC Failures
|
||||
|
||||
* **BFD Flapping** → TLOCs bounce.
|
||||
* Fix: Adjust BFD timers (`bfd-timer 300 900 3`). (Hello interval 300ms, Multiplier 3)
|
||||
* **Color Mismatch** → TLOCs don’t form.
|
||||
* Fix: Ensure colors match exactly (case-sensitive\!).
|
||||
* **NAT Issues** → Private IP leaks.
|
||||
* Fix: Use `tloc-extension public-ip`.
|
||||
|
||||
#### B. Advanced Debugging
|
||||
|
||||
* `debug sdwan omp tlocs` \# Watch TLOC advertisements in real-time
|
||||
* `debug sdwan bfd events` \# Catch BFD failures
|
||||
* `show sdwan tloc-history` \# Track TLOC changes over time
|
||||
|
||||
### 6\. TLOC vs. The World
|
||||
|
||||
| Concept | TLOC | Traditional WAN Addressing |
|
||||
| :------ | :--- | :------------------------- |
|
||||
| **Addressing** | Logical (color-based) | Physical (IP-based) |
|
||||
| **Failover** | Sub-second (BFD + OMP) | Slow (BGP convergence) |
|
||||
| **Policies** | Transport-agnostic | Hardcoded to interfaces |
|
||||
|
||||
**Key Takeaway:** TLOCs turn **network plumbing** into **policy-driven intent**.
|
||||
|
||||
**Final Word**
|
||||
Mastering TLOCs means:
|
||||
|
||||
* ✅ You **never** blame "the SD-WAN" for routing issues—you dissect TLOC states.
|
||||
* ✅ You **design for intent** (colors, groups) instead of hacking interface configs.
|
||||
* ✅ You **troubleshoot like a surgeon**—OMP → BFD → TLOC → Policy.
|
||||
|
||||
Now go forth and make TLOCs obey. 🚀
|
||||
(And when Cisco TAC says "it’s a TLOC issue," you’ll know exactly where to look.)
|
||||
|
||||
-----
|
||||
|
||||
## SD-WAN Site ID + Color + Management Subnet Integration Guide
|
||||
|
||||
To build a **scalable, intuitive, and operationally efficient** SD-WAN fabric, we’ll combine:
|
||||
|
||||
1. **Site IDs** (Logical location identifiers)
|
||||
2. **Colors** (Underlay transport identification)
|
||||
3. **Management Subnet** (VRF for OOB/In-band management)
|
||||
|
||||
Here’s how to plan and implement them cohesively:
|
||||
|
||||
### 1\. Hierarchy & Assignment Strategy
|
||||
|
||||
#### A. Site ID + Color + Management Subnet Relationship
|
||||
|
||||
| Component | Purpose | Example Value | Design Tip |
|
||||
| :-------- | :------ | :------------ | :--------- |
|
||||
| **Site ID** | Uniquely identifies a branch/DC | `100` (HQ), `200` (Branch) | Use geographic encoding (e.g., `1` = Americas). |
|
||||
| **Color** | Identifies WAN transport types | `mpls`, `internet`, `lte` | Match colors to ISP/underlay (e.g., `verizon_mpls`). |
|
||||
| **Mgmt Subnet** | Dedicated subnet for OOB/In-band mgmt | `10.255.100.0/24` (VPN 0 or VPN 512) | Isolate from data VPNs (1-511). |
|
||||
|
||||
#### B. Structured Numbering Example
|
||||
|
||||
**Scenario**: A multinational with:
|
||||
|
||||
* **Region 1 (Americas)**: MPLS + Internet
|
||||
* **Region 2 (EMEA)**: MPLS + LTE
|
||||
|
||||
| Site | Site ID | System IP | Colors (Transport) | Management Subnet |
|
||||
| :--- | :------ | :-------- | :----------------- | :---------------- |
|
||||
| **HQ (Dallas)** | `100` | `172.16.100.1` | `mpls_blue`, `biz_internet` | `10.255.100.0/24` (VPN 0) |
|
||||
| **Branch (NY)** | `101` | `172.16.101.1` | `mpls_blue`, `biz_internet` | `10.255.101.0/24` (VPN 0) |
|
||||
| **DC (Frankfurt)** | `200` | `172.16.200.1` | `europe_mpls`, `lte_backup` | `10.255.200.0/24` (VPN 0) |
|
||||
|
||||
### 2\. Color Planning Best Practices
|
||||
|
||||
#### A. Standardize Color Naming
|
||||
|
||||
* Use **descriptive, consistent names**:
|
||||
```plaintext
|
||||
<carrier>_<type> (e.g., `att_mpls`, `comcast_biz_internet`)
|
||||
```
|
||||
* Avoid generic names like `primary`, `secondary` (confusing at scale).
|
||||
|
||||
#### B. Color Redundancy Rules
|
||||
|
||||
* Assign **at least 2 colors per site** (e.g., `mpls` + `internet`).
|
||||
* Use **BFD** for fast failover between colors.
|
||||
|
||||
#### C. Color Mapping to TLOCs
|
||||
|
||||
* Each **color** corresponds to a **TLOC** (Transport Locator).
|
||||
* Example TLOC config:
|
||||
```bash
|
||||
vEdge(config)# vpn 0 interface ge0/0
|
||||
tunnel-interface
|
||||
color mpls restrict # Restrict to MPLS underlay
|
||||
```
|
||||
|
||||
### 3\. Management Subnet Strategy
|
||||
|
||||
#### A. Key Requirements
|
||||
|
||||
* **Isolation**: Management traffic should be isolated.
|
||||
* **In-band Management:** Typically resides in **VPN 0** (shares the transport VRF with control/data overlay traffic but is logically separate).
|
||||
* **Out-of-Band (OOB) Management:** For dedicated management ports (e.g., `GigabitEthernet0/0` on a vEdge), use **VPN 512**. Routes in VPN 512 are **NOT** advertised into the OMP overlay.
|
||||
* **Subnet Size**: `/24` recommended (supports up to 254 devices).
|
||||
|
||||
#### B. Addressing Scheme Example
|
||||
|
||||
For **In-band Management (VPN 0)**:
|
||||
|
||||
```plaintext
|
||||
10.255.<Site ID>.0/24
|
||||
Example:
|
||||
- Site ID 100 → `10.255.100.0/24`
|
||||
- Site ID 200 → `10.255.200.0/24`
|
||||
```
|
||||
|
||||
For **Out-of-Band Management (VPN 512)**: Use a completely separate, non-overlapping management subnet, typically on a dedicated physical interface.
|
||||
|
||||
**Benefits**:
|
||||
|
||||
* Predictable IPs (easy troubleshooting).
|
||||
* No overlaps with service VPNs.
|
||||
|
||||
#### C. vManage Integration
|
||||
|
||||
* Define management subnets in **vManage Templates**:
|
||||
```bash
|
||||
device vpn 0
|
||||
interface eth0
|
||||
ip address 10.255.100.1/24
|
||||
tunnel-interface
|
||||
color biz_internet restrict
|
||||
```
|
||||
(For VPN 512, you'd configure a separate interface under `device vpn 512`).
|
||||
|
||||
### 4\. Putting It All Together: Design Checklist
|
||||
|
||||
1. **Site IDs**: Geographic/role-based, unique, documented in IPAM.
|
||||
2. **Colors**: Named after carriers, assigned to TLOCs, redundant.
|
||||
3. **Management Subnet**:
|
||||
* `/24` in VPN 0 for in-band.
|
||||
* `/24` in VPN 512 for OOB (preferred for dedicated management ports).
|
||||
4. **System IPs**: Align with Site ID (e.g., Site ID `100` → `172.16.100.1`).
|
||||
|
||||
### 5\. Common Pitfalls
|
||||
|
||||
❌ **Color Conflicts**: Reusing `mpls` for different ISPs (use `att_mpls`, `verizon_mpls`).
|
||||
❌ **Mgmt Overlaps**: Sharing `10.255.100.0/24` across sites (always subnet per site).
|
||||
❌ **Unstructured Site IDs**: Random numbers (hard to scale beyond 50 sites).
|
||||
❌ **Incorrect VPN for Internet Breakout**: Using VPN 512 for DIA (it's for OOB management). DIA should be in a service VPN or VPN 0.
|
||||
|
||||
### Final Topology Example
|
||||
|
||||
```plaintext
|
||||
Site ID: 100 (Dallas HQ)
|
||||
- System IP: 172.16.100.1
|
||||
- Colors: mpls_blue, biz_internet
|
||||
- Mgmt Subnet: 10.255.100.0/24 (VPN 0 for in-band)
|
||||
- Service VPNs: 10 (LAN), 20 (VoIP)
|
||||
```
|
||||
|
||||
-----
|
||||
|
||||
## SD-WAN Fabric Bring-Up Essentials
|
||||
|
||||
To **bring up an SD-WAN fabric**, you need to configure key components correctly. Below is a **concise, step-by-step breakdown** of the essentials, along with **critical design considerations**.
|
||||
|
||||
### 1\. Underlay Network (VPN 0 - Transport VRF / Front-Door VRF)
|
||||
|
||||
* **Purpose**: Handles **control-plane traffic** (OMP, DTLS/TLS tunnels between devices) and **encapsulated data-plane traffic**. All physical WAN interfaces that connect to the underlay belong to VPN 0.
|
||||
* **Key Configurations**:
|
||||
* **Interfaces**: Assign WAN interfaces (e.g., MPLS, Internet, LTE) to VPN 0.
|
||||
* **Routing**:
|
||||
* Static routes (for simple setups).
|
||||
* BGP/OSPF (for dynamic underlay routing in larger deployments).
|
||||
* **TLOC Extensions**: Define public/private IPs for tunnel endpoints, along with colors.
|
||||
* **Design Considerations**:
|
||||
* **Dual Underlay**: Use at least **two transport types** (e.g., MPLS + Internet) for redundancy.
|
||||
* **TLOC Preference**: Prioritize cheaper/faster links (e.g., MPLS over LTE).
|
||||
|
||||
### 2\. Overlay Network (OMP Routing)
|
||||
|
||||
* **Purpose**: Distributes routes and policies across the fabric.
|
||||
* **Key Configurations**:
|
||||
* **OMP (Overlay Management Protocol)**: Advertises routes, TLOCs, and policies between vSmart controllers and edges.
|
||||
* **Route Policies**: Control which prefixes are shared (e.g., only corporate LAN routes).
|
||||
* **Design Considerations**:
|
||||
* **Route Aggregation**: Minimize prefixes advertised to vSmart (e.g., summarize branch LANs).
|
||||
* **TLOC Redundancy**: Assign multiple TLOCs per route for failover.
|
||||
|
||||
### 3\. Service VPNs (VPN 1-511)
|
||||
|
||||
* **Purpose**: Segments user/data traffic (e.g., corporate LAN, guest Wi-Fi, VoIP).
|
||||
* **Key Configurations**:
|
||||
* **VRF Creation**: Define VPNs (e.g., `vpn 10` for corporate LAN).
|
||||
* **Interface Assignment**: Assign LAN interfaces to the correct VPN.
|
||||
* **Route Leaking**: If needed, allow controlled traffic flow between VPNs (via centralized policies).
|
||||
* **Design Considerations**:
|
||||
* **QoS Tagging**: Apply DSCP markings per VPN (e.g., EF for VoIP in `vpn 20`).
|
||||
* **Security Policies**: Restrict inter-VPN communication (e.g., guest Wi-Fi in `vpn 30` can’t reach `vpn 10`).
|
||||
|
||||
### 4\. Internet Breakout
|
||||
|
||||
* **Purpose**: Local internet access (DIA) from branches or centralized internet access from a datacenter.
|
||||
* **Key Configurations**:
|
||||
* **NAT & Firewall**: Enable NAT overload (PAT) for private→public IP translation on the egress interface.
|
||||
* **Policy-Based Routing (PBR) or Application-Aware Routing**: Steer specific traffic (e.g., SaaS apps, guest Wi-Fi) to the local internet path.
|
||||
* **Design Considerations**:
|
||||
* **Security**: Apply **ZTNA/Umbrella** or other security services for secure internet access.
|
||||
* **Backup Path**: If local DIA fails, fall back to centralized internet via the overlay.
|
||||
* **Note**: This is typically configured in a **service VPN (e.g., VPN 10, or a dedicated internet VPN like VPN 999)**, or by routing traffic directly out a VPN 0 interface with specific policies and NAT. **VPN 512 is reserved for Out-of-Band Management, not Internet Breakout.**
|
||||
|
||||
### 5\. Management & Control Plane Connectivity
|
||||
|
||||
* **Purpose**: Ensures vEdges can securely connect to controllers (vManage, vSmart, vBond).
|
||||
* **Key Configurations**:
|
||||
* **Controller IPs**: Ensure vEdges can reach vManage/vSmart/vBond over VPN 0.
|
||||
* **Certificate Auth**: Use device certificates for secure onboarding.
|
||||
* **Design Considerations**:
|
||||
* **Out-of-Band (OOB) Management (VPN 512)**: Use a separate OOB network with interfaces in VPN 512 for high availability and isolation of management traffic from the overlay.
|
||||
* **Geo-Redundancy**: Deploy controllers in multiple regions.
|
||||
|
||||
### 6\. Security Policies
|
||||
|
||||
* **Purpose**: Enforce traffic rules (e.g., blocking, inspection).
|
||||
* **Key Configurations**:
|
||||
* **Zone-Based Firewall**: Assign interfaces to zones (e.g., "inside," "outside").
|
||||
* **Application-Aware Policies**: Block high-risk apps (e.g., Tor, Netflix).
|
||||
* **Design Considerations**:
|
||||
* **Default-Deny**: Start with "deny all," then allow only needed traffic.
|
||||
* **IPS/IDS**: Enable for internet-bound traffic.
|
||||
|
||||
### 7\. High Availability (HA)
|
||||
|
||||
* **Design Considerations**:
|
||||
* **Dual vSmarts**: Avoid single points of failure for the control plane.
|
||||
* **Active/Standby Edges**: Use VRRP/HSRP for LAN-side HA at critical sites.
|
||||
* **Cloud Gateway Redundancy**: For cloud-onramp (e.g., AWS/Azure).
|
||||
|
||||
### Summary Checklist
|
||||
|
||||
| **Step** | **Action** | **Critical Design Tip** |
|
||||
| :------- | :--------- | :---------------------- |
|
||||
| **1. Underlay** | Configure VPN 0 interfaces & routing | Dual transports (MPLS + Internet) |
|
||||
| **2. Overlay** | Set up OMP & route policies | Summarize routes to reduce overhead |
|
||||
| **3. Service VPNs** | Define VPNs 1-511 & assign interfaces | Use QoS for VoIP/VC traffic |
|
||||
| **4. Internet** | Configure DIA in a Service VPN or VPN 0 | Add ZTNA/umbrella for security |
|
||||
| **5. Management** | Ensure controllers are reachable via VPN 0 | OOB management (VPN 512) for resiliency |
|
||||
| **6. Security** | Apply firewall/IPS policies | Default-deny approach |
|
||||
| **7. HA** | Deploy redundant controllers/edges | Active/standby for critical sites |
|
||||
|
||||
-----
|
||||
|
||||
## SD-WAN Application-Aware Routing (AAR) with `match app-list`
|
||||
|
||||
*Control traffic flows based on applications using vManage policies.*
|
||||
|
||||
### 1\. What is `match app-list`?
|
||||
|
||||
* **Purpose:** Identifies specific applications (e.g., Zoom, Netflix, VoIP) to steer traffic via policies.
|
||||
* **Use Cases:**
|
||||
* Prioritize VoIP over MPLS.
|
||||
* Block high-risk apps (e.g., Tor).
|
||||
* Local internet breakout (DIA) for SaaS apps.
|
||||
|
||||
### 2\. How It Works
|
||||
|
||||
1. **Application Detection:**
|
||||
* Uses **Deep Packet Inspection (DPI)** to identify apps (even if ports are encrypted).
|
||||
* Predefined app lists in vManage (e.g., `VOICE-AND-VIDEO`, `BUSINESS-APPS`).
|
||||
2. **Policy Matching:**
|
||||
* Policies reference `app-list` to trigger actions (e.g., change path, apply QoS).
|
||||
|
||||
### 3\. Configuration Steps
|
||||
|
||||
#### 3.1 Define an App List in vManage
|
||||
|
||||
1. Navigate to: **Configuration \> Policies \> Custom Options \> App-Aware Routing**
|
||||
2. Create a new app list:
|
||||
```plaintext
|
||||
Name: CORPORATE-APPS
|
||||
Applications:
|
||||
- Microsoft-365
|
||||
- Webex-Teams
|
||||
- Zoom-Cloud
|
||||
```
|
||||
|
||||
#### 3.2 Create a Policy Using `match app-list`
|
||||
|
||||
**Example:** *"Route Microsoft-365 traffic via VPN 10 (local internet breakout)"*
|
||||
*(Note: VPN 512 is for Out-of-Band Management, not Internet Breakout. Use a service VPN like VPN 10 or route out VPN 0 for DIA.)*
|
||||
|
||||
```bash
|
||||
policy-rule MICROSOFT-365-DIA
|
||||
match app-list CORPORATE-APPS # Match predefined apps
|
||||
action accept
|
||||
set vpn 10 # Force local internet breakout via VPN 10
|
||||
set dscp 46 # Mark for QoS (EF)
|
||||
```
|
||||
|
||||
#### 3.3 Apply Policy to Sites
|
||||
|
||||
1. Attach policy to a **Centralized Policy** in vManage.
|
||||
2. Push to target sites.
|
||||
|
||||
### 4\. Best Practices
|
||||
|
||||
#### 4.1 App List Design
|
||||
|
||||
* **Group logically:**
|
||||
* `VOICE-AND-VIDEO`: Zoom, Webex, MS-Teams.
|
||||
* `BUSINESS-CRITICAL`: SAP, Oracle, Salesforce.
|
||||
* **Avoid overly broad lists** (e.g., "ALL-WEB") to prevent unintended matches.
|
||||
|
||||
#### 4.2 Policy Ordering
|
||||
|
||||
* **Higher priority** (lower number) policies evaluate first.
|
||||
```bash
|
||||
policy-list AAR-POLICY
|
||||
sequence 10
|
||||
match app-list VOICE-AND-VIDEO
|
||||
action accept
|
||||
set color mpls # Force MPLS for voice
|
||||
sequence 20
|
||||
match app-list NETFLIX
|
||||
action drop # Block Netflix
|
||||
```
|
||||
|
||||
#### 4.3 SLA-Based Fallback
|
||||
|
||||
* Combine with **Performance Routing (PfR)** to switch paths if SLA fails:
|
||||
```bash
|
||||
match app-list WEBEX
|
||||
action accept
|
||||
set sla preferred-color mpls latency 100ms
|
||||
```
|
||||
|
||||
### 5\. Verification & Troubleshooting
|
||||
|
||||
#### 5.1 Key Commands
|
||||
|
||||
| Command | Purpose |
|
||||
| :------ | :------ |
|
||||
| `show sdwan app-aware stats` | Lists detected apps and paths. |
|
||||
| `show sdwan policy service-statistics` | Checks policy hits. |
|
||||
| `show sdwan app-fwd dpi flows` | Inspects DPI-classified flows. |
|
||||
|
||||
#### 5.2 Common Issues
|
||||
|
||||
| Symptom | Likely Cause | Fix |
|
||||
| :------ | :----------- | :-- |
|
||||
| App traffic not matching | Incorrect app-list definition | Verify app names in vManage. |
|
||||
| Policy not applying | Wrong policy priority | Reorder policies (lower sequence = higher priority). |
|
||||
| DPI not detecting apps | Encryption (TLS 1.3) | Use IP-based matching as fallback. |
|
||||
|
||||
### 6\. Advanced Use Cases
|
||||
|
||||
#### 6.1 Custom DPI Signatures
|
||||
|
||||
* For proprietary apps, add custom signatures:
|
||||
```bash
|
||||
app-list CUSTOM-APP
|
||||
signature TCP port 5000 protocol HTTP user-agent "MyApp*"
|
||||
```
|
||||
|
||||
#### 6.2 Combining with QoS
|
||||
|
||||
* Mark apps for prioritization:
|
||||
```bash
|
||||
match app-list VOICE
|
||||
action accept
|
||||
set dscp ef # Expedited Forwarding (VoIP)
|
||||
```
|
||||
|
||||
#### 6.3 Internet Breakout for Specific Apps
|
||||
|
||||
```bash
|
||||
match app-list SALESFORCE
|
||||
action accept
|
||||
set vpn 10 # Local breakout via VPN 10
|
||||
set nat use-vpn 0 # Use VPN 0's NAT pool (if VPN 0 is internet-facing)
|
||||
```
|
||||
|
||||
### 7\. Summary Checklist
|
||||
|
||||
* [ ] Define app lists in vManage (**Configuration \> Policies \> App-Aware Routing**).
|
||||
* [ ] Use `match app-list` in policies to steer traffic.
|
||||
* [ ] Test with `show sdwan app-aware stats`.
|
||||
* [ ] Combine with SLA for dynamic failover.
|
||||
|
||||
### Key Takeaways
|
||||
|
||||
1. **`match app-list` enables application-aware routing** (not just IP/port-based).
|
||||
2. **DPI visibility can be affected by strong encryption** (e.g., TLS 1.3 with ESNI) → May need fallback to IP-based matching.
|
||||
3. **Policy order matters** — Highest priority (lowest sequence) evaluates first.
|
||||
|
||||
-----
|
||||
|
||||
## Front-Door VRF (FVRF) Explained (Using Cisco Gear)
|
||||
|
||||
**Front-Door VRF (FVRF)** is a Cisco feature that enhances security by separating the **management plane** from the **data plane** in network devices (routers, switches, firewalls). It achieves this by placing the management interface (SSH, SNMP, HTTPS, etc.) in a separate Virtual Routing and Forwarding (VRF) instance, isolating it from the default global routing table.
|
||||
|
||||
**Note:** While this document describes the general concept of Front-Door VRF in Cisco devices, in Cisco SD-WAN (Viptela-based) architectures:
|
||||
|
||||
* **VPN 0** is often referred to as the "Front-Door VRF" in the sense that it is the transport VRF carrying all overlay control and data tunnel traffic, and often in-band management.
|
||||
* **VPN 512** is used for isolated *out-of-band* management, conceptually similar to a traditional FVRF.
|
||||
|
||||
### Why Use Front-Door VRF?
|
||||
|
||||
1. **Security:** Prevents unauthorized access to management interfaces via data-plane attacks.
|
||||
2. **Isolation:** Ensures management traffic doesn’t mix with production traffic.
|
||||
3. **Multi-Tenancy:** Useful in service provider environments where management traffic must be segregated per customer.
|
||||
4. **Simplified Routing:** Avoids route conflicts between management and data networks.
|
||||
|
||||
### How FVRF Works
|
||||
|
||||
* The **management interface (e.g., Mgmt0/0)** is assigned to a dedicated VRF (e.g., `MGMT-VRF`).
|
||||
* All management traffic (SSH, SNMP, etc.) must go through this VRF.
|
||||
* The data plane (regular traffic) uses the **default global routing table** or other service VRFs.
|
||||
|
||||
### Configuration Example (Cisco IOS-XE / IOS)
|
||||
|
||||
#### 1\. Create the Management VRF
|
||||
|
||||
```bash
|
||||
configure terminal
|
||||
vrf definition MGMT-VRF
|
||||
rd 100:1 ! Route Distinguisher (for uniqueness)
|
||||
address-family ipv4
|
||||
exit-address-family
|
||||
exit
|
||||
```
|
||||
|
||||
#### 2\. Assign the Management Interface to the VRF
|
||||
|
||||
```bash
|
||||
interface GigabitEthernet0/0
|
||||
description Management Interface
|
||||
vrf forwarding MGMT-VRF
|
||||
ip address 192.168.1.1 255.255.255.0
|
||||
no shutdown
|
||||
exit
|
||||
```
|
||||
|
||||
#### 3\. Configure a Default Route for Management Traffic
|
||||
|
||||
```bash
|
||||
ip route vrf MGMT-VRF 0.0.0.0 0.0.0.0 192.168.1.254
|
||||
```
|
||||
|
||||
*(Where `192.168.1.254` is the gateway for management traffic.)*
|
||||
|
||||
#### 4\. Enable VRF-Aware Services
|
||||
|
||||
```bash
|
||||
ip http server
|
||||
ip http vrf MGMT-VRF ! Ensures HTTP/HTTPS uses the MGMT-VRF
|
||||
line vty 0 4
|
||||
transport input ssh vrf-alias MGMT-VRF enable ! Ensures SSH uses the MGMT-VRF
|
||||
exit
|
||||
```
|
||||
|
||||
### Verification
|
||||
|
||||
* Check VRF routing table:
|
||||
```bash
|
||||
show ip route vrf MGMT-VRF
|
||||
```
|
||||
* Verify interface assignment:
|
||||
```bash
|
||||
show vrf brief
|
||||
```
|
||||
* Test connectivity:
|
||||
```bash
|
||||
ping vrf MGMT-VRF 192.168.1.254
|
||||
```
|
||||
|
||||
### Key Considerations
|
||||
|
||||
* **NTP & DNS:** If management relies on NTP/DNS, ensure they are reachable via the FVRF.
|
||||
* **Backup Access:** Always maintain an alternative access method (console) in case of misconfiguration.
|
||||
* **Compatibility:** Some older Cisco devices may not support all VRF-aware services.
|
||||
|
||||
### Conclusion
|
||||
|
||||
Front-Door VRF is a best practice for securing management traffic in Cisco environments. By isolating management interfaces in a separate VRF, you reduce attack surfaces and prevent unauthorized access through data-plane vulnerabilities.
|
||||
|
||||
-----
|
||||
Reference in New Issue
Block a user