Add tech_docs/networking/network_iac_framework.md
This commit is contained in:
519
tech_docs/networking/network_iac_framework.md
Normal file
519
tech_docs/networking/network_iac_framework.md
Normal file
@@ -0,0 +1,519 @@
|
||||
Here’s the distilled meta-framework for designing a configuration management system that balances pragmatism with pride-of-ownership:
|
||||
|
||||
### **The Pillars of Sustainable Network Configuration Design**
|
||||
|
||||
#### **1. The Iron Triangle of Configuration Systems**
|
||||
```mermaid
|
||||
graph TD
|
||||
A[Human Understandability] --> B[Automation Compatibility]
|
||||
B --> C[Audit Compliance]
|
||||
C --> A
|
||||
```
|
||||
|
||||
**Key Insight**: Optimize for the center where all three overlap.
|
||||
|
||||
#### **2. Decision Filters for Every Component**
|
||||
Ask of each element:
|
||||
1. **"Will this still make sense in 5 years?"**
|
||||
(Avoid tools/languages with steep lifecycle curves)
|
||||
2. **"Does this reduce cognitive load?"**
|
||||
(Favor structures that document themselves)
|
||||
3. **"Can we exit this gracefully?"**
|
||||
(No dead-end dependencies)
|
||||
|
||||
#### **3. The Template Hierarchy of Needs**
|
||||
```text
|
||||
[ Reliability ]
|
||||
/ \
|
||||
[Consistency] [Reproducibility]
|
||||
\ /
|
||||
[Documentation]
|
||||
```
|
||||
|
||||
**Implementation Rule**: Satisfy each layer before moving up.
|
||||
|
||||
#### **4. Interface-Oriented Design**
|
||||
Define clear boundaries between:
|
||||
- **Data** (Variables/Secrets)
|
||||
- **Templates** (Rendering Rules)
|
||||
- **Delivery** (CLI/API/SSH)
|
||||
- **Validation** (Pre/Post Checks)
|
||||
|
||||
#### **5. The Versioning Covenant**
|
||||
```text
|
||||
v1.0.0
|
||||
│ │ └─ Patch (Template fixes)
|
||||
│ └── Minor (New profiles)
|
||||
└──── Major (Structure changes)
|
||||
```
|
||||
|
||||
**Golden Rule**: Never break backward template compatibility.
|
||||
|
||||
#### **6. The Compliance Sandwich**
|
||||
```python
|
||||
def deploy(config):
|
||||
pre_checks(config) # Template validations
|
||||
apply(config) # Actual deployment
|
||||
post_checks(config) # Agent verification
|
||||
```
|
||||
|
||||
#### **7. The Escalation Ladder**
|
||||
Design for progressive enhancement:
|
||||
1. **Manual Stage**: `cat template.txt > device`
|
||||
2. **Assisted Stage**: `./validate.py | ssh device`
|
||||
3. **Automated Stage**: CI/CD pipeline
|
||||
|
||||
**Critical Feature**: Each stage produces human-readable artifacts.
|
||||
|
||||
#### **8. The Pride Metrics**
|
||||
Measure system quality by:
|
||||
- **MTTD** (Mean Time To Document) - How long to explain any config
|
||||
- **RFR** (Ready-for-Rollback) - Seconds to revert any change
|
||||
- **CIA** (Config-Impact-Awareness) - % of team who can trace configs to requirements
|
||||
|
||||
### **Practical Implementation Checklist**
|
||||
|
||||
1. **Start Here**:
|
||||
```bash
|
||||
mkdir config_system
|
||||
touch config_system/design_principles.md # Document your meta-rules first
|
||||
```
|
||||
|
||||
2. **First Templates**:
|
||||
```text
|
||||
switches/
|
||||
├── base/
|
||||
│ └── core.txt # Immutable base configs
|
||||
└── overlays/
|
||||
├── finance/ # Compliance additions
|
||||
└── branch/ # Site-specific tweaks
|
||||
```
|
||||
|
||||
3. **Validation Hook**:
|
||||
```python
|
||||
# hooks/validate.py
|
||||
def main():
|
||||
check(files_changed <= 3) # Prevent bulk undocumented changes
|
||||
check(has_comments(config))
|
||||
check(no_secrets_in_clear())
|
||||
```
|
||||
|
||||
4. **The "Proud" Test**:
|
||||
Ask quarterly: "Would we open-source this system today?"
|
||||
(If not, refine until the answer is yes)
|
||||
|
||||
### **When to Stop Designing**
|
||||
The system is "complete" when:
|
||||
- New engineers can safely make config changes in their first week
|
||||
- Any config can be traced to a business requirement in ≤3 clicks
|
||||
- 95% of changes use existing patterns (not one-offs)
|
||||
|
||||
This framework gives you a foundation that’s:
|
||||
- **Engineer-Respectful**: Values human time over tool whims
|
||||
- **Future-Proof**: Ready but not dependent on automation
|
||||
- **Institutionally Durable**: Survives team turnover
|
||||
|
||||
Would you like to pressure-test this against your specific organizational constraints?
|
||||
|
||||
---
|
||||
|
||||
Here's a pragmatic hybrid approach that blends simplicity with modern requirements:
|
||||
|
||||
### **The Hybrid Configuration Framework**
|
||||
|
||||
#### **1. Core Architecture**
|
||||
```mermaid
|
||||
graph TD
|
||||
A[Human-Readable Templates] --> B[Git-Versioned Sources]
|
||||
B --> C[Agent Validation]
|
||||
B --> D[Manual CLI Deployment]
|
||||
B --> E[Automated Deployment]
|
||||
```
|
||||
|
||||
#### **2. Implementation Layers**
|
||||
|
||||
**Layer 1: Static Templates (Foundation)**
|
||||
- Keep your existing template structure
|
||||
- Add validation markers for agents:
|
||||
```text
|
||||
! COMPLIANCE-TAG: PCI-DSS-4.1
|
||||
aaa authentication login default group tacacs+ local
|
||||
! VALIDATION: show aaa sessions
|
||||
```
|
||||
|
||||
**Layer 2: Agent Hooks**
|
||||
```python
|
||||
# Basic validation agent (example)
|
||||
def validate_config(config):
|
||||
required_tags = ["PCI-DSS", "NIST-800"]
|
||||
for line in config.split('\n'):
|
||||
if "! COMPLIANCE-TAG:" in line:
|
||||
tag = line.split(":")[1].strip()
|
||||
if tag not in scanned_standards:
|
||||
alert(f"Missing scan for {tag}")
|
||||
```
|
||||
|
||||
**Layer 3: Deployment Bridge**
|
||||
```bash
|
||||
# Sample CI pipeline (runs on config changes)
|
||||
1. jinja2 render --validate-tags
|
||||
2. agent validate --level PCI
|
||||
3. if pass: scp config to device
|
||||
4. if fail: create JIRA ticket
|
||||
```
|
||||
|
||||
#### **3. File Structure Upgrade**
|
||||
```text
|
||||
configs/
|
||||
├── templates/ # Your existing templates
|
||||
├── agents/
|
||||
│ ├── validator.py # Lightweight compliance checks
|
||||
│ └── tags.db # Required compliance markers
|
||||
└── hooks/
|
||||
├── pre-deploy.sh # Signature verification
|
||||
└── post-deploy.sh # Compliance snapshot
|
||||
```
|
||||
|
||||
#### **4. Workflow Integration**
|
||||
**Manual Mode:**
|
||||
```bash
|
||||
# Engineer workflow
|
||||
vim templates/BASE/02_system.txt
|
||||
./hooks/pre-deploy.sh # Verifies templates
|
||||
deploy manually
|
||||
```
|
||||
|
||||
**Automated Mode:**
|
||||
```yaml
|
||||
# GitLab CI example
|
||||
deploy_config:
|
||||
only:
|
||||
- /^v\d+\.\d+/ # Tagged releases
|
||||
script:
|
||||
- ./agents/validator.py -t PCI
|
||||
- ansible-playbook deploy.yml
|
||||
```
|
||||
|
||||
#### **5. Compliance Bridge**
|
||||
**Template Markers → Agent Food**
|
||||
```text
|
||||
! COMPLIANCE-REQ: Ensure TACACS timeout < 10s
|
||||
tacacs-server timeout 9
|
||||
```
|
||||
Validated by:
|
||||
```python
|
||||
# Agent checks
|
||||
if "tacacs-server timeout" in config:
|
||||
timeout = int(config.split()[-1])
|
||||
assert timeout <= 10, "TACACS timeout violation"
|
||||
```
|
||||
|
||||
#### **6. Key Advantages**
|
||||
1. **Progressive Compliance**:
|
||||
- Start with manual template reviews
|
||||
- Gradually automate checks
|
||||
2. **Tool Agnostic**:
|
||||
- Validators work with Ansible, Terraform, or manual deploys
|
||||
3. **Audit Trail**:
|
||||
```text
|
||||
2023-11-25 14:22: [AGENT] PASSED PCI-4.1 (template: 02_system.txt)
|
||||
```
|
||||
|
||||
#### **7. Reality Checks**
|
||||
**When This Fits:**
|
||||
- Teams with 1-2 engineers managing 50-300 devices
|
||||
- Mixed vendor environments
|
||||
- Gradually maturing compliance programs
|
||||
|
||||
**When to Avoid:**
|
||||
- Fully automated cloud-native networks
|
||||
- Teams already using Terraform Enterprise/SDN
|
||||
|
||||
#### **8. Starter Kit**
|
||||
```bash
|
||||
# Initial setup
|
||||
git clone https://example.com/network-templates
|
||||
cd network-templates
|
||||
pip install -r agents/requirements.txt # Only dependency: PyYAML
|
||||
|
||||
# First validation
|
||||
./agents/validator.py -t NIST-800-53
|
||||
```
|
||||
|
||||
This gives you:
|
||||
- Immediate compliance visibility
|
||||
- Zero automation lock-in
|
||||
- Clear path from templates to full IaC
|
||||
- All while keeping your original template system intact
|
||||
|
||||
Would you like me to develop a specific component further? Particularly:
|
||||
1. The agent validation logic
|
||||
2. Git hook integration
|
||||
3. Compliance tagging standards
|
||||
|
||||
---
|
||||
|
||||
Understood. Here's a lean, focused `README.md` that strictly covers template management without venturing into LCM or automation:
|
||||
|
||||
```markdown
|
||||
# Cisco 4500 Series Configuration Templates
|
||||
|
||||
## Purpose
|
||||
Standardized configuration templates for:
|
||||
- Cisco 4507R+E
|
||||
- Cisco 4510R+E
|
||||
(Layer 3 Access Switch role)
|
||||
|
||||
## File Structure
|
||||
```text
|
||||
templates/
|
||||
├── BASE/ # Core building blocks
|
||||
│ ├── 01_license.txt # License/config-register
|
||||
│ ├── 02_system.txt # Hostname/services/NTP
|
||||
│ ├── 03_aaa.txt # Authentication
|
||||
│ └── 04_vlans.txt # VLAN definitions
|
||||
│
|
||||
├── PORT_PROFILES/ # Interface templates
|
||||
│ ├── access_data.txt # Data+Voice ports
|
||||
│ ├── access_voice.txt # Voice-only ports
|
||||
│ └── trunk_uplink.txt # Uplink ports
|
||||
│
|
||||
└── POLICIES/ # Reusable policy blocks
|
||||
├── qos_marking.txt # QoS class-maps
|
||||
└── acl_preauth.txt # Port ACLs
|
||||
```
|
||||
|
||||
## Usage Instructions
|
||||
|
||||
### 1. Build a Configuration
|
||||
```bash
|
||||
# Combine templates in order
|
||||
cat BASE/01_license.txt BASE/02_system.txt > switch_config.txt
|
||||
|
||||
# Apply port profiles (example)
|
||||
cat PORT_PROFILES/access_data.txt | sed 's/{{ VLAN_ID }}/210/' >> switch_config.txt
|
||||
```
|
||||
|
||||
### 2. Required Customizations
|
||||
Each template contains explicit replacement markers:
|
||||
|
||||
```text
|
||||
! REQUIRED CUSTOMIZATIONS (search for these):
|
||||
{{ HOSTNAME }} # Device hostname
|
||||
{{ MGMT_IP }} # Management interface IP
|
||||
{{ VLAN_2xx }} # Data VLAN ID (2xx range)
|
||||
```
|
||||
|
||||
### 3. Validation Checklist
|
||||
After customization, verify:
|
||||
1. No remaining `{{ }}` placeholders
|
||||
2. VLAN IDs match port assignments
|
||||
3. Unique IPs for all interfaces
|
||||
|
||||
## Template Versioning
|
||||
| Version | Date | Changes |
|
||||
|---------|------------|------------------------------|
|
||||
| v1.0 | 2023-11-20 | Initial release |
|
||||
| v1.1 | 2023-11-25 | Updated RADIUS server list |
|
||||
|
||||
## Contribution Rules
|
||||
1. **Modify templates only via numbered changes**:
|
||||
```bash
|
||||
cp BASE/02_system.txt BASE/02_system_v1.1.txt
|
||||
```
|
||||
2. **Document changes in headers**:
|
||||
```text
|
||||
! HISTORY:
|
||||
! 2023-11-25 - Updated NTP servers (jsmith)
|
||||
! 2023-11-20 - Initial version (network-team)
|
||||
```
|
||||
|
||||
## Policy Exceptions
|
||||
For non-standard configurations:
|
||||
1. Create new profile in `PORT_PROFILES/custom_*.txt`
|
||||
2. Add exception note:
|
||||
```text
|
||||
! EXCEPTION: IDF-SW23 requires special QoS
|
||||
! Approved by: jdoe (2023-11-15)
|
||||
! Ticket: NET-1245
|
||||
```
|
||||
|
||||
## Best Practices
|
||||
1. **Search before creating** - Reuse existing profiles when possible
|
||||
2. **Keep customizations visible** - Never delete placeholder markers
|
||||
3. **Test incrementally** - Apply changes in sections
|
||||
```
|
||||
|
||||
This version:
|
||||
1. **Eliminates all automation/LCM references** - Pure configuration templates
|
||||
2. **Focuses on manual operations** - `cat` and `sed` as primary tools
|
||||
3. **Maintains auditability** - Clear versioning and change tracking
|
||||
4. **Prevents scope creep** - No CI/CD, Ansible, or deployment processes
|
||||
|
||||
The template system stands alone as a documentation and standardization tool, decoupled from any specific deployment methodology.
|
||||
|
||||
---
|
||||
|
||||
Here's a comprehensive `README.md` for your configuration template system:
|
||||
|
||||
```markdown
|
||||
# Network Configuration Template System
|
||||
|
||||
## Overview
|
||||
Modular, version-controlled templates for Cisco 4507R+E/4510R+E switch configurations. Designed for both manual operations and automation readiness.
|
||||
|
||||
## Directory Structure
|
||||
```text
|
||||
config_components/
|
||||
├── core_settings/ # Fundamental device configuration
|
||||
│ ├── 00_licensing.j2
|
||||
│ ├── 10_system_settings.j2
|
||||
│ └── 20_aaa.j2
|
||||
├── network_services/ # Network-wide services
|
||||
│ ├── 30_vlans.j2
|
||||
│ └── 40_routing.j2
|
||||
├── interfaces/ # Interface configurations
|
||||
│ ├── 50_port_profiles/
|
||||
│ │ ├── access_port.j2
|
||||
│ │ └── trunk_port.j2
|
||||
│ └── 60_interface_assignments.j2
|
||||
└── policies/ # Policy definitions
|
||||
├── 70_qos.j2
|
||||
└── 80_access_lists.j2
|
||||
```
|
||||
|
||||
## Key Features
|
||||
- **Numbered Load Order**: Files processed sequentially (00_ → 90_)
|
||||
- **Port Profile System**: Reusable interface configurations
|
||||
- **Policy/Service Separation**: Clean abstraction boundaries
|
||||
- **Version Embedded**: Each template contains its own changelog
|
||||
|
||||
## Usage Guide
|
||||
|
||||
### 1. Preparation
|
||||
```bash
|
||||
# Clone repository
|
||||
git clone https://example.com/network-templates.git
|
||||
cd network-templates
|
||||
|
||||
# Create site variables file
|
||||
cp examples/site_vars.yml site/nyc_floor3.yml
|
||||
```
|
||||
|
||||
### 2. Configuration Generation
|
||||
|
||||
#### Manual Method (Quick Start)
|
||||
```bash
|
||||
# Using sed for simple replacements
|
||||
sed "s/<HOSTNAME>/SWITCH-01/g" base_config.j2 > output.txt
|
||||
|
||||
# Using Jinja2 CLI (requires Python)
|
||||
pip install jinja2-cli
|
||||
jinja2 base_config.j2 site/nyc_floor3.yml --format=yaml > deployed_config.txt
|
||||
```
|
||||
|
||||
#### Automated Method (Ansible)
|
||||
```yaml
|
||||
- name: Deploy switch config
|
||||
hosts: switches
|
||||
tasks:
|
||||
- template:
|
||||
src: "{{ role_path }}/templates/base_config.j2"
|
||||
dest: "/tmp/{{ inventory_hostname }}.cfg"
|
||||
```
|
||||
|
||||
### 3. Deployment
|
||||
```bash
|
||||
# Manual deployment
|
||||
ssh admin@switch < deployed_config.txt
|
||||
|
||||
# Automated validation
|
||||
ansible-playbook validate.yml -e @site/nyc_floor3.yml
|
||||
```
|
||||
|
||||
## Template Development
|
||||
|
||||
### Adding New Components
|
||||
1. Create numbered template file:
|
||||
```bash
|
||||
touch config_components/policies/90_ntp.j2
|
||||
```
|
||||
2. Add metadata header:
|
||||
```jinja2
|
||||
{# META:
|
||||
Version: 1.0
|
||||
Dependencies: 10_system_settings.j2
|
||||
Validated: 2023-11-20
|
||||
#}
|
||||
```
|
||||
|
||||
### Version Control
|
||||
```bash
|
||||
# Standard workflow
|
||||
git checkout -b feature/new_vlan_profile
|
||||
# Edit appropriate .j2 files
|
||||
git commit -m "feat: Add new voice VLAN profile"
|
||||
```
|
||||
|
||||
## Validation Framework
|
||||
Each generated config includes:
|
||||
```text
|
||||
! VALIDATION MARKS
|
||||
! [REQUIRED] Verify VLAN assignments: show vlan brief
|
||||
! [RECOMMENDED] Check interface status: show int status
|
||||
! [OPTIONAL] Test QoS: test policy-map <name>
|
||||
```
|
||||
|
||||
## Variable Hierarchy
|
||||
1. `defaults.yml` - Organization-wide standards
|
||||
2. `site/<location>.yml` - Site-specific overrides
|
||||
3. `device/<hostname>.yml` - Device-level exceptions
|
||||
|
||||
## Maintenance
|
||||
```text
|
||||
# Changelog Format
|
||||
## [Version] YYYY-MM-DD
|
||||
- [CHANGE] Description (Author)
|
||||
- [FIX] Bug description (Author)
|
||||
|
||||
Example:
|
||||
## [1.1] 2023-11-20
|
||||
- [CHANGE] Updated RADIUS servers (jsmith)
|
||||
- [FIX] Corrected VLAN numbering (adoe)
|
||||
```
|
||||
|
||||
## FAQ
|
||||
**Q: How to handle one-off exceptions?**
|
||||
A: Create device-specific vars in `device/` or use conditional blocks:
|
||||
```jinja2
|
||||
{% if inventory_hostname == 'switch23' %}
|
||||
! Special configuration
|
||||
{% endif %}
|
||||
```
|
||||
|
||||
**Q: Can I use this without Jinja2?**
|
||||
A: Yes - templates work as:
|
||||
1. Manual copy-paste docs
|
||||
2. sed/awk processing sources
|
||||
3. Full Jinja2 automation
|
||||
|
||||
**Q: How to test changes safely?**
|
||||
```bash
|
||||
# Dry-run generation
|
||||
jinja2 base_config.j2 test_vars.yml --format=yaml
|
||||
```
|
||||
|
||||
## License
|
||||
This template system is [MIT Licensed](LICENSE).
|
||||
```
|
||||
|
||||
This README provides:
|
||||
1. **Progressive Disclosure** - Simple to advanced usage
|
||||
2. **Multi-Modal Support** - Manual and automated paths
|
||||
3. **Built-In Governance** - Versioning and validation
|
||||
4. **Team Onboarding** - Clear contribution guidelines
|
||||
|
||||
Would you like me to add any specific:
|
||||
- Security considerations?
|
||||
- Disaster recovery procedures?
|
||||
- Integration with existing CMDB systems?
|
||||
Reference in New Issue
Block a user