From deabd0d296e52111f853624c74d52fd15c77e929 Mon Sep 17 00:00:00 2001 From: medusa Date: Sat, 2 Aug 2025 10:34:47 -0500 Subject: [PATCH] Add tech_docs/networking/network_iac_framework.md --- tech_docs/networking/network_iac_framework.md | 519 ++++++++++++++++++ 1 file changed, 519 insertions(+) create mode 100644 tech_docs/networking/network_iac_framework.md diff --git a/tech_docs/networking/network_iac_framework.md b/tech_docs/networking/network_iac_framework.md new file mode 100644 index 0000000..0963a8f --- /dev/null +++ b/tech_docs/networking/network_iac_framework.md @@ -0,0 +1,519 @@ +Here’s the distilled meta-framework for designing a configuration management system that balances pragmatism with pride-of-ownership: + +### **The Pillars of Sustainable Network Configuration Design** + +#### **1. The Iron Triangle of Configuration Systems** +```mermaid +graph TD + A[Human Understandability] --> B[Automation Compatibility] + B --> C[Audit Compliance] + C --> A +``` + +**Key Insight**: Optimize for the center where all three overlap. + +#### **2. Decision Filters for Every Component** +Ask of each element: +1. **"Will this still make sense in 5 years?"** + (Avoid tools/languages with steep lifecycle curves) +2. **"Does this reduce cognitive load?"** + (Favor structures that document themselves) +3. **"Can we exit this gracefully?"** + (No dead-end dependencies) + +#### **3. The Template Hierarchy of Needs** +```text + [ Reliability ] + / \ +[Consistency] [Reproducibility] + \ / + [Documentation] +``` + +**Implementation Rule**: Satisfy each layer before moving up. + +#### **4. Interface-Oriented Design** +Define clear boundaries between: +- **Data** (Variables/Secrets) +- **Templates** (Rendering Rules) +- **Delivery** (CLI/API/SSH) +- **Validation** (Pre/Post Checks) + +#### **5. The Versioning Covenant** +```text +v1.0.0 +│ │ └─ Patch (Template fixes) +│ └── Minor (New profiles) +└──── Major (Structure changes) +``` + +**Golden Rule**: Never break backward template compatibility. + +#### **6. The Compliance Sandwich** +```python +def deploy(config): + pre_checks(config) # Template validations + apply(config) # Actual deployment + post_checks(config) # Agent verification +``` + +#### **7. The Escalation Ladder** +Design for progressive enhancement: +1. **Manual Stage**: `cat template.txt > device` +2. **Assisted Stage**: `./validate.py | ssh device` +3. **Automated Stage**: CI/CD pipeline + +**Critical Feature**: Each stage produces human-readable artifacts. + +#### **8. The Pride Metrics** +Measure system quality by: +- **MTTD** (Mean Time To Document) - How long to explain any config +- **RFR** (Ready-for-Rollback) - Seconds to revert any change +- **CIA** (Config-Impact-Awareness) - % of team who can trace configs to requirements + +### **Practical Implementation Checklist** + +1. **Start Here**: + ```bash + mkdir config_system + touch config_system/design_principles.md # Document your meta-rules first + ``` + +2. **First Templates**: + ```text + switches/ + ├── base/ + │ └── core.txt # Immutable base configs + └── overlays/ + ├── finance/ # Compliance additions + └── branch/ # Site-specific tweaks + ``` + +3. **Validation Hook**: + ```python + # hooks/validate.py + def main(): + check(files_changed <= 3) # Prevent bulk undocumented changes + check(has_comments(config)) + check(no_secrets_in_clear()) + ``` + +4. **The "Proud" Test**: + Ask quarterly: "Would we open-source this system today?" + (If not, refine until the answer is yes) + +### **When to Stop Designing** +The system is "complete" when: +- New engineers can safely make config changes in their first week +- Any config can be traced to a business requirement in ≤3 clicks +- 95% of changes use existing patterns (not one-offs) + +This framework gives you a foundation that’s: +- **Engineer-Respectful**: Values human time over tool whims +- **Future-Proof**: Ready but not dependent on automation +- **Institutionally Durable**: Survives team turnover + +Would you like to pressure-test this against your specific organizational constraints? + +--- + +Here's a pragmatic hybrid approach that blends simplicity with modern requirements: + +### **The Hybrid Configuration Framework** + +#### **1. Core Architecture** +```mermaid +graph TD + A[Human-Readable Templates] --> B[Git-Versioned Sources] + B --> C[Agent Validation] + B --> D[Manual CLI Deployment] + B --> E[Automated Deployment] +``` + +#### **2. Implementation Layers** + +**Layer 1: Static Templates (Foundation)** +- Keep your existing template structure +- Add validation markers for agents: + ```text + ! COMPLIANCE-TAG: PCI-DSS-4.1 + aaa authentication login default group tacacs+ local + ! VALIDATION: show aaa sessions + ``` + +**Layer 2: Agent Hooks** +```python +# Basic validation agent (example) +def validate_config(config): + required_tags = ["PCI-DSS", "NIST-800"] + for line in config.split('\n'): + if "! COMPLIANCE-TAG:" in line: + tag = line.split(":")[1].strip() + if tag not in scanned_standards: + alert(f"Missing scan for {tag}") +``` + +**Layer 3: Deployment Bridge** +```bash +# Sample CI pipeline (runs on config changes) +1. jinja2 render --validate-tags +2. agent validate --level PCI +3. if pass: scp config to device +4. if fail: create JIRA ticket +``` + +#### **3. File Structure Upgrade** +```text +configs/ +├── templates/ # Your existing templates +├── agents/ +│ ├── validator.py # Lightweight compliance checks +│ └── tags.db # Required compliance markers +└── hooks/ + ├── pre-deploy.sh # Signature verification + └── post-deploy.sh # Compliance snapshot +``` + +#### **4. Workflow Integration** +**Manual Mode:** +```bash +# Engineer workflow +vim templates/BASE/02_system.txt +./hooks/pre-deploy.sh # Verifies templates +deploy manually +``` + +**Automated Mode:** +```yaml +# GitLab CI example +deploy_config: + only: + - /^v\d+\.\d+/ # Tagged releases + script: + - ./agents/validator.py -t PCI + - ansible-playbook deploy.yml +``` + +#### **5. Compliance Bridge** +**Template Markers → Agent Food** +```text +! COMPLIANCE-REQ: Ensure TACACS timeout < 10s +tacacs-server timeout 9 +``` +Validated by: +```python +# Agent checks +if "tacacs-server timeout" in config: + timeout = int(config.split()[-1]) + assert timeout <= 10, "TACACS timeout violation" +``` + +#### **6. Key Advantages** +1. **Progressive Compliance**: + - Start with manual template reviews + - Gradually automate checks +2. **Tool Agnostic**: + - Validators work with Ansible, Terraform, or manual deploys +3. **Audit Trail**: + ```text + 2023-11-25 14:22: [AGENT] PASSED PCI-4.1 (template: 02_system.txt) + ``` + +#### **7. Reality Checks** +**When This Fits:** +- Teams with 1-2 engineers managing 50-300 devices +- Mixed vendor environments +- Gradually maturing compliance programs + +**When to Avoid:** +- Fully automated cloud-native networks +- Teams already using Terraform Enterprise/SDN + +#### **8. Starter Kit** +```bash +# Initial setup +git clone https://example.com/network-templates +cd network-templates +pip install -r agents/requirements.txt # Only dependency: PyYAML + +# First validation +./agents/validator.py -t NIST-800-53 +``` + +This gives you: +- Immediate compliance visibility +- Zero automation lock-in +- Clear path from templates to full IaC +- All while keeping your original template system intact + +Would you like me to develop a specific component further? Particularly: +1. The agent validation logic +2. Git hook integration +3. Compliance tagging standards + +--- + +Understood. Here's a lean, focused `README.md` that strictly covers template management without venturing into LCM or automation: + +```markdown +# Cisco 4500 Series Configuration Templates + +## Purpose +Standardized configuration templates for: +- Cisco 4507R+E +- Cisco 4510R+E +(Layer 3 Access Switch role) + +## File Structure +```text +templates/ +├── BASE/ # Core building blocks +│ ├── 01_license.txt # License/config-register +│ ├── 02_system.txt # Hostname/services/NTP +│ ├── 03_aaa.txt # Authentication +│ └── 04_vlans.txt # VLAN definitions +│ +├── PORT_PROFILES/ # Interface templates +│ ├── access_data.txt # Data+Voice ports +│ ├── access_voice.txt # Voice-only ports +│ └── trunk_uplink.txt # Uplink ports +│ +└── POLICIES/ # Reusable policy blocks + ├── qos_marking.txt # QoS class-maps + └── acl_preauth.txt # Port ACLs +``` + +## Usage Instructions + +### 1. Build a Configuration +```bash +# Combine templates in order +cat BASE/01_license.txt BASE/02_system.txt > switch_config.txt + +# Apply port profiles (example) +cat PORT_PROFILES/access_data.txt | sed 's/{{ VLAN_ID }}/210/' >> switch_config.txt +``` + +### 2. Required Customizations +Each template contains explicit replacement markers: + +```text +! REQUIRED CUSTOMIZATIONS (search for these): +{{ HOSTNAME }} # Device hostname +{{ MGMT_IP }} # Management interface IP +{{ VLAN_2xx }} # Data VLAN ID (2xx range) +``` + +### 3. Validation Checklist +After customization, verify: +1. No remaining `{{ }}` placeholders +2. VLAN IDs match port assignments +3. Unique IPs for all interfaces + +## Template Versioning +| Version | Date | Changes | +|---------|------------|------------------------------| +| v1.0 | 2023-11-20 | Initial release | +| v1.1 | 2023-11-25 | Updated RADIUS server list | + +## Contribution Rules +1. **Modify templates only via numbered changes**: + ```bash + cp BASE/02_system.txt BASE/02_system_v1.1.txt + ``` +2. **Document changes in headers**: + ```text + ! HISTORY: + ! 2023-11-25 - Updated NTP servers (jsmith) + ! 2023-11-20 - Initial version (network-team) + ``` + +## Policy Exceptions +For non-standard configurations: +1. Create new profile in `PORT_PROFILES/custom_*.txt` +2. Add exception note: + ```text + ! EXCEPTION: IDF-SW23 requires special QoS + ! Approved by: jdoe (2023-11-15) + ! Ticket: NET-1245 + ``` + +## Best Practices +1. **Search before creating** - Reuse existing profiles when possible +2. **Keep customizations visible** - Never delete placeholder markers +3. **Test incrementally** - Apply changes in sections +``` + +This version: +1. **Eliminates all automation/LCM references** - Pure configuration templates +2. **Focuses on manual operations** - `cat` and `sed` as primary tools +3. **Maintains auditability** - Clear versioning and change tracking +4. **Prevents scope creep** - No CI/CD, Ansible, or deployment processes + +The template system stands alone as a documentation and standardization tool, decoupled from any specific deployment methodology. + +--- + +Here's a comprehensive `README.md` for your configuration template system: + +```markdown +# Network Configuration Template System + +## Overview +Modular, version-controlled templates for Cisco 4507R+E/4510R+E switch configurations. Designed for both manual operations and automation readiness. + +## Directory Structure +```text +config_components/ +├── core_settings/ # Fundamental device configuration +│ ├── 00_licensing.j2 +│ ├── 10_system_settings.j2 +│ └── 20_aaa.j2 +├── network_services/ # Network-wide services +│ ├── 30_vlans.j2 +│ └── 40_routing.j2 +├── interfaces/ # Interface configurations +│ ├── 50_port_profiles/ +│ │ ├── access_port.j2 +│ │ └── trunk_port.j2 +│ └── 60_interface_assignments.j2 +└── policies/ # Policy definitions + ├── 70_qos.j2 + └── 80_access_lists.j2 +``` + +## Key Features +- **Numbered Load Order**: Files processed sequentially (00_ → 90_) +- **Port Profile System**: Reusable interface configurations +- **Policy/Service Separation**: Clean abstraction boundaries +- **Version Embedded**: Each template contains its own changelog + +## Usage Guide + +### 1. Preparation +```bash +# Clone repository +git clone https://example.com/network-templates.git +cd network-templates + +# Create site variables file +cp examples/site_vars.yml site/nyc_floor3.yml +``` + +### 2. Configuration Generation + +#### Manual Method (Quick Start) +```bash +# Using sed for simple replacements +sed "s//SWITCH-01/g" base_config.j2 > output.txt + +# Using Jinja2 CLI (requires Python) +pip install jinja2-cli +jinja2 base_config.j2 site/nyc_floor3.yml --format=yaml > deployed_config.txt +``` + +#### Automated Method (Ansible) +```yaml +- name: Deploy switch config + hosts: switches + tasks: + - template: + src: "{{ role_path }}/templates/base_config.j2" + dest: "/tmp/{{ inventory_hostname }}.cfg" +``` + +### 3. Deployment +```bash +# Manual deployment +ssh admin@switch < deployed_config.txt + +# Automated validation +ansible-playbook validate.yml -e @site/nyc_floor3.yml +``` + +## Template Development + +### Adding New Components +1. Create numbered template file: + ```bash + touch config_components/policies/90_ntp.j2 + ``` +2. Add metadata header: + ```jinja2 + {# META: + Version: 1.0 + Dependencies: 10_system_settings.j2 + Validated: 2023-11-20 + #} + ``` + +### Version Control +```bash +# Standard workflow +git checkout -b feature/new_vlan_profile +# Edit appropriate .j2 files +git commit -m "feat: Add new voice VLAN profile" +``` + +## Validation Framework +Each generated config includes: +```text +! VALIDATION MARKS +! [REQUIRED] Verify VLAN assignments: show vlan brief +! [RECOMMENDED] Check interface status: show int status +! [OPTIONAL] Test QoS: test policy-map +``` + +## Variable Hierarchy +1. `defaults.yml` - Organization-wide standards +2. `site/.yml` - Site-specific overrides +3. `device/.yml` - Device-level exceptions + +## Maintenance +```text +# Changelog Format +## [Version] YYYY-MM-DD +- [CHANGE] Description (Author) +- [FIX] Bug description (Author) + +Example: +## [1.1] 2023-11-20 +- [CHANGE] Updated RADIUS servers (jsmith) +- [FIX] Corrected VLAN numbering (adoe) +``` + +## FAQ +**Q: How to handle one-off exceptions?** +A: Create device-specific vars in `device/` or use conditional blocks: +```jinja2 +{% if inventory_hostname == 'switch23' %} +! Special configuration +{% endif %} +``` + +**Q: Can I use this without Jinja2?** +A: Yes - templates work as: +1. Manual copy-paste docs +2. sed/awk processing sources +3. Full Jinja2 automation + +**Q: How to test changes safely?** +```bash +# Dry-run generation +jinja2 base_config.j2 test_vars.yml --format=yaml +``` + +## License +This template system is [MIT Licensed](LICENSE). +``` + +This README provides: +1. **Progressive Disclosure** - Simple to advanced usage +2. **Multi-Modal Support** - Manual and automated paths +3. **Built-In Governance** - Versioning and validation +4. **Team Onboarding** - Clear contribution guidelines + +Would you like me to add any specific: +- Security considerations? +- Disaster recovery procedures? +- Integration with existing CMDB systems? \ No newline at end of file