25 KiB
Cisco Nexus Technical Preparation Guide
1. Nexus Hardware Platforms
- Nexus 9000 Series (9300, 9500)
- Nexus 7000 Series
- Nexus 5000/6000 Series
- Differences and use cases for each platform
2. NX-OS Operating System
- NX-OS architecture and features
- Command-line interface (CLI) and configuration basics
- NX-OS software upgrade procedures
- High availability features (ISSU, VSS, vPC)
3. Layer 2 Technologies
- VLANs and VLAN Trunking Protocol (VTP)
- Spanning Tree Protocol variations (RPVST+, MST)
- Link Aggregation (LACP)
- Virtual Port Channel (vPC) configuration and troubleshooting
- FabricPath and TRILL
4. Layer 3 Routing
- Static routing
- Dynamic routing protocols (OSPF, EIGRP, BGP)
- First Hop Redundancy Protocols (HSRP, VRRP)
- VRF-lite and MPLS VPN support
5. Data Center Fabric Technologies
- VXLAN overview and configuration
- EVPN for VXLAN
- Cisco FabricPath
- Overlay Transport Virtualization (OTV)
6. Nexus Specific Features
- Virtual Device Contexts (VDC)
- Fabric Extender (FEX) technology
- Cisco Dynamic Fabric Automation (DFA)
- Nexus Converged Fabric
7. Quality of Service (QoS)
- Classification and marking
- Policing and shaping
- Queuing and scheduling
- QoS policy implementation on Nexus switches
8. Security Features
- Access Control Lists (ACLs)
- Authentication, Authorization, and Accounting (AAA)
- Control Plane Policing (CoPP)
- Port Security
- DHCP snooping and Dynamic ARP Inspection
9. Monitoring and Troubleshooting
- SPAN and ERSPAN configuration
- NetFlow implementation
- SNMP and syslog configuration
- Embedded Event Manager (EEM)
- Packet capture techniques
10. Data Center Network Design
- Spine-leaf architecture implementation with Nexus switches
- Oversubscription ratios and capacity planning
- Traffic flow optimization in a Nexus-based data center
- High availability design considerations
11. Virtualization Integration
- VMware vSphere integration (VEM, DVS)
- Microsoft Hyper-V network virtualization support
- Network containerization technologies (e.g., Cisco Contiv)
12. Automation and Programmability
- NX-API REST and NX-API CLI
- Python scripting for Nexus automation
- Ansible playbooks for Nexus configuration
- NETCONF/YANG model usage
13. Cisco Application Centric Infrastructure (ACI) Integration
- ACI fabric access policies for Nexus switches
- Migrating from traditional Nexus environments to ACI
- ACI Multi-Pod and Multi-Site with Nexus spines
14. Performance and Scalability
- Nexus switch performance characteristics
- Forwarding Information Base (FIB) and TCAM utilization
- Buffer management and microburst handling
- Load balancing algorithms and configuration
15. Emerging Technologies
- Intent-based networking on Nexus platforms
- Integration with Cisco DNA Center
- Edge computing support in Nexus switches
- AI/ML applications in Nexus-based networks
16. Compliance and Standards
- Data center compliance requirements (PCI DSS, HIPAA)
- Implementation of network segmentation for compliance
- Industry standards support (IEEE, IETF)
17. Interoperability
- Working with multi-vendor environments
- Integration with legacy network infrastructures
- Cloud connectivity options (AWS Direct Connect, Azure ExpressRoute)
18. Disaster Recovery and Business Continuity
- Data center interconnect (DCI) solutions using Nexus
- Configuration backup and restore procedures
- Failure scenario planning and mitigation strategies
19. Green Data Center Initiatives
- Power efficiency features in Nexus switches
- Environmental monitoring and reporting
- Sustainable networking practices
20. Case Studies and Scenarios
- Large-scale Nexus deployments in enterprise environments
- Troubleshooting complex issues in Nexus-based networks
- Migration strategies from older platforms to Nexus 9000
-
ACI shifts the focus from network-centric to application-centric configurations:
- Traditional networking focuses on configuring individual network devices (switches, routers) and protocols.
- ACI instead focuses on the applications and their requirements, abstracting away much of the underlying network complexity.
- This shift allows network administrators to think in terms of application needs rather than network topology.
-
Network policies are defined based on application requirements:
- In ACI, you define what an application needs in terms of connectivity, security, and performance.
- These requirements are translated into network policies automatically.
- For example, you might specify that a web server needs to communicate with a database server on a specific port, and ACI will configure the necessary network settings.
-
Applications are grouped into "End Point Groups" (EPGs):
- An EPG is a logical grouping of endpoints that require similar network policies.
- Endpoints can be physical servers, virtual machines, containers, or even individual IP addresses.
- EPGs abstract away the physical and logical topology, focusing instead on the application function.
-
EPGs are collections of endpoints that share common policy requirements:
- All endpoints in an EPG are treated the same from a policy perspective.
- This simplifies policy management - instead of configuring policies for each individual endpoint, you configure them once for the EPG.
- For example, all web servers might be in one EPG, while all database servers are in another.
-
Contracts define how EPGs communicate with each other:
- Contracts are the ACI equivalent of Access Control Lists (ACLs) in traditional networking.
- They specify which EPGs can communicate with each other and how.
- Contracts can define allowed protocols, ports, and even quality of service (QoS) settings.
- They follow a provider-consumer model: one EPG provides a contract, and another EPG consumes it.
Example scenario: Imagine a three-tier web application with web, application, and database layers. In ACI:
- You'd create three EPGs: Web-EPG, App-EPG, and DB-EPG.
- You'd then create contracts:
- Web-to-App contract (allows HTTP/HTTPS traffic)
- App-to-DB contract (allows specific database port traffic)
- The Web-EPG would consume the Web-to-App contract, and the App-EPG would provide it.
- The App-EPG would consume the App-to-DB contract, and the DB-EPG would provide it.
This approach allows for intuitive, application-focused network design and management, with built-in security and scalability.
Cisco ACI: Network-Centric Guide
1. Physical Topology: Leaf-Spine Architecture
ACI uses a leaf-spine architecture:
- Leaf switches: Connect to end devices (servers, firewalls, load balancers)
- Spine switches: Interconnect all leaf switches
- Every leaf connects to every spine, creating a full mesh topology
Benefits:
- Predictable latency
- High bandwidth
- No spanning tree protocol needed
2. APIC (Application Policy Infrastructure Controller)
- Centralized management and control plane
- Cluster of 3 or more controllers for high availability
- Manages all aspects of the ACI fabric
3. Underlay Network: IS-IS and VXLAN
- IS-IS (Intermediate System to Intermediate System) routing protocol used internally
- VXLAN (Virtual Extensible LAN) for network virtualization
- Allows layer 2 segments to extend across the layer 3 fabric
- 24-bit VNID (VXLAN Network Identifier) for segment identification
4. Tenant Network Virtualization
- Tenants: Logical containers for policies, services, and network segments
- VRF (Virtual Routing and Forwarding): Provides IP address space isolation
- Bridge Domains: Layer 2 forwarding domains, similar to VLANs
- Subnets: IP address ranges associated with Bridge Domains
5. External Connectivity
- L3Out: Connects ACI fabric to external layer 3 networks
- Supports BGP, OSPF, EIGRP, and static routing
- L2Out: Connects ACI fabric to external layer 2 networks
6. Packet Flow
- Ingress leaf switch performs VXLAN encapsulation
- Spine switches route based on VXLAN outer header
- Egress leaf switch performs VXLAN decapsulation
- Policy enforcement occurs at ingress and egress leaf switches
7. Hardware Components
- Nexus 9000 series switches
- 9300 platform for leaf switches
- 9500 platform for spine switches
- APIC appliances or virtual machines
8. Key Protocols and Technologies
- LLDP (Link Layer Discovery Protocol): Neighbor discovery
- CDP (Cisco Discovery Protocol): Cisco-specific neighbor discovery
- COOP (Council of Oracle Protocol): Endpoint location distribution
- MP-BGP EVPN: For multi-site deployments
9. Multicast
- Uses a modified version of PIM BiDir (Bidirectional Protocol Independent Multicast)
- Optimized for the leaf-spine architecture
10. Quality of Service (QoS)
- Implemented through Custom Queuing Classes (CQC)
- Policies can be applied at various levels (EPG, contract, etc.)
Understanding these network-centric aspects is crucial for effectively designing, implementing, and troubleshooting an ACI fabric.
Here’s an outline of all the topics we’ve discussed during our conversation:
1. Cisco Nexus APIC Controllers Overview
- Basic Concepts: Introduction to Cisco Nexus APIC (Application Policy Infrastructure Controller) and its role in Cisco ACI.
- APIC Architecture: Overview of tenants, EPGs (Endpoint Groups), and contracts in Cisco ACI.
- Network Abstraction and Centralized Policy Management: How APIC abstracts the network and applies policies across the fabric.
2. Endpoint Groups (EPGs)
- Definition and Purpose: Logical grouping of endpoints (servers, VMs, containers) that share the same policies.
- EPG Example: Example of an EPG for a three-tier web application (Web, App, and Database EPGs).
- Communication Between EPGs: Using contracts to control traffic between EPGs and enforce policies.
3. Tenants in Cisco ACI
- Tenant Overview: Explanation of tenants as logical containers that provide isolation between different network segments.
- Types of Tenants:
- Common Tenant: Shared services across the entire fabric.
- Infrastructure Tenant: Used for fabric-level configurations.
- User Tenants: Representing departments, applications, or business units.
- Example of Tenant Usage: Different departments with isolated network and security policies.
4. Contracts in Cisco ACI
- Purpose: Contracts define rules for communication between EPGs.
- Step-by-Step Guide for Creating Contracts:
- How to create a contract, subjects, and filters.
- Attaching contracts to EPGs (providing and consuming contracts).
- Example of Contract: Setting up HTTP traffic between Web and App EPGs.
5. Monitoring Contracts
- APIC GUI Monitoring: Monitoring contracts in the APIC GUI and tracking communication between EPGs.
- CLI Monitoring: Using CLI commands to check contract usage, traffic, and faults.
- REST API for Monitoring: Programmatically monitor contract stats using the ACI REST API.
- SNMP and Syslog: Configuring SNMP traps and Syslog for external monitoring and logging.
6. Viewing and Exporting Faults
- Viewing Faults:
- APIC GUI: Viewing active and historical faults in the APIC interface.
- CLI: Checking fault details using CLI commands.
- REST API: Retrieving faults via API for automation and integration.
- Exporting Fault Logs: Steps to export fault logs to CSV or JSON formats for analysis and sharing.
- Using Syslog and SNMP: Sending faults to a Syslog server or SNMP traps for centralized monitoring.
7. Resolving Faults
- Identifying Faults: How to analyze fault details like severity, cause, and affected object.
- Common Faults and Resolutions:
- Interface Down or Flapping: Troubleshooting physical and configuration issues.
- VPC Peer-Link Issues: Fixing peer-link failures and keepalive issues.
- Contract Denied or Misconfigured: Resolving issues with blocked traffic due to incorrect contracts.
- Node Unreachable: Rebooting or reconfiguring unreachable fabric nodes.
- Configuration Out of Sync: Re-syncing fabric configurations with the APIC.
8. Clearing Faults
- Clearing Faults via APIC GUI: Acknowledge or clear faults from the APIC interface.
- Clearing Faults via CLI: Manually clearing faults using CLI commands.
- REST API for Clearing Faults: Using the API to programmatically clear faults.
- Best Practices for Clearing Faults: Ensure issues are resolved before clearing faults.
9. Major Faults in Cisco ACI
- Common Causes of Major Faults:
- Misconfigured Contracts and Filters: Issues with denied traffic.
- Interface or Port Issues: Speed/duplex mismatches, down interfaces.
- VPC Misconfiguration: Peer-link or keepalive failures.
- Misconfigured Fabric Policies: Problems with access policies or QoS settings.
- Node Resource Utilization: High CPU or memory utilization.
- Configuration Out of Sync: Mismatch between APIC and fabric node configurations.
- Reachability Issues: APIC or fabric node connectivity problems.
- Firmware Bugs: Issues introduced by software bugs.
10. Updating Cisco ACI Firmware
- Step-by-Step Firmware Update Process:
- Pre-Upgrade: Download firmware, back up configuration, verify compatibility.
- Upgrade APIC Controllers: Perform a rolling upgrade of APIC controllers.
- Upgrade Fabric Nodes: Upgrade leaf and spine switches, using ISSU to minimize downtime.
- Post-Upgrade: Verify versions, check health, and resolve faults.
11. Common Issues During ACI Firmware Upgrade
- Fabric Nodes Failing to Upgrade: Causes include insufficient disk space, corrupted firmware, or connectivity issues.
- APIC Cluster Quorum Loss: Loss of connectivity or sync issues during the APIC upgrade.
- VPC Inconsistencies: VPC peer-link or configuration mismatches after the upgrade.
- Connectivity Issues Post-Upgrade: Traffic loss due to policy enforcement problems or stale ARP/MAC entries.
- Node Reboot Loops: Continuous reboot cycles caused by firmware or hardware failures.
12. Best Practices for Cisco ACI Firmware Upgrades
- Pre-Upgrade:
- Validate compatibility and upgrade path.
- Back up the configuration and schedule a maintenance window.
- Test in a lab environment.
- During the Upgrade:
- Upgrade APIC controllers first.
- Use ISSU for Nexus switches.
- Monitor system health and logs.
- Post-Upgrade:
- Verify all nodes are upgraded.
- Monitor health and faults.
- Re-check connectivity and policies.
- Take a post-upgrade configuration backup.
Re-syncing nodes in Cisco ACI ensures that any configuration discrepancies between the APIC controller and the fabric nodes (leaf or spine switches) are corrected. A re-sync forces the APIC to re-push the configuration to a node to ensure that the fabric nodes are in sync with the intended policies, contracts, and other configurations.
Re-syncing nodes can be necessary when there are configuration out-of-sync faults, node reachability issues, or after performing firmware upgrades to ensure that all configurations have been applied properly.
Here’s how to re-sync nodes in Cisco ACI using both the APIC GUI and CLI.
1. Re-sync Nodes via APIC GUI
Step-by-Step Process:
-
Log into the APIC GUI:
- Open your browser and log into the APIC using your credentials.
-
Navigate to the Fabric Membership:
- On the left-hand menu, navigate to Fabric > Inventory > Fabric Membership.
- This page displays all the fabric nodes (both leaf and spine switches) and their status.
-
Check for Out-of-Sync Nodes:
- Look for any faults related to configuration out-of-sync issues.
- You may notice specific out-of-sync faults for nodes that need to be re-synchronized with the APIC.
-
Select the Node to Re-sync:
- In the Fabric Membership page, locate the node (leaf or spine) you want to re-sync.
- Right-click on the node or click on the node's options menu (three dots) next to the node’s name.
-
Re-sync the Node:
- Select Re-sync Config from the dropdown menu.
- This will force the APIC to re-apply the current configuration to the selected node.
-
Monitor the Re-sync Process:
- After initiating the re-sync, you can monitor the status of the process.
- Check for any faults or issues that may arise during the re-sync.
- Once complete, verify that the node is healthy and synchronized by checking its health score and ensuring that there are no out-of-sync configuration errors.
2. Re-sync Nodes via CLI
Step-by-Step Process:
-
Access the APIC CLI:
- SSH into your APIC controller using the following command:
ssh admin@<APIC-IP>
- SSH into your APIC controller using the following command:
-
List the Fabric Nodes:
- To see the current fabric nodes (leaf and spine switches) and their IDs, run:
show fabric membership - This will list all the nodes in the fabric and their Node ID.
- To see the current fabric nodes (leaf and spine switches) and their IDs, run:
-
Re-sync a Specific Node:
- To re-sync a specific node, use the following command:
fabric re-sync node <node-id> - Replace
<node-id>with the actual ID of the node you want to re-sync, which you obtained from the previous step.
- To re-sync a specific node, use the following command:
-
Monitor the Re-sync Process:
- After issuing the command, the APIC will push the configuration to the node and attempt to bring it in sync.
- Use the following command to monitor the node's synchronization status and check for faults:
show fault
-
Verify Node Sync Status:
- Once the re-sync is complete, ensure the node is healthy and that there are no out-of-sync errors by running:
show fabric membership
- Once the re-sync is complete, ensure the node is healthy and that there are no out-of-sync errors by running:
3. When to Re-sync Nodes?
Re-syncing nodes is commonly needed in the following situations:
- Configuration Out-of-Sync Faults: When the configuration on the node and APIC doesn’t match, re-syncing can resolve the mismatch.
- Node Not Responding to Policy Changes: If policy changes or new configurations are not reflected on a node, re-syncing forces the node to apply them.
- After Firmware Upgrades: After upgrading the firmware of APIC controllers or fabric nodes, re-syncing ensures that all configuration updates are properly applied.
- After Node Reboots: If a node was rebooted or brought back online, re-syncing ensures it fully re-joins the fabric with the correct configuration.
4. Best Practices for Re-syncing Nodes
- Monitor Health and Faults: Always monitor the node’s health and check for faults both before and after the re-sync. This will help identify potential issues that could prevent the node from re-syncing properly.
- Re-sync During Maintenance Windows: If possible, perform re-syncs during a maintenance window, especially if you're working with critical nodes, to minimize the impact on network performance.
- Avoid Re-syncing Multiple Nodes at Once: Re-sync nodes one at a time to avoid overwhelming the APIC and fabric with multiple configuration pushes at once.
It sounds like you're ready to consolidate all the detailed information we've covered into a single, comprehensive artifact. Here’s a suggested outline to summarize everything about Cisco ACI firmware upgrades, focusing on best practices, common issues, and the step-by-step process.
Artifact: Cisco ACI Firmware Upgrade Guide
1. Introduction
- Purpose: This guide outlines the recommended practices and step-by-step instructions for upgrading the firmware of Cisco ACI, including APIC controllers and Nexus leaf/spine switches. It also covers common issues, troubleshooting tips, and how to ensure a smooth and successful upgrade.
- Audience: Network engineers, administrators, and IT professionals responsible for managing and upgrading Cisco ACI infrastructure.
2. Pre-Upgrade Planning
Key Preparations Before Upgrading:
- Backup the Configuration: Always back up the ACI fabric configuration before starting the upgrade. Navigate to Admin > Import/Export > Config Export in the APIC GUI.
- Understand Compatibility: Review the ACI compatibility matrix and release notes to ensure that the APIC and Nexus switches can be upgraded to the target version.
- Review the Upgrade Path: Ensure you're following the correct upgrade path, especially when moving between major versions. Some versions may require intermediate upgrades.
- Check Disk Space: Confirm that APIC controllers and Nexus switches have adequate disk space for the upgrade files using
show system internal flashfor switches. - Test in a Lab Environment: If possible, simulate the upgrade in a test environment to identify potential issues.
- Schedule a Maintenance Window: Plan for downtime, notify stakeholders, and ensure that the upgrade is performed during a low-traffic period.
3. Step-by-Step Upgrade Process
a. Download Firmware:
- Download the firmware packages for APIC controllers and Nexus switches (leaf and spine) from the Cisco Software Download Portal.
- Upload the firmware to the APIC by navigating to Admin > Firmware > Firmware Repository.
b. APIC Controller Upgrade:
- Navigate to Admin > Firmware > Infrastructure Firmware.
- Start a rolling upgrade by selecting Upgrade Now or scheduling the upgrade.
- Upgrade the APICs one by one to maintain cluster quorum.
- Monitor the upgrade process and verify the firmware version after each APIC has been upgraded.
c. Nexus Leaf and Spine Switch Upgrade:
- Start by upgrading spine nodes first, then leaf nodes.
- Use In-Service Software Upgrade (ISSU) where possible to minimize downtime.
- Monitor the upgrade progress in Admin > Firmware > Infrastructure Firmware.
- Verify that all fabric nodes are running the correct firmware version after the upgrade using
show version.
4. Post-Upgrade Actions
a. Verify the Firmware Versions:
- Use the APIC GUI or
show versionon switches to ensure all nodes are running the correct firmware.
b. Health Checks:
- Monitor the overall health of the fabric in Fabric > Fabric Membership.
- Check for new faults under Monitoring > Faults and resolve any major or critical issues.
c. Policy and Connectivity Validation:
- Test critical applications and network policies to ensure EPGs and contracts are working as expected. Use connectivity tests (e.g., ping, traceroute) between endpoints.
d. Post-Upgrade Backup:
- After verifying the upgrade, create a new backup of the ACI configuration using Admin > Import/Export > Config Export.
5. Common Upgrade Issues and Resolutions
a. Fabric Nodes Failing to Upgrade:
- Symptoms: Leaf or spine switches remain on the old firmware version.
- Resolution: Check for insufficient disk space or upload the firmware again. Ensure the correct upgrade path is followed.
b. APIC Cluster Quorum Loss:
- Symptoms: One or more APIC controllers fail to rejoin the cluster.
- Resolution: Ensure that out-of-band management is properly configured. Reboot APICs or re-sync the database.
c. VPC Inconsistencies:
- Symptoms: Virtual Port Channels stop functioning after the upgrade.
- Resolution: Review the VPC configuration and ensure peer links are up. Re-apply or reconfigure VPC settings if necessary.
d. Connectivity Issues:
- Symptoms: Endpoints lose connectivity after the upgrade.
- Resolution: Check for stale ARP/MAC entries, clear them if necessary, and verify contract and policy enforcement.
e. Fabric Node Reboot Loops:
- Symptoms: Nodes repeatedly reboot after the upgrade.
- Resolution: Reload firmware manually or replace faulty hardware if needed.
6. Best Practices for ACI Firmware Upgrades
a. Upgrade APIC Controllers First: Always upgrade the APICs before fabric nodes, maintaining cluster quorum. b. Use ISSU: When upgrading Nexus switches, use In-Service Software Upgrade to minimize disruption. c. Upgrade in Phases: For large environments, upgrade nodes in small batches. d. Monitor System Health: Continuously monitor the health of the system during and after the upgrade, watching for critical faults or performance degradation. e. Review Release Notes and Known Bugs: Stay informed about potential issues with the firmware version by reviewing Cisco's release notes and bug tracker.
7. Conclusion
- Summary: Upgrading Cisco ACI firmware is essential for ensuring a secure and stable infrastructure. Following best practices and carefully monitoring the process helps mitigate risks, reduce downtime, and maintain network continuity.
- Further Support: Always consult Cisco’s technical documentation and reach out to Cisco TAC for assistance if any issues arise during the upgrade process.
Appendices
- ACI Firmware Compatibility Matrix: (Insert link or reference to the Cisco matrix)
- Useful CLI Commands:
- Check current firmware version:
show version - Verify cluster health:
acidiag health - Check disk space on switches:
show system internal flash - Re-sync fabric configuration:
fabric re-sync node <node-id>
- Check current firmware version: