497 lines
14 KiB
Markdown
497 lines
14 KiB
Markdown
Understood. I'll create comprehensive artifacts for each of the seven main topics you've outlined. This will provide you with a thorough reference for your interview preparation.
|
|
|
|
1. Kubernetes (k8s) Fundamentals:
|
|
|
|
|
|
|
|
|
|
|
|
# Kubernetes Fundamentals
|
|
|
|
## Core Components
|
|
|
|
### Pods
|
|
- Smallest deployable units in Kubernetes
|
|
- Can contain one or more containers
|
|
- Share network namespace and storage
|
|
- Ephemeral by nature, can be replaced at any time
|
|
|
|
### Deployments
|
|
- Manage ReplicaSets
|
|
- Provide declarative updates for Pods
|
|
- Support rolling updates and rollbacks
|
|
- Ensure desired number of Pod replicas are running
|
|
|
|
### Services
|
|
- Expose Pods as network services
|
|
- Types:
|
|
- ClusterIP: Internal cluster communication
|
|
- NodePort: Exposes the service on each Node's IP at a static port
|
|
- LoadBalancer: Exposes the service externally using a cloud provider's load balancer
|
|
- ExternalName: Maps the service to a DNS name
|
|
- Service discovery and load balancing
|
|
|
|
### ConfigMaps and Secrets
|
|
- Store configuration data and sensitive information
|
|
- Can be mounted as volumes or environment variables
|
|
- ConfigMaps for non-sensitive configuration data
|
|
- Secrets for sensitive data (base64 encoded)
|
|
|
|
## Namespace Management
|
|
- Logical partitions within a cluster
|
|
- Resource isolation and access control
|
|
- Quota management per namespace
|
|
- Default namespaces: default, kube-system, kube-public
|
|
|
|
## Resource Management
|
|
- Requests: Minimum resources guaranteed
|
|
- Limits: Maximum resources allowed
|
|
- CPU (in cores or millicores) and memory (in bytes)
|
|
- Quality of Service (QoS) classes: Guaranteed, Burstable, BestEffort
|
|
|
|
## Persistent Volumes and Claims
|
|
- PersistentVolumes (PV): Cluster-wide storage resources
|
|
- PersistentVolumeClaims (PVC): Storage requests by Pods
|
|
- StorageClasses: Dynamic provisioning of PVs
|
|
- Reclaim policies: Retain, Delete, Recycle
|
|
|
|
## Advanced Concepts
|
|
- StatefulSets: For stateful applications
|
|
- DaemonSets: Ensure all (or some) nodes run a copy of a Pod
|
|
- Jobs and CronJobs: For batch and scheduled tasks
|
|
- Ingress: HTTP(S) routing to services
|
|
|
|
|
|
2. Deployment Types in Kubernetes:
|
|
|
|
|
|
|
|
|
|
|
|
# Kubernetes Deployment Types
|
|
|
|
## Rolling Updates
|
|
- Default deployment strategy in Kubernetes
|
|
- Gradually replace old Pods with new ones
|
|
- Control with `maxSurge` and `maxUnavailable` parameters
|
|
- Pros: Zero downtime, controlled rollout
|
|
- Cons: Both versions coexist during update, may require compatibility
|
|
|
|
Implementation:
|
|
```yaml
|
|
spec:
|
|
strategy:
|
|
type: RollingUpdate
|
|
rollingUpdate:
|
|
maxSurge: 25%
|
|
maxUnavailable: 25%
|
|
```
|
|
|
|
## Blue/Green Deployments
|
|
- Two identical environments: blue (current) and green (new)
|
|
- Switch traffic at once by updating a Service
|
|
- Easy rollback by switching back to the old version
|
|
- Pros: Instant cutover, easy rollback
|
|
- Cons: Resource-intensive, requires double the resources
|
|
|
|
Implementation:
|
|
1. Deploy new version with a different label
|
|
2. Test the new deployment
|
|
3. Update the Service selector to point to the new version
|
|
4. Remove old deployment if new version is stable
|
|
|
|
## Canary Deployments
|
|
- Release to a subset of users or servers
|
|
- Gradually increase traffic to the new version
|
|
- Ideal for real-world testing and risk mitigation
|
|
- Pros: Controlled exposure, real user testing
|
|
- Cons: Complexity in managing traffic split
|
|
|
|
Implementation:
|
|
1. Deploy both versions
|
|
2. Create a Service that targets both versions
|
|
3. Control traffic split using Ingress or Service Mesh (e.g., Istio)
|
|
|
|
Example using Istio:
|
|
```yaml
|
|
apiVersion: networking.istio.io/v1alpha3
|
|
kind: VirtualService
|
|
metadata:
|
|
name: my-service
|
|
spec:
|
|
hosts:
|
|
- my-service
|
|
http:
|
|
- route:
|
|
- destination:
|
|
host: my-service-v1
|
|
weight: 90
|
|
- destination:
|
|
host: my-service-v2
|
|
weight: 10
|
|
```
|
|
|
|
## Choosing a Deployment Strategy
|
|
- Rolling Update: For most applications, especially stateless
|
|
- Blue/Green: When you need atomic updates or easy rollbacks
|
|
- Canary: For high-risk changes or when you need gradual user exposure
|
|
|
|
Best Practices:
|
|
- Use readiness probes to ensure only healthy Pods receive traffic
|
|
- Implement proper monitoring and logging for all deployment types
|
|
- Automate deployments using CI/CD pipelines
|
|
- Practice rollbacks to ensure they work as expected
|
|
|
|
|
|
3. NATS in Kubernetes:
|
|
|
|
|
|
|
|
|
|
|
|
# NATS in Kubernetes
|
|
|
|
## NATS Architecture
|
|
- Distributed messaging system
|
|
- Publish-subscribe and request-reply messaging patterns
|
|
- Core NATS: At-most-once delivery, no persistence
|
|
- NATS Streaming (STAN): At-least-once delivery, persistence
|
|
- Key components:
|
|
- Subjects: Hierarchical naming for messages
|
|
- Queues: Load balancing for subscribers
|
|
- Clusters: Group of NATS servers
|
|
|
|
## Deploying NATS on Kubernetes
|
|
1. Core NATS Deployment:
|
|
- Use StatefulSet for ordered, unique network identifiers
|
|
- Headless Service for peer discovery
|
|
- ConfigMap for NATS server configuration
|
|
|
|
Example NATS StatefulSet:
|
|
```yaml
|
|
apiVersion: apps/v1
|
|
kind: StatefulSet
|
|
metadata:
|
|
name: nats
|
|
spec:
|
|
serviceName: "nats"
|
|
replicas: 3
|
|
selector:
|
|
matchLabels:
|
|
app: nats
|
|
template:
|
|
metadata:
|
|
labels:
|
|
app: nats
|
|
spec:
|
|
containers:
|
|
- name: nats
|
|
image: nats:2.1.9-alpine3.12
|
|
ports:
|
|
- containerPort: 4222
|
|
name: client
|
|
- containerPort: 6222
|
|
name: cluster
|
|
- containerPort: 8222
|
|
name: monitor
|
|
command:
|
|
- "nats-server"
|
|
- "--config"
|
|
- "/etc/nats-config/nats.conf"
|
|
```
|
|
|
|
2. NATS Streaming Deployment:
|
|
- Similar to core NATS, but with additional configuration
|
|
- Use persistent volumes for message storage
|
|
|
|
## NATS Streaming (Persistence)
|
|
- Provides message persistence and guaranteed delivery
|
|
- Features:
|
|
- Durable subscriptions
|
|
- Message replay
|
|
- Last value semantics
|
|
- Configuration options:
|
|
- File-based storage
|
|
- SQL database storage (MySQL, PostgreSQL)
|
|
|
|
NATS Streaming configuration example:
|
|
```yaml
|
|
streaming {
|
|
id: "nats-streaming-cluster"
|
|
store: "file"
|
|
dir: "/data/nats-streaming"
|
|
channels {
|
|
max_inactivity: "24h"
|
|
}
|
|
}
|
|
```
|
|
|
|
## Best Practices for NATS in Kubernetes
|
|
1. Use anti-affinity rules to spread NATS servers across nodes
|
|
2. Implement proper monitoring (e.g., Prometheus metrics)
|
|
3. Use NetworkPolicies to control access to NATS servers
|
|
4. Regularly backup NATS Streaming data
|
|
5. Implement proper authentication and authorization
|
|
|
|
## Scaling NATS in Kubernetes
|
|
- Horizontal scaling: Increase the number of NATS server instances
|
|
- Vertical scaling: Increase resources for NATS server Pods
|
|
- Use NATS supercluster for multi-cluster setups
|
|
|
|
## Troubleshooting NATS in Kubernetes
|
|
1. Check Pod logs: `kubectl logs <pod-name>`
|
|
2. Verify network connectivity between Pods
|
|
3. Use NATS CLI tools for debugging (nats-top, nats-bench)
|
|
4. Monitor NATS server metrics through the monitoring endpoint
|
|
|
|
|
|
4. Event-Driven Architecture (EDA) Concepts:
|
|
|
|
|
|
|
|
|
|
|
|
# Event-Driven Architecture (EDA) Concepts
|
|
|
|
## Event Producers and Consumers
|
|
|
|
### Event Producers
|
|
- Components that generate events
|
|
- Examples: User actions, system state changes, IoT devices
|
|
- Responsibilities:
|
|
- Create events in a standardized format
|
|
- Publish events to appropriate channels or topics
|
|
|
|
### Event Consumers
|
|
- Components that react to events
|
|
- Examples: Microservices, analytics systems, notification services
|
|
- Responsibilities:
|
|
- Subscribe to relevant event channels or topics
|
|
- Process events and perform actions
|
|
|
|
## Publish-Subscribe Pattern
|
|
- Core communication pattern in EDA
|
|
- Characteristics:
|
|
- Decoupled communication between producers and consumers
|
|
- One-to-many distribution of events
|
|
- Asynchronous processing
|
|
|
|
### Topics or Channels
|
|
- Categorize events for efficient routing
|
|
- Allow fine-grained subscription by consumers
|
|
- Hierarchical naming often used (e.g., "orders.created", "users.updated")
|
|
|
|
### Benefits
|
|
1. Scalability: Easy to add new producers or consumers
|
|
2. Flexibility: Systems can evolve independently
|
|
3. Resilience: Failure in one component doesn't affect others
|
|
|
|
## Message Queues vs. Streaming
|
|
|
|
### Message Queues
|
|
- Characteristics:
|
|
- Often used for task distribution
|
|
- Competing consumers model
|
|
- Messages typically removed after processing
|
|
- Use cases:
|
|
- Work queue processing
|
|
- Load balancing tasks across workers
|
|
|
|
### Streaming
|
|
- Characteristics:
|
|
- Maintains an ordered log of events
|
|
- Allows multiple consumers to read the same events
|
|
- Supports event replay and time-based access
|
|
- Use cases:
|
|
- Event sourcing
|
|
- Audit trails
|
|
- Real-time analytics
|
|
|
|
## Key EDA Patterns
|
|
1. Event Notification: Simple event publishing without payload
|
|
2. Event-Carried State Transfer: Events contain relevant state data
|
|
3. Event Sourcing: Store state changes as a sequence of events
|
|
4. CQRS (Command Query Responsibility Segregation): Separate read and write models
|
|
|
|
## Challenges in EDA
|
|
1. Eventual Consistency: Managing data consistency across services
|
|
2. Event Schema Evolution: Handling changes in event structure over time
|
|
3. Error Handling and Dead Letter Queues: Managing failed event processing
|
|
4. Monitoring and Tracing: Tracking event flow across distributed systems
|
|
|
|
## EDA in Kubernetes
|
|
- Use message brokers like Kafka, NATS, or RabbitMQ
|
|
- Leverage Kubernetes features:
|
|
- StatefulSets for stateful messaging systems
|
|
- Services for service discovery
|
|
- ConfigMaps and Secrets for configuration
|
|
|
|
## Best Practices
|
|
1. Design clear and well-documented event schemas
|
|
2. Implement idempotent event consumers
|
|
3. Use correlation IDs for tracing events across services
|
|
4. Implement proper error handling and retry mechanisms
|
|
5. Monitor system health and performance metrics
|
|
|
|
|
|
5. Kafka Basics:
|
|
|
|
|
|
|
|
|
|
|
|
# Kafka Basics
|
|
|
|
## Core Concepts
|
|
|
|
### Topics
|
|
- Categorize streams of records
|
|
- Similar to tables in a database
|
|
- Can have multiple partitions for parallelism
|
|
- Retention policies define how long data is kept
|
|
|
|
### Partitions
|
|
- Ordered, immutable sequence of records
|
|
- Each record assigned a sequential offset
|
|
- Enable parallel processing and scalability
|
|
- Distributed across Kafka brokers
|
|
|
|
### Offsets
|
|
- Unique identifier for each record within a partition
|
|
- Consumers track their position using offsets
|
|
- Allows consumers to replay data from a specific point
|
|
|
|
## Producers and Consumers
|
|
|
|
### Producers
|
|
- Publish records to topics
|
|
- Can choose the partition for each record
|
|
- Support for asynchronous and synchronous sending
|
|
- Configurable acknowledgment levels (acks)
|
|
|
|
### Consumers
|
|
- Read records from topics
|
|
- Organized into consumer groups for scalability
|
|
- Each partition consumed by only one consumer in a group
|
|
- Offset management: automatic or manual commit
|
|
|
|
## Kafka Streams
|
|
|
|
- Library for building streaming applications
|
|
- Stateful and stateless processing
|
|
- Exactly-once semantics
|
|
- Supports windowing operations
|
|
- Key features:
|
|
- KStream: represent streams
|
|
- KTable: represent changelog streams
|
|
- GlobalKTable: enrichment use cases
|
|
|
|
## Kafka Connect
|
|
- Framework for connecting Kafka with external systems
|
|
- Source connectors: import data from external systems
|
|
- Sink connectors: export data to external systems
|
|
- Distributed mode for scalability
|
|
|
|
## Key Features
|
|
|
|
- High throughput and low latency
|
|
- Scalability: can handle trillions of events a day
|
|
- Durability: persists messages on disk
|
|
- Fault-tolerance: replication across multiple brokers
|
|
- Exactly-once semantics (since version 0.11)
|
|
|
|
## Use Cases
|
|
|
|
- Messaging system
|
|
- Activity tracking
|
|
- Gather metrics from various sources
|
|
- Application logs gathering
|
|
- Stream processing
|
|
- Event sourcing
|
|
- Commit log service
|
|
|
|
## Kafka in Kubernetes
|
|
- Use Kafka Operator for easy deployment and management
|
|
- StatefulSets for Kafka brokers
|
|
- Headless Services for broker discovery
|
|
- Configure proper storage class for persistence
|
|
|
|
## Best Practices
|
|
1. Choose appropriate partition count and replication factor
|
|
2. Implement proper monitoring and alerting
|
|
3. Tune producer and consumer configurations for performance
|
|
4. Implement proper security measures (authentication, authorization, encryption)
|
|
5. Regular maintenance: log compaction, cluster balancing
|
|
|
|
## Common Challenges
|
|
1. Topic management at scale
|
|
2. Handling large messages
|
|
3. Dealing with rebalancing in consumer groups
|
|
4. Ensuring exactly-once processing in stream applications
|
|
5. Disaster recovery planning
|
|
|
|
|
|
6. Troubleshooting in Kubernetes:
|
|
|
|
|
|
|
|
|
|
|
|
# Troubleshooting in Kubernetes
|
|
|
|
## Logging and Monitoring Tools
|
|
|
|
### Logging
|
|
1. Kubectl logs
|
|
- Basic logging: `kubectl logs <pod-name>`
|
|
- Follow logs: `kubectl logs -f <pod-name>`
|
|
- Logs from previous instance: `kubectl logs <pod-name> --previous`
|
|
|
|
2. Centralized Logging
|
|
- ELK Stack (Elasticsearch, Logstash, Kibana)
|
|
- Fluentd + Elasticsearch + Kibana
|
|
- Loki (part of Grafana stack)
|
|
|
|
### Monitoring
|
|
1. Prometheus
|
|
- Open-source monitoring and alerting toolkit
|
|
- Pull-based metrics collection
|
|
- PromQL for querying metrics
|
|
|
|
2. Grafana
|
|
- Visualization platform for metrics
|
|
- Supports multiple data sources (including Prometheus)
|
|
- Customizable dashboards
|
|
|
|
3. Kubernetes Dashboard
|
|
- Web-based UI for Kubernetes clusters
|
|
- Overview of applications running on the cluster
|
|
- Create and modify Kubernetes resources
|
|
|
|
## Debugging Pods and Services
|
|
|
|
### Pod Debugging
|
|
1. Describe Pod
|
|
- `kubectl describe pod <pod-name>`
|
|
- Check events, status, and configuration
|
|
|
|
2. Exec into Container
|
|
- `kubectl exec -it <pod-name> -- /bin/sh`
|
|
- Investigate from within the container
|
|
|
|
3. Port Forwarding
|
|
- `kubectl port-forward <pod-name> 8080:80`
|
|
- Access pod's port locally for debugging
|
|
|
|
### Service Debugging
|
|
1. Verify Service Configuration
|
|
- `kubectl get svc <service-name>`
|
|
- `kubectl describe svc <service-name>`
|
|
|
|
2. Check Endpoints
|
|
- `kubectl get endpoints <service-name>`
|
|
- Ensure pods are correctly registered
|
|
|
|
3. DNS Troubleshooting
|
|
- Deploy a debug pod: `kubectl run -it --rm debug --image=busybox --restart=Never -- sh`
|
|
- Use `nsl |