Add tech_docs/review_k8s.md
This commit is contained in:
497
tech_docs/review_k8s.md
Normal file
497
tech_docs/review_k8s.md
Normal file
@@ -0,0 +1,497 @@
|
||||
Understood. I'll create comprehensive artifacts for each of the seven main topics you've outlined. This will provide you with a thorough reference for your interview preparation.
|
||||
|
||||
1. Kubernetes (k8s) Fundamentals:
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
# Kubernetes Fundamentals
|
||||
|
||||
## Core Components
|
||||
|
||||
### Pods
|
||||
- Smallest deployable units in Kubernetes
|
||||
- Can contain one or more containers
|
||||
- Share network namespace and storage
|
||||
- Ephemeral by nature, can be replaced at any time
|
||||
|
||||
### Deployments
|
||||
- Manage ReplicaSets
|
||||
- Provide declarative updates for Pods
|
||||
- Support rolling updates and rollbacks
|
||||
- Ensure desired number of Pod replicas are running
|
||||
|
||||
### Services
|
||||
- Expose Pods as network services
|
||||
- Types:
|
||||
- ClusterIP: Internal cluster communication
|
||||
- NodePort: Exposes the service on each Node's IP at a static port
|
||||
- LoadBalancer: Exposes the service externally using a cloud provider's load balancer
|
||||
- ExternalName: Maps the service to a DNS name
|
||||
- Service discovery and load balancing
|
||||
|
||||
### ConfigMaps and Secrets
|
||||
- Store configuration data and sensitive information
|
||||
- Can be mounted as volumes or environment variables
|
||||
- ConfigMaps for non-sensitive configuration data
|
||||
- Secrets for sensitive data (base64 encoded)
|
||||
|
||||
## Namespace Management
|
||||
- Logical partitions within a cluster
|
||||
- Resource isolation and access control
|
||||
- Quota management per namespace
|
||||
- Default namespaces: default, kube-system, kube-public
|
||||
|
||||
## Resource Management
|
||||
- Requests: Minimum resources guaranteed
|
||||
- Limits: Maximum resources allowed
|
||||
- CPU (in cores or millicores) and memory (in bytes)
|
||||
- Quality of Service (QoS) classes: Guaranteed, Burstable, BestEffort
|
||||
|
||||
## Persistent Volumes and Claims
|
||||
- PersistentVolumes (PV): Cluster-wide storage resources
|
||||
- PersistentVolumeClaims (PVC): Storage requests by Pods
|
||||
- StorageClasses: Dynamic provisioning of PVs
|
||||
- Reclaim policies: Retain, Delete, Recycle
|
||||
|
||||
## Advanced Concepts
|
||||
- StatefulSets: For stateful applications
|
||||
- DaemonSets: Ensure all (or some) nodes run a copy of a Pod
|
||||
- Jobs and CronJobs: For batch and scheduled tasks
|
||||
- Ingress: HTTP(S) routing to services
|
||||
|
||||
|
||||
2. Deployment Types in Kubernetes:
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
# Kubernetes Deployment Types
|
||||
|
||||
## Rolling Updates
|
||||
- Default deployment strategy in Kubernetes
|
||||
- Gradually replace old Pods with new ones
|
||||
- Control with `maxSurge` and `maxUnavailable` parameters
|
||||
- Pros: Zero downtime, controlled rollout
|
||||
- Cons: Both versions coexist during update, may require compatibility
|
||||
|
||||
Implementation:
|
||||
```yaml
|
||||
spec:
|
||||
strategy:
|
||||
type: RollingUpdate
|
||||
rollingUpdate:
|
||||
maxSurge: 25%
|
||||
maxUnavailable: 25%
|
||||
```
|
||||
|
||||
## Blue/Green Deployments
|
||||
- Two identical environments: blue (current) and green (new)
|
||||
- Switch traffic at once by updating a Service
|
||||
- Easy rollback by switching back to the old version
|
||||
- Pros: Instant cutover, easy rollback
|
||||
- Cons: Resource-intensive, requires double the resources
|
||||
|
||||
Implementation:
|
||||
1. Deploy new version with a different label
|
||||
2. Test the new deployment
|
||||
3. Update the Service selector to point to the new version
|
||||
4. Remove old deployment if new version is stable
|
||||
|
||||
## Canary Deployments
|
||||
- Release to a subset of users or servers
|
||||
- Gradually increase traffic to the new version
|
||||
- Ideal for real-world testing and risk mitigation
|
||||
- Pros: Controlled exposure, real user testing
|
||||
- Cons: Complexity in managing traffic split
|
||||
|
||||
Implementation:
|
||||
1. Deploy both versions
|
||||
2. Create a Service that targets both versions
|
||||
3. Control traffic split using Ingress or Service Mesh (e.g., Istio)
|
||||
|
||||
Example using Istio:
|
||||
```yaml
|
||||
apiVersion: networking.istio.io/v1alpha3
|
||||
kind: VirtualService
|
||||
metadata:
|
||||
name: my-service
|
||||
spec:
|
||||
hosts:
|
||||
- my-service
|
||||
http:
|
||||
- route:
|
||||
- destination:
|
||||
host: my-service-v1
|
||||
weight: 90
|
||||
- destination:
|
||||
host: my-service-v2
|
||||
weight: 10
|
||||
```
|
||||
|
||||
## Choosing a Deployment Strategy
|
||||
- Rolling Update: For most applications, especially stateless
|
||||
- Blue/Green: When you need atomic updates or easy rollbacks
|
||||
- Canary: For high-risk changes or when you need gradual user exposure
|
||||
|
||||
Best Practices:
|
||||
- Use readiness probes to ensure only healthy Pods receive traffic
|
||||
- Implement proper monitoring and logging for all deployment types
|
||||
- Automate deployments using CI/CD pipelines
|
||||
- Practice rollbacks to ensure they work as expected
|
||||
|
||||
|
||||
3. NATS in Kubernetes:
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
# NATS in Kubernetes
|
||||
|
||||
## NATS Architecture
|
||||
- Distributed messaging system
|
||||
- Publish-subscribe and request-reply messaging patterns
|
||||
- Core NATS: At-most-once delivery, no persistence
|
||||
- NATS Streaming (STAN): At-least-once delivery, persistence
|
||||
- Key components:
|
||||
- Subjects: Hierarchical naming for messages
|
||||
- Queues: Load balancing for subscribers
|
||||
- Clusters: Group of NATS servers
|
||||
|
||||
## Deploying NATS on Kubernetes
|
||||
1. Core NATS Deployment:
|
||||
- Use StatefulSet for ordered, unique network identifiers
|
||||
- Headless Service for peer discovery
|
||||
- ConfigMap for NATS server configuration
|
||||
|
||||
Example NATS StatefulSet:
|
||||
```yaml
|
||||
apiVersion: apps/v1
|
||||
kind: StatefulSet
|
||||
metadata:
|
||||
name: nats
|
||||
spec:
|
||||
serviceName: "nats"
|
||||
replicas: 3
|
||||
selector:
|
||||
matchLabels:
|
||||
app: nats
|
||||
template:
|
||||
metadata:
|
||||
labels:
|
||||
app: nats
|
||||
spec:
|
||||
containers:
|
||||
- name: nats
|
||||
image: nats:2.1.9-alpine3.12
|
||||
ports:
|
||||
- containerPort: 4222
|
||||
name: client
|
||||
- containerPort: 6222
|
||||
name: cluster
|
||||
- containerPort: 8222
|
||||
name: monitor
|
||||
command:
|
||||
- "nats-server"
|
||||
- "--config"
|
||||
- "/etc/nats-config/nats.conf"
|
||||
```
|
||||
|
||||
2. NATS Streaming Deployment:
|
||||
- Similar to core NATS, but with additional configuration
|
||||
- Use persistent volumes for message storage
|
||||
|
||||
## NATS Streaming (Persistence)
|
||||
- Provides message persistence and guaranteed delivery
|
||||
- Features:
|
||||
- Durable subscriptions
|
||||
- Message replay
|
||||
- Last value semantics
|
||||
- Configuration options:
|
||||
- File-based storage
|
||||
- SQL database storage (MySQL, PostgreSQL)
|
||||
|
||||
NATS Streaming configuration example:
|
||||
```yaml
|
||||
streaming {
|
||||
id: "nats-streaming-cluster"
|
||||
store: "file"
|
||||
dir: "/data/nats-streaming"
|
||||
channels {
|
||||
max_inactivity: "24h"
|
||||
}
|
||||
}
|
||||
```
|
||||
|
||||
## Best Practices for NATS in Kubernetes
|
||||
1. Use anti-affinity rules to spread NATS servers across nodes
|
||||
2. Implement proper monitoring (e.g., Prometheus metrics)
|
||||
3. Use NetworkPolicies to control access to NATS servers
|
||||
4. Regularly backup NATS Streaming data
|
||||
5. Implement proper authentication and authorization
|
||||
|
||||
## Scaling NATS in Kubernetes
|
||||
- Horizontal scaling: Increase the number of NATS server instances
|
||||
- Vertical scaling: Increase resources for NATS server Pods
|
||||
- Use NATS supercluster for multi-cluster setups
|
||||
|
||||
## Troubleshooting NATS in Kubernetes
|
||||
1. Check Pod logs: `kubectl logs <pod-name>`
|
||||
2. Verify network connectivity between Pods
|
||||
3. Use NATS CLI tools for debugging (nats-top, nats-bench)
|
||||
4. Monitor NATS server metrics through the monitoring endpoint
|
||||
|
||||
|
||||
4. Event-Driven Architecture (EDA) Concepts:
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
# Event-Driven Architecture (EDA) Concepts
|
||||
|
||||
## Event Producers and Consumers
|
||||
|
||||
### Event Producers
|
||||
- Components that generate events
|
||||
- Examples: User actions, system state changes, IoT devices
|
||||
- Responsibilities:
|
||||
- Create events in a standardized format
|
||||
- Publish events to appropriate channels or topics
|
||||
|
||||
### Event Consumers
|
||||
- Components that react to events
|
||||
- Examples: Microservices, analytics systems, notification services
|
||||
- Responsibilities:
|
||||
- Subscribe to relevant event channels or topics
|
||||
- Process events and perform actions
|
||||
|
||||
## Publish-Subscribe Pattern
|
||||
- Core communication pattern in EDA
|
||||
- Characteristics:
|
||||
- Decoupled communication between producers and consumers
|
||||
- One-to-many distribution of events
|
||||
- Asynchronous processing
|
||||
|
||||
### Topics or Channels
|
||||
- Categorize events for efficient routing
|
||||
- Allow fine-grained subscription by consumers
|
||||
- Hierarchical naming often used (e.g., "orders.created", "users.updated")
|
||||
|
||||
### Benefits
|
||||
1. Scalability: Easy to add new producers or consumers
|
||||
2. Flexibility: Systems can evolve independently
|
||||
3. Resilience: Failure in one component doesn't affect others
|
||||
|
||||
## Message Queues vs. Streaming
|
||||
|
||||
### Message Queues
|
||||
- Characteristics:
|
||||
- Often used for task distribution
|
||||
- Competing consumers model
|
||||
- Messages typically removed after processing
|
||||
- Use cases:
|
||||
- Work queue processing
|
||||
- Load balancing tasks across workers
|
||||
|
||||
### Streaming
|
||||
- Characteristics:
|
||||
- Maintains an ordered log of events
|
||||
- Allows multiple consumers to read the same events
|
||||
- Supports event replay and time-based access
|
||||
- Use cases:
|
||||
- Event sourcing
|
||||
- Audit trails
|
||||
- Real-time analytics
|
||||
|
||||
## Key EDA Patterns
|
||||
1. Event Notification: Simple event publishing without payload
|
||||
2. Event-Carried State Transfer: Events contain relevant state data
|
||||
3. Event Sourcing: Store state changes as a sequence of events
|
||||
4. CQRS (Command Query Responsibility Segregation): Separate read and write models
|
||||
|
||||
## Challenges in EDA
|
||||
1. Eventual Consistency: Managing data consistency across services
|
||||
2. Event Schema Evolution: Handling changes in event structure over time
|
||||
3. Error Handling and Dead Letter Queues: Managing failed event processing
|
||||
4. Monitoring and Tracing: Tracking event flow across distributed systems
|
||||
|
||||
## EDA in Kubernetes
|
||||
- Use message brokers like Kafka, NATS, or RabbitMQ
|
||||
- Leverage Kubernetes features:
|
||||
- StatefulSets for stateful messaging systems
|
||||
- Services for service discovery
|
||||
- ConfigMaps and Secrets for configuration
|
||||
|
||||
## Best Practices
|
||||
1. Design clear and well-documented event schemas
|
||||
2. Implement idempotent event consumers
|
||||
3. Use correlation IDs for tracing events across services
|
||||
4. Implement proper error handling and retry mechanisms
|
||||
5. Monitor system health and performance metrics
|
||||
|
||||
|
||||
5. Kafka Basics:
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
# Kafka Basics
|
||||
|
||||
## Core Concepts
|
||||
|
||||
### Topics
|
||||
- Categorize streams of records
|
||||
- Similar to tables in a database
|
||||
- Can have multiple partitions for parallelism
|
||||
- Retention policies define how long data is kept
|
||||
|
||||
### Partitions
|
||||
- Ordered, immutable sequence of records
|
||||
- Each record assigned a sequential offset
|
||||
- Enable parallel processing and scalability
|
||||
- Distributed across Kafka brokers
|
||||
|
||||
### Offsets
|
||||
- Unique identifier for each record within a partition
|
||||
- Consumers track their position using offsets
|
||||
- Allows consumers to replay data from a specific point
|
||||
|
||||
## Producers and Consumers
|
||||
|
||||
### Producers
|
||||
- Publish records to topics
|
||||
- Can choose the partition for each record
|
||||
- Support for asynchronous and synchronous sending
|
||||
- Configurable acknowledgment levels (acks)
|
||||
|
||||
### Consumers
|
||||
- Read records from topics
|
||||
- Organized into consumer groups for scalability
|
||||
- Each partition consumed by only one consumer in a group
|
||||
- Offset management: automatic or manual commit
|
||||
|
||||
## Kafka Streams
|
||||
|
||||
- Library for building streaming applications
|
||||
- Stateful and stateless processing
|
||||
- Exactly-once semantics
|
||||
- Supports windowing operations
|
||||
- Key features:
|
||||
- KStream: represent streams
|
||||
- KTable: represent changelog streams
|
||||
- GlobalKTable: enrichment use cases
|
||||
|
||||
## Kafka Connect
|
||||
- Framework for connecting Kafka with external systems
|
||||
- Source connectors: import data from external systems
|
||||
- Sink connectors: export data to external systems
|
||||
- Distributed mode for scalability
|
||||
|
||||
## Key Features
|
||||
|
||||
- High throughput and low latency
|
||||
- Scalability: can handle trillions of events a day
|
||||
- Durability: persists messages on disk
|
||||
- Fault-tolerance: replication across multiple brokers
|
||||
- Exactly-once semantics (since version 0.11)
|
||||
|
||||
## Use Cases
|
||||
|
||||
- Messaging system
|
||||
- Activity tracking
|
||||
- Gather metrics from various sources
|
||||
- Application logs gathering
|
||||
- Stream processing
|
||||
- Event sourcing
|
||||
- Commit log service
|
||||
|
||||
## Kafka in Kubernetes
|
||||
- Use Kafka Operator for easy deployment and management
|
||||
- StatefulSets for Kafka brokers
|
||||
- Headless Services for broker discovery
|
||||
- Configure proper storage class for persistence
|
||||
|
||||
## Best Practices
|
||||
1. Choose appropriate partition count and replication factor
|
||||
2. Implement proper monitoring and alerting
|
||||
3. Tune producer and consumer configurations for performance
|
||||
4. Implement proper security measures (authentication, authorization, encryption)
|
||||
5. Regular maintenance: log compaction, cluster balancing
|
||||
|
||||
## Common Challenges
|
||||
1. Topic management at scale
|
||||
2. Handling large messages
|
||||
3. Dealing with rebalancing in consumer groups
|
||||
4. Ensuring exactly-once processing in stream applications
|
||||
5. Disaster recovery planning
|
||||
|
||||
|
||||
6. Troubleshooting in Kubernetes:
|
||||
|
||||
|
||||
|
||||
|
||||
|
||||
# Troubleshooting in Kubernetes
|
||||
|
||||
## Logging and Monitoring Tools
|
||||
|
||||
### Logging
|
||||
1. Kubectl logs
|
||||
- Basic logging: `kubectl logs <pod-name>`
|
||||
- Follow logs: `kubectl logs -f <pod-name>`
|
||||
- Logs from previous instance: `kubectl logs <pod-name> --previous`
|
||||
|
||||
2. Centralized Logging
|
||||
- ELK Stack (Elasticsearch, Logstash, Kibana)
|
||||
- Fluentd + Elasticsearch + Kibana
|
||||
- Loki (part of Grafana stack)
|
||||
|
||||
### Monitoring
|
||||
1. Prometheus
|
||||
- Open-source monitoring and alerting toolkit
|
||||
- Pull-based metrics collection
|
||||
- PromQL for querying metrics
|
||||
|
||||
2. Grafana
|
||||
- Visualization platform for metrics
|
||||
- Supports multiple data sources (including Prometheus)
|
||||
- Customizable dashboards
|
||||
|
||||
3. Kubernetes Dashboard
|
||||
- Web-based UI for Kubernetes clusters
|
||||
- Overview of applications running on the cluster
|
||||
- Create and modify Kubernetes resources
|
||||
|
||||
## Debugging Pods and Services
|
||||
|
||||
### Pod Debugging
|
||||
1. Describe Pod
|
||||
- `kubectl describe pod <pod-name>`
|
||||
- Check events, status, and configuration
|
||||
|
||||
2. Exec into Container
|
||||
- `kubectl exec -it <pod-name> -- /bin/sh`
|
||||
- Investigate from within the container
|
||||
|
||||
3. Port Forwarding
|
||||
- `kubectl port-forward <pod-name> 8080:80`
|
||||
- Access pod's port locally for debugging
|
||||
|
||||
### Service Debugging
|
||||
1. Verify Service Configuration
|
||||
- `kubectl get svc <service-name>`
|
||||
- `kubectl describe svc <service-name>`
|
||||
|
||||
2. Check Endpoints
|
||||
- `kubectl get endpoints <service-name>`
|
||||
- Ensure pods are correctly registered
|
||||
|
||||
3. DNS Troubleshooting
|
||||
- Deploy a debug pod: `kubectl run -it --rm debug --image=busybox --restart=Never -- sh`
|
||||
- Use `nsl
|
||||
Reference in New Issue
Block a user