Files
the_information_nexus/tech_docs/review_k8s.md
2024-08-07 17:26:48 +00:00

14 KiB

Understood. I'll create comprehensive artifacts for each of the seven main topics you've outlined. This will provide you with a thorough reference for your interview preparation.

  1. Kubernetes (k8s) Fundamentals:

Kubernetes Fundamentals

Core Components

Pods

  • Smallest deployable units in Kubernetes
  • Can contain one or more containers
  • Share network namespace and storage
  • Ephemeral by nature, can be replaced at any time

Deployments

  • Manage ReplicaSets
  • Provide declarative updates for Pods
  • Support rolling updates and rollbacks
  • Ensure desired number of Pod replicas are running

Services

  • Expose Pods as network services
  • Types:
    • ClusterIP: Internal cluster communication
    • NodePort: Exposes the service on each Node's IP at a static port
    • LoadBalancer: Exposes the service externally using a cloud provider's load balancer
    • ExternalName: Maps the service to a DNS name
  • Service discovery and load balancing

ConfigMaps and Secrets

  • Store configuration data and sensitive information
  • Can be mounted as volumes or environment variables
  • ConfigMaps for non-sensitive configuration data
  • Secrets for sensitive data (base64 encoded)

Namespace Management

  • Logical partitions within a cluster
  • Resource isolation and access control
  • Quota management per namespace
  • Default namespaces: default, kube-system, kube-public

Resource Management

  • Requests: Minimum resources guaranteed
  • Limits: Maximum resources allowed
  • CPU (in cores or millicores) and memory (in bytes)
  • Quality of Service (QoS) classes: Guaranteed, Burstable, BestEffort

Persistent Volumes and Claims

  • PersistentVolumes (PV): Cluster-wide storage resources
  • PersistentVolumeClaims (PVC): Storage requests by Pods
  • StorageClasses: Dynamic provisioning of PVs
  • Reclaim policies: Retain, Delete, Recycle

Advanced Concepts

  • StatefulSets: For stateful applications
  • DaemonSets: Ensure all (or some) nodes run a copy of a Pod
  • Jobs and CronJobs: For batch and scheduled tasks
  • Ingress: HTTP(S) routing to services
  1. Deployment Types in Kubernetes:

Kubernetes Deployment Types

Rolling Updates

  • Default deployment strategy in Kubernetes
  • Gradually replace old Pods with new ones
  • Control with maxSurge and maxUnavailable parameters
  • Pros: Zero downtime, controlled rollout
  • Cons: Both versions coexist during update, may require compatibility

Implementation:

spec:
  strategy:
    type: RollingUpdate
    rollingUpdate:
      maxSurge: 25%
      maxUnavailable: 25%

Blue/Green Deployments

  • Two identical environments: blue (current) and green (new)
  • Switch traffic at once by updating a Service
  • Easy rollback by switching back to the old version
  • Pros: Instant cutover, easy rollback
  • Cons: Resource-intensive, requires double the resources

Implementation:

  1. Deploy new version with a different label
  2. Test the new deployment
  3. Update the Service selector to point to the new version
  4. Remove old deployment if new version is stable

Canary Deployments

  • Release to a subset of users or servers
  • Gradually increase traffic to the new version
  • Ideal for real-world testing and risk mitigation
  • Pros: Controlled exposure, real user testing
  • Cons: Complexity in managing traffic split

Implementation:

  1. Deploy both versions
  2. Create a Service that targets both versions
  3. Control traffic split using Ingress or Service Mesh (e.g., Istio)

Example using Istio:

apiVersion: networking.istio.io/v1alpha3
kind: VirtualService
metadata:
  name: my-service
spec:
  hosts:
  - my-service
  http:
  - route:
    - destination:
        host: my-service-v1
      weight: 90
    - destination:
        host: my-service-v2
      weight: 10

Choosing a Deployment Strategy

  • Rolling Update: For most applications, especially stateless
  • Blue/Green: When you need atomic updates or easy rollbacks
  • Canary: For high-risk changes or when you need gradual user exposure

Best Practices:

  • Use readiness probes to ensure only healthy Pods receive traffic
  • Implement proper monitoring and logging for all deployment types
  • Automate deployments using CI/CD pipelines
  • Practice rollbacks to ensure they work as expected
  1. NATS in Kubernetes:

NATS in Kubernetes

NATS Architecture

  • Distributed messaging system
  • Publish-subscribe and request-reply messaging patterns
  • Core NATS: At-most-once delivery, no persistence
  • NATS Streaming (STAN): At-least-once delivery, persistence
  • Key components:
    • Subjects: Hierarchical naming for messages
    • Queues: Load balancing for subscribers
    • Clusters: Group of NATS servers

Deploying NATS on Kubernetes

  1. Core NATS Deployment:
    • Use StatefulSet for ordered, unique network identifiers
    • Headless Service for peer discovery
    • ConfigMap for NATS server configuration

Example NATS StatefulSet:

apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: nats
spec:
  serviceName: "nats"
  replicas: 3
  selector:
    matchLabels:
      app: nats
  template:
    metadata:
      labels:
        app: nats
    spec:
      containers:
      - name: nats
        image: nats:2.1.9-alpine3.12
        ports:
        - containerPort: 4222
          name: client
        - containerPort: 6222
          name: cluster
        - containerPort: 8222
          name: monitor
        command:
         - "nats-server"
         - "--config"
         - "/etc/nats-config/nats.conf"
  1. NATS Streaming Deployment:
    • Similar to core NATS, but with additional configuration
    • Use persistent volumes for message storage

NATS Streaming (Persistence)

  • Provides message persistence and guaranteed delivery
  • Features:
    • Durable subscriptions
    • Message replay
    • Last value semantics
  • Configuration options:
    • File-based storage
    • SQL database storage (MySQL, PostgreSQL)

NATS Streaming configuration example:

streaming {
  id: "nats-streaming-cluster"
  store: "file"
  dir: "/data/nats-streaming"
  channels {
    max_inactivity: "24h"
  }
}

Best Practices for NATS in Kubernetes

  1. Use anti-affinity rules to spread NATS servers across nodes
  2. Implement proper monitoring (e.g., Prometheus metrics)
  3. Use NetworkPolicies to control access to NATS servers
  4. Regularly backup NATS Streaming data
  5. Implement proper authentication and authorization

Scaling NATS in Kubernetes

  • Horizontal scaling: Increase the number of NATS server instances
  • Vertical scaling: Increase resources for NATS server Pods
  • Use NATS supercluster for multi-cluster setups

Troubleshooting NATS in Kubernetes

  1. Check Pod logs: kubectl logs <pod-name>

  2. Verify network connectivity between Pods

  3. Use NATS CLI tools for debugging (nats-top, nats-bench)

  4. Monitor NATS server metrics through the monitoring endpoint

  5. Event-Driven Architecture (EDA) Concepts:

Event-Driven Architecture (EDA) Concepts

Event Producers and Consumers

Event Producers

  • Components that generate events
  • Examples: User actions, system state changes, IoT devices
  • Responsibilities:
    • Create events in a standardized format
    • Publish events to appropriate channels or topics

Event Consumers

  • Components that react to events
  • Examples: Microservices, analytics systems, notification services
  • Responsibilities:
    • Subscribe to relevant event channels or topics
    • Process events and perform actions

Publish-Subscribe Pattern

  • Core communication pattern in EDA
  • Characteristics:
    • Decoupled communication between producers and consumers
    • One-to-many distribution of events
    • Asynchronous processing

Topics or Channels

  • Categorize events for efficient routing
  • Allow fine-grained subscription by consumers
  • Hierarchical naming often used (e.g., "orders.created", "users.updated")

Benefits

  1. Scalability: Easy to add new producers or consumers
  2. Flexibility: Systems can evolve independently
  3. Resilience: Failure in one component doesn't affect others

Message Queues vs. Streaming

Message Queues

  • Characteristics:
    • Often used for task distribution
    • Competing consumers model
    • Messages typically removed after processing
  • Use cases:
    • Work queue processing
    • Load balancing tasks across workers

Streaming

  • Characteristics:
    • Maintains an ordered log of events
    • Allows multiple consumers to read the same events
    • Supports event replay and time-based access
  • Use cases:
    • Event sourcing
    • Audit trails
    • Real-time analytics

Key EDA Patterns

  1. Event Notification: Simple event publishing without payload
  2. Event-Carried State Transfer: Events contain relevant state data
  3. Event Sourcing: Store state changes as a sequence of events
  4. CQRS (Command Query Responsibility Segregation): Separate read and write models

Challenges in EDA

  1. Eventual Consistency: Managing data consistency across services
  2. Event Schema Evolution: Handling changes in event structure over time
  3. Error Handling and Dead Letter Queues: Managing failed event processing
  4. Monitoring and Tracing: Tracking event flow across distributed systems

EDA in Kubernetes

  • Use message brokers like Kafka, NATS, or RabbitMQ
  • Leverage Kubernetes features:
    • StatefulSets for stateful messaging systems
    • Services for service discovery
    • ConfigMaps and Secrets for configuration

Best Practices

  1. Design clear and well-documented event schemas

  2. Implement idempotent event consumers

  3. Use correlation IDs for tracing events across services

  4. Implement proper error handling and retry mechanisms

  5. Monitor system health and performance metrics

  6. Kafka Basics:

Kafka Basics

Core Concepts

Topics

  • Categorize streams of records
  • Similar to tables in a database
  • Can have multiple partitions for parallelism
  • Retention policies define how long data is kept

Partitions

  • Ordered, immutable sequence of records
  • Each record assigned a sequential offset
  • Enable parallel processing and scalability
  • Distributed across Kafka brokers

Offsets

  • Unique identifier for each record within a partition
  • Consumers track their position using offsets
  • Allows consumers to replay data from a specific point

Producers and Consumers

Producers

  • Publish records to topics
  • Can choose the partition for each record
  • Support for asynchronous and synchronous sending
  • Configurable acknowledgment levels (acks)

Consumers

  • Read records from topics
  • Organized into consumer groups for scalability
  • Each partition consumed by only one consumer in a group
  • Offset management: automatic or manual commit

Kafka Streams

  • Library for building streaming applications
  • Stateful and stateless processing
  • Exactly-once semantics
  • Supports windowing operations
  • Key features:
    • KStream: represent streams
    • KTable: represent changelog streams
    • GlobalKTable: enrichment use cases

Kafka Connect

  • Framework for connecting Kafka with external systems
  • Source connectors: import data from external systems
  • Sink connectors: export data to external systems
  • Distributed mode for scalability

Key Features

  • High throughput and low latency
  • Scalability: can handle trillions of events a day
  • Durability: persists messages on disk
  • Fault-tolerance: replication across multiple brokers
  • Exactly-once semantics (since version 0.11)

Use Cases

  • Messaging system
  • Activity tracking
  • Gather metrics from various sources
  • Application logs gathering
  • Stream processing
  • Event sourcing
  • Commit log service

Kafka in Kubernetes

  • Use Kafka Operator for easy deployment and management
  • StatefulSets for Kafka brokers
  • Headless Services for broker discovery
  • Configure proper storage class for persistence

Best Practices

  1. Choose appropriate partition count and replication factor
  2. Implement proper monitoring and alerting
  3. Tune producer and consumer configurations for performance
  4. Implement proper security measures (authentication, authorization, encryption)
  5. Regular maintenance: log compaction, cluster balancing

Common Challenges

  1. Topic management at scale

  2. Handling large messages

  3. Dealing with rebalancing in consumer groups

  4. Ensuring exactly-once processing in stream applications

  5. Disaster recovery planning

  6. Troubleshooting in Kubernetes:

Troubleshooting in Kubernetes

Logging and Monitoring Tools

Logging

  1. Kubectl logs

    • Basic logging: kubectl logs <pod-name>
    • Follow logs: kubectl logs -f <pod-name>
    • Logs from previous instance: kubectl logs <pod-name> --previous
  2. Centralized Logging

    • ELK Stack (Elasticsearch, Logstash, Kibana)
    • Fluentd + Elasticsearch + Kibana
    • Loki (part of Grafana stack)

Monitoring

  1. Prometheus

    • Open-source monitoring and alerting toolkit
    • Pull-based metrics collection
    • PromQL for querying metrics
  2. Grafana

    • Visualization platform for metrics
    • Supports multiple data sources (including Prometheus)
    • Customizable dashboards
  3. Kubernetes Dashboard

    • Web-based UI for Kubernetes clusters
    • Overview of applications running on the cluster
    • Create and modify Kubernetes resources

Debugging Pods and Services

Pod Debugging

  1. Describe Pod

    • kubectl describe pod <pod-name>
    • Check events, status, and configuration
  2. Exec into Container

    • kubectl exec -it <pod-name> -- /bin/sh
    • Investigate from within the container
  3. Port Forwarding

    • kubectl port-forward <pod-name> 8080:80
    • Access pod's port locally for debugging

Service Debugging

  1. Verify Service Configuration

    • kubectl get svc <service-name>
    • kubectl describe svc <service-name>
  2. Check Endpoints

    • kubectl get endpoints <service-name>
    • Ensure pods are correctly registered
  3. DNS Troubleshooting

    • Deploy a debug pod: kubectl run -it --rm debug --image=busybox --restart=Never -- sh
    • Use `nsl