the_information_nexus/docker_swarm.md at main

Files

medusa c51a768d79 Add tech_docs/docker_swarm.md

2025-08-03 23:49:08 -05:00

9.7 KiB

Raw Permalink Blame History

DS-920+ 20 GB RAM + SSD-Cache: Production-grade Docker-Swarm Lab Design ────────────────────────────────────────────────────────────────────────────

High-level goal
• 6 VMs (5 Swarm nodes + 1 gateway)
• All-in-one on the NAS today, but architected so that any node can be migrated to bare metal later.
• Observability stack is first-class (Prometheus, Loki, Grafana).
• After bake-in period we down-size RAM per VM and rely on SSD-cache + ballooning.

Physical resource envelope

CPU : 4 cores / 8 threads (J4125)
RAM : 20 GB (16 GB upgrade + 4 GB stock)
Storage : 4×HDD in SHR-1 + 2×NVMe read/write SSD cache (RAID-1)
Network : 1×1 GbE (bond later if you add a USB-NIC)

Hard limits
• Max 4 vCPUs per VM to leave headroom for DSM.
• Plan for ≤ 18 GB VM RAM total so DSM + containers never swap to HDD.

VM map (initial “generous” sizing)

VM	vCPU	RAM (GB)	Disk	Role / Notes
d12-gw	1	1	8 GB	Router, DNS, DHCP, jump box
d12-m1	2	4	16 GB	Swarm mgr-1 + Prometheus
d12-m2	2	4	16 GB	Swarm mgr-2 + Loki
d12-m3	2	4	16 GB	Swarm mgr-3 + Grafana
d12-w1	2	3	32 GB	Swarm worker-1
d12-w2	2	3	32 GB	Swarm worker-2
TOTAL 11 19 GB ≈ 120 GB thin-provisioned

Disk layout on NAS
• All VMs on SSD-cache-backed volume (QoS = high).
• Enable “SSD cache advisor” → pin VM disks (random I/O) into cache.

Network topology

Virtual switches
• vs-lan : 192.168.1.0/24 (eth0 on every VM) – upstream & mgmt.
• vs-swarm : 10.10.10.0/24 (eth1 on every VM) – overlay & control-plane.

Firewall rules on d12-gw
• Forward/NAT only if you need Internet from vs-swarm.
• SSH jump via d12-gw (port 2222 → internal 10.10.10.x:22).

MTU tuning
• vs-swarm MTU 1550 → Docker overlay MTU 1450 (leaves 100 bytes for VXLAN).

Storage layer

Inside Swarm
• Local-path-provisioner on each worker (fast SSD-cache) for stateless pods.
• NFS share on d12-gw (/srv/nfs) exported to 10.10.10.0/24 for shared volumes (logs, Prometheus TSDB cold tier).

Backup policy
• DSM Snapshot Replication on VM folders nightly.
• Off-site push via Hyper-Backup to cloud bucket.

Observability stack (first-class)

Namespace: monitoring

Service	Placement	Resource limits	Notes
Prometheus	mgr-1	1 vCPU, 2 GB RAM	15-day retention, 10 GiB PVC
Loki	mgr-2	1 vCPU, 1 GB RAM	7-day retention, 5 GiB PVC
Grafana	mgr-3	1 vCPU, 1 GB RAM	persistent `grafana.db` on NFS
node-exporter	global (all 5)	0.1 vCPU, 64 MB	host metrics
cadvisor	global (all 5)	0.2 vCPU, 128 MB	container metrics
promtail	global (all 5)	0.1 vCPU, 64 MB	forwards to Loki

Deploy via single compose file (observability-stack.yml) using docker stack deploy.

Swarm service placement rules

# managers only run control-plane containers
docker node update --availability drain d12-m{1,2,3}

constraints:
  - node.labels.role==worker   # user workloads
  - node.labels.role==monitor  # monitoring (mgr-*)

Label nodes:

docker node update --label-add role=worker d12-w{1,2}
docker node update --label-add role=monitor d12-m{1,2,3}

Post-bake-in “rightsizing” schedule

Monitor Grafana → “VM Memory Utilization” for 2 weeks.

Typical safe cuts

VM	New RAM	Justification
d12-gw	512 MB	Static routes + dnsmasq only
d12-m{1,2,3}	2 GB each	1 GB OS + 1 GB Prometheus/Loki/Grafana
d12-w{1,2}	2 GB each	1 GB OS + 1 GB workload burst
TOTAL after resize ≈ 10.5 GB (leaves ~8 GB for DSM & SSD-cache buffers).

Use virtio-balloon so DSM can reclaim unused RAM dynamically.

Security & hardening checklist

✓ TLS on Docker socket (dockerd --tlsverify)
✓ SSH key-only, fail2ban on d12-gw
✓ sysctl hardening: net.ipv4.ip_forward=1 only on d12-gw, disabled elsewhere
✓ Secrets via Docker secrets, NOT env-vars
✓ Weekly dsym offline snapshots of VM disks

Day-2 growth path

• Add USB-NIC → LACP bond → 2 GbE for DSM + VMs.
• When CPU becomes bottleneck, migrate workers to bare-metal NUC; keep managers on NAS.
• Move NFS to dedicated SSD shelf via USB-C enclosure if I/O saturates.

One-command bootstrap (after VMs exist)

# on d12-m1
git clone https://github.com/you/swarm-lab
cd swarm-lab
./scripts/init.sh   # labels nodes, deploys observability stack

After 5 minutes you’ll have metrics, logs, and a resource dashboard.

Below is a concise, end-to-end walkthrough that takes you from “I have a fresh Linux VM” to “I have a 3-node Docker Swarm running a small demo service.”
Everything is 100 % CLI-based and works on Debian 12, Ubuntu 22/24, or Alpine with Docker ≥ 27.3 installed .

Prerequisites

• 3 Linux hosts (1 manager + 2 workers) on the same L2/L3 network
• Docker Engine installed and started on every host
• TCP ports 2377, 7946 (both TCP & UDP), and 4789 (UDP) open between hosts

Install Docker (example for Debian/Ubuntu)

sudo apt update && sudo apt install -y docker.io
sudo systemctl enable --now docker
sudo usermod -aG docker $USER   # log out & back in

Initialize the swarm (on the manager)

# Find the interface that other nodes can reach, e.g. 192.168.1.10
docker swarm init --advertise-addr 192.168.1.10

You’ll receive a docker swarm join command with a token.
Example output:

Swarm initialized ...  
To add a worker, run:
  docker swarm join --token SWMTKN-1-xxxxx 192.168.1.10:2377

Save that token .

Join the worker nodes

SSH to each worker and paste the join command:

docker swarm join --token <token> 192.168.1.10:2377

Back on the manager, verify:

docker node ls

You should see 3 nodes in “Ready/Active” state .

Deploy your first service

Create a replicated nginx service (3 instances) and expose it on port 80:

docker service create \
  --name web \
  --replicas 3 \
  --publish 80:80 \
  nginx:latest

Check its status:

docker service ls
docker service ps web

Swarm automatically spreads the 3 nginx containers across the three nodes .

Scale or update with one command

Scale up:

docker service scale web=5

Rolling update to a new image:

docker service update --image nginx:1.25-alpine web

Optional: deploy with a stack file

Save as demo-stack.yml:

version: "3.9"
services:
  web:
    image: nginx:alpine
    ports:
      - "80:80"
    deploy:
      replicas: 3

Deploy:

docker stack deploy -c demo-stack.yml demo

You now have the same nginx cluster expressed declaratively .

Day-2 commands you’ll use often

• List nodes: docker node ls
• Inspect a service: docker service inspect --pretty web
• Drain a node (maintenance): docker node update --availability drain <node>
• Remove a service: docker service rm web
• Leave the swarm (worker): docker swarm leave
• Tear down the whole swarm (manager): docker swarm leave --force

Key concepts in 30 seconds

Node = Docker host joined to the swarm
Manager = schedules tasks & keeps the swarm state
Worker = runs the containers (tasks)
Service = desired state for a set of containers (image, replicas, ports, etc.)
Task = a single container instance scheduled by the swarm
Stack = group of services defined in a Compose file

That’s all you need to get productive. Once you’re comfortable, add secrets, configs, health-checks, and multiple managers for HA.

9.7 KiB Raw Permalink Blame History Unescape Escape

One-command bootstrap (after VMs exist)

Key concepts in 30 seconds

9.7 KiB

Raw Permalink Blame History