Add tech_docs/docker_swarm.md
This commit is contained in:
260
tech_docs/docker_swarm.md
Normal file
260
tech_docs/docker_swarm.md
Normal file
@@ -0,0 +1,260 @@
|
|||||||
|
DS-920+ 20 GB RAM + SSD-Cache: **Production-grade Docker-Swarm Lab Design**
|
||||||
|
────────────────────────────────────────────────────────────────────────────
|
||||||
|
|
||||||
|
High-level goal
|
||||||
|
• 6 VMs (5 Swarm nodes + 1 gateway)
|
||||||
|
• All-in-one on the NAS today, but architected so that **any** node can be migrated to bare metal later.
|
||||||
|
• Observability stack is **first-class** (Prometheus, Loki, Grafana).
|
||||||
|
• After bake-in period we **down-size RAM** per VM and rely on SSD-cache + ballooning.
|
||||||
|
|
||||||
|
------------------------------------------------
|
||||||
|
1. Physical resource envelope
|
||||||
|
------------------------------------------------
|
||||||
|
CPU : 4 cores / 8 threads (J4125)
|
||||||
|
RAM : 20 GB (16 GB upgrade + 4 GB stock)
|
||||||
|
Storage : 4×HDD in SHR-1 + 2×NVMe read/write SSD cache (RAID-1)
|
||||||
|
Network : 1×1 GbE (bond later if you add a USB-NIC)
|
||||||
|
|
||||||
|
Hard limits
|
||||||
|
• Max 4 vCPUs per VM to leave headroom for DSM.
|
||||||
|
• Plan for **≤ 18 GB VM RAM total** so DSM + containers never swap to HDD.
|
||||||
|
|
||||||
|
------------------------------------------------
|
||||||
|
2. VM map (initial “generous” sizing)
|
||||||
|
------------------------------------------------
|
||||||
|
| VM | vCPU | RAM (GB) | Disk | Role / Notes |
|
||||||
|
|---------|------|----------|------|--------------|
|
||||||
|
| d12-gw | 1 | 1 | 8 GB | Router, DNS, DHCP, jump box |
|
||||||
|
| d12-m1 | 2 | 4 | 16 GB| Swarm mgr-1 + Prometheus |
|
||||||
|
| d12-m2 | 2 | 4 | 16 GB| Swarm mgr-2 + Loki |
|
||||||
|
| d12-m3 | 2 | 4 | 16 GB| Swarm mgr-3 + Grafana |
|
||||||
|
| d12-w1 | 2 | 3 | 32 GB| Swarm worker-1 |
|
||||||
|
| d12-w2 | 2 | 3 | 32 GB| Swarm worker-2 |
|
||||||
|
TOTAL 11 19 GB ≈ 120 GB thin-provisioned
|
||||||
|
|
||||||
|
Disk layout on NAS
|
||||||
|
• All VMs on **SSD-cache-backed** volume (QoS = high).
|
||||||
|
• Enable **“SSD cache advisor”** → pin VM disks (random I/O) into cache.
|
||||||
|
|
||||||
|
------------------------------------------------
|
||||||
|
3. Network topology
|
||||||
|
------------------------------------------------
|
||||||
|
Virtual switches
|
||||||
|
• **vs-lan** : 192.168.1.0/24 (eth0 on every VM) – upstream & mgmt.
|
||||||
|
• **vs-swarm** : 10.10.10.0/24 (eth1 on every VM) – overlay & control-plane.
|
||||||
|
|
||||||
|
Firewall rules on d12-gw
|
||||||
|
• Forward/NAT only if you need Internet from vs-swarm.
|
||||||
|
• SSH jump via d12-gw (port 2222 → internal 10.10.10.x:22).
|
||||||
|
|
||||||
|
MTU tuning
|
||||||
|
• vs-swarm MTU 1550 → Docker overlay MTU 1450 (leaves 100 bytes for VXLAN).
|
||||||
|
|
||||||
|
------------------------------------------------
|
||||||
|
4. Storage layer
|
||||||
|
------------------------------------------------
|
||||||
|
Inside Swarm
|
||||||
|
• Local-path-provisioner on each worker (fast SSD-cache) for stateless pods.
|
||||||
|
• NFS share on d12-gw (`/srv/nfs`) exported to 10.10.10.0/24 for shared volumes (logs, Prometheus TSDB cold tier).
|
||||||
|
|
||||||
|
Backup policy
|
||||||
|
• DSM Snapshot Replication on VM folders nightly.
|
||||||
|
• Off-site push via Hyper-Backup to cloud bucket.
|
||||||
|
|
||||||
|
------------------------------------------------
|
||||||
|
5. Observability stack (first-class)
|
||||||
|
------------------------------------------------
|
||||||
|
Namespace: `monitoring`
|
||||||
|
|
||||||
|
| Service | Placement | Resource limits | Notes |
|
||||||
|
|---------|-----------|-----------------|-------|
|
||||||
|
| Prometheus | mgr-1 | 1 vCPU, 2 GB RAM | 15-day retention, 10 GiB PVC |
|
||||||
|
| Loki | mgr-2 | 1 vCPU, 1 GB RAM | 7-day retention, 5 GiB PVC |
|
||||||
|
| Grafana | mgr-3 | 1 vCPU, 1 GB RAM | persistent `grafana.db` on NFS |
|
||||||
|
| node-exporter | global (all 5) | 0.1 vCPU, 64 MB | host metrics |
|
||||||
|
| cadvisor | global (all 5) | 0.2 vCPU, 128 MB | container metrics |
|
||||||
|
| promtail | global (all 5) | 0.1 vCPU, 64 MB | forwards to Loki |
|
||||||
|
|
||||||
|
Deploy via single compose file (`observability-stack.yml`) using `docker stack deploy`.
|
||||||
|
|
||||||
|
------------------------------------------------
|
||||||
|
6. Swarm service placement rules
|
||||||
|
------------------------------------------------
|
||||||
|
```yaml
|
||||||
|
# managers only run control-plane containers
|
||||||
|
docker node update --availability drain d12-m{1,2,3}
|
||||||
|
|
||||||
|
constraints:
|
||||||
|
- node.labels.role==worker # user workloads
|
||||||
|
- node.labels.role==monitor # monitoring (mgr-*)
|
||||||
|
```
|
||||||
|
|
||||||
|
Label nodes:
|
||||||
|
```bash
|
||||||
|
docker node update --label-add role=worker d12-w{1,2}
|
||||||
|
docker node update --label-add role=monitor d12-m{1,2,3}
|
||||||
|
```
|
||||||
|
|
||||||
|
------------------------------------------------
|
||||||
|
7. Post-bake-in “rightsizing” schedule
|
||||||
|
------------------------------------------------
|
||||||
|
Monitor **Grafana → “VM Memory Utilization”** for 2 weeks.
|
||||||
|
|
||||||
|
Typical safe cuts
|
||||||
|
| VM | New RAM | Justification |
|
||||||
|
|---------|---------|---------------|
|
||||||
|
| d12-gw | 512 MB | Static routes + dnsmasq only |
|
||||||
|
| d12-m{1,2,3} | 2 GB each | 1 GB OS + 1 GB Prometheus/Loki/Grafana |
|
||||||
|
| d12-w{1,2} | 2 GB each | 1 GB OS + 1 GB workload burst |
|
||||||
|
TOTAL after resize ≈ **10.5 GB** (leaves ~8 GB for DSM & SSD-cache buffers).
|
||||||
|
|
||||||
|
Use **virtio-balloon** so DSM can reclaim unused RAM dynamically.
|
||||||
|
|
||||||
|
------------------------------------------------
|
||||||
|
8. Security & hardening checklist
|
||||||
|
------------------------------------------------
|
||||||
|
✓ TLS on Docker socket (`dockerd --tlsverify`)
|
||||||
|
✓ SSH key-only, fail2ban on d12-gw
|
||||||
|
✓ `sysctl` hardening: `net.ipv4.ip_forward=1` only on d12-gw, disabled elsewhere
|
||||||
|
✓ Secrets via Docker secrets, NOT env-vars
|
||||||
|
✓ Weekly `dsym` offline snapshots of VM disks
|
||||||
|
|
||||||
|
------------------------------------------------
|
||||||
|
9. Day-2 growth path
|
||||||
|
------------------------------------------------
|
||||||
|
• Add USB-NIC → LACP bond → 2 GbE for DSM + VMs.
|
||||||
|
• When CPU becomes bottleneck, migrate workers to bare-metal NUC; keep managers on NAS.
|
||||||
|
• Move NFS to dedicated SSD shelf via USB-C enclosure if I/O saturates.
|
||||||
|
|
||||||
|
------------------------------------------------
|
||||||
|
One-command bootstrap (after VMs exist)
|
||||||
|
------------------------------------------------
|
||||||
|
```bash
|
||||||
|
# on d12-m1
|
||||||
|
git clone https://github.com/you/swarm-lab
|
||||||
|
cd swarm-lab
|
||||||
|
./scripts/init.sh # labels nodes, deploys observability stack
|
||||||
|
```
|
||||||
|
After 5 minutes you’ll have metrics, logs, and a resource dashboard.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
Below is a concise, end-to-end walkthrough that takes you from “I have a fresh Linux VM” to “I have a 3-node Docker Swarm running a small demo service.”
|
||||||
|
Everything is 100 % CLI-based and works on Debian 12, Ubuntu 22/24, or Alpine with Docker ≥ 27.3 installed .
|
||||||
|
|
||||||
|
--------------------------------------------------
|
||||||
|
1. Prerequisites
|
||||||
|
--------------------------------------------------
|
||||||
|
• 3 Linux hosts (1 manager + 2 workers) on the same L2/L3 network
|
||||||
|
• Docker Engine installed and started on every host
|
||||||
|
• TCP ports **2377**, **7946** (both TCP & UDP), and **4789** (UDP) open between hosts
|
||||||
|
|
||||||
|
--------------------------------------------------
|
||||||
|
2. Install Docker (example for Debian/Ubuntu)
|
||||||
|
--------------------------------------------------
|
||||||
|
```bash
|
||||||
|
sudo apt update && sudo apt install -y docker.io
|
||||||
|
sudo systemctl enable --now docker
|
||||||
|
sudo usermod -aG docker $USER # log out & back in
|
||||||
|
```
|
||||||
|
|
||||||
|
--------------------------------------------------
|
||||||
|
3. Initialize the swarm (on the manager)
|
||||||
|
--------------------------------------------------
|
||||||
|
```bash
|
||||||
|
# Find the interface that other nodes can reach, e.g. 192.168.1.10
|
||||||
|
docker swarm init --advertise-addr 192.168.1.10
|
||||||
|
```
|
||||||
|
You’ll receive a `docker swarm join` command with a token.
|
||||||
|
Example output:
|
||||||
|
```
|
||||||
|
Swarm initialized ...
|
||||||
|
To add a worker, run:
|
||||||
|
docker swarm join --token SWMTKN-1-xxxxx 192.168.1.10:2377
|
||||||
|
```
|
||||||
|
Save that token .
|
||||||
|
|
||||||
|
--------------------------------------------------
|
||||||
|
4. Join the worker nodes
|
||||||
|
--------------------------------------------------
|
||||||
|
SSH to each worker and paste the join command:
|
||||||
|
```bash
|
||||||
|
docker swarm join --token <token> 192.168.1.10:2377
|
||||||
|
```
|
||||||
|
Back on the manager, verify:
|
||||||
|
```bash
|
||||||
|
docker node ls
|
||||||
|
```
|
||||||
|
You should see 3 nodes in “Ready/Active” state .
|
||||||
|
|
||||||
|
--------------------------------------------------
|
||||||
|
5. Deploy your first service
|
||||||
|
--------------------------------------------------
|
||||||
|
Create a replicated nginx service (3 instances) and expose it on port 80:
|
||||||
|
```bash
|
||||||
|
docker service create \
|
||||||
|
--name web \
|
||||||
|
--replicas 3 \
|
||||||
|
--publish 80:80 \
|
||||||
|
nginx:latest
|
||||||
|
```
|
||||||
|
Check its status:
|
||||||
|
```bash
|
||||||
|
docker service ls
|
||||||
|
docker service ps web
|
||||||
|
```
|
||||||
|
Swarm automatically spreads the 3 nginx containers across the three nodes .
|
||||||
|
|
||||||
|
--------------------------------------------------
|
||||||
|
6. Scale or update with one command
|
||||||
|
--------------------------------------------------
|
||||||
|
Scale up:
|
||||||
|
```bash
|
||||||
|
docker service scale web=5
|
||||||
|
```
|
||||||
|
Rolling update to a new image:
|
||||||
|
```bash
|
||||||
|
docker service update --image nginx:1.25-alpine web
|
||||||
|
```
|
||||||
|
|
||||||
|
--------------------------------------------------
|
||||||
|
7. Optional: deploy with a stack file
|
||||||
|
--------------------------------------------------
|
||||||
|
Save as `demo-stack.yml`:
|
||||||
|
```yaml
|
||||||
|
version: "3.9"
|
||||||
|
services:
|
||||||
|
web:
|
||||||
|
image: nginx:alpine
|
||||||
|
ports:
|
||||||
|
- "80:80"
|
||||||
|
deploy:
|
||||||
|
replicas: 3
|
||||||
|
```
|
||||||
|
Deploy:
|
||||||
|
```bash
|
||||||
|
docker stack deploy -c demo-stack.yml demo
|
||||||
|
```
|
||||||
|
You now have the same nginx cluster expressed declaratively .
|
||||||
|
|
||||||
|
--------------------------------------------------
|
||||||
|
8. Day-2 commands you’ll use often
|
||||||
|
--------------------------------------------------
|
||||||
|
• List nodes: `docker node ls`
|
||||||
|
• Inspect a service: `docker service inspect --pretty web`
|
||||||
|
• Drain a node (maintenance): `docker node update --availability drain <node>`
|
||||||
|
• Remove a service: `docker service rm web`
|
||||||
|
• Leave the swarm (worker): `docker swarm leave`
|
||||||
|
• Tear down the whole swarm (manager): `docker swarm leave --force`
|
||||||
|
|
||||||
|
--------------------------------------------------
|
||||||
|
Key concepts in 30 seconds
|
||||||
|
--------------------------------------------------
|
||||||
|
Node = Docker host joined to the swarm
|
||||||
|
Manager = schedules tasks & keeps the swarm state
|
||||||
|
Worker = runs the containers (tasks)
|
||||||
|
Service = desired state for a set of containers (image, replicas, ports, etc.)
|
||||||
|
Task = a single container instance scheduled by the swarm
|
||||||
|
Stack = group of services defined in a Compose file
|
||||||
|
|
||||||
|
That’s all you need to get productive. Once you’re comfortable, add secrets, configs, health-checks, and multiple managers for HA.
|
||||||
Reference in New Issue
Block a user