Kontinuum Node — Deployment
Build process, packaging, hardware requirements, network topology, zero-downtime updates, rollback. P1 prerequisite перед первым deployment.
Audience: SRE / DevOps · partner operators · self-hosted PRO users.
Связанные документы:
configuration.md— config files, env varsbootstrap.md— first-launch sequencesoperations.md— DR plan, admin surfacesobservability.md— Grafana / Prometheus integration
Build process
Source layout
kontinuum-node/
├── Cargo.toml # workspace: server + admin
├── server/
├── admin/
└── deploy/ # ← deployment artifacts (this document)
├── docker/
├── k8s/
├── systemd/
├── ansible/
├── grafana/ # см. observability.md
├── prometheus/
└── directus/Build profiles
# Cargo.toml workspace
[profile.release]
opt-level = 3
lto = "fat" # full link-time optimization
codegen-units = 1 # max optimization (slower compile, faster runtime)
strip = "symbols" # remove debug symbols
panic = "abort" # smaller binary, no unwinding tables
[profile.release-debug]
inherits = "release"
debug = true # keep debug info для production debugging
strip = "none"Build commands
# Development build
cargo build --workspace
# Release build (for prod deployment)
cargo build --workspace --release
# Static binary (для Docker scratch image)
RUSTFLAGS='-C target-feature=+crt-static -C link-arg=-s' \
cargo build --workspace --release --target x86_64-unknown-linux-musl
# Cross-compile for ARM (home server / Raspberry Pi)
cross build --workspace --release --target aarch64-unknown-linux-muslBinary size targets
kontinuum-node-server(release, stripped, musl): ≤ 30 MBkontinuum-node-admin(release, stripped, musl): ≤ 20 MB
При превышении — review dependencies (особенно tokio features, libp2p features).
Reproducible builds
Цель: bit-identical binary для same source + dependencies. Используем:
- Pinned
Cargo.lock(committed в repo). SOURCE_DATE_EPOCHenv var для timestamp determinism.- Vendored dependencies (
cargo vendor) для air-gapped builds.
SOURCE_DATE_EPOCH=$(git log -1 --format=%ct) \
cargo build --workspace --release --frozen --lockedDocker packaging
Image: kontinuum-node-server
# Stage 1 — Builder
FROM rust:1.78-slim AS builder
RUN apt-get update && apt-get install -y \
pkg-config libssl-dev musl-tools \
&& rm -rf /var/lib/apt/lists/*
WORKDIR /build
COPY . .
RUN rustup target add x86_64-unknown-linux-musl \
&& RUSTFLAGS='-C target-feature=+crt-static' \
cargo build --release --target x86_64-unknown-linux-musl \
--bin kontinuum-node-server
# Stage 2 — Runtime (distroless)
FROM gcr.io/distroless/static-debian12:nonroot
COPY --from=builder \
/build/target/x86_64-unknown-linux-musl/release/kontinuum-node-server \
/usr/local/bin/
# rustfs binary (if embedded)
# COPY --from=builder /build/target/.../rustfs /usr/local/bin/
WORKDIR /var/lib/kontinuum-node
VOLUME ["/var/lib/kontinuum-node/data", "/var/lib/kontinuum-node/keys"]
EXPOSE 4001/tcp 4001/udp 9100/tcp
USER nonroot:nonroot
ENTRYPOINT ["/usr/local/bin/kontinuum-node-server"]
CMD ["--config", "/etc/kontinuum-node/node.toml"]Image: kontinuum-node-admin
Аналогично, с дополнительным port 8080 (REST API) и 9090 (billing webhooks).
Image tags
kontinuum.io/kontinuum-node-server:1.0.0 # immutable version
kontinuum.io/kontinuum-node-server:1.0 # latest patch для 1.0.x
kontinuum.io/kontinuum-node-server:latest # latest stable
kontinuum.io/kontinuum-node-server:edge # bleeding-edge (CI builds)Multi-arch builds
docker buildx build \
--platform linux/amd64,linux/arm64,linux/arm/v7 \
-t kontinuum.io/kontinuum-node-server:1.0.0 \
--push .arm/v7 для Raspberry Pi 4 home servers.
Kubernetes deployment
StatefulSet (server)
# deploy/k8s/server-statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
name: kontinuum-node-server
spec:
serviceName: kontinuum-node-server
replicas: 1
selector:
matchLabels: { app: kontinuum-node-server }
template:
metadata:
labels: { app: kontinuum-node-server }
spec:
containers:
- name: server
image: kontinuum.io/kontinuum-node-server:1.0.0
ports:
- name: libp2p-tcp
containerPort: 4001
protocol: TCP
- name: libp2p-quic
containerPort: 4001
protocol: UDP
- name: prometheus
containerPort: 9100
protocol: TCP
volumeMounts:
- name: data
mountPath: /var/lib/kontinuum-node/data
- name: keys
mountPath: /var/lib/kontinuum-node/keys
readOnly: true
- name: config
mountPath: /etc/kontinuum-node
readOnly: true
resources:
requests:
cpu: 500m
memory: 1Gi
limits:
cpu: 2000m
memory: 4Gi
livenessProbe:
httpGet:
path: /healthz
port: 9100
initialDelaySeconds: 30
periodSeconds: 10
readinessProbe:
httpGet:
path: /readyz
port: 9100
initialDelaySeconds: 10
periodSeconds: 5
volumes:
- name: keys
secret:
secretName: kontinuum-node-keys
defaultMode: 0600
- name: config
configMap:
name: kontinuum-node-config
volumeClaimTemplates:
- metadata:
name: data
spec:
accessModes: [ReadWriteOnce]
resources:
requests:
storage: 100Gi
storageClassName: fast-ssdService / Ingress
# deploy/k8s/server-service.yaml
apiVersion: v1
kind: Service
metadata:
name: kontinuum-node-server
spec:
type: LoadBalancer # для external libp2p access
ports:
- name: libp2p-tcp
port: 4001
targetPort: 4001
protocol: TCP
- name: libp2p-quic
port: 4001
targetPort: 4001
protocol: UDP
selector:
app: kontinuum-node-serverConfigMap + Secret
# deploy/k8s/server-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
name: kontinuum-node-config
data:
node.toml: |
[identity]
key_path = "/var/lib/kontinuum-node/keys/node.key"
cert_path = "/var/lib/kontinuum-node/keys/node.cert"
[network]
listen_addrs = ["/ip4/0.0.0.0/tcp/4001", "/ip4/0.0.0.0/udp/4001/quic-v1"]
# ...# deploy/k8s/server-secret.yaml — НЕ commit в git! Generate via sealed-secrets / Vault
apiVersion: v1
kind: Secret
metadata:
name: kontinuum-node-keys
type: Opaque
data:
node.key: <base64-encoded encrypted Ed25519 key>
node.cert: <base64-encoded cert>Helm chart (optional)
Для repeated deployments — deploy/k8s/helm/kontinuum-node/. Values.yaml exposes:
# values.yaml
image:
repository: kontinuum.io/kontinuum-node-server
tag: "1.0.0"
config:
tier: 1
tenancy: "single"
operator: "org"
geoZone: "eu-west"
maxTotalGb: 50
resources:
requests: { cpu: 500m, memory: 1Gi }
limits: { cpu: 2000m, memory: 4Gi }
storage:
size: 100Gi
className: fast-ssdSystemd deployment (Linux servers без k8s)
Unit file
# deploy/systemd/kontinuum-node-server.service
[Unit]
Description=Kontinuum Node Server
After=network.target
Wants=network-online.target
[Service]
Type=simple
User=kontinuum-node
Group=kontinuum-node
ExecStart=/usr/local/bin/kontinuum-node-server --config /etc/kontinuum-node/node.toml
ExecReload=/bin/kill -SIGHUP $MAINPID
Restart=always
RestartSec=10
StartLimitInterval=600
StartLimitBurst=5
# Security hardening
NoNewPrivileges=true
PrivateTmp=true
ProtectSystem=strict
ProtectHome=true
ReadWritePaths=/var/lib/kontinuum-node /var/log/kontinuum-node
ProtectKernelTunables=true
ProtectKernelModules=true
ProtectControlGroups=true
RestrictNamespaces=true
RestrictRealtime=true
RestrictSUIDSGID=true
LockPersonality=true
MemoryDenyWriteExecute=true
# Resource limits
LimitNOFILE=65535
LimitNPROC=4096
MemoryMax=4G
# Logging
StandardOutput=journal
StandardError=journal
[Install]
WantedBy=multi-user.targetInstallation script
# deploy/systemd/install.sh
#!/usr/bin/env bash
set -euo pipefail
# Create user
useradd --system --no-create-home --shell /usr/sbin/nologin kontinuum-node
# Install binary
install -m 755 target/release/kontinuum-node-server /usr/local/bin/
# Setup directories
install -d -m 750 -o kontinuum-node -g kontinuum-node \
/var/lib/kontinuum-node/data \
/var/lib/kontinuum-node/keys
install -d -m 755 /etc/kontinuum-node
# Install unit file
install -m 644 deploy/systemd/kontinuum-node-server.service /etc/systemd/system/
systemctl daemon-reload
systemctl enable kontinuum-node-server.service
echo "Next steps:"
echo " 1. Configure /etc/kontinuum-node/node.toml (см. docs/node/configuration.md)"
echo " 2. Bootstrap identity: kontinuum-node-server bootstrap"
echo " 3. Request cert через admin"
echo " 4. systemctl start kontinuum-node-server"Hardware requirements
Per-tier minimum specs
| Tier | CPU | RAM | Disk (SSD) | Network |
|---|---|---|---|---|
| Tier 0 anchor | 4 vCPU | 8 GB | 50 GB | 100 Mbps symmetric, public IP |
| Tier 1 Small | 2 vCPU | 2 GB | 20 GB | 100 Mbps, public IP |
| Tier 1 Medium | 4 vCPU | 4 GB | 50 GB | 200 Mbps, public IP |
| Tier 1 Large | 8 vCPU | 8 GB | 100 GB | 1 Gbps, public IP |
| Tier 2 byo:vps | 2-4 vCPU | 2-4 GB | 20-50 GB | 50+ Mbps, public IP preferred |
| Tier 2 byo:home | ARMv7+ | 1+ GB | 50+ GB (any) | 25+ Mbps NAT OK (DCUtR) |
IOPS requirements:
- Tier 0 / 1: ≥ 500 IOPS sustained (NVMe или enterprise SSD).
- Tier 2: ≥ 100 IOPS sufficient (consumer SSD; HDD discouraged для DB but OK для blob storage).
Capacity-tier mapping
| Cost-tier (см. pricing.md §8.3) | Disk (allocated) | Egress allowance/mo |
|---|---|---|
| Small | 20 GB | 100 GB |
| Medium | 50 GB | 300 GB |
| Large | 100 GB | 1 TB |
Network requirements
Open ports
| Port | Protocol | Direction | Purpose |
|---|---|---|---|
| 4001 | TCP | inbound + outbound | libp2p (Noise + Yamux) |
| 4001 | UDP | inbound + outbound | libp2p QUIC |
| 9000 | TCP | localhost only | rustfs internal (auth-shim bridged) |
| 9100 | TCP | inbound (Prom) | Prometheus metrics scrape |
| 8080 | TCP | inbound (admin) | Admin REST API |
| 9090 | TCP | inbound (billing) | Billing webhook receiver |
Firewall rules (example UFW)
# server
ufw allow 4001/tcp comment 'libp2p TCP'
ufw allow 4001/udp comment 'libp2p QUIC'
ufw allow from <prometheus-server-ip> to any port 9100 comment 'Prometheus scrape'
ufw deny 9000 # rustfs internal only
# admin (separate VM/instance)
ufw allow from <org-admin-vpn-cidr> to any port 8080 comment 'Admin REST'
ufw allow from <billing-system-ip> to any port 9090 comment 'Billing webhooks'NAT / firewall traversal
Для Tier 2 byo:home (часто за NAT):
- DCUtR (Direct Connection Upgrade through Relay) — libp2p hole punching, ~70% NAT types успешно.
- Circuit-relay v2 через Tier 1 / Tier 0 nodes — fallback для restrictive NAT.
- UPnP / NAT-PMP — opportunistic port forwarding если router supports.
- При полной NAT failure — нода работает только как client (не accepts inbound peers), что severely limits её роль в DHT.
DNS requirements
| Tier | DNS record needed |
|---|---|
| Tier 0 anchor | A record + AAAA (e.g., anchor1.kontinuum.network), permanent |
| Tier 1 (org-run) | A record (e.g., relay-eu-west-01.kontinuum.network) |
| Tier 1 (partner / user-owned) | IP address sufficient; libp2p PeerId stable identifier |
| Tier 2 | IP address sufficient |
Tier 0 anchors must have stable DNS — это hardcoded в client/node binaries (см. bootstrap.md).
Zero-downtime updates
Rolling restart procedure
# 1. Set node в drain mode
curl -X POST https://admin.kontinuum.network/api/v1/nodes/$NODE_ID/drain \
-H "Authorization: Bearer $TOKEN" \
-d '{"reason":"version-upgrade","estimated_duration_minutes":10}'
# 2. Wait для drain completion (existing connections finish gracefully)
sleep 60
# 3. Update binary
systemctl stop kontinuum-node-server
install -m 755 /tmp/kontinuum-node-server-1.0.1 /usr/local/bin/kontinuum-node-server
systemctl start kontinuum-node-server
# 4. Wait для node готов
while ! curl -sf http://localhost:9100/readyz; do sleep 2; done
# 5. Remove drain mode (automatic when service starts с valid cert)
# Verify через admin API:
curl https://admin.kontinuum.network/api/v1/nodes/$NODE_IDKubernetes rolling update
# StatefulSet update strategy
spec:
updateStrategy:
type: RollingUpdate
rollingUpdate:
partition: 0 # update all replicas, one at a time
podManagementPolicy: OrderedReady # wait for ready before next podDatabase migrations
Automatic при startup через rusqlite_migration (см. db-schemas.md). Forward-only.
Critical: перед deploying новой version с migration:
- Backup DB (full snapshot).
- Test migration в staging на copy of prod DB.
- Monitor first prod nodes carefully — если migration ломает, остановить rollout.
Wire protocol compatibility
Major version bumps (protocol_version в NodeHello) — overlap window ≥ 6 months (см. wire-types.md §Schema evolution).
- Two-phase rollout:
- Phase A: deploy new version with dual-protocol support (v1 + v2).
- Phase B (after 6 months): drop v1 support.
Rollback procedure
Scenario 1 — Binary regression (no schema change)
# Stop service
systemctl stop kontinuum-node-server
# Restore previous binary
install -m 755 /var/backup/kontinuum-node-server-1.0.0 /usr/local/bin/
# Start
systemctl start kontinuum-node-serverScenario 2 — Schema migration regression
Critical: migrations forward-only в production. Если migration v002 ломает — rollback options:
- Restore DB from backup (taken before deployment).
- Restore previous binary.
- Manual investigation; написать v003 migration исправляющий v002.
Никогда не запускать rollback migrations в prod — слишком рискованно.
Scenario 3 — Cert revocation incident
(см. bootstrap.md §Tier 0 key compromise)
CI/CD pipeline
GitLab CI (.gitlab-ci.yml)
stages:
- test
- build
- publish
- deploy
test:
stage: test
script:
- cargo fmt --check
- cargo clippy --workspace -- -D warnings
- cargo test --workspace
- cargo deny check
- cargo audit
build:
stage: build
script:
- cargo build --workspace --release --target x86_64-unknown-linux-musl
artifacts:
paths:
- target/x86_64-unknown-linux-musl/release/kontinuum-node-server
- target/x86_64-unknown-linux-musl/release/kontinuum-node-admin
publish-docker:
stage: publish
only: [tags]
script:
- docker buildx build --platform linux/amd64,linux/arm64
-t kontinuum.io/kontinuum-node-server:${CI_COMMIT_TAG}
--push .
deploy-staging:
stage: deploy
only: [main]
script:
- kubectl set image statefulset/kontinuum-node-server
server=kontinuum.io/kontinuum-node-server:${CI_COMMIT_SHORT_SHA}
-n staging
deploy-prod:
stage: deploy
only: [tags]
when: manual # require manual approval для prod
script:
- kubectl set image statefulset/kontinuum-node-server
server=kontinuum.io/kontinuum-node-server:${CI_COMMIT_TAG}
-n productionBackup strategy
What to backup
| Resource | Frequency | Where | Retention |
|---|---|---|---|
node.key (encrypted with master_key) | Once (provisioning) | Offline cold storage (3+ копии) | Permanent |
node.cert | At every renewal | Hot storage + offline | Permanent |
Server DB (node.db) | Hourly | S3-compatible backup bucket | 30 days |
Admin DB (admin.db) | Daily | S3-compatible backup bucket | 90 days |
| Blob storage | Continuous (replicas serve as backup) | N/A | N/A |
Backup encryption
Все backups encrypted перед upload (через age или GPG):
# Backup encrypt example
sqlite3 /var/lib/kontinuum-node/data/node.db ".backup /tmp/node.db.tmp"
age -r $BACKUP_PUBKEY -o /tmp/node.db.tmp.age /tmp/node.db.tmp
aws s3 cp /tmp/node.db.tmp.age s3://backup-bucket/$(date +%Y%m%d-%H)/node.db.age
rm /tmp/node.db.tmp /tmp/node.db.tmp.ageRestore procedure
# 1. Download backup
aws s3 cp s3://backup-bucket/20260519-10/node.db.age /tmp/
# 2. Decrypt
age -d -i $BACKUP_PRIVATE_KEY -o /tmp/node.db /tmp/node.db.age
# 3. Stop service, replace DB, restart
systemctl stop kontinuum-node-server
cp /tmp/node.db /var/lib/kontinuum-node/data/node.db
chown kontinuum-node:kontinuum-node /var/lib/kontinuum-node/data/node.db
systemctl start kontinuum-node-server
# 4. Network anti-entropy gossip восстановит divergence от других nodesTested deployment matrix
| Platform | Versions tested | Status |
|---|---|---|
| Ubuntu 22.04 LTS | Latest patch | Primary |
| Ubuntu 24.04 LTS | Latest patch | Tested |
| Debian 12 | Latest patch | Tested |
| Fedora 39+ | Latest | Best-effort |
| Arch | Rolling | Best-effort |
| Alpine 3.19+ | (musl-static builds) | Container only |
| macOS | - | Dev only |
| Windows | - | Not supported |
| Raspberry Pi OS | (arm64 builds) | Tier 2 byo:home |
Implementation checklist
- [ ]
deploy/docker/Dockerfile.server+Dockerfile.admin. - [ ]
deploy/k8s/— StatefulSet, Service, ConfigMap, Secret templates. - [ ]
deploy/k8s/helm/kontinuum-node/— Helm chart (optional но recommended). - [ ]
deploy/systemd/kontinuum-node-server.service+kontinuum-node-admin.service. - [ ]
deploy/ansible/— playbooks для bare-metal / VM provisioning (optional). - [ ]
deploy/scripts/install.sh— single-machine setup. - [ ]
deploy/scripts/backup.sh+restore.sh. - [ ] CI/CD pipeline в
.gitlab-ci.yml. - [ ] Reproducible build verification в CI (compare hashes между clean builds).
- [ ] Multi-arch builds verified (amd64 + arm64).
Open implementation questions
- Distroless vs Alpine vs scratch. Distroless — secure default, нет shell для debug. Alpine — small + has shell. Scratch — smallest, no debug. Compromise: distroless для prod, Alpine для dev/staging images.
- Helm vs raw manifests. Helm — repeated deployments легче, но dependency. Решение: raw manifests primary в
deploy/k8s/, Helm chart — optional для convenience. - systemd vs OpenRC. Spec приоритезирует systemd (modern Linux). Alpine используют OpenRC — нужен alternative unit, post-v1.0.
- k0s / k3s vs full kubernetes. k3s — lightweight, suitable для small org. Full k8s — для larger scale. Решение: documentation работает для обоих, choice — operator's.
- Auto-scaling. Tier 1 nodes — not auto-scaled (provisioning через billing). Auto-scaling — относится только к admin process (REST API под load).