Skip to content

Kontinuum Node — Deployment

Build process, packaging, hardware requirements, network topology, zero-downtime updates, rollback. P1 prerequisite перед первым deployment.

Audience: SRE / DevOps · partner operators · self-hosted PRO users.

Связанные документы:


Build process

Source layout

kontinuum-node/
├── Cargo.toml          # workspace: server + admin
├── server/
├── admin/
└── deploy/             # ← deployment artifacts (this document)
    ├── docker/
    ├── k8s/
    ├── systemd/
    ├── ansible/
    ├── grafana/        # см. observability.md
    ├── prometheus/
    └── directus/

Build profiles

toml
# Cargo.toml workspace

[profile.release]
opt-level = 3
lto = "fat"              # full link-time optimization
codegen-units = 1        # max optimization (slower compile, faster runtime)
strip = "symbols"        # remove debug symbols
panic = "abort"          # smaller binary, no unwinding tables

[profile.release-debug]
inherits = "release"
debug = true             # keep debug info для production debugging
strip = "none"

Build commands

bash
# Development build
cargo build --workspace

# Release build (for prod deployment)
cargo build --workspace --release

# Static binary (для Docker scratch image)
RUSTFLAGS='-C target-feature=+crt-static -C link-arg=-s' \
cargo build --workspace --release --target x86_64-unknown-linux-musl

# Cross-compile for ARM (home server / Raspberry Pi)
cross build --workspace --release --target aarch64-unknown-linux-musl

Binary size targets

  • kontinuum-node-server (release, stripped, musl): ≤ 30 MB
  • kontinuum-node-admin (release, stripped, musl): ≤ 20 MB

При превышении — review dependencies (особенно tokio features, libp2p features).

Reproducible builds

Цель: bit-identical binary для same source + dependencies. Используем:

  • Pinned Cargo.lock (committed в repo).
  • SOURCE_DATE_EPOCH env var для timestamp determinism.
  • Vendored dependencies (cargo vendor) для air-gapped builds.
bash
SOURCE_DATE_EPOCH=$(git log -1 --format=%ct) \
cargo build --workspace --release --frozen --locked

Docker packaging

Image: kontinuum-node-server

dockerfile
# Stage 1 — Builder
FROM rust:1.78-slim AS builder

RUN apt-get update && apt-get install -y \
    pkg-config libssl-dev musl-tools \
    && rm -rf /var/lib/apt/lists/*

WORKDIR /build
COPY . .
RUN rustup target add x86_64-unknown-linux-musl \
    && RUSTFLAGS='-C target-feature=+crt-static' \
    cargo build --release --target x86_64-unknown-linux-musl \
        --bin kontinuum-node-server

# Stage 2 — Runtime (distroless)
FROM gcr.io/distroless/static-debian12:nonroot

COPY --from=builder \
    /build/target/x86_64-unknown-linux-musl/release/kontinuum-node-server \
    /usr/local/bin/

# rustfs binary (if embedded)
# COPY --from=builder /build/target/.../rustfs /usr/local/bin/

WORKDIR /var/lib/kontinuum-node

VOLUME ["/var/lib/kontinuum-node/data", "/var/lib/kontinuum-node/keys"]

EXPOSE 4001/tcp 4001/udp 9100/tcp

USER nonroot:nonroot

ENTRYPOINT ["/usr/local/bin/kontinuum-node-server"]
CMD ["--config", "/etc/kontinuum-node/node.toml"]

Image: kontinuum-node-admin

Аналогично, с дополнительным port 8080 (REST API) и 9090 (billing webhooks).

Image tags

kontinuum.io/kontinuum-node-server:1.0.0       # immutable version
kontinuum.io/kontinuum-node-server:1.0         # latest patch для 1.0.x
kontinuum.io/kontinuum-node-server:latest      # latest stable
kontinuum.io/kontinuum-node-server:edge        # bleeding-edge (CI builds)

Multi-arch builds

bash
docker buildx build \
    --platform linux/amd64,linux/arm64,linux/arm/v7 \
    -t kontinuum.io/kontinuum-node-server:1.0.0 \
    --push .

arm/v7 для Raspberry Pi 4 home servers.


Kubernetes deployment

StatefulSet (server)

yaml
# deploy/k8s/server-statefulset.yaml
apiVersion: apps/v1
kind: StatefulSet
metadata:
  name: kontinuum-node-server
spec:
  serviceName: kontinuum-node-server
  replicas: 1
  selector:
    matchLabels: { app: kontinuum-node-server }
  template:
    metadata:
      labels: { app: kontinuum-node-server }
    spec:
      containers:
        - name: server
          image: kontinuum.io/kontinuum-node-server:1.0.0
          ports:
            - name: libp2p-tcp
              containerPort: 4001
              protocol: TCP
            - name: libp2p-quic
              containerPort: 4001
              protocol: UDP
            - name: prometheus
              containerPort: 9100
              protocol: TCP
          volumeMounts:
            - name: data
              mountPath: /var/lib/kontinuum-node/data
            - name: keys
              mountPath: /var/lib/kontinuum-node/keys
              readOnly: true
            - name: config
              mountPath: /etc/kontinuum-node
              readOnly: true
          resources:
            requests:
              cpu: 500m
              memory: 1Gi
            limits:
              cpu: 2000m
              memory: 4Gi
          livenessProbe:
            httpGet:
              path: /healthz
              port: 9100
            initialDelaySeconds: 30
            periodSeconds: 10
          readinessProbe:
            httpGet:
              path: /readyz
              port: 9100
            initialDelaySeconds: 10
            periodSeconds: 5
      volumes:
        - name: keys
          secret:
            secretName: kontinuum-node-keys
            defaultMode: 0600
        - name: config
          configMap:
            name: kontinuum-node-config
  volumeClaimTemplates:
    - metadata:
        name: data
      spec:
        accessModes: [ReadWriteOnce]
        resources:
          requests:
            storage: 100Gi
        storageClassName: fast-ssd

Service / Ingress

yaml
# deploy/k8s/server-service.yaml
apiVersion: v1
kind: Service
metadata:
  name: kontinuum-node-server
spec:
  type: LoadBalancer        # для external libp2p access
  ports:
    - name: libp2p-tcp
      port: 4001
      targetPort: 4001
      protocol: TCP
    - name: libp2p-quic
      port: 4001
      targetPort: 4001
      protocol: UDP
  selector:
    app: kontinuum-node-server

ConfigMap + Secret

yaml
# deploy/k8s/server-config.yaml
apiVersion: v1
kind: ConfigMap
metadata:
  name: kontinuum-node-config
data:
  node.toml: |
    [identity]
    key_path = "/var/lib/kontinuum-node/keys/node.key"
    cert_path = "/var/lib/kontinuum-node/keys/node.cert"
    
    [network]
    listen_addrs = ["/ip4/0.0.0.0/tcp/4001", "/ip4/0.0.0.0/udp/4001/quic-v1"]
    # ...
yaml
# deploy/k8s/server-secret.yaml — НЕ commit в git! Generate via sealed-secrets / Vault
apiVersion: v1
kind: Secret
metadata:
  name: kontinuum-node-keys
type: Opaque
data:
  node.key: <base64-encoded encrypted Ed25519 key>
  node.cert: <base64-encoded cert>

Helm chart (optional)

Для repeated deployments — deploy/k8s/helm/kontinuum-node/. Values.yaml exposes:

yaml
# values.yaml
image:
  repository: kontinuum.io/kontinuum-node-server
  tag: "1.0.0"

config:
  tier: 1
  tenancy: "single"
  operator: "org"
  geoZone: "eu-west"
  maxTotalGb: 50

resources:
  requests: { cpu: 500m, memory: 1Gi }
  limits: { cpu: 2000m, memory: 4Gi }

storage:
  size: 100Gi
  className: fast-ssd

Systemd deployment (Linux servers без k8s)

Unit file

ini
# deploy/systemd/kontinuum-node-server.service
[Unit]
Description=Kontinuum Node Server
After=network.target
Wants=network-online.target

[Service]
Type=simple
User=kontinuum-node
Group=kontinuum-node
ExecStart=/usr/local/bin/kontinuum-node-server --config /etc/kontinuum-node/node.toml
ExecReload=/bin/kill -SIGHUP $MAINPID
Restart=always
RestartSec=10
StartLimitInterval=600
StartLimitBurst=5

# Security hardening
NoNewPrivileges=true
PrivateTmp=true
ProtectSystem=strict
ProtectHome=true
ReadWritePaths=/var/lib/kontinuum-node /var/log/kontinuum-node
ProtectKernelTunables=true
ProtectKernelModules=true
ProtectControlGroups=true
RestrictNamespaces=true
RestrictRealtime=true
RestrictSUIDSGID=true
LockPersonality=true
MemoryDenyWriteExecute=true

# Resource limits
LimitNOFILE=65535
LimitNPROC=4096
MemoryMax=4G

# Logging
StandardOutput=journal
StandardError=journal

[Install]
WantedBy=multi-user.target

Installation script

bash
# deploy/systemd/install.sh
#!/usr/bin/env bash
set -euo pipefail

# Create user
useradd --system --no-create-home --shell /usr/sbin/nologin kontinuum-node

# Install binary
install -m 755 target/release/kontinuum-node-server /usr/local/bin/

# Setup directories
install -d -m 750 -o kontinuum-node -g kontinuum-node \
    /var/lib/kontinuum-node/data \
    /var/lib/kontinuum-node/keys
install -d -m 755 /etc/kontinuum-node

# Install unit file
install -m 644 deploy/systemd/kontinuum-node-server.service /etc/systemd/system/

systemctl daemon-reload
systemctl enable kontinuum-node-server.service

echo "Next steps:"
echo "  1. Configure /etc/kontinuum-node/node.toml (см. docs/node/configuration.md)"
echo "  2. Bootstrap identity: kontinuum-node-server bootstrap"
echo "  3. Request cert через admin"
echo "  4. systemctl start kontinuum-node-server"

Hardware requirements

Per-tier minimum specs

TierCPURAMDisk (SSD)Network
Tier 0 anchor4 vCPU8 GB50 GB100 Mbps symmetric, public IP
Tier 1 Small2 vCPU2 GB20 GB100 Mbps, public IP
Tier 1 Medium4 vCPU4 GB50 GB200 Mbps, public IP
Tier 1 Large8 vCPU8 GB100 GB1 Gbps, public IP
Tier 2 byo:vps2-4 vCPU2-4 GB20-50 GB50+ Mbps, public IP preferred
Tier 2 byo:homeARMv7+1+ GB50+ GB (any)25+ Mbps NAT OK (DCUtR)

IOPS requirements:

  • Tier 0 / 1: ≥ 500 IOPS sustained (NVMe или enterprise SSD).
  • Tier 2: ≥ 100 IOPS sufficient (consumer SSD; HDD discouraged для DB but OK для blob storage).

Capacity-tier mapping

Cost-tier (см. pricing.md §8.3)Disk (allocated)Egress allowance/mo
Small20 GB100 GB
Medium50 GB300 GB
Large100 GB1 TB

Network requirements

Open ports

PortProtocolDirectionPurpose
4001TCPinbound + outboundlibp2p (Noise + Yamux)
4001UDPinbound + outboundlibp2p QUIC
9000TCPlocalhost onlyrustfs internal (auth-shim bridged)
9100TCPinbound (Prom)Prometheus metrics scrape
8080TCPinbound (admin)Admin REST API
9090TCPinbound (billing)Billing webhook receiver

Firewall rules (example UFW)

bash
# server
ufw allow 4001/tcp comment 'libp2p TCP'
ufw allow 4001/udp comment 'libp2p QUIC'
ufw allow from <prometheus-server-ip> to any port 9100 comment 'Prometheus scrape'
ufw deny 9000  # rustfs internal only

# admin (separate VM/instance)
ufw allow from <org-admin-vpn-cidr> to any port 8080 comment 'Admin REST'
ufw allow from <billing-system-ip> to any port 9090 comment 'Billing webhooks'

NAT / firewall traversal

Для Tier 2 byo:home (часто за NAT):

  • DCUtR (Direct Connection Upgrade through Relay) — libp2p hole punching, ~70% NAT types успешно.
  • Circuit-relay v2 через Tier 1 / Tier 0 nodes — fallback для restrictive NAT.
  • UPnP / NAT-PMP — opportunistic port forwarding если router supports.
  • При полной NAT failure — нода работает только как client (не accepts inbound peers), что severely limits её роль в DHT.

DNS requirements

TierDNS record needed
Tier 0 anchorA record + AAAA (e.g., anchor1.kontinuum.network), permanent
Tier 1 (org-run)A record (e.g., relay-eu-west-01.kontinuum.network)
Tier 1 (partner / user-owned)IP address sufficient; libp2p PeerId stable identifier
Tier 2IP address sufficient

Tier 0 anchors must have stable DNS — это hardcoded в client/node binaries (см. bootstrap.md).


Zero-downtime updates

Rolling restart procedure

bash
# 1. Set node в drain mode
curl -X POST https://admin.kontinuum.network/api/v1/nodes/$NODE_ID/drain \
  -H "Authorization: Bearer $TOKEN" \
  -d '{"reason":"version-upgrade","estimated_duration_minutes":10}'

# 2. Wait для drain completion (existing connections finish gracefully)
sleep 60

# 3. Update binary
systemctl stop kontinuum-node-server
install -m 755 /tmp/kontinuum-node-server-1.0.1 /usr/local/bin/kontinuum-node-server
systemctl start kontinuum-node-server

# 4. Wait для node готов
while ! curl -sf http://localhost:9100/readyz; do sleep 2; done

# 5. Remove drain mode (automatic when service starts с valid cert)
# Verify через admin API:
curl https://admin.kontinuum.network/api/v1/nodes/$NODE_ID

Kubernetes rolling update

yaml
# StatefulSet update strategy
spec:
  updateStrategy:
    type: RollingUpdate
    rollingUpdate:
      partition: 0   # update all replicas, one at a time
  podManagementPolicy: OrderedReady   # wait for ready before next pod

Database migrations

Automatic при startup через rusqlite_migration (см. db-schemas.md). Forward-only.

Critical: перед deploying новой version с migration:

  1. Backup DB (full snapshot).
  2. Test migration в staging на copy of prod DB.
  3. Monitor first prod nodes carefully — если migration ломает, остановить rollout.

Wire protocol compatibility

Major version bumps (protocol_version в NodeHello) — overlap window ≥ 6 months (см. wire-types.md §Schema evolution).

  • Two-phase rollout:
    1. Phase A: deploy new version with dual-protocol support (v1 + v2).
    2. Phase B (after 6 months): drop v1 support.

Rollback procedure

Scenario 1 — Binary regression (no schema change)

bash
# Stop service
systemctl stop kontinuum-node-server

# Restore previous binary
install -m 755 /var/backup/kontinuum-node-server-1.0.0 /usr/local/bin/

# Start
systemctl start kontinuum-node-server

Scenario 2 — Schema migration regression

Critical: migrations forward-only в production. Если migration v002 ломает — rollback options:

  1. Restore DB from backup (taken before deployment).
  2. Restore previous binary.
  3. Manual investigation; написать v003 migration исправляющий v002.

Никогда не запускать rollback migrations в prod — слишком рискованно.

Scenario 3 — Cert revocation incident

(см. bootstrap.md §Tier 0 key compromise)


CI/CD pipeline

GitLab CI (.gitlab-ci.yml)

yaml
stages:
  - test
  - build
  - publish
  - deploy

test:
  stage: test
  script:
    - cargo fmt --check
    - cargo clippy --workspace -- -D warnings
    - cargo test --workspace
    - cargo deny check
    - cargo audit

build:
  stage: build
  script:
    - cargo build --workspace --release --target x86_64-unknown-linux-musl
  artifacts:
    paths:
      - target/x86_64-unknown-linux-musl/release/kontinuum-node-server
      - target/x86_64-unknown-linux-musl/release/kontinuum-node-admin

publish-docker:
  stage: publish
  only: [tags]
  script:
    - docker buildx build --platform linux/amd64,linux/arm64
        -t kontinuum.io/kontinuum-node-server:${CI_COMMIT_TAG}
        --push .

deploy-staging:
  stage: deploy
  only: [main]
  script:
    - kubectl set image statefulset/kontinuum-node-server
        server=kontinuum.io/kontinuum-node-server:${CI_COMMIT_SHORT_SHA}
        -n staging

deploy-prod:
  stage: deploy
  only: [tags]
  when: manual   # require manual approval для prod
  script:
    - kubectl set image statefulset/kontinuum-node-server
        server=kontinuum.io/kontinuum-node-server:${CI_COMMIT_TAG}
        -n production

Backup strategy

What to backup

ResourceFrequencyWhereRetention
node.key (encrypted with master_key)Once (provisioning)Offline cold storage (3+ копии)Permanent
node.certAt every renewalHot storage + offlinePermanent
Server DB (node.db)HourlyS3-compatible backup bucket30 days
Admin DB (admin.db)DailyS3-compatible backup bucket90 days
Blob storageContinuous (replicas serve as backup)N/AN/A

Backup encryption

Все backups encrypted перед upload (через age или GPG):

bash
# Backup encrypt example
sqlite3 /var/lib/kontinuum-node/data/node.db ".backup /tmp/node.db.tmp"
age -r $BACKUP_PUBKEY -o /tmp/node.db.tmp.age /tmp/node.db.tmp
aws s3 cp /tmp/node.db.tmp.age s3://backup-bucket/$(date +%Y%m%d-%H)/node.db.age
rm /tmp/node.db.tmp /tmp/node.db.tmp.age

Restore procedure

bash
# 1. Download backup
aws s3 cp s3://backup-bucket/20260519-10/node.db.age /tmp/

# 2. Decrypt
age -d -i $BACKUP_PRIVATE_KEY -o /tmp/node.db /tmp/node.db.age

# 3. Stop service, replace DB, restart
systemctl stop kontinuum-node-server
cp /tmp/node.db /var/lib/kontinuum-node/data/node.db
chown kontinuum-node:kontinuum-node /var/lib/kontinuum-node/data/node.db
systemctl start kontinuum-node-server

# 4. Network anti-entropy gossip восстановит divergence от других nodes

Tested deployment matrix

PlatformVersions testedStatus
Ubuntu 22.04 LTSLatest patchPrimary
Ubuntu 24.04 LTSLatest patchTested
Debian 12Latest patchTested
Fedora 39+LatestBest-effort
ArchRollingBest-effort
Alpine 3.19+(musl-static builds)Container only
macOS-Dev only
Windows-Not supported
Raspberry Pi OS(arm64 builds)Tier 2 byo:home

Implementation checklist

  • [ ] deploy/docker/Dockerfile.server + Dockerfile.admin.
  • [ ] deploy/k8s/ — StatefulSet, Service, ConfigMap, Secret templates.
  • [ ] deploy/k8s/helm/kontinuum-node/ — Helm chart (optional но recommended).
  • [ ] deploy/systemd/kontinuum-node-server.service + kontinuum-node-admin.service.
  • [ ] deploy/ansible/ — playbooks для bare-metal / VM provisioning (optional).
  • [ ] deploy/scripts/install.sh — single-machine setup.
  • [ ] deploy/scripts/backup.sh + restore.sh.
  • [ ] CI/CD pipeline в .gitlab-ci.yml.
  • [ ] Reproducible build verification в CI (compare hashes между clean builds).
  • [ ] Multi-arch builds verified (amd64 + arm64).

Open implementation questions

  1. Distroless vs Alpine vs scratch. Distroless — secure default, нет shell для debug. Alpine — small + has shell. Scratch — smallest, no debug. Compromise: distroless для prod, Alpine для dev/staging images.
  2. Helm vs raw manifests. Helm — repeated deployments легче, но dependency. Решение: raw manifests primary в deploy/k8s/, Helm chart — optional для convenience.
  3. systemd vs OpenRC. Spec приоритезирует systemd (modern Linux). Alpine используют OpenRC — нужен alternative unit, post-v1.0.
  4. k0s / k3s vs full kubernetes. k3s — lightweight, suitable для small org. Full k8s — для larger scale. Решение: documentation работает для обоих, choice — operator's.
  5. Auto-scaling. Tier 1 nodes — not auto-scaled (provisioning через billing). Auto-scaling — относится только к admin process (REST API под load).