Mixed-Architecture Cluster

Mac minis + Raspberry Pis + maybe one x86 box. Cross-arch (arm64-apple, arm64-linux, amd64-linux), cross-OS (macOS + Linux), one registry. The shape the example registry that ships with the repo is built for.

TL;DR

Hardware Any combination of M-series Mac minis, arm64-linux Pis, and amd64-linux boxes (NUC, mini PC, old desktop)
Architectures supported arm64-apple, arm64-linux, amd64-linux, amd64-apple (Intel Mac)
OS combos macOS + Linux mixed in the same cluster — no special flags
Image strategy One buildx host produces multi-arch images; deploys pull the right variant per host
Time to first cross-arch deploy ~1 hour from cold start, less if you already have buildx working
Biggest gotcha Anything that hard-codes linux/amd64 (a vendor image, your own Dockerfile with the wrong base) will fail on arm64 nodes silently or with confusing errors

Why pick this shape

  • Use the hardware you already have. The ten-year-old desktop in the basement is a perfectly fine x86 worker.
  • Apple Silicon for the dev-feeling control plane (Caddy + DNS + Vault on a mini), Pis for the cheap 24/7 services, an x86 box for whatever needs amd64.
  • The full registry.example.yml at the repo root demonstrates a 7-host version of this layout: 3 Macs (m1, mini1, mini2) + 4 Pis (pi1–pi4).

Reading the example registry

hosts:
  m1:
    ip: 192.0.2.20
    arch: arm64-apple
    ssh_user: <ssh-user>
    path: <base-path>
    roles: [user_services, workflows, vault]
  mini1:
    ip: 192.0.2.10
    arch: arm64-apple
    roles: [infrastructure, test_databases, mlx_backends]
  mini2:
    ip: 192.0.2.11
    arch: arm64-apple
    roles: [production_databases, mlx_inference]
  pi1:
    ip: 192.0.2.31
    arch: arm64-linux
    roles: [kag_ecosystem]
  pi2:
    arch: arm64-linux
    roles: [management_services]
  pi3:
    arch: arm64-linux
    roles: [business_services, ingestion]
  pi4:
    arch: arm64-linux
    roles: [coding_service]

The arch: key tells Portoser which image variant to pull or build. roles: is purely for human readability — it doesn't affect routing — but using it consistently makes the registry navigable as it grows.

Building multi-arch images

The buildx host is whichever machine has the most cores. Usually the same machine as your dev box.

./portoser cluster setup-buildx
# Configures a buildx builder with linux/arm64 + linux/amd64 platforms

./portoser cluster build --all
# Cross-builds every service that has a Dockerfile, in parallel batches

Under the hood this calls into lib/cluster/buildx.sh:setup_cluster_buildx, which spins up a docker buildx builder named after your registry and verifies it's ready before any build runs.

If you push to a registry: configure docker_registry in .env (see .env.example). If you don't: buildx can store images locally and cluster deploy will docker save | ssh ... docker load them across — slower, but works without infrastructure.

Cross-arch traps

Hard-coded --platform=linux/amd64

Easy to do by accident:

FROM --platform=linux/amd64 python:3.12-slim    # ← wrong on arm64 hosts

Drop the --platform flag and let buildx figure it out per target. Or specify it conditionally if you really need to pin one stage.

Vendor images that are amd64-only

Some upstream images (older databases, certain ML SDKs) only publish for amd64. You have three choices:

  1. Pin the service to your amd64 host by setting current_host: nuc1 for any service whose image won't run on arm64.
  2. Run via QEMU emulation — works under Docker Desktop on macOS but is slow and unreliable for heavy workloads.
  3. Build your own arm64 image if the upstream is just a Python package with no native deps.

Native services across OSes

deployment_type: native works on both. lib/platform/detector.sh looks at arch: (or uname on the host) and dispatches to launchctl (Darwin) or systemctl (Linux). The service_file you reference can declare both — Portoser will pick the right block.

services:
  postgres-prod:
    hostname: postgres-prod.internal
    current_host: mini2          # Apple Silicon → launchctl
    deployment_type: native
    service_file: /postgres/service.yml
    port: 5432
  ingestion:
    hostname: ingestion.internal
    current_host: pi3            # arm64-linux → systemctl
    deployment_type: docker      # mixing types is fine
    docker_compose: /ingestion_service/docker-compose.yml
    port: 8555

Bringing it up

# 1. Make sure SSH keys reach every node (key-only auth, no passwords)
for host in mini1 mini2 m1 pi1 pi2 pi3 pi4; do
  ssh-copy-id "$host"
done

# 2. Validate the registry — this catches bad arch values, missing fields
./portoser registry validate

# 3. Set up buildx on the builder host
./portoser cluster setup-buildx

# 4. Build everything once
./portoser cluster build --all

# 5. Deploy in role order — infrastructure first
./portoser deploy mini1 dnsmasq caddy
./portoser deploy mini1 vault
./portoser deploy mini2 postgres-prod
./portoser cluster deploy --all

The cluster deploy --all path SSHes into each host in turn, runs docker compose up -d (or local/native equivalents), and waits for health to come green. lib/cluster/deploy.sh:verify_deployment is the gate.

Health and observability

./portoser cluster health --watch         # live, all hosts, all services
./portoser cluster docker-health -v        # container counts per host
./portoser dependencies graph              # cross-host dependency edges
./portoser metrics                          # per-service resource usage

Each host runs the same metrics collector (lib/metrics/collector.sh) shelled in over SSH. Cross-arch is not visible — the collector reads /proc on Linux and top/vm_stat on macOS and produces the same JSON shape.

Where this shape falls down

  • The buildx host becomes a soft dependency. If you can't build, you can't push new images. Keep the buildx config reproducible (it's a single setup-buildx invocation).
  • Diagnosing a problem that only happens on one arch ("works on my Mac, fails on the Pi") is real and tedious. The self-healing loop's playbooks are arch-aware in some patterns but not all — port conflicts, stale processes, and disk exhaustion are arch-agnostic; everything else may need eyeballs.
  • If your hardware is so heterogeneous that more than half your services pin to a specific host, you're not really running a cluster — you're running multiple single-host setups behind one registry. That's fine, just be honest about it when reasoning about failover.

Next