Mixed-Architecture Cluster

Mac minis + Raspberry Pis + maybe one x86 box. Cross-arch (arm64-apple, arm64-linux, amd64-linux), cross-OS (macOS + Linux), one registry. The shape the example registry that ships with the repo is built for.

TL;DR


Hardware	Any combination of M-series Mac minis, arm64-linux Pis, and amd64-linux boxes (NUC, mini PC, old desktop)
Architectures supported	`arm64-apple`, `arm64-linux`, `amd64-linux`, `amd64-apple` (Intel Mac)
OS combos	macOS + Linux mixed in the same cluster — no special flags
Image strategy	One buildx host produces multi-arch images; deploys pull the right variant per host
Time to first cross-arch deploy	~1 hour from cold start, less if you already have buildx working
Biggest gotcha	Anything that hard-codes `linux/amd64` (a vendor image, your own Dockerfile with the wrong base) will fail on arm64 nodes silently or with confusing errors

Why pick this shape

Use the hardware you already have. The ten-year-old desktop in the basement is a perfectly fine x86 worker.
Apple Silicon for the dev-feeling control plane (Caddy + DNS + Vault on a mini), Pis for the cheap 24/7 services, an x86 box for whatever needs amd64.
The full registry.example.yml at the repo root demonstrates a 7-host version of this layout: 3 Macs (m1, mini1, mini2) + 4 Pis (pi1–pi4).

Reading the example registry

hosts:
  m1:
    ip: 192.0.2.20
    arch: arm64-apple
    ssh_user: <ssh-user>
    path: <base-path>
    roles: [user_services, workflows, vault]
  mini1:
    ip: 192.0.2.10
    arch: arm64-apple
    roles: [infrastructure, test_databases, mlx_backends]
  mini2:
    ip: 192.0.2.11
    arch: arm64-apple
    roles: [production_databases, mlx_inference]
  pi1:
    ip: 192.0.2.31
    arch: arm64-linux
    roles: [kag_ecosystem]
  pi2:
    arch: arm64-linux
    roles: [management_services]
  pi3:
    arch: arm64-linux
    roles: [business_services, ingestion]
  pi4:
    arch: arm64-linux
    roles: [coding_service]

The arch: key tells Portoser which image variant to pull or build. roles: is purely for human readability — it doesn't affect routing — but using it consistently makes the registry navigable as it grows.

Building multi-arch images

The buildx host is whichever machine has the most cores. Usually the same machine as your dev box.

./portoser cluster setup-buildx
# Configures a buildx builder with linux/arm64 + linux/amd64 platforms

./portoser cluster build --all
# Cross-builds every service that has a Dockerfile, in parallel batches

Under the hood this calls into lib/cluster/buildx.sh:setup_cluster_buildx, which spins up a docker buildx builder named after your registry and verifies it's ready before any build runs.

If you push to a registry: configure docker_registry in .env (see .env.example). If you don't: buildx can store images locally and cluster deploy will docker save | ssh ... docker load them across — slower, but works without infrastructure.

Cross-arch traps

Hard-coded `--platform=linux/amd64`

Easy to do by accident:

FROM --platform=linux/amd64 python:3.12-slim    # ← wrong on arm64 hosts

Drop the --platform flag and let buildx figure it out per target. Or specify it conditionally if you really need to pin one stage.

Vendor images that are amd64-only

Some upstream images (older databases, certain ML SDKs) only publish for amd64. You have three choices:

Pin the service to your amd64 host by setting current_host: nuc1 for any service whose image won't run on arm64.
Run via QEMU emulation — works under Docker Desktop on macOS but is slow and unreliable for heavy workloads.
Build your own arm64 image if the upstream is just a Python package with no native deps.

Native services across OSes

deployment_type: native works on both. lib/platform/detector.sh looks at arch: (or uname on the host) and dispatches to launchctl (Darwin) or systemctl (Linux). The service_file you reference can declare both — Portoser will pick the right block.

services:
  postgres-prod:
    hostname: postgres-prod.internal
    current_host: mini2          # Apple Silicon → launchctl
    deployment_type: native
    service_file: /postgres/service.yml
    port: 5432
  ingestion:
    hostname: ingestion.internal
    current_host: pi3            # arm64-linux → systemctl
    deployment_type: docker      # mixing types is fine
    docker_compose: /ingestion_service/docker-compose.yml
    port: 8555

Bringing it up

# 1. Make sure SSH keys reach every node (key-only auth, no passwords)
for host in mini1 mini2 m1 pi1 pi2 pi3 pi4; do
  ssh-copy-id "$host"
done

# 2. Validate the registry — this catches bad arch values, missing fields
./portoser registry validate

# 3. Set up buildx on the builder host
./portoser cluster setup-buildx

# 4. Build everything once
./portoser cluster build --all

# 5. Deploy in role order — infrastructure first
./portoser deploy mini1 dnsmasq caddy
./portoser deploy mini1 vault
./portoser deploy mini2 postgres-prod
./portoser cluster deploy --all

The cluster deploy --all path SSHes into each host in turn, runs docker compose up -d (or local/native equivalents), and waits for health to come green. lib/cluster/deploy.sh:verify_deployment is the gate.

Health and observability

./portoser cluster health --watch         # live, all hosts, all services
./portoser cluster docker-health -v        # container counts per host
./portoser dependencies graph              # cross-host dependency edges
./portoser metrics                          # per-service resource usage

Each host runs the same metrics collector (lib/metrics/collector.sh) shelled in over SSH. Cross-arch is not visible — the collector reads /proc on Linux and top/vm_stat on macOS and produces the same JSON shape.

Where this shape falls down

The buildx host becomes a soft dependency. If you can't build, you can't push new images. Keep the buildx config reproducible (it's a single setup-buildx invocation).
Diagnosing a problem that only happens on one arch ("works on my Mac, fails on the Pi") is real and tedious. The self-healing loop's playbooks are arch-aware in some patterns but not all — port conflicts, stale processes, and disk exhaustion are arch-agnostic; everything else may need eyeballs.
If your hardware is so heterogeneous that more than half your services pin to a specific host, you're not really running a cluster — you're running multiple single-host setups behind one registry. That's fine, just be honest about it when reasoning about failover.

GPU + CPU Split — the special case of "one host has the GPU, the rest don't"
VPS + Home Hybrid — when one of your "hosts" is a remote VPS
Operations: Troubleshooting — fingerprint-by-fingerprint guide for cross-arch surprises