Architecture

Portoser is built as a modular, layered system that combines CLI tooling, intelligent automation, and visual management into a cohesive platform.

High-Level Architecture

Components

1. User Interfaces

CLI Tool (`portoser`)

Technology: Bash/Shell scripts
Purpose: Primary interface for service management
Key Features:
- Service lifecycle management (deploy, start, stop, restart)
- Machine and service registration
- Health checks and diagnostics
- Vault integration
- Registry management

Web UI

Technology: React 18 + Vite + TailwindCSS
Purpose: Visual cluster management
Key Features:
- Drag-and-drop service assignment
- Real-time health monitoring
- Metrics dashboards
- Deployment history
- Service dependency graphs

MCP Server

Technology: FastMCP + RestrictedPython
Purpose: AI agent integration
Key Features:
- Tool-based service management
- Safe script execution
- Audit logging
- Authentication and authorization

2. Core Services Layer

Service Registry

Format: YAML (registry.yml)
Contents:
- Machine definitions (host, user, platform)
- Service definitions (name, type, path, machine)
- Configuration parameters
- Health check endpoints
- Dependencies

Vault Integration

Technology: HashiCorp Vault client
Purpose: Centralized secret management
Features:
- Per-machine AppRole authentication
- Secret injection into services
- Certificate storage
- Credential rotation

Health Monitor

Purpose: Real-time service health tracking
Features:
- HTTP/TCP health checks
- Process monitoring
- Resource utilization tracking
- Failure detection

Metrics Collector

Purpose: Usage and performance tracking
Metrics:
- CPU and memory usage
- Disk utilization
- Network statistics
- Service uptime
- Deployment frequency

3. Self-Healing Loop

Every deploy passes through a four-stage loop. Each stage is a real script in lib/:

Stage	What it does	Source
Observe	Reads target host state — SSH, disk, ports, Docker daemon, dependency health	`lib/observe/observer.sh`
Analyze	Classifies failures against known patterns — port conflict, stale process, Docker down, disk pressure, dependency unhealthy, SSH/permission errors	`lib/diagnose/analyzer.sh`
Solve	Runs the matching playbook from `~/.portoser/knowledge/playbooks/`	`lib/solve/solver.sh`
Learn	When a new fix succeeds, writes it back as a playbook with a frequency count	`lib/standardize/learning.sh`

The full loop is on by default; opt out per-deploy with --no-auto-heal. See Intelligent Deployment for the full mechanism and Self-Healing Loop for how to extend it.

4. Deployment Engines

Docker Engine

Purpose: Deploy Docker Compose applications
Features:
- Automatic Docker daemon startup
- Compose file validation
- Container health checks
- Volume management
- Network configuration

Local Engine (UV)

Purpose: Deploy Python applications
Features:
- UV package manager integration
- Virtual environment management
- Process supervision
- Log rotation

Native Engine

Purpose: Deploy system services
Features:
- systemd integration (Linux)
- launchd integration (macOS)
- Service file generation
- Auto-start configuration

5. Infrastructure Services

Caddy Reverse Proxy

Purpose: Automatic routing and TLS
Features:
- Auto-generated Caddyfile from registry
- Automatic HTTPS
- mTLS support
- Load balancing

Data Flow

Service Deployment Flow

Health Monitoring Flow

Communication Patterns

CLI ↔ Target Machines

Protocol: SSH
Authentication: Key-based
Encryption: SSH tunnel

Web UI ↔ Backend

Protocol: REST API + WebSockets
Authentication: JWT + mTLS
Format: JSON

Backend ↔ Vault

Protocol: HTTPS
Authentication: AppRole
Encryption: TLS

Services ↔ Caddy

Protocol: HTTP/HTTPS
Routing: Domain-based
TLS: Automatic certificates

Security Architecture

Authentication Layers

Secret Management

Network Security

All inter-service communication encrypted
Caddy handles TLS termination
mTLS for sensitive endpoints
Network segmentation supported

Scalability

Horizontal Scaling

Add more target machines
Distribute services across machines
Load balance with Caddy

Vertical Scaling

Increase machine resources
Adjust service resource limits
Optimize deployment strategies

Performance Considerations

Parallel deployments supported
Async health checks
Cached registry reads
Connection pooling

Extension Points

Custom Deployment Types

Add new deployment engines in /lib/:

Implement deploy/start/stop/status functions
Register in service registry
Follow deployment interface

Custom Health Checks

Define in registry:

services:
  my-service:
    health:
      endpoint: "/health"
      type: "http"
      interval: 30

Custom Solution Patterns

Add to /lib/solve/patterns/:

Define problem detection
Implement solution logic
Document in knowledge base

MCP Tools

Extend MCP server with custom tools:

Add tool definition
Implement handler
Configure authentication

Technology Stack Summary

Layer	Technology	Purpose
CLI	Bash/Shell	Command-line interface
Web UI	React + Vite	Visual management
Backend API	FastAPI	REST endpoints
MCP Server	FastMCP	AI integration
Database	PostgreSQL	MCP data storage
Cache	Redis	Sessions, rate limiting
Secrets	HashiCorp Vault	Secret management
Proxy	Caddy	Reverse proxy, TLS
Monitoring	Custom + Prometheus	Metrics collection
Registry	YAML	Service definitions

Design Principles

Simplicity: Easy to understand and use
Intelligence: Auto-healing and problem detection
Flexibility: Support multiple deployment types
Security: Defense in depth
Observability: Comprehensive monitoring and logging
Extensibility: Easy to add new features
Reliability: Fault-tolerant and self-healing

Next Steps

Service Registry — understanding the registry
Self-Healing Loop — Observe → Diagnose → Solve → Standardize on every deploy
Deployment Types — when to pick docker vs local vs native
Certificates & mTLS — how trust is distributed across the cluster