Roadmap

Milestone 1 — Core Orchestration (MVP)

Status: Done

Deploy containers across multiple servers using a familiar YAML manifest.

Status: Done

Per-container health status, logs, and visibility from the CLI.

Agent monitors container health after deployment (running, exited, restarting)
Agent reports per-container status back to Engine via gRPC
banyan-cli status shows per-service and per-container status (not just aggregate)
CLI command to stream container logs from agents (via engine gRPC proxy)
Detect and surface failed containers (e.g. exited immediately after start)
banyan-cli down command to stop and remove all containers for a deployment

Status: Done

Secure gRPC communication between CLI, Engine, and Agents.

All inter-component communication uses gRPC with password authentication
Agent → Engine: password in gRPC metadata on every call
CLI → Engine: password in gRPC metadata on every call
Engine → Agent: session token authentication for log streaming
Config file at /etc/banyan/banyan.yaml with sections: security, engine, agent, cli
init commands for engine, agent, and CLI prompt for credentials and connection info
Three separate binaries: banyan-engine, banyan-agent, banyan-cli

Collect and store resource metrics from every node and container.

Smarter task distribution based on node resources instead of simple round-robin.

Agent reports node resource usage (CPU, memory, disk) to Engine via etcd
Engine selects the node with the most available resources when scheduling new tasks
Resource requests in banyan.yaml: services can declare CPU and memory requirements (e.g. cpus: 2, memory: 4g)
Engine validates that target node has sufficient resources before assigning a task
Engine rejects deployments that exceed total cluster capacity

Multiple active engine nodes share workload for high availability and horizontal scaling.

Active-active engines: Any engine can handle CLI requests and schedule tasks
etcd coordination: Task claiming via Compare-And-Swap to prevent duplication
Distributed registry: Index-based lookup so agents pull images from the correct engine
Optimistic locking: Concurrent deployment updates are serialized
Session state in etcd: Agents can reconnect to any engine
Client load balancing: CLI connects to any available engine

See Multi-Engine HA Design for detailed architecture.

Scale services based on metrics and support zero-downtime updates.

Auto-scaling: Define scaling rules in the manifest (min/max replicas, target thresholds)
Auto-scaling: Engine evaluates metrics against rules and adjusts replica count
Auto-scaling: Graceful scale-down (drain before stopping)
Redeployment: Rolling update when service image or config changes
Redeployment: Health check between rollout steps
Redeployment: Automatic rollback on failure

Give operators visibility into the cluster through a web UI and CLI commands.

Stronger authentication model for production environments.

Deeper observability and richer operational tooling.