Skip to content

Server Configuration

Configuration format: YAML. Default filename: server.yaml.


Minimal Configuration

server:
  host: 0.0.0.0          # Listen address
  port: 8787              # Listen port

storage: {}
  # postgres_url: ${env.DATABASE_URL}   # Optional; falls back to local SQLite shadow storage when omitted
  # redis_url: ${env.REDIS_URL}         # Optional; runs execute inline in the API process when omitted

sandbox:
  provider: embedded                  # embedded | self_hosted | e2b
  # fleet:
  #   min_count: 1
  #   max_count: 32
  #   warm_empty_count: 1
  #   resource_cpu_pressure_threshold: 0.8
  #   resource_memory_pressure_threshold: 0.8
  #   max_workspaces_per_sandbox: 32
  #   ownerless_pool: shared          # shared | dedicated
  # self_hosted:
  #   base_url: http://oah-sandbox:8787/internal/v1
  # e2b:
  #   base_url: https://api.e2b.dev
  #   api_key: ${env.E2B_API_KEY}

paths:
  workspace_dir: /srv/openharness/workspaces       # Project workspace root
  runtime_state_dir: /srv/openharness/.openharness  # Runtime-private state root
  runtime_dir: /srv/openharness/runtimes        # Workspace runtime directory
  model_dir: /srv/openharness/models               # Platform model directory
  tool_dir: /srv/openharness/tools                 # Platform tool directory
  skill_dir: /srv/openharness/skills               # Platform skill directory

workers:
  embedded:
    min_count: 2                # Minimum worker count in API + embedded worker mode
    max_count: 4                # Upper bound for light local autoscaling
    scale_interval_ms: 1000     # Scaling check interval
    idle_ttl_ms: 30000          # How long surplus workers may stay idle before cleanup
    scale_up_window: 2          # Consecutive high-pressure samples required before scale-up
    scale_down_window: 2        # Consecutive low-pressure samples required before scale-down
    cooldown_ms: 1000           # Minimum cooldown between scaling actions
    reserved_capacity_for_subagent: 1  # Spare capacity reserved for subagent backlog

llm:
  default_model: openai-default   # Default model name (must exist in model_dir)

info Use ${env.VAR_NAME} syntax to reference environment variables.

tip Neither storage.postgres_url nor storage.redis_url is mandatory. Omitting PostgreSQL falls back to local SQLite shadow persistence; omitting Redis keeps run execution inline in the current API process.


Configuration Fields

server

Field Type Default Description
host string 127.0.0.1 Listen address
port number 8787 Listen port

storage

Field Type Required Description
postgres_url string No PostgreSQL connection string. Workspaces without serviceName use this database directly; once serviceName is set, the default database keeps only the workspace/session/run routing index while runtime truth is routed to a sibling derived database name (for example OAH-acme). When omitted, OAH falls back to local SQLite shadow persistence.
redis_url string No Redis connection string. Used for queues, locks, rate limiting, and SSE event fanout.

tip Without PostgreSQL, workspace/session/run persistence falls back to local SQLite shadow state. That is fine for single-node development, but it is not a shared source of truth for multi-instance deployments.

tip Without Redis, runs execute in-process on the API server (suitable for local dev). With Redis, multiple worker instances can consume the queue.

object_storage

Field Type Description
provider string Currently only s3-compatible object storage is supported
bucket string Target bucket
region string Object storage region
endpoint string Optional custom S3/OSS/MinIO endpoint
access_key string Optional access credential
secret_key string Optional access credential
session_token string Optional temporary credential
force_path_style boolean Whether to force path-style URLs
workspace_backing_store.enabled boolean Enables managed workspace object-storage backing. Active workspace writes still flush only on idle / drain / delete
workspace_backing_store.key_prefix string Object-storage key prefix used for workspace backing
mirrors.paths string[] Readonly prefixes mirrored locally. Supports runtime / model / tool / skill
mirrors.sync_on_boot boolean Whether mirrored prefixes should be pulled from object storage on startup
mirrors.sync_on_change boolean Whether mirrored readonly prefixes are polled for changes. This does not live-sync active workspace writes
mirrors.poll_interval_ms number Mirror poll interval
mirrors.key_prefixes.* object Object-storage key prefix mapping for each readonly mirrored path
managed_paths / key_prefixes.* / sync_on_* legacy Backward-compatible legacy fields; prefer workspace_backing_store and mirrors for new configs. Loading them emits a deprecation warning

tip runtime / model / tool / skill in mirrors.paths are still mirrored through ObjectStorageMirrorController on boot and on change polling.

tip workspace_backing_store only controls managed workspace externalRef / backing-store semantics. Active workspace writes do not flush on every change; they flush through the workspace materialization idle / drain lifecycle.

sandbox

Field Type Description
provider string Sandbox provider. Supports embedded, self_hosted, and e2b. Defaults to embedded. embedded means the worker is hosted inside oah-api; self_hosted / e2b mean a standalone worker runs inside a real sandbox.
fleet.min_count number Minimum sandbox count the controller should maintain for self-hosted / e2b providers. Defaults to 1 for remote providers and 0 for embedded.
fleet.max_count number Maximum sandbox count the controller may target. Defaults to 64.
fleet.warm_empty_count number Extra empty sandboxes to keep warm so new workspaces can bind quickly at any time. Defaults to 1 for remote providers and 0 for embedded.
fleet.resource_cpu_pressure_threshold number Sandbox resource pressure threshold. Ownerless workspaces prefer an empty sandbox when CPU load ratio exceeds this value. Defaults to 0.8.
fleet.resource_memory_pressure_threshold number Sandbox memory pressure threshold. Ownerless workspaces prefer an empty sandbox when memory used ratio exceeds this value. Defaults to 0.8.
fleet.max_workspaces_per_sandbox number Capacity limit for how many workspaces a single real sandbox should carry. Defaults to 32.
fleet.ownerless_pool string How workspaces without ownerId are grouped into sandboxes. shared uses a shared pool; dedicated gives each workspace its own sandbox.
self_hosted.base_url string Required when provider=self_hosted. Base /internal/v1 URL exposed by the sandbox-resident standalone worker.
self_hosted.headers object Optional static headers attached to remote self-hosted sandbox requests.
e2b.base_url string Optional when provider=e2b. Overrides the native E2B API base URL; legacy /internal/v1-style URLs are normalized automatically.
e2b.api_key string Optional. When set, OAH sends it as Authorization: Bearer <key> on e2b requests.
e2b.domain string Optional. Overrides the E2B sandbox domain.
e2b.template string Optional. Selects the E2B template used when creating sandboxes.
e2b.timeout_ms number Optional. Timeout for sandbox create / resolve operations.
e2b.request_timeout_ms number Optional. Timeout for individual E2B HTTP requests.
e2b.headers object Optional static headers attached to e2b requests.

tip OAH keeps the external /sandboxes API stable. Switching sandbox.provider changes only the server-side sandbox backend wiring; the Web app, OpenAPI clients, and runtime callers do not need to change their request shape.

tip The /sandboxes surface, the /workspace root, and sandbox-scoped file / command semantics are intentionally kept this way to stay compatible with E2B. Treat them as a deliberate contract, not as a temporary legacy shim that should default back to /workspaces. The /workspaces API itself still remains in place for workspace metadata, catalog, and lifecycle concerns.

tip self_hosted and e2b share the same execution semantics: oah-api routes workspaces into a real sandbox, while the standalone worker inside that sandbox owns the live workspace copy, local file state, and command execution context.

tip The controller now treats sandbox fleet demand as a first-class signal: the same ownerId prefers the same real sandbox, while ownerless workspaces use a shared pool by default. Ownerless workspaces first reuse existing sandboxes whose CPU and memory are both below threshold; when either CPU or memory crosses the threshold, placement falls back to the empty sandboxes reserved by warm_empty_count.

tip Starting with the current version, createSession asynchronously prewarms the target workspace after the session is created. With a remote sandbox provider, that eagerly binds the workspace to a sandbox; with workspace materialization enabled, it also prepares the active workspace copy ahead of the first user message. Combined with the remote-provider default fleet.warm_empty_count = 1, this removes most first-message cold-start latency, although very large first-time materializations can still dominate.

tip sandbox is a host-layer concept, not a project-layer concept. One sandbox may carry multiple active workspaces. It answers “where does the worker run?”, while a workspace answers “which project and capability set is being executed?”

paths

Field Type Description
workspace_dir string Project workspace root directory
runtime_state_dir string Runtime-private state root for SQLite shadow data, archive exports, and legacy materialization state. Defaults to dirname(workspace_dir)/.openharness
runtime_dir string Workspace runtime directory
model_dir string Platform model definition directory
tool_dir string Platform tool source directory, primarily used for runtime imports and shared single-workspace sources
skill_dir string Platform skill source directory, primarily used for runtime imports and shared single-workspace sources

workspace

Field Type Description
materialization.idle_ttl_ms number How long an active workspace copy may stay idle before flush / cleanup is considered. Default 900000.
materialization.maintenance_interval_ms number Background maintenance interval for workspace materialization. Default 5000.

tip workspace.materialization primarily affects object-storage backing stores, remote sandboxes, and active workspace-copy lifecycle timing. It does not change the declarative workspace capability model.

llm

Field Type Description
default_model string Default model name. Must exist in model_dir. Resolved to platform/<name> at runtime.

workers

Field Type Description
embedded.min_count number Minimum always-on worker count in API + embedded worker mode.
embedded.max_count number Maximum embedded worker count under queue pressure.
embedded.scale_interval_ms number Rebalance interval for the embedded worker pool.
embedded.idle_ttl_ms number How long surplus embedded workers may stay idle before cleanup.
embedded.scale_up_window number Consecutive high-pressure samples required before scaling up.
embedded.scale_down_window number Consecutive low-pressure samples required before scaling down.
embedded.cooldown_ms number Cooldown between embedded worker scaling actions.
embedded.reserved_capacity_for_subagent number Minimum spare embedded capacity reserved for subagent backlog.
standalone.min_replicas number Minimum sandbox replicas the controller may keep for standalone workers. Set 0 to allow scale-to-zero when idle.
standalone.max_replicas number Maximum sandbox replicas the controller may target for standalone workers.
standalone.ready_sessions_per_capacity_unit number Queue-density target used by the controller when translating observed worker capacity into sandbox replica demand.
standalone.reserved_capacity_for_subagent number Minimum observed execution capacity reserved for subagent backlog.
standalone.slots_per_pod number Legacy compatibility field. The controller no longer uses this static value to size sandbox replicas and instead relies on worker-reported observed capacity.
controller.scale_interval_ms number How often the controller samples backlog / worker-registry state and recomputes desired replicas.
controller.scale_up_window number Consecutive high-pressure samples required before scaling up.
controller.scale_down_window number Consecutive low-pressure samples required before scaling down.
controller.cooldown_ms number Cooldown between controller scaling actions.
controller.scale_up_busy_ratio_threshold number Busy-ratio threshold in the range 0..1 that may trigger extra scale-up.
controller.scale_up_max_ready_age_ms number Allows scale-up when the oldest schedulable ready session exceeds this age.
controller.leader_election.type string Leader-election type for the controller. Supports noop and kubernetes.
controller.leader_election.kubernetes.* object Kubernetes lease settings such as namespace, lease name, API URL, token file, CA file, skip TLS verify, and identity.
controller.scale_target.type string Scale-target backend. Supports noop, kubernetes, and docker_compose.
controller.scale_target.allow_scale_down boolean Whether the controller may actively scale down replicas.
controller.scale_target.kubernetes.* object Kubernetes Deployment /scale target settings such as namespace, deployment, label selector, API URL, token file, CA file, and skip TLS verify.
controller.scale_target.docker_compose.* object Local Docker Compose scaling settings such as compose file, project name, service, command, plus optional remote executor endpoint, auth token, and timeout.

tip The controller boundary is now explicitly sandbox-only. How many threads, slots, or processes run inside a sandbox is owned by the worker runtime itself; the controller only consumes the observed capacity those workers publish and turns it into sandbox replica and placement decisions.


Directory Reference

Path and Layer Boundaries

Object Role Active execution location
workspace_dir workspace source / managed root Not always
runtime_state_dir engine-private state root No
runtime_dir initialization source for new workspaces No
Active Workspace Copy active execution copy of a workspace Yes

Read them like this:

  • workspace_dir answers “which workspaces exist”
  • runtime_dir answers “how a new workspace is initialized”
  • sandbox answers “where the current run executes”
  • runtime_state_dir answers “where engine-private state lives”

workspace_dir

Each direct subdirectory is treated as one project workspace. Only first-level subdirectories are scanned. This directory should hold workspace source roots only and should not be relied on as an engine-internal state root.

In embedded mode, active execution often happens directly against the local workspace. In self_hosted / e2b, the active execution copy is usually materialized into the owner sandbox, so workspace_dir behaves more like a managed source root than the final execution location.

runtime_state_dir

Stores runtime-private state, including:

  • SQLite shadow history.db
  • Archive export output
  • Legacy object-store materialization state

The default is dirname(workspace_dir)/.openharness, which keeps the live workspace root separate from internal runtime state. If you want this state to survive container restarts, mount it to durable writable storage explicitly.

runtime_dir

Stores workspace runtimes. When creating a new workspace via POST /workspaces, a runtime from this directory is used as the initialization source. Runtimes are never loaded as active workspaces at runtime.

runtime_dir does not participate in run execution and never holds the active execution copy of a workspace. It only answers “how do we initialize a workspace?”, not “where is it currently running?”

model_dir

Recursively scans *.yaml files in the directory. File format matches workspace .openharness/models/*.yaml. Loaded models appear as platform/<name> in the model catalog.

Example (model_dir/openai-default.yaml):

openai-default:
  provider: openai
  key: ${env.OPENAI_API_KEY}
  name: gpt-5

tool_dir

Platform-level tool source directory. Its structure should match workspace .openharness/tools (settings.yaml + servers/*). In the current implementation it is primarily used as the import source for runtime imports.tools, and as a shared source in single-workspace mode.

tip When OAH runs inside Docker, HTTP MCP servers configured with http://127.0.0.1:... or http://localhost:... are rewritten at runtime to a host-reachable alias. The default alias is host.docker.internal. Override it with OAH_DOCKER_HOST_ALIAS if needed.

skill_dir

Platform-level skill source directory. In the current implementation it is primarily used as the import source for runtime imports.skills, and as a shared source in single-workspace mode.

warning Contents of tool_dir and skill_dir are primarily imported during runtime initialization. At runtime, workspaces use only capabilities declared in their own .openharness directory, plus any content already copied into that workspace during initialization.


Runtime Modes

Mode Command Description
API + embedded worker pnpm exec tsx --tsconfig ./apps/server/tsconfig.json ./apps/server/src/index.ts -- --config server.yaml Smallest deployment. One oah-api process directly hosts the embedded worker.
API only pnpm exec tsx --tsconfig ./apps/server/tsconfig.json ./apps/server/src/index.ts -- --config server.yaml --api-only Starts oah-api only. Typically paired with oah-controller and oah-sandbox.
Standalone worker pnpm exec tsx --tsconfig ./apps/server/tsconfig.json ./apps/server/src/worker.ts -- --config server.yaml Standalone worker, typically running inside a self-hosted or E2B sandbox.

Environment Variable Overrides

In addition to YAML config, the server also reads a set of runtime environment variables for recovery, worker-pool behavior, and diagnostics.

Stale Run Recovery

Variable Default Description
OAH_STALE_RUN_RECOVERY_STRATEGY requeue_running with Redis, otherwise fail Stale-run recovery strategy. Supports fail, requeue_running, and requeue_all.
OAH_STALE_RUN_RECOVERY_MAX_ATTEMPTS 1 Maximum number of automatic requeue attempts per run.

Embedded Worker Pool

Variable Default Description
OAH_EMBEDDED_WORKER_MIN 2 with Redis, otherwise 1 Minimum embedded worker instances; standalone worker processes always keep at least 1.
OAH_EMBEDDED_WORKER_MAX Same as OAH_EMBEDDED_WORKER_MIN Maximum embedded worker instances.
OAH_EMBEDDED_WORKER_SCALE_INTERVAL_MS 5000 Embedded worker pool rebalance interval.
OAH_EMBEDDED_WORKER_READY_SESSIONS_PER_CAPACITY_UNIT 1 Target ready-session density per observed execution-capacity unit.
OAH_EMBEDDED_WORKER_SCALE_UP_COOLDOWN_MS 1000 Scale-up cooldown.
OAH_EMBEDDED_WORKER_SCALE_DOWN_COOLDOWN_MS 15000 Scale-down cooldown.
OAH_EMBEDDED_WORKER_SCALE_UP_SAMPLE_SIZE 2 Consecutive high-pressure samples required before scaling up.
OAH_EMBEDDED_WORKER_SCALE_DOWN_SAMPLE_SIZE 3 Consecutive low-pressure samples required before scaling down.
OAH_EMBEDDED_WORKER_SCALE_UP_BUSY_RATIO_PERCENT 75 Busy-ratio threshold that may unlock extra scale-up when combined with queue age.
OAH_EMBEDDED_WORKER_SCALE_UP_MAX_READY_AGE_MS 2000 Allows age-driven scale-up once the oldest schedulable session waits longer than this.
OAH_EMBEDDED_WORKER_RESERVED_CAPACITY_FOR_SUBAGENT 1 Extra spare capacity reserved when subagent backlog appears; may be set to 0.

Other Runtime Parameters

Variable Default Description
OAH_HISTORY_EVENT_RETENTION_DAYS 7 Retention window for historical events in PostgreSQL mode.
OAH_STORAGE_ADMIN_REDIS_OVERVIEW_KEY_LIMIT 200 Maximum number of Redis session queue / lock / event keys scanned and returned per category in storage overview, capped at 10000; responses include truncated flags when the cap is reached.
OAH_RUNTIME_DEBUG unset Mirrors runtime debug logs to stdout when set.
OAH_DOCKER_HOST_ALIAS host.docker.internal Host alias used when OAH runs inside Docker and an HTTP MCP server is configured with a loopback URL such as 127.0.0.1 or localhost.

tip With Redis plus API + embedded worker, OAH defaults to at least 2 embedded workers and performs lightweight scaling based on the gap between ready queue pressure and available worker capacity. scale_up_window, scale_down_window, and cooldown_ms still gate each action. If subagent backlog appears, the pool first tries to restore reserved_capacity_for_subagent so parent runs are less likely to be starved by normal backlog.

tip OAH_DOCKER_HOST_ALIAS is mainly for the case where containerized OAH needs to reach an HTTP MCP server running on the host machine. The local docker-compose.local.yml already injects host.docker.internal:host-gateway, so the default works in most setups.


Schema

JSON Schema: schemas/server-config.schema.json