Server Configuration¶
Configuration format: YAML. Default filename: server.yaml.
Minimal Configuration¶
server:
host: 0.0.0.0 # Listen address
port: 8787 # Listen port
storage: {}
# postgres_url: ${env.DATABASE_URL} # Optional; falls back to local SQLite shadow storage when omitted
# redis_url: ${env.REDIS_URL} # Optional; runs execute inline in the API process when omitted
sandbox:
provider: embedded # embedded | self_hosted | e2b
# fleet:
# min_count: 1
# max_count: 32
# warm_empty_count: 1
# resource_cpu_pressure_threshold: 0.8
# resource_memory_pressure_threshold: 0.8
# max_workspaces_per_sandbox: 32
# ownerless_pool: shared # shared | dedicated
# self_hosted:
# base_url: http://oah-sandbox:8787/internal/v1
# e2b:
# base_url: https://api.e2b.dev
# api_key: ${env.E2B_API_KEY}
paths:
workspace_dir: /srv/openharness/workspaces # Project workspace root
runtime_state_dir: /srv/openharness/.openharness # Runtime-private state root
runtime_dir: /srv/openharness/runtimes # Workspace runtime directory
model_dir: /srv/openharness/models # Platform model directory
tool_dir: /srv/openharness/tools # Platform tool directory
skill_dir: /srv/openharness/skills # Platform skill directory
workers:
embedded:
min_count: 2 # Minimum worker count in API + embedded worker mode
max_count: 4 # Upper bound for light local autoscaling
scale_interval_ms: 1000 # Scaling check interval
idle_ttl_ms: 30000 # How long surplus workers may stay idle before cleanup
scale_up_window: 2 # Consecutive high-pressure samples required before scale-up
scale_down_window: 2 # Consecutive low-pressure samples required before scale-down
cooldown_ms: 1000 # Minimum cooldown between scaling actions
reserved_capacity_for_subagent: 1 # Spare capacity reserved for subagent backlog
llm:
default_model: openai-default # Default model name (must exist in model_dir)
info Use
${env.VAR_NAME}syntax to reference environment variables.tip Neither
storage.postgres_urlnorstorage.redis_urlis mandatory. Omitting PostgreSQL falls back to local SQLite shadow persistence; omitting Redis keeps run execution inline in the current API process.
Configuration Fields¶
server¶
| Field | Type | Default | Description |
|---|---|---|---|
host |
string | 127.0.0.1 |
Listen address |
port |
number | 8787 |
Listen port |
storage¶
| Field | Type | Required | Description |
|---|---|---|---|
postgres_url |
string | No | PostgreSQL connection string. Workspaces without serviceName use this database directly; once serviceName is set, the default database keeps only the workspace/session/run routing index while runtime truth is routed to a sibling derived database name (for example OAH-acme). When omitted, OAH falls back to local SQLite shadow persistence. |
redis_url |
string | No | Redis connection string. Used for queues, locks, rate limiting, and SSE event fanout. |
tip Without PostgreSQL, workspace/session/run persistence falls back to local SQLite shadow state. That is fine for single-node development, but it is not a shared source of truth for multi-instance deployments.
tip Without Redis, runs execute in-process on the API server (suitable for local dev). With Redis, multiple worker instances can consume the queue.
object_storage¶
| Field | Type | Description |
|---|---|---|
provider |
string | Currently only s3-compatible object storage is supported |
bucket |
string | Target bucket |
region |
string | Object storage region |
endpoint |
string | Optional custom S3/OSS/MinIO endpoint |
access_key |
string | Optional access credential |
secret_key |
string | Optional access credential |
session_token |
string | Optional temporary credential |
force_path_style |
boolean | Whether to force path-style URLs |
workspace_backing_store.enabled |
boolean | Enables managed workspace object-storage backing. Active workspace writes still flush only on idle / drain / delete |
workspace_backing_store.key_prefix |
string | Object-storage key prefix used for workspace backing |
mirrors.paths |
string[] | Readonly prefixes mirrored locally. Supports runtime / model / tool / skill |
mirrors.sync_on_boot |
boolean | Whether mirrored prefixes should be pulled from object storage on startup |
mirrors.sync_on_change |
boolean | Whether mirrored readonly prefixes are polled for changes. This does not live-sync active workspace writes |
mirrors.poll_interval_ms |
number | Mirror poll interval |
mirrors.key_prefixes.* |
object | Object-storage key prefix mapping for each readonly mirrored path |
managed_paths / key_prefixes.* / sync_on_* |
legacy | Backward-compatible legacy fields; prefer workspace_backing_store and mirrors for new configs. Loading them emits a deprecation warning |
tip
runtime / model / tool / skillinmirrors.pathsare still mirrored throughObjectStorageMirrorControlleron boot and on change polling.tip
workspace_backing_storeonly controls managed workspaceexternalRef/ backing-store semantics. Active workspace writes do not flush on every change; they flush through the workspace materialization idle / drain lifecycle.
sandbox¶
| Field | Type | Description |
|---|---|---|
provider |
string | Sandbox provider. Supports embedded, self_hosted, and e2b. Defaults to embedded. embedded means the worker is hosted inside oah-api; self_hosted / e2b mean a standalone worker runs inside a real sandbox. |
fleet.min_count |
number | Minimum sandbox count the controller should maintain for self-hosted / e2b providers. Defaults to 1 for remote providers and 0 for embedded. |
fleet.max_count |
number | Maximum sandbox count the controller may target. Defaults to 64. |
fleet.warm_empty_count |
number | Extra empty sandboxes to keep warm so new workspaces can bind quickly at any time. Defaults to 1 for remote providers and 0 for embedded. |
fleet.resource_cpu_pressure_threshold |
number | Sandbox resource pressure threshold. Ownerless workspaces prefer an empty sandbox when CPU load ratio exceeds this value. Defaults to 0.8. |
fleet.resource_memory_pressure_threshold |
number | Sandbox memory pressure threshold. Ownerless workspaces prefer an empty sandbox when memory used ratio exceeds this value. Defaults to 0.8. |
fleet.max_workspaces_per_sandbox |
number | Capacity limit for how many workspaces a single real sandbox should carry. Defaults to 32. |
fleet.ownerless_pool |
string | How workspaces without ownerId are grouped into sandboxes. shared uses a shared pool; dedicated gives each workspace its own sandbox. |
self_hosted.base_url |
string | Required when provider=self_hosted. Base /internal/v1 URL exposed by the sandbox-resident standalone worker. |
self_hosted.headers |
object | Optional static headers attached to remote self-hosted sandbox requests. |
e2b.base_url |
string | Optional when provider=e2b. Overrides the native E2B API base URL; legacy /internal/v1-style URLs are normalized automatically. |
e2b.api_key |
string | Optional. When set, OAH sends it as Authorization: Bearer <key> on e2b requests. |
e2b.domain |
string | Optional. Overrides the E2B sandbox domain. |
e2b.template |
string | Optional. Selects the E2B template used when creating sandboxes. |
e2b.timeout_ms |
number | Optional. Timeout for sandbox create / resolve operations. |
e2b.request_timeout_ms |
number | Optional. Timeout for individual E2B HTTP requests. |
e2b.headers |
object | Optional static headers attached to e2b requests. |
tip OAH keeps the external
/sandboxesAPI stable. Switchingsandbox.providerchanges only the server-side sandbox backend wiring; the Web app, OpenAPI clients, and runtime callers do not need to change their request shape.tip The
/sandboxessurface, the/workspaceroot, and sandbox-scoped file / command semantics are intentionally kept this way to stay compatible with E2B. Treat them as a deliberate contract, not as a temporary legacy shim that should default back to/workspaces. The/workspacesAPI itself still remains in place for workspace metadata, catalog, and lifecycle concerns.tip
self_hostedande2bshare the same execution semantics:oah-apiroutes workspaces into a real sandbox, while the standalone worker inside that sandbox owns the live workspace copy, local file state, and command execution context.tip The controller now treats sandbox fleet demand as a first-class signal: the same
ownerIdprefers the same real sandbox, while ownerless workspaces use a shared pool by default. Ownerless workspaces first reuse existing sandboxes whose CPU and memory are both below threshold; when either CPU or memory crosses the threshold, placement falls back to the empty sandboxes reserved bywarm_empty_count.tip Starting with the current version,
createSessionasynchronously prewarms the target workspace after the session is created. With a remote sandbox provider, that eagerly binds the workspace to a sandbox; with workspace materialization enabled, it also prepares the active workspace copy ahead of the first user message. Combined with the remote-provider defaultfleet.warm_empty_count = 1, this removes most first-message cold-start latency, although very large first-time materializations can still dominate.tip
sandboxis a host-layer concept, not a project-layer concept. One sandbox may carry multiple active workspaces. It answers “where does the worker run?”, while a workspace answers “which project and capability set is being executed?”
paths¶
| Field | Type | Description |
|---|---|---|
workspace_dir |
string | Project workspace root directory |
runtime_state_dir |
string | Runtime-private state root for SQLite shadow data, archive exports, and legacy materialization state. Defaults to dirname(workspace_dir)/.openharness |
runtime_dir |
string | Workspace runtime directory |
model_dir |
string | Platform model definition directory |
tool_dir |
string | Platform tool source directory, primarily used for runtime imports and shared single-workspace sources |
skill_dir |
string | Platform skill source directory, primarily used for runtime imports and shared single-workspace sources |
workspace¶
| Field | Type | Description |
|---|---|---|
materialization.idle_ttl_ms |
number | How long an active workspace copy may stay idle before flush / cleanup is considered. Default 900000. |
materialization.maintenance_interval_ms |
number | Background maintenance interval for workspace materialization. Default 5000. |
tip
workspace.materializationprimarily affects object-storage backing stores, remote sandboxes, and active workspace-copy lifecycle timing. It does not change the declarative workspace capability model.
llm¶
| Field | Type | Description |
|---|---|---|
default_model |
string | Default model name. Must exist in model_dir. Resolved to platform/<name> at runtime. |
workers¶
| Field | Type | Description |
|---|---|---|
embedded.min_count |
number | Minimum always-on worker count in API + embedded worker mode. |
embedded.max_count |
number | Maximum embedded worker count under queue pressure. |
embedded.scale_interval_ms |
number | Rebalance interval for the embedded worker pool. |
embedded.idle_ttl_ms |
number | How long surplus embedded workers may stay idle before cleanup. |
embedded.scale_up_window |
number | Consecutive high-pressure samples required before scaling up. |
embedded.scale_down_window |
number | Consecutive low-pressure samples required before scaling down. |
embedded.cooldown_ms |
number | Cooldown between embedded worker scaling actions. |
embedded.reserved_capacity_for_subagent |
number | Minimum spare embedded capacity reserved for subagent backlog. |
standalone.min_replicas |
number | Minimum sandbox replicas the controller may keep for standalone workers. Set 0 to allow scale-to-zero when idle. |
standalone.max_replicas |
number | Maximum sandbox replicas the controller may target for standalone workers. |
standalone.ready_sessions_per_capacity_unit |
number | Queue-density target used by the controller when translating observed worker capacity into sandbox replica demand. |
standalone.reserved_capacity_for_subagent |
number | Minimum observed execution capacity reserved for subagent backlog. |
standalone.slots_per_pod |
number | Legacy compatibility field. The controller no longer uses this static value to size sandbox replicas and instead relies on worker-reported observed capacity. |
controller.scale_interval_ms |
number | How often the controller samples backlog / worker-registry state and recomputes desired replicas. |
controller.scale_up_window |
number | Consecutive high-pressure samples required before scaling up. |
controller.scale_down_window |
number | Consecutive low-pressure samples required before scaling down. |
controller.cooldown_ms |
number | Cooldown between controller scaling actions. |
controller.scale_up_busy_ratio_threshold |
number | Busy-ratio threshold in the range 0..1 that may trigger extra scale-up. |
controller.scale_up_max_ready_age_ms |
number | Allows scale-up when the oldest schedulable ready session exceeds this age. |
controller.leader_election.type |
string | Leader-election type for the controller. Supports noop and kubernetes. |
controller.leader_election.kubernetes.* |
object | Kubernetes lease settings such as namespace, lease name, API URL, token file, CA file, skip TLS verify, and identity. |
controller.scale_target.type |
string | Scale-target backend. Supports noop, kubernetes, and docker_compose. |
controller.scale_target.allow_scale_down |
boolean | Whether the controller may actively scale down replicas. |
controller.scale_target.kubernetes.* |
object | Kubernetes Deployment /scale target settings such as namespace, deployment, label selector, API URL, token file, CA file, and skip TLS verify. |
controller.scale_target.docker_compose.* |
object | Local Docker Compose scaling settings such as compose file, project name, service, command, plus optional remote executor endpoint, auth token, and timeout. |
tip The controller boundary is now explicitly sandbox-only. How many threads, slots, or processes run inside a sandbox is owned by the worker runtime itself; the controller only consumes the observed capacity those workers publish and turns it into sandbox replica and placement decisions.
Directory Reference¶
Path and Layer Boundaries¶
| Object | Role | Active execution location |
|---|---|---|
workspace_dir |
workspace source / managed root | Not always |
runtime_state_dir |
engine-private state root | No |
runtime_dir |
initialization source for new workspaces | No |
Active Workspace Copy |
active execution copy of a workspace | Yes |
Read them like this:
workspace_diranswers “which workspaces exist”runtime_diranswers “how a new workspace is initialized”sandboxanswers “where the current run executes”runtime_state_diranswers “where engine-private state lives”
workspace_dir¶
Each direct subdirectory is treated as one project workspace. Only first-level subdirectories are scanned. This directory should hold workspace source roots only and should not be relied on as an engine-internal state root.
In embedded mode, active execution often happens directly against the local workspace. In self_hosted / e2b, the active execution copy is usually materialized into the owner sandbox, so workspace_dir behaves more like a managed source root than the final execution location.
runtime_state_dir¶
Stores runtime-private state, including:
- SQLite shadow
history.db - Archive export output
- Legacy object-store materialization state
The default is dirname(workspace_dir)/.openharness, which keeps the live workspace root separate from internal runtime state. If you want this state to survive container restarts, mount it to durable writable storage explicitly.
runtime_dir¶
Stores workspace runtimes. When creating a new workspace via POST /workspaces, a runtime from this directory is used as the initialization source. Runtimes are never loaded as active workspaces at runtime.
runtime_dir does not participate in run execution and never holds the active execution copy of a workspace. It only answers “how do we initialize a workspace?”, not “where is it currently running?”
model_dir¶
Recursively scans *.yaml files in the directory. File format matches workspace .openharness/models/*.yaml. Loaded models appear as platform/<name> in the model catalog.
Example (model_dir/openai-default.yaml):
tool_dir¶
Platform-level tool source directory. Its structure should match workspace .openharness/tools (settings.yaml + servers/*). In the current implementation it is primarily used as the import source for runtime imports.tools, and as a shared source in single-workspace mode.
tip When OAH runs inside Docker, HTTP MCP servers configured with
http://127.0.0.1:...orhttp://localhost:...are rewritten at runtime to a host-reachable alias. The default alias ishost.docker.internal. Override it withOAH_DOCKER_HOST_ALIASif needed.
skill_dir¶
Platform-level skill source directory. In the current implementation it is primarily used as the import source for runtime imports.skills, and as a shared source in single-workspace mode.
warning Contents of
tool_dirandskill_dirare primarily imported during runtime initialization. At runtime, workspaces use only capabilities declared in their own.openharnessdirectory, plus any content already copied into that workspace during initialization.
Runtime Modes¶
| Mode | Command | Description |
|---|---|---|
| API + embedded worker | pnpm exec tsx --tsconfig ./apps/server/tsconfig.json ./apps/server/src/index.ts -- --config server.yaml |
Smallest deployment. One oah-api process directly hosts the embedded worker. |
| API only | pnpm exec tsx --tsconfig ./apps/server/tsconfig.json ./apps/server/src/index.ts -- --config server.yaml --api-only |
Starts oah-api only. Typically paired with oah-controller and oah-sandbox. |
| Standalone worker | pnpm exec tsx --tsconfig ./apps/server/tsconfig.json ./apps/server/src/worker.ts -- --config server.yaml |
Standalone worker, typically running inside a self-hosted or E2B sandbox. |
Environment Variable Overrides¶
In addition to YAML config, the server also reads a set of runtime environment variables for recovery, worker-pool behavior, and diagnostics.
Stale Run Recovery¶
| Variable | Default | Description |
|---|---|---|
OAH_STALE_RUN_RECOVERY_STRATEGY |
requeue_running with Redis, otherwise fail |
Stale-run recovery strategy. Supports fail, requeue_running, and requeue_all. |
OAH_STALE_RUN_RECOVERY_MAX_ATTEMPTS |
1 |
Maximum number of automatic requeue attempts per run. |
Embedded Worker Pool¶
| Variable | Default | Description |
|---|---|---|
OAH_EMBEDDED_WORKER_MIN |
2 with Redis, otherwise 1 |
Minimum embedded worker instances; standalone worker processes always keep at least 1. |
OAH_EMBEDDED_WORKER_MAX |
Same as OAH_EMBEDDED_WORKER_MIN |
Maximum embedded worker instances. |
OAH_EMBEDDED_WORKER_SCALE_INTERVAL_MS |
5000 |
Embedded worker pool rebalance interval. |
OAH_EMBEDDED_WORKER_READY_SESSIONS_PER_CAPACITY_UNIT |
1 |
Target ready-session density per observed execution-capacity unit. |
OAH_EMBEDDED_WORKER_SCALE_UP_COOLDOWN_MS |
1000 |
Scale-up cooldown. |
OAH_EMBEDDED_WORKER_SCALE_DOWN_COOLDOWN_MS |
15000 |
Scale-down cooldown. |
OAH_EMBEDDED_WORKER_SCALE_UP_SAMPLE_SIZE |
2 |
Consecutive high-pressure samples required before scaling up. |
OAH_EMBEDDED_WORKER_SCALE_DOWN_SAMPLE_SIZE |
3 |
Consecutive low-pressure samples required before scaling down. |
OAH_EMBEDDED_WORKER_SCALE_UP_BUSY_RATIO_PERCENT |
75 |
Busy-ratio threshold that may unlock extra scale-up when combined with queue age. |
OAH_EMBEDDED_WORKER_SCALE_UP_MAX_READY_AGE_MS |
2000 |
Allows age-driven scale-up once the oldest schedulable session waits longer than this. |
OAH_EMBEDDED_WORKER_RESERVED_CAPACITY_FOR_SUBAGENT |
1 |
Extra spare capacity reserved when subagent backlog appears; may be set to 0. |
Other Runtime Parameters¶
| Variable | Default | Description |
|---|---|---|
OAH_HISTORY_EVENT_RETENTION_DAYS |
7 |
Retention window for historical events in PostgreSQL mode. |
OAH_STORAGE_ADMIN_REDIS_OVERVIEW_KEY_LIMIT |
200 |
Maximum number of Redis session queue / lock / event keys scanned and returned per category in storage overview, capped at 10000; responses include truncated flags when the cap is reached. |
OAH_RUNTIME_DEBUG |
unset | Mirrors runtime debug logs to stdout when set. |
OAH_DOCKER_HOST_ALIAS |
host.docker.internal |
Host alias used when OAH runs inside Docker and an HTTP MCP server is configured with a loopback URL such as 127.0.0.1 or localhost. |
tip With Redis plus
API + embedded worker, OAH defaults to at least2embedded workers and performs lightweight scaling based on the gap between ready queue pressure and available worker capacity.scale_up_window,scale_down_window, andcooldown_msstill gate each action. If subagent backlog appears, the pool first tries to restorereserved_capacity_for_subagentso parent runs are less likely to be starved by normal backlog.tip
OAH_DOCKER_HOST_ALIASis mainly for the case where containerized OAH needs to reach an HTTP MCP server running on the host machine. The localdocker-compose.local.ymlalready injectshost.docker.internal:host-gateway, so the default works in most setups.
Schema¶
JSON Schema: schemas/server-config.schema.json