REST API Overview

The OpenTracy engine speaks OpenAI-compatible HTTP, so any language that can do a POST request can use it. Python is a convenience; the REST API is the real surface.

Two services, two ports

Port	Service	What lives there
`8080`	Gateway (Go)	`/v1/chat/completions`, `/v1/route`, `/v1/models`, `/health`
`8000`	Management (Py)	Distillation jobs, trace search, clustering, evaluations, datasets

The gateway is the hot path — every completion goes through it. The management API is the slow path — you hit it when you’re building datasets, kicking off training jobs, or reviewing analytics.

Authentication

Out of the box, neither port requires auth. The engine expects you to put it behind your own proxy (Traefik, Caddy, a VPN, a service mesh) before exposing it to the internet. When you configure authentication, both services accept a bearer token:

Authorization: Bearer <your-token>

Provider API keys (OpenAI, Anthropic, …) are held by the engine, not passed by the client. See Self-hosting → Configuration.

Request format

Every endpoint is JSON-in, JSON-out:

POST /v1/chat/completions HTTP/1.1
Host: localhost:8080
Content-Type: application/json

{ ... }

All responses are UTF-8 JSON. Errors follow the OpenAI shape:

{
  "error": {
    "message": "Model 'foo' not found",
    "type": "invalid_request",
    "code": "model_not_found"
  }
}

OpenTracy-specific response headers

The gateway adds a few headers to every /v1/chat/completions response:

Header	Meaning
`X-OpenTracy-Selected-Model`	Concrete model the request was answered by.
`X-OpenTracy-Cluster-ID`	Semantic cluster the prompt landed in (0–99).
`X-OpenTracy-Expected-Error`	The router’s predicted error rate for this cluster/model.
`X-OpenTracy-Routing-Ms`	Wall time spent on the routing decision.
`X-OpenTracy-Session-Id`	For multi-turn tool calls — pass back on follow-up requests.

Health

curl -s http://localhost:8080/health
curl -s http://localhost:8000/health

{
  "status": "healthy",
  "router_initialized": true,
  "num_models": 12,
  "num_clusters": 100,
  "embedder_ready": true
}

Chat completions

POST /v1/chat/completions — the OpenAI-compatible entry point.

Routing decision

POST /v1/route — ask the router which model it would pick, without generating.

Models & health

GET /v1/models, GET /health — discover what’s configured.

Distillation

Create jobs, poll status, fetch artifacts over HTTP.

Traces

Search captured traces by model, time range, cost, or metadata.

Dropping in over the OpenAI SDK

The gateway is OpenAI-compatible by design. Any client library that lets you set a base URL works unchanged — see the drop-in OpenAI guide for Python, TypeScript, and raw curl.

Python SDK

REST API

Two services, two ports

Authentication

Request format

OpenTracy-specific response headers

Health

Pages

Chat completions

Routing decision

Models & health

Distillation

Traces

Dropping in over the OpenAI SDK

​Two services, two ports

​Authentication

​Request format

​OpenTracy-specific response headers

​Health

​Pages

Chat completions

Routing decision

Models & health

Distillation

Traces

​Dropping in over the OpenAI SDK

Two services, two ports

Authentication

Request format

OpenTracy-specific response headers

Health

Pages

Dropping in over the OpenAI SDK