Skip to main content
OpenTracy ghost holding a shield — self-host when you need full control over data residency and security Most of OpenTracy works from just pip install opentracy. You need the full self-hosted stack if you want:
  • Trace storage and analytics (ClickHouse-backed).
  • The UI at localhost:3000 for browsing traces, datasets, experiments.
  • Distillation (Distiller client — requires the REST API + a GPU).
  • Engine-side routing aliases (the model="smart" → distilled-student swap).

Prerequisites

  • Docker ≥ 24 and Docker Compose v2
  • An NVIDIA GPU + nvidia-container-toolkit if you plan to run distillation
  • ~10 GB free disk for ClickHouse + weights + training artifacts
On Linux with an NVIDIA GPU:
sudo apt-get install -y nvidia-container-toolkit
sudo systemctl restart docker
docker run --rm --gpus all nvidia/cuda:12.6.3-base-ubuntu22.04 nvidia-smi
If that last command prints your GPU, you’re ready.

Start the stack

git clone https://github.com/OpenTracy/opentracy.git
cd opentracy
make start-full
After a minute or two:
ServiceURLPurpose
Gateway enginehttp://localhost:8080OpenAI-compatible, routes requests.
REST / Python APIhttp://localhost:8000Datasets, distillation, evaluations.
UIhttp://localhost:3000Browser dashboard.
ClickHouse HTTPhttp://localhost:8123Trace storage.
Health-check them all:
curl -s http://localhost:8080/health | jq .
curl -s http://localhost:8000/health | jq .
make start-full is defined in the repo’s top-level Makefile. It pulls the pinned images, bakes a docker-compose.yml with the four services above, and waits until ClickHouse is reachable before starting the engine. Run make help to see every command.

What each service is

  • opentracy-engine (Go) — the gateway. Receives /v1/chat/completions, authenticates to upstream providers, streams responses back, writes traces to ClickHouse. Also exposes /v1/route for auto-routing.
  • opentracy-api (Python, FastAPI) — business logic on top of traces: datasets, evaluations, distillation jobs. Uses ClickHouse for reads, launches training subprocesses.
  • clickhouse — column-store for traces and analytics.
  • ui (Next.js) — the dashboard.

Configuration knobs

Provider API keys live under ~/.opentracy/secrets.json on the host (mounted into the engine container):
{
  "openai_api_key": "sk-...",
  "anthropic_api_key": "sk-ant-...",
  "groq_api_key": "gsk_...",
  "deepseek_api_key": "sk-...",
  "huggingface_api_key": "hf_..."
}
Or via env vars passed through to the engine:
OPENAI_API_KEY=... ANTHROPIC_API_KEY=... make start-full
Other env vars worth knowing (all read by the opentracy-api container):
VariableDefaultPurpose
OPENTRACY_CH_HOSTclickhouseClickHouse host inside the compose network.
OPENTRACY_CH_DATABASEopentracyClickHouse database name.
OPENTRACY_CH_ENABLEDfalseEnable ClickHouse writes. make start-full sets this.
OPENTRACY_DATA_DIR/app/dataTraining artifacts root (mapped to a volume).
OPENTRACY_TRACE_REDACTfalseStrip PII patterns from trace content.
OPENTRACY_TRACE_CONTENTtrueSet to false to drop prompt/response text.
AUTO_TRAINERfalseEnable the autonomous operator loop.
Legacy LUNAR_* env vars still work and fall back to the new OPENTRACY_* names with a one-time DeprecationWarning. Migrate when you touch the env file — no rush.

Point your app at the local stack

import os
os.environ["OPENTRACY_ENGINE_URL"] = "http://localhost:8080"

import opentracy as ot
resp = ot.completion(model="openai/gpt-4o-mini", messages=[{"role": "user", "content": "hi"}])
# Request flowed through the engine, trace persisted in ClickHouse.
Or with the OpenAI SDK — see the drop-in-openai guide.

Running distillation

With the stack up and at least one GPU visible:
from opentracy import Distiller

d = Distiller(base_url="http://localhost:8000")
print([t["id"] for t in d.teacher_models()][:5])
print([s["id"] for s in d.student_models()][:5])

job = d.create(
    name="demo",
    dataset_id="ds_my_dataset",
    teacher_model="openai/gpt-4o-mini",
    student_model="llama-3.2-1b",
    num_prompts=50,
    training_steps=30,
)
job = d.wait(job["id"])
print(d.artifacts(job["id"]))
If the training step fails with “HuggingFace access required”, the student model is gated — add your HuggingFace token in UI → Settings → Integrations, then retry. The engine’s trainer subprocess picks up the token from secrets storage automatically.

Updating

git pull
make start-full    # rebuilds and restarts changed services
Migration: database schema changes are handled automatically on startup. Routing weights update whenever a newer version appears in the Hub (ot.download("weights-default") refreshes them locally).

Stopping

make stop-full
Data persists in volumes (clickhouse-data, weights-data, distillation-data). To wipe everything and start fresh:
docker compose down --volumes

Production considerations

The engine has no auth out of the box. Do not expose port 8080 publicly. Put it behind an auth proxy (Traefik, Caddy with mTLS, a VPN, etc.).
ClickHouse defaults are for dev. The compose file starts ClickHouse with the default password opentracy and no TLS. For production, set a real password (CLICKHOUSE_PASSWORD env var), enable HTTPS, and take regular backups.
Distillation artifacts are on the host. The distillation-data volume ends up under Docker’s volume root; set up regular backups if the adapters matter beyond reproducing them from the dataset.

Next

Pipeline

Understand how the self-hosted stack implements the pipeline.

Python SDK

The thin client that talks to the stack you just started.