Using opentracy directly — the Python-first path for new apps
The Python SDK (opentracy) is the native entry point. Use it if you’re
starting a new project or if you want features (auto-routing, distillation,
trace ingestion) that aren’t part of the OpenAI API shape.
One install pulls a platform-specific wheel with the Go engine binary, the
ONNX embedder, and pre-trained routing weights bundled in. No extras
needed for the core path.
Load the pre-trained router once; it picks the right model per prompt:
auto = ot.load_router(cost_weight=0.5)decision = auto.route("Write a haiku about autumn")print(decision.selected_model) # e.g. "ministral-3b-latest"print(decision.cluster_id) # e.g. 87print(decision.expected_error) # e.g. 0.212print(decision.all_scores) # full score dict
Combined with ot.completion this becomes a cost-optimizing client:
acompletion shares its request-preparation path with the sync version,
so force_engine, force_direct, fallbacks, and engine-prefix handling
all behave identically.
If you have existing logs from another LLM provider and want to use them
for dataset building or distillation in OpenTracy, you can import them
directly:
from opentracy import add_trace, add_traces, import_traces# Single traceadd_trace({ "prompt": "Classify: ...", "response": "billing", "model": "openai/gpt-4o", "total_cost_usd": 0.00025, "latency_ms": 340, "metadata": {"source": "legacy-log-export"},})# Batchadd_traces([{...}, {...}, {...}])# From a JSONL fileimport_traces("path/to/exported-traces.jsonl")
From that point on, ot.completion(...) routes through the engine.
Per-call overrides:
# Always engine (even if OPENTRACY_ENGINE_URL is unset):ot.completion(..., force_engine=True)# Always direct (even if OPENTRACY_ENGINE_URL is set):ot.completion(..., force_direct=True)
Why isn’t this automatic? Because silently routing through whatever happens
to be listening on localhost:8080 is a footgun. Opt-in is explicit.
Five providers have dedicated classes (OpenAI, Anthropic, Google, Groq,
Mistral); the remaining seven (DeepSeek, Perplexity, Cerebras, Sambanova,
Together, Fireworks, Cohere) route through a UnifiedClient that speaks
the OpenAI-chat protocol. Bedrock is registered but raises a clear error
on construction — AWS SigV4 is not handled by UnifiedClient yet; use
ot.completion(force_engine=True) instead.
Legacy research APIs (load_router, UniRouteRouter, RouterEvaluator,
LLMJudge, …) resolve lazily via __getattr__ — they import the first
time you touch them, so they don’t slow down the initial import opentracy.
Legacy code using import lunar_router as lr keeps working via a
backwards-compat shim that redirects to opentracy and emits a
DeprecationWarning. New code should use import opentracy as ot.