Tracy

We cut your cost per token by 62%

Smart routing sends simple prompts to cheap models, hard ones to powerful ones. One API, 13+ providers, automatic fallbacks. Your AI stack, finally optimized.

Open source. MIT Licensed. No credit card required.

Intelligence & Observability — requests routed, cost savings, model distributionDistill Evaluations — overview with activity and model leaderboardEvaluations list — running, completed, and failed evaluation runsCost Analysis — baseline model comparison, latency distribution, router efficiencyCost Analysis — savings trend, cost over time, baseline comparisonCost Analysis — cost by model, cost by provider, most expensive requests

Works with every major provider

OpenAI
Anthropic
Google
GroqGroq
AzureAzure
CohereCohere
DeepSeekDeepSeek
FireworksFireworks
OllamaOllama
TogetherTogether
Gemini

Everything you need to
ship AI at scale

Route, trace, evaluate, and distill — one platform, every provider.

0
LLM Providers
0
Models
0
Routing Overhead
0
Open Source
Tracy
One API, Every Model

One endpoint.
Every LLM provider.

One OpenAI-compatible API that routes to every major provider — OpenAI, Anthropic, Google, Mistral, Groq, and more. Swap providers in one line. Automatic fallbacks keep you online.

OpenAIAnthropicGoogleMistralGroq+10 more
python
import opentracy as ot

# Call any model — one line
response = ot.completion(
    model="openai/gpt-4o-mini",
    messages=[{"role": "user", "content": "Hello!"}],
    fallbacks=["anthropic/claude-3-haiku"]
)

print(response.choices[0].message.content)
print(f"Cost: ${response._cost:.6f}")
Smart Routing — cost baseline comparison
Tracy routing
Smart Routing

Route smarter.
Pay less per request.

Automatically send simple prompts to fast, cheap models and route complex reasoning to the most capable one — across any provider, no code changes needed.

Tracy cost
Cost Tracking

Know exactly where
every dollar goes.

Per-token pricing on 70+ models, broken down by model, user, or feature. Set budget alerts and hard caps — no more end-of-month surprises.

Cost Tracking — savings trend over time
Real-time observability dashboard
Tracy alert
Real-Time Traces

Complete visibility
into every request.

Every request logged with full input, output, cost, latency, and model metadata. AI-powered scanning detects hallucinations before your users do.

Tracy zen
Model Distillation

Train your own model
from production data.

Turn production traces into fine-tuning datasets automatically. Get frontier-model quality from a model you own — at a fraction of the inference cost.

Model Distillation — evaluation runs
Quality Monitoring — evaluation overview
Tracy security
Quality Monitoring

Catch quality drops
before users do.

Continuous evaluations on production traffic detect regressions and hallucinations automatically.

Simple, predictable pricing

Start free. Scale when you need to. No surprises.

Free

$0

For developers exploring LLM routing and observability.

  • Up to 10,000 requests/month
  • 3 distillation runs/month
  • All 13+ providers
  • Full trace logging
  • Community support
Best Value

Starter

$10/mo

For teams running LLMs in production.

  • Unlimited requests
  • Unlimited distillation
  • Advanced evaluations
  • AI quality scanning
  • Priority support
  • Team collaboration
  • Self-host option
  • Custom routing rules

Enterprise

Custom

For organizations that want a turnkey AI infrastructure with hands-on guidance.

  • Everything in Starter
  • 24/7 dedicated support
  • Full setup done for you
  • Implementation consulting
  • Dedicated onboarding
  • Custom API architecture review
  • VPC deployment
  • SSO / SAML
  • Audit logs
  • Custom SLAs
  • On-premise option
  • BYOK encryption

Join the community

Open source, open development. Build with us.

1.2k
GitHub Stars
48
Contributors
850+
Discord Members

Open source. Self-host or cloud.

Run on your own infrastructure with full control, or use our managed cloud. MIT licensed, no vendor lock-in.

Free tier available. No credit card required.