8000 exposes the distillation pipeline
over REST. Use this if you’re driving training from a language that
doesn’t have a Python client — CI jobs, a TypeScript backend, or a
Rust CLI, for example.
The REST endpoints backing the Python
Distiller
client. Any call you make through the SDK can also be made over HTTP.POST /v1/distillation
Create a new distillation job. Returns immediately withstatus: "pending";
training happens asynchronously on the engine host.
Request body
| Field | Type | Notes |
|---|---|---|
tenant_id | string | Workspace key. Defaults to "default". |
name | string | Human label. |
description | string | Optional. |
config.teacher_model | string | Provider-prefixed, e.g. openai/gpt-4o. |
config.student_model | string | HF-style ID, e.g. llama-3.2-1b. |
config.num_prompts | int | Cap on dataset rows to use. |
config.n_samples | int | Best-of-N candidates per prompt (default 4). |
config.training_steps | int | Fine-tune steps. |
config.bond_beta | float | BOND preference weight (default 0.5). |
config.bond_gamma | float | KL regularization strength (default 0.1). |
config.export_gguf | bool | Convert trained adapter to GGUF after training. |
config.quantization_types | array of string | Quantization flavors, e.g. ["q4_k_m", "q8_0"]. |
Response
Curl
GET /v1/distillation/
Fetch the current state of a job.Response
pending → running → completed | failed
| cancelled. phase is more granular: initializing → data_generation
→ curation → training → export → (done).
Polling idiom
GET /v1/distillation — list jobs
Response
tenant_id, status, limit (max 100),
offset.
POST /v1/distillation//cancel
Cancel a running job. Safe at any phase — partial artifacts are kept.GET /v1/distillation//artifacts
Fetch file paths on the engine host for the trained adapter + GGUF exports. Paths are relative to the engine’sOPENTRACY_DATA_DIR.
Errors
| Status | error.code | Meaning |
|---|---|---|
400 | invalid_config | Unknown model, missing required field, or bad range. |
402 | insufficient_credits | Cost estimate exceeds tenant’s budget. |
404 | job_not_found | job_id doesn’t exist (or belongs to another tenant). |
409 | job_already_running | Attempted to mutate a terminal job. |
500 | training_error | Subprocess crashed — see logs endpoint for details. |

