Documentation
Trace Collection
Capture full traces from your LLM agents and individual calls. Kalmia offers two approaches: the SDK for rich agent tracing with automatic span capture, and the Proxy for zero-code-change tracing of individual LLM calls.
traced() to group multiple calls and tool executions into a single trace with full conversation history.Quick Start
Three lines to start collecting traces from your existing code:
Install the SDK — pip install kalmia-sdk or npm install @kalmia/sdk
Initialize the logger and wrap your client.
Use traced() to group agent runs into a single trace, or let individual calls create their own traces automatically.
from kalmia_sdk import init_logger, wrap_anthropic, traced
import anthropic
# 1. Point traces at your Kalmia instance
init_logger(project_name="my-agent", base_url="http://localhost:3000")
# 2. Wrap your client — every LLM call is now auto-traced
client = wrap_anthropic(anthropic.Anthropic())
# 3. Use traced() to group calls into one trace
@traced(name="my-agent-run")
def run(prompt):
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": prompt}],
)
return response.content[0].text
run("What is the capital of France?")Python SDK
The Python SDK has zero external dependencies. It supports both OpenAI and Anthropic clients.
Installation
pip install kalmia-sdkWrap OpenAI
from kalmia_sdk import init_logger, wrap_openai
import openai
init_logger(project_name="my-project", base_url="http://localhost:3000")
client = wrap_openai(openai.OpenAI())
# Every call is now auto-traced
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}],
)Wrap Anthropic
from kalmia_sdk import init_logger, wrap_anthropic
import anthropic
init_logger(project_name="my-project", base_url="http://localhost:3000")
client = wrap_anthropic(anthropic.Anthropic())
# Every call is now auto-traced
message = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": "Hello!"}],
)API
TypeScript SDK
The TypeScript SDK uses AsyncLocalStorage for automatic context propagation. No manual span passing required.
Installation
npm install @kalmia/sdkWrap OpenAI
import OpenAI from "openai";
import { initLogger, wrapOpenAI } from "@kalmia/sdk";
initLogger({ projectName: "my-project", baseUrl: "http://localhost:3000" });
const client = wrapOpenAI(new OpenAI());
const response = await client.chat.completions.create({
model: "gpt-4o",
messages: [{ role: "user", content: "Hello!" }],
});Wrap Anthropic
import Anthropic from "@anthropic-ai/sdk";
import { initLogger, wrapAnthropic } from "@kalmia/sdk";
initLogger({ projectName: "my-project", baseUrl: "http://localhost:3000" });
const client = wrapAnthropic(new Anthropic());
const message = await client.messages.create({
model: "claude-sonnet-4-20250514",
max_tokens: 1024,
messages: [{ role: "user", content: "Hello!" }],
});API
Agent Tracing
Use traced() to capture an entire agent run — multiple LLM calls, tool executions, and the full conversation — as a single trace. Without traced(), each LLM call creates its own standalone trace.
Tool Spans
Wrap tool executions with traced() using span_type="tool" to capture them as tool spans. These appear with a green icon in the dashboard.
from kalmia_sdk import init_logger, wrap_anthropic, traced
import anthropic, json
init_logger(project_name="weather-agent", base_url="http://localhost:3000")
client = wrap_anthropic(anthropic.Anthropic())
TOOLS = [{
"name": "get_weather",
"description": "Get weather for a city",
"input_schema": {
"type": "object",
"properties": {"city": {"type": "string"}},
"required": ["city"],
},
}]
@traced(name="weather-agent")
def run_agent(prompt):
messages = [{"role": "user", "content": prompt}]
while True:
response = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
tools=TOOLS,
messages=messages,
)
tool_uses = [b for b in response.content if b.type == "tool_use"]
if not tool_uses:
return response.content[0].text
tool_results = []
for tool_use in tool_uses:
# Tool calls are captured as tool spans
with traced(name=tool_use.name, span_type="tool") as span:
result = get_weather(tool_use.input["city"])
span.log(input=tool_use.input, output=result)
tool_results.append({
"type": "tool_result",
"tool_use_id": tool_use.id,
"content": json.dumps(result),
})
messages.append({"role": "assistant",
"content": [b.model_dump() for b in response.content]})
messages.append({"role": "user", "content": tool_results})
run_agent("Compare weather in Tokyo, New York, and San Francisco")What Shows Up in the Dashboard
A traced agent run produces a conversation view showing every message exchange and tool call:
OpenAI Proxy
Alternatively, point the OpenAI SDK directly at Kalmia. The proxy forwards requests to api.openai.com and captures each call as a trace. No SDK wrapper needed.
Note: the proxy captures individual calls only. For multi-step agent tracing with grouped spans, use the SDK approach above.
from openai import OpenAI
client = OpenAI(
base_url="http://localhost:3000/api/v1",
api_key="sk-...",
default_headers={"x-kalmia-project": "my-project"},
)
response = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)Anthropic / Claude Proxy
Point the Anthropic SDK at /api/v1. The proxy handles the /messages path and forwards the x-api-key and anthropic-version headers.
import anthropic
client = anthropic.Anthropic(
base_url="http://localhost:3000/api/v1",
api_key="sk-ant-...",
default_headers={"x-kalmia-project": "my-project"},
)
message = client.messages.create(
model="claude-sonnet-4-20250514",
max_tokens=1024,
messages=[{"role": "user", "content": "Hello!"}],
)
print(message.content[0].text)Streaming
Both the SDK wrappers and the proxy support streaming. The SDK captures the full response after the stream completes. The proxy passes SSE chunks through in real-time.
# Streaming works transparently with wrapped clients
stream = client.chat.completions.create(
model="gpt-4o",
messages=[{"role": "user", "content": "Tell me a story"}],
stream=True,
)
for chunk in stream:
if chunk.choices[0].delta.content:
print(chunk.choices[0].delta.content, end="")
# Span is auto-created when stream endsDirect Trace Ingestion
Already have traces in Braintrust format? Post them directly to the ingestion endpoint. Accepts a single trace object or an array.
curl -X POST http://localhost:3000/api/v1/traces/ingest \
-H "Content-Type: application/json" \
-d '[{
"id": "trace-1",
"project_id": "my-project",
"input": [{"role": "user", "content": "Hello"}],
"output": "Hi there!",
"metadata": {"provider": "openai", "model": "gpt-4o"},
"spans": [{
"id": "span-1",
"span_id": "span-1",
"root_span_id": "span-1",
"name": "openai chat",
"input": [{"role": "user", "content": "Hello"}],
"output": "Hi there!",
"span_attributes": {"type": "llm"},
"metrics": {"prompt_tokens": 10, "completion_tokens": 5}
}]
}]'Projects
Traces are grouped into projects. With the SDK, set project_name in init_logger(). With the proxy, use the x-kalmia-project header. Each project gets its own Experiment in the dashboard.
Traces for the same project are grouped into one experiment. New traces append to the existing experiment rather than creating duplicates.
API Reference
All endpoints are served from your Kalmia instance. Replace localhost:3000 with your deployed URL in production.
/api/v1/chat/completionsOpenAI-compatible proxy. Forwards to api.openai.com and captures trace.
/api/v1/messagesAnthropic-compatible proxy. Forwards to api.anthropic.com and captures trace.
/api/v1/traces/ingestIngest pre-built trace objects. Accepts a single trace or an array.
/api/v1/tracesList captured traces. Supports ?project= and ?provider= query filters.
/api/experimentsList all experiments, including auto-created proxy experiments.

