KalmiaKalmia

Documentation

Trace Collection

Capture full traces from your LLM agents and individual calls. Kalmia offers two approaches: the SDK for rich agent tracing with automatic span capture, and the Proxy for zero-code-change tracing of individual LLM calls.

SDK (recommended)Wrap your OpenAI or Anthropic client. Every LLM call is auto-captured. Use traced() to group multiple calls and tool executions into a single trace with full conversation history.
ProxyPoint your SDK's base URL at Kalmia. Requests are forwarded to the real provider and each call is recorded as an individual trace.

Quick Start

Three lines to start collecting traces from your existing code:

1

Install the SDK — pip install kalmia-sdk or npm install @kalmia/sdk

2

Initialize the logger and wrap your client.

3

Use traced() to group agent runs into a single trace, or let individual calls create their own traces automatically.

from kalmia_sdk import init_logger, wrap_anthropic, traced
import anthropic

# 1. Point traces at your Kalmia instance
init_logger(project_name="my-agent", base_url="http://localhost:3000")

# 2. Wrap your client — every LLM call is now auto-traced
client = wrap_anthropic(anthropic.Anthropic())

# 3. Use traced() to group calls into one trace
@traced(name="my-agent-run")
def run(prompt):
    response = client.messages.create(
        model="claude-sonnet-4-20250514",
        max_tokens=1024,
        messages=[{"role": "user", "content": prompt}],
    )
    return response.content[0].text

run("What is the capital of France?")

Python SDK

The Python SDK has zero external dependencies. It supports both OpenAI and Anthropic clients.

Installation

pip install kalmia-sdk

Wrap OpenAI

from kalmia_sdk import init_logger, wrap_openai
import openai

init_logger(project_name="my-project", base_url="http://localhost:3000")
client = wrap_openai(openai.OpenAI())

# Every call is now auto-traced
response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
)

Wrap Anthropic

from kalmia_sdk import init_logger, wrap_anthropic
import anthropic

init_logger(project_name="my-project", base_url="http://localhost:3000")
client = wrap_anthropic(anthropic.Anthropic())

# Every call is now auto-traced
message = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}],
)

API

init_logger(project_name, base_url)Configure trace destination
wrap_openai(client)Auto-trace chat completions
wrap_anthropic(client)Auto-trace message creation
@traced(name=...) / with traced()Group calls into one trace
current_span()Get active span from context

TypeScript SDK

The TypeScript SDK uses AsyncLocalStorage for automatic context propagation. No manual span passing required.

Installation

npm install @kalmia/sdk

Wrap OpenAI

import OpenAI from "openai";
import { initLogger, wrapOpenAI } from "@kalmia/sdk";

initLogger({ projectName: "my-project", baseUrl: "http://localhost:3000" });
const client = wrapOpenAI(new OpenAI());

const response = await client.chat.completions.create({
  model: "gpt-4o",
  messages: [{ role: "user", content: "Hello!" }],
});

Wrap Anthropic

import Anthropic from "@anthropic-ai/sdk";
import { initLogger, wrapAnthropic } from "@kalmia/sdk";

initLogger({ projectName: "my-project", baseUrl: "http://localhost:3000" });
const client = wrapAnthropic(new Anthropic());

const message = await client.messages.create({
  model: "claude-sonnet-4-20250514",
  max_tokens: 1024,
  messages: [{ role: "user", content: "Hello!" }],
});

API

initLogger({ projectName, baseUrl })Configure trace destination
wrapOpenAI(client)Auto-trace chat completions
wrapAnthropic(client)Auto-trace message creation
traced(callback, { name })Group calls into one trace
currentSpan()Get active span from context

Agent Tracing

Use traced() to capture an entire agent run — multiple LLM calls, tool executions, and the full conversation — as a single trace. Without traced(), each LLM call creates its own standalone trace.

Tool Spans

Wrap tool executions with traced() using span_type="tool" to capture them as tool spans. These appear with a green icon in the dashboard.

from kalmia_sdk import init_logger, wrap_anthropic, traced
import anthropic, json

init_logger(project_name="weather-agent", base_url="http://localhost:3000")
client = wrap_anthropic(anthropic.Anthropic())

TOOLS = [{
    "name": "get_weather",
    "description": "Get weather for a city",
    "input_schema": {
        "type": "object",
        "properties": {"city": {"type": "string"}},
        "required": ["city"],
    },
}]

@traced(name="weather-agent")
def run_agent(prompt):
    messages = [{"role": "user", "content": prompt}]

    while True:
        response = client.messages.create(
            model="claude-sonnet-4-20250514",
            max_tokens=1024,
            tools=TOOLS,
            messages=messages,
        )

        tool_uses = [b for b in response.content if b.type == "tool_use"]
        if not tool_uses:
            return response.content[0].text

        tool_results = []
        for tool_use in tool_uses:
            # Tool calls are captured as tool spans
            with traced(name=tool_use.name, span_type="tool") as span:
                result = get_weather(tool_use.input["city"])
                span.log(input=tool_use.input, output=result)

            tool_results.append({
                "type": "tool_result",
                "tool_use_id": tool_use.id,
                "content": json.dumps(result),
            })

        messages.append({"role": "assistant",
                         "content": [b.model_dump() for b in response.content]})
        messages.append({"role": "user", "content": tool_results})

run_agent("Compare weather in Tokyo, New York, and San Francisco")

What Shows Up in the Dashboard

A traced agent run produces a conversation view showing every message exchange and tool call:

LLM spansPurple icon — auto-captured with tokens
Tool spansGreen icon — manual via traced()
ConversationFull message history in trace view
MetricsToken counts, timing per span

OpenAI Proxy

Alternatively, point the OpenAI SDK directly at Kalmia. The proxy forwards requests to api.openai.com and captures each call as a trace. No SDK wrapper needed.

Note: the proxy captures individual calls only. For multi-step agent tracing with grouped spans, use the SDK approach above.

from openai import OpenAI

client = OpenAI(
    base_url="http://localhost:3000/api/v1",
    api_key="sk-...",
    default_headers={"x-kalmia-project": "my-project"},
)

response = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Hello!"}],
)
print(response.choices[0].message.content)

Anthropic / Claude Proxy

Point the Anthropic SDK at /api/v1. The proxy handles the /messages path and forwards the x-api-key and anthropic-version headers.

import anthropic

client = anthropic.Anthropic(
    base_url="http://localhost:3000/api/v1",
    api_key="sk-ant-...",
    default_headers={"x-kalmia-project": "my-project"},
)

message = client.messages.create(
    model="claude-sonnet-4-20250514",
    max_tokens=1024,
    messages=[{"role": "user", "content": "Hello!"}],
)
print(message.content[0].text)

Streaming

Both the SDK wrappers and the proxy support streaming. The SDK captures the full response after the stream completes. The proxy passes SSE chunks through in real-time.

# Streaming works transparently with wrapped clients
stream = client.chat.completions.create(
    model="gpt-4o",
    messages=[{"role": "user", "content": "Tell me a story"}],
    stream=True,
)
for chunk in stream:
    if chunk.choices[0].delta.content:
        print(chunk.choices[0].delta.content, end="")
# Span is auto-created when stream ends

Direct Trace Ingestion

Already have traces in Braintrust format? Post them directly to the ingestion endpoint. Accepts a single trace object or an array.

curl -X POST http://localhost:3000/api/v1/traces/ingest \
  -H "Content-Type: application/json" \
  -d '[{
    "id": "trace-1",
    "project_id": "my-project",
    "input": [{"role": "user", "content": "Hello"}],
    "output": "Hi there!",
    "metadata": {"provider": "openai", "model": "gpt-4o"},
    "spans": [{
      "id": "span-1",
      "span_id": "span-1",
      "root_span_id": "span-1",
      "name": "openai chat",
      "input": [{"role": "user", "content": "Hello"}],
      "output": "Hi there!",
      "span_attributes": {"type": "llm"},
      "metrics": {"prompt_tokens": 10, "completion_tokens": 5}
    }]
  }]'

Projects

Traces are grouped into projects. With the SDK, set project_name in init_logger(). With the proxy, use the x-kalmia-project header. Each project gets its own Experiment in the dashboard.

SDKinit_logger(project_name="...")
Proxy headerx-kalmia-project
Defaultdefault
Experiment name{project} (Proxy)

Traces for the same project are grouped into one experiment. New traces append to the existing experiment rather than creating duplicates.

API Reference

All endpoints are served from your Kalmia instance. Replace localhost:3000 with your deployed URL in production.

POST/api/v1/chat/completions

OpenAI-compatible proxy. Forwards to api.openai.com and captures trace.

POST/api/v1/messages

Anthropic-compatible proxy. Forwards to api.anthropic.com and captures trace.

POST/api/v1/traces/ingest

Ingest pre-built trace objects. Accepts a single trace or an array.

GET/api/v1/traces

List captured traces. Supports ?project= and ?provider= query filters.

GET/api/experiments

List all experiments, including auto-created proxy experiments.