Guide

Behavior Detection

Define custom behaviors in plain language and let Kalmia's AI automatically scan your traces and tag matches.

How it works

Behavior detection uses a two-table system: behaviors are definitions (what to look for) and annotations are instances (where it was found). They're linked by a shared label string.

BehaviorA definition — e.g. 'agent retries the same tool 3+ times'

AnnotationAn instance — a specific trace tagged with a behavior

DetectionAI scans traces and auto-creates annotations on matches

Define a behavior

Create a behavior definition with a label and plain-language description. The description tells Kalmia's AI what pattern to look for.

curl -X POST /api/behaviors \
  -H "Content-Type: application/json" \
  -d '{
    "label": "excessive-retries",
    "description": "The agent retries the same tool call 3 or more times in a row without changing its approach"
  }'

Run detection

Trigger detection on a set of traces. Kalmia's internal agent (powered by Claude Haiku) scans each trace against your behavior definitions and creates annotations with confidence scores.

curl -X POST /api/behaviors/detect \
  -H "Content-Type: application/json" \
  -d '{
    "traceIds": ["trace-abc", "trace-def"],
    "behaviorLabels": ["excessive-retries"]
  }'

Three creation paths

Annotations can be created in three ways:

ManualCreate an annotation directly on a trace from the dashboard

AI auto-detectionTriggered via /api/behaviors/detect — scans traces and tags matches

Batch fetchWhen loading annotations, auto-detects missing behaviors

Confidence levels

The AI returns a confidence score for each detection: high, medium, or low. Only high and medium confidence matches are auto-saved as annotations. Low confidence matches are discarded.

Cascading delete

Deleting a behavior definition automatically deletes all annotations with that label. This keeps your data clean when you remove or rename behaviors.

Example behaviors

Hallucination: "The agent generates code or file references that don't exist in the project"
Excessive retries: "The agent retries the same operation 3+ times without changing approach"
Permission errors: "The agent encounters permission denied or access errors"
Backtracking: "The agent undoes or reverts changes it previously made"