//DOCS Chat & Agents

Route-visible chat economics, save-as-agent, and scheduled single-task agents.
Open App

Chat & Agents

Punk's interactive surface has two parts:

  • Chat: a conversation UI where every assistant reply is a real gateway run. Each reply carries its route badge and exact cost, so you watch conversations get cheaper as the runtime learns.
  • Agents: scheduled single-task runners built on the workflow engine. An agent is instructions + a prompt template + a model + an optional cron schedule, with no graph editing required.

Both are tenant-scoped and use the same auth as the rest of the API (session cookie or bearer key).

Chat

How a turn works

POST /api/v1/conversations/:id/messages builds the full message history (the conversation's system prompt, every prior turn, then the new message) and routes it through the same runtime path as any gateway request: governance, exact cache, promoted artifacts, learned substitutions, then the live provider. Whatever path served the reply, the stored assistant message records:

FieldMeaning
runIdThe gateway run behind the reply (full trace at /api/v1/runs/:id).
routelive, exact_cache, artifact, model_substitution, …
costUsdWhat the reply actually cost.
savedUsdWhat it avoided versus the live path.

This is the point: ask the same question twice (even in a fresh thread with the same system prompt and model) and the second answer comes back as an exact_cache hit at $0 with the savings on the reply. Chat turns run at temperature 0 under app id chat, so they cache, fingerprint, and feed the learning loop like all other traffic.

Failed or policy-blocked turns store nothing; the gateway's error (including punk_* codes) is returned as-is.

Endpoints

MethodPathNotes
GET/api/v1/conversationsTenant's conversations, most recently active first.
POST/api/v1/conversations{ title?, model?, system? }, where model defaults to gpt-4o.
GET/api/v1/conversations/:idConversation plus its messages in order.
DELETE/api/v1/conversations/:idRemoves the thread and all messages.
POST/api/v1/conversations/:id/messages{ content }{ conversation, userMessage, assistantMessage }.

Conversations auto-title from the first user message (first 6 words).

Dashboard

#/chat shows the conversation list beside the thread. User turns render on the right; assistant turns on the left with the route badge, cost, savings, and a link to the underlying run. The composer sets the model and an optional collapsible system prompt when starting a thread; an existing thread pins both. Enter sends; shift+Enter inserts a newline.

SAVE AS AGENT turns a conversation's setup into an agent: the system prompt becomes the agent's instructions, the last user message becomes its prompt template, and the model carries over, prefilled into the agent form for you to schedule.

Agents

An agent is the simplest useful automation: one governed LLM task, runnable on demand or on a cron schedule. Under the hood it is a workflow row with kind: "agent" and a fixed three-node graph (start → llm → output), where the llm node holds the agent's instructions (system), prompt template, and model and saves its answer. That means agents inherit the entire workflow machinery for free:

  • runs execute through the deterministic interpreter, with node timelines on the trace ledger (/api/v1/workflow-runs/:id);
  • the LLM step is a real gateway run (governed, cached, learnable), so repeated agent runs become patterns and, with evidence, deterministic artifacts;
  • cost and savings roll up per run, and the shared savings endpoint (/api/v1/workflows/:id/savings) works unchanged;
  • cron schedules fire through the existing minute-cadence schedule sweep, and deleting an agent unschedules it atomically.

The kind column only partitions the views: agents never appear in the workflow editor's list (GET /api/v1/workflows?kind=workflow), and plain workflows never appear as agents.

Endpoints

MethodPathNotes
GET/api/v1/agentsTenant's agents (name, instructions, prompt, model, cron, enabled).
POST/api/v1/agentsAdmin. { name, instructions, prompt, model?, scheduleCron?, description?, enabled? }.
GET/api/v1/agents/:idOne agent.
PATCH/api/v1/agents/:idAdmin. Partial update; the graph is rebuilt and the version bumps.
DELETE/api/v1/agents/:idAdmin. Also removes its schedule.
POST/api/v1/agents/:id/runAdmin. { input? } → synchronous completed run.

Prompt templates interpolate run input with {{input.field}} (e.g. Triage this ticket: {{input.ticket}}). scheduleCron is a 5-field UTC cron expression.

Dashboard

#/agents lists agents with model, cron badge, enabled toggle, last run, and total savings. The create/edit form is four fields plus a schedule, deliberately not the graph editor. RUN NOW executes synchronously and shows the output, cost/savings, and a link to the run's node timeline.

Economics in one sentence

Chat and agents are both front-ends to the same runtime loop: traffic is observed, repeated shapes become patterns, proven patterns route through caches and artifacts, and the route badge on every reply shows you exactly when that happened and what it saved.