//DOCS SDK

TypeScript client surface for chat, tools, feedback, and SOM fetch.
GitHub Docs

@punk/sdk — API reference

The TypeScript client for the Punk gateway. Zero dependencies; works in Bun and Node 18+ (uses global fetch). Source: packages/sdk/. For a guided tour, read ONBOARDING.md.

import { Punk } from "@punk/sdk";

All response types (Run, Pattern, Artifact, SavingsSummary, SomSnapshot, …) are exported from the package. They are local mirrors of the canonical @punk/trace-schema contracts, copied so the published SDK stays dependency-free.


Constructor

new Punk(opts?: PunkOptions)
OptionTypeDefaultSent as
baseUrlstring"http://localhost:4100"— (trailing slashes stripped)
apiKeystringnoneAuthorization: Bearer <apiKey> on every request
appstring"default-app"X-Punk-App on chat
agentstringnoneX-Punk-Agent on chat
subjectstringnoneX-Punk-Subject on chat; subject field on tool-cache calls

The client is stateless — construct one per (app, agent, subject) identity. apiKey is only needed when the gateway sets PUNK_API_KEY.


Chat

chat(params: ChatParams): Promise<ChatResult>

POST /v1/chat/completions (OpenAI-compatible) with the X-Punk-* identity headers. Forces stream: false — for streaming, use any OpenAI client pointed at the gateway instead.

interface ChatParams {
  model: string;
  messages: Array<{ role: string; content: string }>;
  temperature?: number;
  response_format?: unknown;
}

interface ChatResult {
  content: string; // choices[0].message.content, "" if absent
  runId: string;   // x-punk-run-id response header, "" if absent
  route: string;   // x-punk-route response header, "live" if absent
  raw: any;        // full OpenAI-shaped response body
}

Errors: throws on any non-2xx (including policy blocks, which return the verdict in the body).


Tool tracing

traceTool<TArgs, TResult>(def: ToolDefinition<TArgs, TResult>): TracedTool<TArgs, TResult>

Wraps a tool function so invocations are traced into a run and read-only results participate in the tool-result cache.

interface ToolDefinition<TArgs, TResult> {
  name: string;
  sideEffectLevel?: SideEffectLevel; // 0–4; default 3 (conservative)
  ttlSeconds?: number;               // level <= 1 + ttl > 0 => cacheable
  execute: (args: TArgs) => Promise<TResult> | TResult;
}

type TracedTool<TArgs, TResult> =
  (args: TArgs, ctx?: { runId?: string }) => Promise<TResult>;

Behavior of the returned function, in order:

  1. Cache check (only if sideEffectLevel <= 1 and ttlSeconds > 0): POST /api/v1/tool-cache/check with { toolName, subject, args }. On a hit, returns the cached result without executing; if a runId was given, traces tool.completed with cached: true. Network failure degrades to a miss.
  2. Trace tool.called with { name, args, sideEffectLevel } — only when ctx.runId is provided.
  3. Trace side_effect.planned with { toolName, level, payload } — only for sideEffectLevel >= 2, before execution, so replay/shadow can suppress it.
  4. Execute def.execute(args).
  5. Trace tool.completed with { name, result }.
  6. Cache store (cacheable tools only): POST /api/v1/tool-cache/store with the result and TTL.

Guarantees: without ctx.runId the tool executes untraced; trace and cache failures are swallowed (telemetry never breaks the tool call); errors thrown by execute propagate to the caller unchanged.

trace(runId: string, type: TraceEventType | string, payload: Record<string, unknown>): Promise<void>

POST /api/v1/trace with { runId, type, payload }. Appends a trace event to a run's ledger. Throws on non-2xx (unlike the internal tracing in traceTool, which is best-effort).


Feedback

feedback(runId: string, rating: 1 | -1, correction?: string): Promise<void>

POST /api/v1/runs/:id/feedback with { type: "rating", rating, correction }. Corrections are the strongest learning signal — they count against pattern stability and artifact confidence. Throws on non-2xx.


Semantic web (SOM)

fetchSom(url: string, opts?: { bypassCache?: boolean }): Promise<WebFetchResult>

POST /api/v1/web/fetch. Fetches a page and compiles it to a Semantic Object Model — regions and elements with stable ids — instead of raw HTML.

interface WebFetchResult {
  som: SomSnapshot;            // regions/elements, meta with byte counts
  source: string;              // "plasmate" | "builtin" | "cache"
  cached: boolean;             // served from the SOM cache
  htmlBytes: number;
  somBytes: number;
  tokensSavedEstimate: number; // raw-HTML tokens you didn't spend
  diff?: SomDiff;              // semantic diff vs. previous snapshot (on refetch)
  context: string;             // compact prompt-ready text rendering
}

bypassCache: true forces a refetch; when a prior snapshot exists, diff reports semantically weighted changes (pricing changed is high-significance; footer noise is low) and an aggregate driftScore in [0,1]. Throws on non-2xx.

Web sessions & actions — punk.web.*

The perception→action loop: open a stateful session, act on SOM element ids, observe the result. Actions are protocol-level (follow links, fill/submit forms — no JS engine, by design) and governed server-side.

punk.web.openSession(url): Promise<WebSessionOpenResult>   // POST /api/v1/web/sessions
punk.web.act(sessionId, intent): Promise<WebActResult>     // POST /api/v1/web/sessions/:id/act
punk.web.closeSession(sessionId): Promise<{ ok: boolean }> // DELETE /api/v1/web/sessions/:id
punk.web.listSessions()                                    // GET /api/v1/web/sessions

interface WebActionIntent {
  action: "click" | "type" | "select" | "submit";
  target: string;   // SOM element id e_… (or region id r_form… for submit)
  value?: string;   // for type/select
}

interface WebActResult {
  result: WebActionResult; // { ok, action, target, resolved?, navigated?, url, error? }
  som: SomSnapshot;        // fresh SOM after the action
  diff?: SomDiff;          // semantic diff vs. the pre-action snapshot
  context: string;         // prompt-ready rendering of the fresh SOM
}

Governance levels (PRD §17): type/select are level 0 (session-local form state), click is level 1 (read:web), and submit is level 3 — a write:web gated by the same policy engine as chat tools. Policy deny/approval_required on a submit returns 403 with the verdict; observe-mode keys can never submit ("observe-mode keys cannot perform web writes", 403) though their reads run normally. Every action is audited and every navigation destination (session open, link hrefs, form actions) is SSRF-guarded. Idle sessions auto-close after 5 minutes; sessions are tenant-private (another tenant's key sees 404).


Read APIs

savings(): Promise<SavingsSummary>

GET /api/v1/savings. Tenant rollup: totalRuns, liveRuns, optimizedRuns, blockedRuns, totalCostUsd, totalSavedUsd, ghostSavedUsd (observe-mode "would have saved" accounting), totalSavedMs, cacheHitRate, artifactHitRate, somTokensSaved.

patterns(): Promise<Pattern[]>

GET /api/v1/patterns, unwraps { patterns } ([] if absent). Each Pattern carries its lifecycle state (observedcandidate → … → promoted, or negative/retired), fingerprints, runCount, cost/latency averages, stabilityScore, and optimizableScore.

artifacts(): Promise<Artifact[]>

GET /api/v1/artifacts, unwraps { artifacts } ([] if absent). Each Artifact carries state, type, the declarative representation (interpreted DSL — never generated code), provenance (source/holdout run ids), Beta-posterior confidence (alpha, beta, confidence), and replay/shadow/live pass-fail counters.

artifactDetail(id: string): Promise<ArtifactDetail>

GET /api/v1/artifacts/:id.

interface ArtifactDetail {
  artifact: Artifact;
  evaluations: ArtifactEvaluation[]; // replay/shadow/live evidence rows
  pattern: Pattern | null;           // the source pattern
}

runDetail(id: string): Promise<RunDetail>

GET /api/v1/runs/:id.

interface RunDetail {
  run: Run;                        // includes routeExplanation
  events: TraceEvent[];            // the full append-only trace
  sideEffects: SideEffectRecord[]; // planned/executed/suppressed/blocked
}

run.routeExplanation is the audit story: route, reason, rejected alternatives, policy verdict, cache/artifact details, estimated savings, fallback.

cacheStats(): Promise<CacheStats>

GET /api/v1/cache/stats{ stats: Array<{ cacheType, entries, hits }> } per tier (exact_response, tool_result, som, negative, …).


Learning lifecycle

learningTick(): Promise<LearningReport>

POST /api/v1/learning/tick. Forces one learning pass (it also runs on a timer inside the gateway). Returns at least:

interface LearningReport {
  artifactsSynthesized: number;
  promotionsEligible: string[]; // artifact ids that passed the gates
  autoPromoted: string[];       // promoted hands-free (PUNK_AUTO_PROMOTE)
  synthesisReports?: Array<Record<string, unknown>>;
  [key: string]: unknown;
}

promoteArtifact(id: string): Promise<Artifact>

POST /api/v1/artifacts/:id/promote, unwraps { artifact }. The gateway enforces the promotion gate — replay evidence plus shadow agreement; side-effect-bearing artifacts additionally require operator action. Throws on non-2xx, including "gate not satisfied" rejections.


Tool-result cache (low level)

traceTool calls these for you; they're public for manual integration.

toolCacheCheck(toolName: string, args: unknown): Promise<{ hit: boolean; result?: unknown }>

POST /api/v1/tool-cache/check with { toolName, subject, args }. Never throws — any failure returns { hit: false }.

toolCacheStore(toolName: string, args: unknown, result: unknown, ttlSeconds?: number): Promise<void>

POST /api/v1/tool-cache/store with { toolName, subject, args, result, ttlSeconds }. Never throws — caching is an optimization, not a failure mode.


Error behavior summary

SurfaceOn failure
chat, trace, feedback, fetchSom, all read APIs, learningTick, promoteArtifactthrows Error("Punk API <METHOD> <path> failed: <status> <statusText> — <body, first 500 chars>")
Tracing inside traceToolswallowed — the tool call succeeds untraced
toolCacheCheckdegrades to { hit: false }
toolCacheStoreswallowed
def.execute inside a traced toolpropagates unchanged

There are no retries in the SDK; the gateway is local-first and the router fails open server-side.

Properties

punk.baseUrl, punk.app, punk.agent, punk.subject are readable on the instance. The API key is private.