//DOCS Operations

Production shape, workers, backups, retention, security, and health.
GitHub Docs

Operations

This guide covers running Punk locally and operating it in a deployed environment.

Local Development

bun install
bun run dev

Local endpoints:

  • Dashboard: http://localhost:4100
  • Chat gateway: http://localhost:4100/v1/chat/completions
  • Health: http://localhost:4100/health

In another terminal:

bun run demo

Production Shape

Minimum production process layout:

api process      bun run dev or equivalent entrypoint
worker process   bun run worker
database         Postgres via PUNK_DATABASE_URL

Recommended production settings:

  • Set PUNK_API_KEY.
  • Set PUNK_DATABASE_URL.
  • Set live provider variables.
  • Keep PUNK_ALLOW_PRIVATE_WEB_FETCH=false unless explicitly needed.
  • Keep PUNK_ALLOW_PRIVATE_WEBHOOKS=false unless explicitly needed.
  • Use observe mode API keys for first customer traffic.
  • Enable tenant redaction where tool payloads may include secrets.

Health Checks

Use:

curl http://localhost:4100/health

Expected fields:

  • ok
  • version
  • provider
  • plasmate

Worker

The API process starts an embedded worker by default. For deployed Postgres-backed environments, run a separate worker:

bun run worker

The job system handles retries, durable queues, and stats exposed through /api/v1/jobs.

Backups

SQLite:

  • Back up data/punk.db.
  • Include WAL/SHM files during live backup or stop the process first.

Postgres:

  • Use provider-native snapshots.
  • Confirm retention covers audit and replay needs.

Critical data:

  • Runs.
  • Trace events.
  • Audit events.
  • API keys metadata.
  • Settings.
  • Patterns/artifacts/evaluations.
  • Approvals and policy exceptions.

Retention

PUNK_RETENTION_DAYS controls retention sweep defaults. Retention deletes old runs/events/audit records for a tenant. Confirm the value matches customer and compliance expectations before enabling in production.

Security Checklist

Before exposing Punk publicly:

  • Set PUNK_API_KEY.
  • Use HTTPS at the edge.
  • Use Postgres for durable multi-process operation.
  • Create scoped tenant keys instead of sharing bootstrap admin token.
  • Pin keys to app ids where possible.
  • Use observe mode before optimize mode for new tenants.
  • Set rate limits.
  • Keep private web fetch disabled unless required.
  • Keep private webhooks disabled unless required.
  • Enable redaction for sensitive tool traces.
  • Review policies in PUNK_POLICIES_DIR.

Rate Limits

Defaults:

  • API: 300 RPM per caller.
  • Chat: 600 RPM per caller.

Set either to 0 to disable. The current limiter is in-process. Multi-instance rate limiting requires a shared backend and is not current behavior.

Webhooks

Tenant setting webhook_url must be public unless PUNK_ALLOW_PRIVATE_WEBHOOKS=true.

Use webhook_secret for signing/verification when configured. The secret is not returned by settings GET.

Deployment Notes

Punk includes vercel.json and scripts/vercel-build.ts, but long-running gateway plus worker behavior is usually better suited to a persistent runtime. If deploying serverless, confirm:

  • Database adapter behavior.
  • Worker scheduling.
  • Durable jobs.
  • Long-lived streaming.
  • File-system assumptions.

Observability

Use:

  • /health for uptime.
  • /api/v1/savings for value metrics.
  • /api/v1/jobs for background work.
  • /api/v1/audit for governance events.
  • /api/v1/runs for request-level behavior.
  • /api/v1/cache/stats for cache performance.