Operations
This guide covers running Punk locally and operating it in a deployed environment.
Local Development
bun install
bun run dev
Local endpoints:
- Dashboard:
http://localhost:4100 - Chat gateway:
http://localhost:4100/v1/chat/completions - Health:
http://localhost:4100/health
In another terminal:
bun run demo
Production Shape
Minimum production process layout:
api process bun run dev or equivalent entrypoint
worker process bun run worker
database Postgres via PUNK_DATABASE_URL
Recommended production settings:
- Set
PUNK_API_KEY. - Set
PUNK_DATABASE_URL. - Set live provider variables.
- Keep
PUNK_ALLOW_PRIVATE_WEB_FETCH=falseunless explicitly needed. - Keep
PUNK_ALLOW_PRIVATE_WEBHOOKS=falseunless explicitly needed. - Use observe mode API keys for first customer traffic.
- Enable tenant redaction where tool payloads may include secrets.
Health Checks
Use:
curl http://localhost:4100/health
Expected fields:
okversionproviderplasmate
Worker
The API process starts an embedded worker by default. For deployed Postgres-backed environments, run a separate worker:
bun run worker
The job system handles retries, durable queues, and stats exposed through /api/v1/jobs.
Backups
SQLite:
- Back up
data/punk.db. - Include WAL/SHM files during live backup or stop the process first.
Postgres:
- Use provider-native snapshots.
- Confirm retention covers audit and replay needs.
Critical data:
- Runs.
- Trace events.
- Audit events.
- API keys metadata.
- Settings.
- Patterns/artifacts/evaluations.
- Approvals and policy exceptions.
Retention
PUNK_RETENTION_DAYS controls retention sweep defaults. Retention deletes old runs/events/audit records for a tenant. Confirm the value matches customer and compliance expectations before enabling in production.
Security Checklist
Before exposing Punk publicly:
- Set
PUNK_API_KEY. - Use HTTPS at the edge.
- Use Postgres for durable multi-process operation.
- Create scoped tenant keys instead of sharing bootstrap admin token.
- Pin keys to app ids where possible.
- Use observe mode before optimize mode for new tenants.
- Set rate limits.
- Keep private web fetch disabled unless required.
- Keep private webhooks disabled unless required.
- Enable redaction for sensitive tool traces.
- Review policies in
PUNK_POLICIES_DIR.
Rate Limits
Defaults:
- API: 300 RPM per caller.
- Chat: 600 RPM per caller.
Set either to 0 to disable. The current limiter is in-process. Multi-instance rate limiting requires a shared backend and is not current behavior.
Webhooks
Tenant setting webhook_url must be public unless PUNK_ALLOW_PRIVATE_WEBHOOKS=true.
Use webhook_secret for signing/verification when configured. The secret is not returned by settings GET.
Deployment Notes
Punk includes vercel.json and scripts/vercel-build.ts, but long-running gateway plus worker behavior is usually better suited to a persistent runtime. If deploying serverless, confirm:
- Database adapter behavior.
- Worker scheduling.
- Durable jobs.
- Long-lived streaming.
- File-system assumptions.
Observability
Use:
/healthfor uptime./api/v1/savingsfor value metrics./api/v1/jobsfor background work./api/v1/auditfor governance events./api/v1/runsfor request-level behavior./api/v1/cache/statsfor cache performance.