Punk Docs - Learning Loop

Inspecting the Learning Loop

Punk's inductive compiler is a glass box, not a black box. Every time the learning loop tries to turn a pattern into a deterministic artifact, and whether it succeeds, abstains, or fails replay, Punk persists the attempt, including the compiler's verbatim reasoning. Operators can see exactly why an artifact did or didn't synthesize and promote. Evidence, not magic.

The synthesis attempt log

Synthesis used to be ephemeral: the compiler's report lines were returned from POST /patterns/:id/synthesize and then forgotten. Now every run of the compiler records a row in synthesis_attempts:

field	meaning
`outcome`	`synthesized` · `aborted` · `replay_failed`
`reason`	the one-line verdict (the abort line, or the synthesized state reason)
`report`	the compiler's full report-line array, verbatim
`sampleCount`	aligned live samples the compiler loaded
`holdoutAccuracy`	self-verification accuracy on the chronological holdout (when reached)
`artifactId`	the artifact produced, when one was

Both writers record the same shape:

The learning loop (LearningLoop.tick) records an attempt for each

candidate it compiles: aborted when the compiler abstains, synthesized when replay proves the program, replay_failed when it doesn't generalize.

The manual endpoint (POST /api/v1/patterns/:id/synthesize) records an

attempt every time an operator hits Synthesize now.

The report lines are the gold. They narrate each decision the compiler made:

compiling pattern "support-triage" (pat_…, taskClass schema_deterministic)
loaded 14 live samples → 10 train / 4 holdout (chronological split)
field 'priority': induced 3-rule decision list
field 'note': no deterministic program found, delegating to small-model fallback (hybrid)
self-verification: train accuracy 100.0% (10 samples), holdout accuracy 100.0% (4 samples)
aborting: holdout accuracy 62.0% < 80% floor; the induced program does not generalize

Endpoints

GET /api/v1/learning/attempts?patternId=&limit= → { attempts }: the log,

newest-first, tenant-scoped (read-only, not admin). Omit patternId for the whole tenant.

GET /api/v1/patterns/:id/evidence → `{ pattern, artifacts, attempts,

routeArms }`: everything known about one pattern's optimization, aggregated. Each artifact carries an evaluation summary (replay/shadow pass-fail counts) and a confidence trajectory (the running confidence as evidence accrued, for an inline sparkline).

The Learning view

The dashboard's Learning view (#/learning) turns the log into a panel:

A pattern-centric table: name, state, runs, stability, task class, and

the latest synthesis outcome with a reason snippet.

Click a pattern → the evidence panel: the lifecycle pipeline, the

synthesis attempt log (each attempt's outcome badge, reason, and report lines in a mono console block), per-artifact replay/shadow pass-fail bars, a confidence-trajectory sparkline, and the pattern's route arms.

A Synthesize now button runs the compiler against the pattern and

refreshes the attempt log, so an operator can watch the compiler reason in real time.

This is the same evidence the promotion gate uses, surfaced so a human can audit the runtime's decisions.

//DOCS Learning Loop

Inspecting the Learning Loop

The synthesis attempt log

Endpoints

The Learning view