//DOCS Learning Loop

The synthesis attempt log, the evidence view, and inspecting why an artifact did or didn't synthesize.
Open App

Inspecting the Learning Loop

Punk's inductive compiler is a glass box, not a black box. Every time the learning loop tries to turn a pattern into a deterministic artifact, and whether it succeeds, abstains, or fails replay, Punk persists the attempt, including the compiler's verbatim reasoning. Operators can see exactly why an artifact did or didn't synthesize and promote. Evidence, not magic.

The synthesis attempt log

Synthesis used to be ephemeral: the compiler's report lines were returned from POST /patterns/:id/synthesize and then forgotten. Now every run of the compiler records a row in synthesis_attempts:

fieldmeaning
outcomesynthesized · aborted · replay_failed
reasonthe one-line verdict (the abort line, or the synthesized state reason)
reportthe compiler's full report-line array, verbatim
sampleCountaligned live samples the compiler loaded
holdoutAccuracyself-verification accuracy on the chronological holdout (when reached)
artifactIdthe artifact produced, when one was

Both writers record the same shape:

  • The learning loop (LearningLoop.tick) records an attempt for each
  • candidate it compiles: aborted when the compiler abstains, synthesized when replay proves the program, replay_failed when it doesn't generalize.

  • The manual endpoint (POST /api/v1/patterns/:id/synthesize) records an
  • attempt every time an operator hits Synthesize now.

The report lines are the gold. They narrate each decision the compiler made:

compiling pattern "support-triage" (pat_…, taskClass schema_deterministic)
loaded 14 live samples → 10 train / 4 holdout (chronological split)
field 'priority': induced 3-rule decision list
field 'note': no deterministic program found, delegating to small-model fallback (hybrid)
self-verification: train accuracy 100.0% (10 samples), holdout accuracy 100.0% (4 samples)
aborting: holdout accuracy 62.0% < 80% floor; the induced program does not generalize

Endpoints

  • GET /api/v1/learning/attempts?patternId=&limit={ attempts }: the log,
  • newest-first, tenant-scoped (read-only, not admin). Omit patternId for the whole tenant.

  • GET /api/v1/patterns/:id/evidence → `{ pattern, artifacts, attempts,
  • routeArms }`: everything known about one pattern's optimization, aggregated. Each artifact carries an evaluation summary (replay/shadow pass-fail counts) and a confidence trajectory (the running confidence as evidence accrued, for an inline sparkline).

The Learning view

The dashboard's Learning view (#/learning) turns the log into a panel:

  1. A pattern-centric table: name, state, runs, stability, task class, and
  2. the latest synthesis outcome with a reason snippet.

  3. Click a pattern → the evidence panel: the lifecycle pipeline, the
  4. synthesis attempt log (each attempt's outcome badge, reason, and report lines in a mono console block), per-artifact replay/shadow pass-fail bars, a confidence-trajectory sparkline, and the pattern's route arms.

  5. A Synthesize now button runs the compiler against the pattern and
  6. refreshes the attempt log, so an operator can watch the compiler reason in real time.

This is the same evidence the promotion gate uses, surfaced so a human can audit the runtime's decisions.