Anti-Fragile Education Institute

Expert Encoding System

The mechanism that captures what makes any educator great and reproduces it at scale. Implementation plan for version one.

March 2026 · Internal document · Version 1.0

What We Are Encoding

Great educators are not great because of their content. They are great because of how they respond to student thinking. The way they praise, the way they challenge, what they notice first, what they refuse to let slide, how they adjust when a student is struggling versus thriving.

That responsive intelligence is what we are encoding. The word “voice” undersells it. We are encoding a pedagogical identity — a composite of seven distinct layers, each of which varies independently between educators.

The core insight: A single critique tells us how an educator responds to one type of work. Five critiques across the quality spectrum reveal how they modulate — and that modulation is the gold.

The current platform has one hardcoded educator voice — your voice — with 7 pedagogical moves, a critique structure, key phrases, and a teaching philosophy baked into four prompt files. It works beautifully for one workshop.

Expert Encoding is the mechanism that makes this a platform. It is the difference between a tool that delivers one workshop and a system that captures what makes any educator great and reproduces it at scale. It is the core IP.

The Seven Layers of Pedagogical Identity

Layer 1

Diagnostic Lens — What They See

Every educator looks at student work through a particular lens. You see everything through power, stewardship, and the CMRC anatomy. A business school professor sees market viability and unit economics. A fine art tutor sees intention, material honesty, and cultural positioning. This layer is the educator’s framework — the conceptual vocabulary they use to parse the world. It is the most fundamental layer and the hardest to capture because educators rarely articulate it explicitly. They just see differently.

In the current system: CMRC, reflect/affect, three problem types, stewardship. Lives in lecture-content.ts, referenced across all prompts.

Layer 2

Pedagogical Moves — How They Provoke

The seven moves in ideation-prompt.ts (Invert the assumption, Ask who loses, Name the data, Push the affordance, Scale the network, Challenge the obvious, Find the control) are your specific repertoire of provocations. These are not generic — they are signatures. One educator might always ask “who pays for this?” while another always asks “what happens in ten years?”

Layer 3

Evaluation Priorities — What They Reward and Punish

When two aspects of a submission are both strong, which one does the educator praise? When two aspects are both weak, which one do they address first? This ordering reveals values. You value “who loses?” thinking above technical sophistication. Another educator might prioritise originality above everything.

Layer 4

Tonal Register — How They Sound

Formality level, use of humour, directness versus indirection, questions versus statements, sentence rhythm. You use phrases like “less but better” and “that’s elegant product thinking.” Equally important: anti-patterns — things this educator would never say. “Great job!” is generic. “That’s a really interesting idea” is filler. Capturing what they wouldn’t say is as valuable as capturing what they would.

Layer 5

Scaffolding Strategy — How They Structure Feedback

Does the educator lead with strengths or weaknesses? Build to a crescendo or front-load the hard truth? End with a question or a statement? Your current strategy: Strengths → Provocations → Connections → Reflection. Another educator might use Problem → Reframe → Challenge → Next Steps. Or no fixed structure at all.

Layer 6

Reference Repertoire — What They Draw On

You reference Dieter Rams, Airbnb, Waze — drawing from design history, platform economics, and public services. Another educator might reference critical theory, scientific method, or case law. The reference repertoire signals intellectual context and shapes the conceptual world the student is invited into.

Layer 7

Calibration Intuition — How They Adjust to the Student

The most subtle and most important layer. A great educator speaks differently to a confused student than to a brilliant one. They know when to push harder and when to soften. This layer is about the differential — how does the educator’s behaviour change as a function of student quality? This is why we need synthetic students at different quality levels.

System Architecture

The encoding system sits alongside the existing workshop engine as a parallel tool for educators. It produces encoded prompt data that the existing API routes consume.

System overview

Encoding UI
Calibration Loop
Encoded Prompt Blocks
assemblePrompt()
+
assembleFewShots()
/api/critique
/api/chat (ideation)
/api/summarise
/api/chat (lecture)
Student workshop experience

Key Architectural Principles

Architecture Decision: Storage Strategy

No auth or database tables exist in the current system. All state is localStorage-based.

  • Option A: localStorage for everything (consistent with current pattern, works immediately)
  • Option B: Supabase tables (future-proof, requires auth setup)

Recommendation: Use localStorage for encoding session state in version one, matching the existing workshop pattern. Create the Supabase migration SQL as a file ready to run when auth is in place. The encoding data shape is designed to map directly to the database schema, so migration will be straightforward. This lets us build and test the full encoding flow without blocking on auth infrastructure.

The Encoding Process

The encoding process has five phases, each capturing different layers. Total time investment: approximately 90 minutes. This is deliberate — encoding an expert should feel like a serious professional process, not a quick form.

The minimum viable encoding is Phase 4 alone (critique encoding, ~30 minutes). This gives a critique prompt that sounds approximately right. The other phases extend encoding to ideation, briefing, and lecture. For version one, we build all five phases because we are encoding two known educators.

1

~15 min

Framework Interview

A structured AI-mediated conversation that captures Layers 1–3 (diagnostic lens, moves, evaluation priorities). Seven seed questions, each probed with follow-ups. The AI mirrors back what it hears and asks “is that right?” — forcing the educator to correct, refine, and sharpen.

Output: Populated Identity, Framework, Moves, and Evaluation blocks.

Questions: (1) What 3–5 concepts do you always return to? (2) What do you notice first in student work? (3) What’s your instinct when a student gives you a mediocre idea? (4) What do you never let slide? (5) What are your go-to provocations? (6) What vocabulary should students adopt? (7) What do you believe about teaching that colleagues disagree with?

2

~20 min

Ideation Encoding

The educator sees 3 synthetic students at different points in their ideation journey. For each, they write the provocation they would give. The AI then attempts the same provocation using the draft prompt. Side-by-side comparison and calibration.

Output: Calibrated ideation moves block + 2–3 few-shot provocation examples.

3

~10 min

Briefing Encoding

The educator sees synthetic student briefing responses and writes the summary they would give. The AI attempts the same summary. Side-by-side comparison.

Output: Calibrated briefing scaffold block + 1–2 few-shot summary examples.

4

~30 min

Critique Encoding

The core calibration loop. The educator critiques 5 synthetic student submissions spanning the full quality spectrum. After each, the AI attempts the same critique. The educator provides structured feedback on the gap. The prompt is updated between rounds. This is described in full in Section 05.

Output: Calibrated critique evaluation block, critique scaffold block, tone block, calibration block + 2–3 few-shot critique examples.

5

~10 min

Lecture Q&A Encoding

The educator answers 3 student questions from different angles — one conceptual, one practical, one off-topic. The AI attempts the same answers. Comparison reveals explanation style and intellectual honesty about limits.

Output: Calibrated lecture scaffold block + 2–3 few-shot Q&A examples.

The Calibration Loop

This is the heart of the system. Get this right and the encoding works. Get it wrong and you have a generic AI pretending to be a specific person.

The Loop: One Round

Each round of the critique calibration loop has four phases:

Single calibration round

A. Read & Critique
B. Self-Reflection
C. Side-by-Side
D. Gap Feedback
Meta-calibration updates prompt blocks
Next round uses improved prompt

Phase A: Read and Critique

The educator sees a synthetic student’s submission, presented exactly as the existing critique page presents it — same layout, same typography, same structure. Student background is shown as one sentence at the top. Quality band is not shown (to prevent bias). Provocation history is behind a toggle.

Below the submission: a single large text area. No structure imposed. No form fields. The prompt says: “Critique this student’s work as you would in person. Be yourself.” This is critical — the moment the interface imposes structure, you get structured responses instead of authentic ones.

Phase B: Self-Reflection

After saving the critique, four questions extract metacognitive process:

  1. “What was the first thing you noticed about this work?”
  2. “What did you choose NOT to say, and why?”
  3. “On a scale of 1–5, how confident are you in this student’s potential?”
  4. “If you could tell this student one thing, what would it be?”

Phase C: Side-by-Side Comparison

After the educator writes their critique, the AI generates its own critique of the same student using the current draft prompt. Two columns: left is “Your words,” right is “The encoded version.” Identical formatting. No value-laden labels like “human” versus “AI.”

Phase D: Gap Feedback

A structured feedback form. All fields optional except at least one must be completed:

  1. “What did the AI miss?” — Free text. Captures false negatives.
  2. “What did the AI say that you would never say?” — Free text. Captures anti-patterns. Extremely high-value signal.
  3. “How was the tone wrong?” — Multiple choice (too harsh / too soft / too generic / too formal / too casual / wrong emphasis) + free text.
  4. “Rate the AI’s attempt” — Slider, 1–10. Anchored: 1 = “completely wrong voice”, 5 = “recognisably similar but noticeably off”, 10 = “I could have written this.”
  5. “What’s the single most important thing to fix?” — Free text. Forces prioritisation.

Student Sequencing

The 5 synthetic students are presented in this deliberate order (not by band number):

Order Band Rationale
1st Band 2 (Competent) Easy to critique, establishes baseline
2nd Band 4 (Surface-level) Tests how the educator handles weaker work
3rd Band 1 (Exceptional) Tests nuance — finding genuine gaps in strong work
4th Band 5 (Off-track) Tests redirection and empathy
5th Band 3 (Promising but unfocused) Tests the ability to find the buried gem

What Happens Between Rounds

After each round of feedback:

  1. The system takes the educator’s gap analysis
  2. Feeds it to Claude as a meta-prompt: “Here is the educator’s critique and the AI’s attempt. Here is what the educator says is wrong. Update the system prompt to address these gaps.”
  3. Stores the updated prompt as a new version
  4. Uses the updated prompt for the next round’s AI critique

This creates a version history. Each version should be better than the last. The educator sees improvement in real time.

Completion Criteria

The Holdout Test

After encoding is complete, generate 2–3 new synthetic students the educator has never seen. The AI critiques them using the final encoded prompt. The educator rates: voice fidelity, diagnostic accuracy, calibration accuracy (each 1–10) and flags any anti-pattern violations. This is the truest measure of encoding quality.

Synthetic Students

We need five distinct quality bands. Each represents a meaningfully different kind of student and tests a different dimension of the educator’s range.

1
Exceptional

Sharp insight, elegant execution, one non-obvious blind spot

Can they praise without flattering? Can they find the one real gap in strong work?
2
Competent

Solid work, follows the framework, nothing surprising

Can they distinguish “good enough” from “good”? Do they push for excellence or accept competence?
3
Promising but Unfocused

Strong instincts, scattered execution, tries to do too much

Can they identify the buried gem and help the student focus?
4
Surface-Level

Uses the right words but hasn’t really thought deeply

Can they name what’s missing without being cruel? How do they handle intellectual dishonesty?
5
Off-Track

Misunderstood the brief, or solved a problem nobody has

Can they redirect without discouraging? How much ground truth do they preserve?

What Each Synthetic Student Contains

Each synthetic student is a complete WorkshopState object, matching the exact shape in workshop-state.ts:

The realism problem: AI-generated “bad” student work often reads like a parody. Real confused students are confused in specific, human ways: they mix up “model” and “render,” describe a database when they think they’re describing AI, write aspirational user journeys rather than grounded ones, bury one brilliant observation in mediocre thinking. The generation prompt must produce work that feels like a particular human wrote it.

Architecture Decision: Synthetic Student Generation

Options:

  • Option A: Generate via Claude API at encoding time (dynamic, but slow and unpredictable quality)
  • Option B: Pre-generate as static content like existing test personas (faster, more reliable)
  • Option C: Hybrid — pre-generate a default set, allow regeneration via API if the educator flags one as unrealistic

Recommendation: Option C (Hybrid). Pre-generate 5 static synthetic students per workshop (one per band) as the default set, stored in synthetic-students.ts alongside the existing test-personas.ts. Add an API route for on-demand regeneration. The pre-generated set can be carefully crafted for realism. The API regeneration handles edge cases. For version one with two workshops, we hand-craft all 10 synthetic students (5 per workshop) and automate later.

Provocation History Is Critical

The provocation history must be generated too, because it shows the student’s trajectory. A Band 1 student improves dramatically across 5 provocations. A Band 4 student nods along but doesn’t actually change their thinking. This trajectory is essential context for the educator’s critique.

Domain Coverage

The 5 synthetic students for each workshop should cover at least 4 of the 8 domains (Education, Health, Travel & Transport, Government, Retail, Manufacturing, Environment & Energy, Creative Arts & Culture), ensuring the educator encounters varied subject matter.

Modular Prompt Assembly

The final encoded system prompt is not one monolithic blob. It is assembled from discrete blocks that can be swapped depending on the interaction stage.

The Seven Prompt Blocks

Block ~Tokens Contents
Identity 500 Who you are. Teaching philosophy in 3–4 sentences. Core belief about education. How students describe your style.
Framework 600 Analytical framework. Named concepts. (CMRC, reflect/affect, three problem types, stewardship — or whatever the educator’s equivalent primitives are.)
Moves 500 5–7 named intervention patterns. Each with a one-sentence description and a characteristic phrase.
Evaluation 400 What you prioritise in student work, ranked. What you reward. What you challenge. What you never let slide.
Tone 300 Tonal register. Key phrases. Anti-patterns (things you would NEVER say). Sentence rhythm.
Calibration 400 How you adjust to student quality. “When a student is strong, do X. When they struggle, do Y.”
Scaffold 300 Structural template for this specific interaction type. (Stage-specific — one each for critique, ideation, briefing, lecture.)

Total declarative prompt: ~3,000 tokens. This leaves room for workshop-specific context (the submission, lecture content, etc.).

Stage-Specific Assembly

Each workshop stage uses a different combination of blocks:

Stage Blocks Used
Critique Identity + Framework + Evaluation + Tone + Calibration + Critique Scaffold
Ideation Identity + Framework + Moves + Tone + Calibration + Ideation Scaffold
Briefing Identity + Framework + Tone + Briefing Scaffold
Lecture Q&A Identity + Framework + Tone + Lecture Scaffold

Shared blocks (Identity, Framework, Tone) ensure consistency. Stage-specific blocks (Moves, Evaluation, Scaffolds) ensure appropriate behaviour.

Few-Shot Examples

The educator’s actual critiques become few-shot examples, placed in the messages array (not the system prompt):

// Assembled message array for a real student critique
[
  { system: "[assembled blocks for critique stage]" },
  { user: "Here is a student submission: {Band 2 submission}" },
  { assistant: "{Educator's actual Band 2 critique}" },
  { user: "Here is a student submission: {Band 4 submission}" },
  { assistant: "{Educator's actual Band 4 critique}" },
  { user: "Here is a student submission: {real student's submission}" }
]

Two examples are optimal — one strong student, one weaker student — showing the AI the range of the educator’s voice. Three is better if token budget allows.

The Integration Surface

// These two functions are the entire integration surface.
// Everything else in the encoding system exists to produce
// the data that feeds into them.

function assemblePrompt(
  encoding: EncodedEducator,
  stage: "critique" | "ideation" | "briefing" | "lecture",
  context: Record<string, unknown>
): string;

function assembleFewShotMessages(
  encoding: EncodedEducator,
  stage: "critique" | "ideation" | "briefing" | "lecture"
): Array<{ role: "user" | "assistant"; content: string }>;

Data Shapes and Types

All types live in src/lib/encoding/types.ts. They import from the existing workshop-state.ts types.

import type { WorkshopState, SubmissionData } from "@/lib/workshop-state";

// --- Synthetic Students ---

export type QualityBand = 1 | 2 | 3 | 4 | 5;

export interface SyntheticStudent {
  id: string;
  personaName: string;
  background: string;          // one sentence
  qualityBand: QualityBand;
  failureModes: string[];      // what goes wrong in their thinking
  workshopState: WorkshopState;
  isHoldout: boolean;          // reserved for final test
}

// --- Encoding Session ---

export type EncodingPhase =
  | "interview"
  | "ideation"
  | "briefing"
  | "critique"
  | "lecture"
  | "holdout"
  | "complete";

export type CritiqueRoundPhase =
  | "writing"       // educator writes critique
  | "reflecting"    // self-reflection questions
  | "comparing"     // side-by-side view
  | "feedback";     // gap feedback form

export interface SelfReflection {
  firstNoticed: string;
  choseNotToSay: string;
  confidence: number;          // 1-5
  oneThing: string;
}

export interface GapFeedback {
  missed: string;              // what AI missed
  neverSay: string;            // what AI said that educator never would
  toneIssues: string[];        // checkboxes + free text
  rating: number;              // 1-10 slider
  priorityFix: string;         // single most important thing to fix
}

export interface EncodingRound {
  roundNumber: number;
  studentId: string;
  stage: "critique" | "ideation" | "briefing" | "lecture";
  educatorResponse: string;
  aiResponse: string | null;
  selfReflection: SelfReflection | null;
  gapFeedback: GapFeedback | null;
}

export interface EncodingSession {
  id: string;
  workshopId: string;
  status: EncodingPhase;
  currentRound: number;
  rounds: EncodingRound[];
  promptVersions: EncodedPrompt[];
  createdAt: string;
  completedAt: string | null;
}

// --- Encoded Prompt Blocks ---

export interface EncodedPrompt {
  version: number;
  identityBlock: string;
  frameworkBlock: string;
  toneBlock: string;
  calibrationBlock: string;
  ideationMovesBlock: string;
  ideationScaffoldBlock: string;
  critiqueEvaluationBlock: string;
  critiqueScaffoldBlock: string;
  briefingScaffoldBlock: string;
  lectureScaffoldBlock: string;
}

// --- The Full Encoded Educator ---

export interface EncodedEducator {
  id: string;
  workshopId: string;
  prompt: EncodedPrompt;       // latest version

  // Few-shot examples per stage
  critiqueExamples: Array<{
    submission: SubmissionData;
    critique: string;
  }>;
  ideationExamples: Array<{
    context: string;
    provocation: string;
  }>;
  briefingExamples: Array<{
    responses: string;
    summary: string;
  }>;
  lectureExamples: Array<{
    question: string;
    answer: string;
  }>;

  // Metadata
  encodingVersion: number;
  calibrationScores: number[];
  holdoutScores: {
    voiceFidelity: number;
    diagnosticAccuracy: number;
    calibrationAccuracy: number;
    antiPatternViolations: boolean;
  } | null;
}

File Structure

Every new file the encoding system introduces, and every existing file it modifies. Green is new, yellow is modified.

src/lib/encoding/ types.ts -- All TypeScript types for encoding encoding-state.ts -- localStorage persistence for encoding sessions assemble-prompt.ts -- Takes encoding data + stage, returns system prompt assemble-few-shots.ts -- Selects and formats few-shot examples as messages calibrate.ts -- Takes gap feedback, calls meta-calibration, returns updated blocks bootstrap-prompt.ts -- Creates initial v0 prompt from existing hardcoded prompts src/lib/prompts/encoding/ interview-prompts.ts -- System prompts for Phase 1 framework interview student-generation-prompt.ts -- Prompt for generating synthetic students per band meta-calibration-prompt.ts -- Meta-prompt: update blocks from gap feedback holdout-prompt.ts -- Instructions for holdout evaluation src/lib/content/ synthetic-students.ts -- 5 pre-crafted synthetic students (quality bands 1-5) src/app/educator/ layout.tsx -- Educator area layout (nav, back link) src/app/educator/encoding/ page.tsx -- Encoding dashboard (progress, status, phases) src/app/educator/encoding/interview/ page.tsx -- Phase 1: Framework interview chat src/app/educator/encoding/ideation/ page.tsx -- Phase 2: Ideation encoding src/app/educator/encoding/briefing/ page.tsx -- Phase 3: Briefing encoding src/app/educator/encoding/critique/ page.tsx -- Phase 4: Critique calibration loop (core UI) src/app/educator/encoding/lecture/ page.tsx -- Phase 5: Lecture Q&A encoding src/app/educator/encoding/review/ page.tsx -- Holdout test + final approval src/app/api/encoding/ generate-students/route.ts -- Generate synthetic student batch via Claude interview/route.ts -- Streaming chat for framework interview calibrate/route.ts -- Generate AI critique attempt for side-by-side update-prompt/route.ts -- Update encoded prompt blocks from gap feedback holdout/route.ts -- Generate holdout critiques for evaluation src/app/api/ critique/route.ts -- MODIFY: add encoding-aware path chat/route.ts -- MODIFY: add encoding-aware path summarise/route.ts -- MODIFY: add encoding-aware path scripts/ encoding-migration.sql -- Supabase migration for all 6 encoding tables

Total new files: ~25. Modified existing files: 3 (API routes only, purely additive changes).

Build Sequence

The build is ordered so that each step produces something testable. Steps 1–4 are foundational. Steps 5–9 build the critique calibration loop. Steps 10–13 build the remaining encoding phases. Step 14 integrates everything with the workshop.

Version one delivers all 14 steps. The system will be fully functional for your workshop and Charlotte’s workshop, with all five encoding phases, the calibration loop, holdout testing, and integrated prompt assembly.

01

Types and Encoding State

All TypeScript types for the encoding system. localStorage persistence layer matching the existing loadWorkshopState() / saveWorkshopState() pattern.

src/lib/encoding/types.ts src/lib/encoding/encoding-state.ts
02

Bootstrap Prompt Extraction

Extract the current hardcoded educator’s identity into modular blocks. This creates the initial v0 EncodedPrompt by decomposing the existing prompts in critique-prompt.ts, ideation-prompt.ts, briefing-prompt.ts, and lecture-prompt.ts into the seven block types. The v0 prompt is the starting point before calibration.

src/lib/encoding/bootstrap-prompt.ts
03

Prompt Assembly Functions

assemblePrompt() concatenates the relevant blocks for a given stage into a system prompt string. assembleFewShotMessages() formats the educator’s examples as a message array. These are the two functions that connect the encoding system to the workshop API routes.

src/lib/encoding/assemble-prompt.ts src/lib/encoding/assemble-few-shots.ts
04

Synthetic Students

Hand-craft 5 complete SyntheticStudent objects (one per quality band) for the CMRC workshop. Each is a full WorkshopState with believable human imperfections. The existing Maya, Tom, and Priya are Band 1–2. We need Bands 3–5 and can refine the existing personas. Also write the generation prompt for future on-demand creation.

src/lib/content/synthetic-students.ts src/lib/prompts/encoding/student-generation-prompt.ts src/app/api/encoding/generate-students/route.ts
05

Educator Layout and Encoding Dashboard

The educator area layout (matching workshop layout pattern — sticky nav, progress indicator, back link). The encoding dashboard page showing phase progress, current status, and links to each encoding phase.

src/app/educator/layout.tsx src/app/educator/encoding/page.tsx
06

Critique Capture UI

The critique encoding page — the core calibration loop. Reuses the submission display pattern from the existing critique/page.tsx. State machine managing: writing → reflecting → comparing → feedback. The “writing” phase shows the synthetic student’s submission and a single large textarea.

src/app/educator/encoding/critique/page.tsx
07

AI Critique Generation (Side-by-Side)

API route that takes the current encoding and a student submission, generates the AI’s critique attempt using the assembled prompt. Returns the AI critique for the side-by-side comparison view.

src/app/api/encoding/calibrate/route.ts
08

Meta-Calibration

The meta-calibration prompt and API route. Takes the educator’s critique, the AI’s attempt, and the gap feedback. Asks Claude to update the prompt blocks to close the gaps. Returns the updated EncodedPrompt with an incremented version number.

src/lib/prompts/encoding/meta-calibration-prompt.ts src/lib/encoding/calibrate.ts src/app/api/encoding/update-prompt/route.ts
09

Full Critique Loop Integration

Wire everything together: 5 rounds, sequenced by quality band (2 → 4 → 1 → 5 → 3), prompt updates between rounds, progress tracking, completion detection (7+ for two consecutive rounds). At this point, critique encoding is fully functional end-to-end.

Updates to src/app/educator/encoding/critique/page.tsx

10

Framework Interview

Phase 1 of encoding. A streaming chat interface (reuses the existing SSE pattern from /api/chat). The AI conducts the 7-question interview, probes with follow-ups, mirrors back understanding. Outputs populated Identity, Framework, Moves, and Evaluation blocks.

src/lib/prompts/encoding/interview-prompts.ts src/app/api/encoding/interview/route.ts src/app/educator/encoding/interview/page.tsx
11

Ideation Encoding

Phase 2. Same calibration loop pattern as critique but for ideation provocations. Educator sees 3 synthetic students mid-ideation, writes provocations, compares with AI, provides feedback.

src/app/educator/encoding/ideation/page.tsx
12

Briefing and Lecture Encoding

Phases 3 and 5. Same pattern: educator writes response, AI attempts, side-by-side, feedback. Briefing encoding uses synthetic briefing responses. Lecture encoding uses 3 student questions (conceptual, practical, off-topic).

src/app/educator/encoding/briefing/page.tsx src/app/educator/encoding/lecture/page.tsx
13

Holdout Evaluation

Generate 2–3 new synthetic students the educator has never seen. AI critiques them with the final encoded prompt. Educator rates voice fidelity, diagnostic accuracy, calibration accuracy, and flags anti-pattern violations.

src/lib/prompts/encoding/holdout-prompt.ts src/app/api/encoding/holdout/route.ts src/app/educator/encoding/review/page.tsx
14

Workshop Integration

Modify the three existing API routes to check for encoded educator data. If an encoding exists, use assemblePrompt() and assembleFewShotMessages(). If not, fall back to the existing hardcoded prompts. The student workshop experience is identical — only the voice behind the critique changes.

src/app/api/critique/route.ts (modify) src/app/api/chat/route.ts (modify) src/app/api/summarise/route.ts (modify)
15

Database Migration (Ready for Future)

SQL migration file with all 6 encoding tables, ready to run when Supabase auth is in place. The localStorage data shapes map directly to these tables.

scripts/encoding-migration.sql

Integration with the Existing System

The existing workshop continues to work exactly as it does today. The encoding system is purely additive. Here is how the modified API routes work:

// src/app/api/critique/route.ts (modified)

import { chat } from "@/lib/claude";
import { getCritiqueSystemPrompt } from "@/lib/prompts/critique-prompt";
import { assemblePrompt } from "@/lib/encoding/assemble-prompt";
import { assembleFewShotMessages } from "@/lib/encoding/assemble-few-shots";

export async function POST(req) {
  const { submission, encoding } = await req.json();

  let systemPrompt: string;
  let messages: Array<{ role: string; content: string }>;

  if (encoding) {
    // Use encoded educator prompt
    systemPrompt = assemblePrompt(encoding, "critique", { submission });
    const fewShots = assembleFewShotMessages(encoding, "critique");
    messages = [
      ...fewShots,
      { role: "user", content: `Review this submission:\n${formatSubmission(submission)}` }
    ];
  } else {
    // Fallback to existing hardcoded prompt
    systemPrompt = getCritiqueSystemPrompt(submission);
    messages = [
      { role: "user", content: "Please review this submission and deliver your critique." }
    ];
  }

  const result = await chat(systemPrompt, messages);
  return NextResponse.json({ critique: result });
}

The pattern is identical across all three API routes: check for encoding data, assemble prompt if present, fall back to hardcoded prompt if not.

Database Schema

This schema is prepared as a migration file for when Supabase auth is in place. For version one, the same data shapes are stored in localStorage.

-- Encoding sessions
create table encoding_sessions (
  id uuid primary key default gen_random_uuid(),
  educator_id uuid not null references profiles(id),
  workshop_id uuid not null references workshops(id),
  status text not null default 'interview'
    check (status in (
      'interview', 'ideation', 'briefing',
      'critique', 'lecture', 'holdout', 'complete'
    )),
  current_round integer default 0,
  created_at timestamptz default now(),
  completed_at timestamptz
);

-- Synthetic students for encoding
create table synthetic_students (
  id uuid primary key default gen_random_uuid(),
  session_id uuid not null references encoding_sessions(id) on delete cascade,
  workshop_id uuid not null references workshops(id),
  persona_name text not null,
  background text not null,
  quality_band integer not null check (quality_band between 1 and 5),
  failure_modes text[],
  workshop_state jsonb not null,
  generation_metadata jsonb,
  is_holdout boolean default false,
  created_at timestamptz default now()
);

-- Encoding rounds (each critique-compare-calibrate cycle)
create table encoding_rounds (
  id uuid primary key default gen_random_uuid(),
  session_id uuid not null references encoding_sessions(id) on delete cascade,
  stage text not null check (stage in ('critique', 'ideation', 'briefing', 'lecture')),
  round_number integer not null,
  student_id uuid not null references synthetic_students(id),
  educator_response text not null,
  ai_response text,
  gap_feedback jsonb,
  self_reflection jsonb,
  created_at timestamptz default now()
);

-- Versioned encoded prompts
create table encoded_prompts (
  id uuid primary key default gen_random_uuid(),
  session_id uuid not null references encoding_sessions(id) on delete cascade,
  version integer not null,
  identity_block text,
  framework_block text,
  tone_block text,
  calibration_block text,
  ideation_moves_block text,
  ideation_scaffold_block text,
  critique_evaluation_block text,
  critique_scaffold_block text,
  briefing_scaffold_block text,
  lecture_scaffold_block text,
  created_at timestamptz default now()
);

-- Few-shot examples
create table encoding_examples (
  id uuid primary key default gen_random_uuid(),
  prompt_id uuid not null references encoded_prompts(id) on delete cascade,
  stage text not null check (stage in ('critique', 'ideation', 'briefing', 'lecture')),
  input_data jsonb not null,
  educator_response text not null,
  created_at timestamptz default now()
);

-- Holdout evaluations
create table holdout_evaluations (
  id uuid primary key default gen_random_uuid(),
  session_id uuid not null references encoding_sessions(id) on delete cascade,
  student_id uuid not null references synthetic_students(id),
  ai_critique text not null,
  voice_fidelity integer check (voice_fidelity between 1 and 10),
  diagnostic_accuracy integer check (diagnostic_accuracy between 1 and 10),
  calibration_accuracy integer check (calibration_accuracy between 1 and 10),
  anti_pattern_violations boolean,
  anti_pattern_details text,
  created_at timestamptz default now()
);

Edge Cases and Mitigations

Synthetic students don’t feel real

If synthetic students read as obviously AI-generated, the educator will critique the artificiality rather than the content.

Mitigation: Hand-craft the first set of 10 synthetic students (5 per workshop) with specific, human-like imperfections. Include an option for the educator to flag “this doesn’t feel like a real student” and regenerate. The generation prompt must produce work that feels like a particular human wrote it, not work that reads like an AI deliberately writing badly.

The educator’s voice is too subtle to capture

Some educators’ power is in pause, timing, facial expression — things that don’t translate to text.

Mitigation: During the framework interview, explicitly ask “What parts of your teaching rely on being in the room? What would you lose in text?” The encoding can compensate by being more explicit about intent where the educator would rely on presence.

The educator is inconsistent across critiques

Human educators have good days and bad days. They respond differently to work that touches their interests.

Mitigation: If the educator critiques two similar-quality students very differently, the system asks: “You responded quite differently to these two similar students. Can you help us understand what drove that difference?” This reveals hidden variables.

Charlotte teaches outside CMRC

The system must not assume CMRC or any specific framework. The Framework block is populated from the educator’s own vocabulary.

Mitigation: For version one, both workshops use the same WorkshopState submission shape (which includes CMRC fields). If Charlotte’s framework is significantly different, we parameterise the submission shape and synthetic student generation. The encoding system itself is already framework-agnostic — only the synthetic students and the workshop UI assume CMRC.

The AI plateaus below satisfaction

After 10 rounds the educator still rates below 7.

Mitigation: Surface this honestly: “We’ve captured X% of your voice. Here are the dimensions where the gap remains.” Options: (a) accept and monitor, (b) provide additional examples, (c) flag specific interactions for manual review.

Token budget pressure

Assembled prompts (~3,000 tokens) plus few-shot examples (2–3 full critiques) plus the actual student submission could push against context limits.

Mitigation: The assembleFewShotMessages() function selects the best 2 examples (one strong student, one weak student). If token budget is tight, it falls back to 1 example. The blocks themselves are kept tight by the meta-calibration prompt, which is instructed to keep each block under its token target.

Verification Protocol

How we know the system works, tested in sequence:

Technical Verification

  1. Synthetic student realism: Generate 5 synthetic students. Read them cold. Can you tell which band they’re in without being told? Do they feel like real students?
  2. Encoding flow: Walk through the full encoding process. Does the interface let the educator be themselves, or does it constrain them?
  3. Calibration convergence: After 5 rounds of feedback, does the AI’s critique sound recognisably like the educator? What’s still missing?
  4. Prompt assembly: Verify that assemblePrompt() produces valid system prompts for all four stages. Check token counts are within budget.
  5. Regression: Existing workshop still works with hardcoded prompts. No changes to student-facing behaviour when no encoding is present.

Voice Verification

  1. Holdout test: Show yourself 2–3 AI-generated critiques of students you’ve never seen. Could you have written these? Would your students believe you wrote them?
  2. Cross-encoding test: Run the same process with Charlotte. Her encoding should sound distinctly different from yours. If the two encoded voices are indistinguishable, the system is capturing something generic, not something personal.
  3. The ultimate test: Show an encoded critique to someone who knows the educator. Can they identify whose voice it is without being told?

Acceptance Criteria for Version One

Criterion Measure
Your encoding complete All 5 phases done. Holdout score 7+ on all three dimensions.
Charlotte’s encoding complete All 5 phases done. Holdout score 7+ on all three dimensions.
Voice differentiation Blind reader can tell which educator wrote which critique at least 80% of the time.
Student experience unchanged Workshop flow is identical. Only the voice behind critique/ideation/briefing changes.
No anti-pattern violations Neither encoding produces phrases the educator flagged as “would never say.”

Version One Scope: Your Workshop and Charlotte’s

Version one delivers the complete Expert Encoding system, proven on two educators with distinct pedagogical identities. Here is exactly what it includes and what it defers.

Included in Version One

Capability Detail
Full 5-phase encoding Framework interview, ideation encoding, briefing encoding, critique calibration (5-round loop), lecture Q&A encoding.
10 hand-crafted synthetic students 5 per workshop, one per quality band. Covering at least 4 of 8 domains each. Plus holdout students for testing.
Modular prompt assembly 7 blocks per educator, stage-specific assembly, few-shot example injection.
Meta-calibration Automated prompt refinement between calibration rounds. Version history.
Holdout testing Blind evaluation of encoding quality on unseen students.
Workshop integration Encoded prompts flow seamlessly into existing API routes. Fallback to hardcoded prompts when no encoding exists.
Two complete encodings Your voice and Charlotte’s voice, fully encoded and verified.
DB migration ready SQL file prepared for Supabase. Data shapes designed for easy migration from localStorage.

Deferred to Version Two

Capability Rationale for Deferral
Auth & multi-educator accounts Version one is used by you and Charlotte directly. No need for self-service onboarding yet.
Database persistence localStorage works for two known educators. Migrate to Supabase when auth is in place.
Automated synthetic student generation Hand-crafted students are higher quality for version one. Automate when we need to scale beyond 2 educators.
Framework-agnostic submission shapes Both version one workshops can share the CMRC submission shape. Parameterise when a non-CMRC educator joins.
Multi-workshop encoding transfer Version one has one workshop per educator. Transfer logic comes when an educator teaches multiple workshops.
Encoding analytics & monitoring Version one validates the approach. Monitoring comes when it’s in production with real students.

Charlotte’s Workshop: What We Need to Know

Open question: Does Charlotte use CMRC, or a different framework? If her framework is different, we need to either (a) parameterise the submission shape and synthetic student generation now, or (b) map her framework onto CMRC fields for version one. Option (b) is faster but may lose nuance. The framework interview (Phase 1) will reveal this naturally.

Regardless of the answer, the encoding system itself is framework-agnostic. The seven prompt blocks capture her concepts, her moves, her values. The only place the CMRC assumption lives is in the WorkshopState / SubmissionData type shape and the synthetic student content. If Charlotte’s framework maps reasonably onto collect/model/render/control fields (even with different labels), version one works. If it’s fundamentally different, we parameterise the submission shape — which is a clean, bounded change.

The Version One Experience

For you: Navigate to /educator/encoding. Walk through all five phases (~90 minutes). See your pedagogical identity decomposed into seven layers. Watch the AI get progressively better at sounding like you across five calibration rounds. Verify with holdout students. Your encoded voice is now live for every student who takes your workshop.

For Charlotte: Same process, same interface. Her encoding captures a distinctly different pedagogical identity. Her students get critiques that sound like her, not like you. The proof that the system works is that two outputs of the same encoding process are recognisably different people.

For students: Nothing changes about their experience. They still go through Lecture → Briefing → Ideation → Submission → Critique. The only difference is that the voice behind the critique, the ideation provocations, and the briefing summary is now specifically their educator, not a generic design educator.

The proof of concept: If someone who knows you reads an AI-generated critique from your encoding, and someone who knows Charlotte reads one from hers, and both recognise the voice — Expert Encoding works. The core IP is proven. Everything after that is scaling.