Skip to content

Fan-out, reduction, parsing, and scoring

What it is: the staged model that separates candidate multiplicity, reduced-candidate selection, output normalization, and final scoring.

When it matters: whenever num_samples is greater than one or metrics need a parsed shape that differs from raw model output.

What you provide: candidate policy, optional reducer, optional parser, and metrics.

What Themis provides: ordered stage execution, typed contexts, and persisted artifacts between stages.

Use this pipeline when you need to see exactly where the output changes shape.

flowchart LR
    A["One dataset case"] --> B["Fan out N candidates"]
    B --> C["Reduce to one candidate or summary"]
    C --> D["Parse into metric-ready shape"]
    D --> E["Score or run workflow-backed evaluation"]
    E --> F["Persist artifacts and projections"]

Each stage exists so multiplicity, selection, normalization, and scoring can evolve independently without collapsing into one overloaded component.

What to inspect when it goes wrong: generated candidates first, then reduction output, then parsed output, then scores or evaluation executions.