Skip to content

Protocols reference

Use this page when you are implementing custom components rather than using builtin ids.

Important runtime instrumentation contracts

Name Kind Use when Key constraints / notes
LifecycleSubscriber Instrumentation protocol You want callbacks around stage boundaries or raw on_event(...) notifications Observes execution without changing run_id
TracingProvider Instrumentation protocol You want span-oriented tracing around runs or stages Implements start_span(...) and end_span(...) hooks

Important config/runtime contracts

Name Kind Use when Key constraints / notes
Generator Generation protocol Candidate production logic belongs in your own code rather than a builtin or adapter Used in both direct Python authoring and config-loaded experiments
CandidateReducer Reduction protocol Multi-candidate output needs custom collapse or synthesis logic Pair with fan-out generation
Parser Parsing protocol Reduced output needs custom normalization before scoring Keep parser responsibility separate from metric logic
Metric protocols Evaluation protocols You need custom deterministic, workflow-backed, or trace-aware scoring Choose the smallest metric protocol that matches the task
WorkflowRunner Runtime protocol Workflow-backed metrics need custom execution semantics Keeps workflow execution separate from observation hooks

Generated contracts:

themis.core.protocols

Runtime-checkable extension protocols for Themis.

AfterGenerate

Bases: Protocol

Hook invoked after a generator returns a candidate.

AfterJudge

Bases: Protocol

Hook invoked after a workflow-backed metric finishes.

AfterParse

Bases: Protocol

Hook invoked after parsing completes.

AfterReduce

Bases: Protocol

Hook invoked after reduction produces a final candidate.

AfterScore

Bases: Protocol

Hook invoked after a pure metric emits a score or error.

BeforeGenerate

Bases: Protocol

Hook invoked before a generator runs.

BeforeJudge

Bases: Protocol

Hook invoked before a workflow-backed metric begins judging.

BeforeParse

Bases: Protocol

Hook invoked before parsing a reduced candidate.

BeforeReduce

Bases: Protocol

Hook invoked before reduction starts.

BeforeScore

Bases: Protocol

Hook invoked before a pure metric runs.

CandidateReducer

Bases: Protocol

Protocol for reducers that collapse multiple candidates into one.

CandidateSelector

Bases: Protocol

Protocol for selectors that choose candidates before reduction.

EvaluationWorkflow

Bases: Protocol

Protocol for workflow-backed metrics driven by judge model calls.

Generator

Bases: Protocol

Protocol for generation components that produce candidate outputs.

JudgeModel

Bases: Protocol

Protocol for judge models used inside evaluation workflows.

LLMMetric

Bases: Protocol

Protocol for metrics that judge a reduced candidate set with an LLM.

LifecycleSubscriber

Bases: BeforeGenerate, AfterGenerate, BeforeReduce, AfterReduce, BeforeParse, AfterParse, BeforeScore, AfterScore, BeforeJudge, AfterJudge, OnEvent, Protocol

Aggregate lifecycle subscriber protocol.

OnEvent

Bases: Protocol

Hook invoked after an execution event is persisted.

Parser

Bases: Protocol

Protocol for parsers that normalize reduced candidate outputs.

PureMetric

Bases: Protocol

Protocol for deterministic metrics that score parsed outputs directly.

SelectionMetric

Bases: Protocol

Protocol for metrics that judge multiple generated candidates.

TraceMetric

Bases: Protocol

Protocol for metrics that score traces or conversations.

TracingProvider

Bases: Protocol

Protocol for span-based tracing integrations.

WorkflowRunner

Bases: Protocol

Protocol for executing evaluation workflows and returning traces.