Extractors¶
Built-in parsing helpers are registered automatically by PluginRegistry.
builtin ¶
Built-in extractor implementations auto-registered by PluginRegistry.
BoxedTextExtractor ¶
Extract the final LaTeX-style boxed answer from raw text.
Source code in themis/extractors/builtin.py
extract ¶
extract(
trial: TrialSpec,
candidate: CandidateRecord,
config: Mapping[str, JSONValueType] | None = None,
) -> ExtractionRecord
Extract the final boxed segment from candidate output.
Source code in themis/extractors/builtin.py
ChoiceLetterExtractor ¶
Extract an uppercase multiple-choice letter from candidate raw text.
Source code in themis/extractors/builtin.py
extract ¶
extract(
trial: TrialSpec,
candidate: CandidateRecord,
config: Mapping[str, JSONValueType] | None = None,
) -> ExtractionRecord
Extract a multiple-choice answer letter from candidate output.
Source code in themis/extractors/builtin.py
FirstNumberExtractor ¶
Extract the first integer or floating-point token from raw text.
Source code in themis/extractors/builtin.py
extract ¶
extract(
trial: TrialSpec,
candidate: CandidateRecord,
config: Mapping[str, JSONValueType] | None = None,
) -> ExtractionRecord
Extract the first numeric token from candidate output.
Source code in themis/extractors/builtin.py
JsonSchemaExtractor ¶
Parse candidate raw text as JSON and validate it against a schema.
Source code in themis/extractors/builtin.py
extract ¶
extract(
trial: TrialSpec,
candidate: CandidateRecord,
config: Mapping[str, JSONValueType] | None = None,
) -> ExtractionRecord
Parse candidate output as JSON and validate it against a schema.
Source code in themis/extractors/builtin.py
NormalizedTextExtractor ¶
Normalize free-form text for robust exact-match style scoring.
Source code in themis/extractors/builtin.py
extract ¶
extract(
trial: TrialSpec,
candidate: CandidateRecord,
config: Mapping[str, JSONValueType] | None = None,
) -> ExtractionRecord
Normalize either the boxed answer or the full raw text.
Source code in themis/extractors/builtin.py
RegexExtractor ¶
Extract a regex match or capture group from candidate raw text.
Source code in themis/extractors/builtin.py
extract ¶
extract(
trial: TrialSpec,
candidate: CandidateRecord,
config: Mapping[str, JSONValueType] | None = None,
) -> ExtractionRecord
Extract a configured regex match from candidate output.