refactor(intrinsics): requirement_check_to_bool raises on schema mismatch (Epic #929 Phase 1)#1320
Conversation
…mputing#1320 - ALoraRequirement docstring now accurately scopes the LLMaJ fallback to generation-time errors only; schema mismatches from output_to_bool propagate intentionally as AdapterSchemaMismatchError - requirement_check_to_bool: broaden score guard from `is None` to isinstance check, so non-numeric scores (e.g. JSON strings) raise AdapterSchemaMismatchError instead of bare TypeError - Add tests for null and string-typed score values - Remove issue number refs from test docstrings per project convention Assisted-by: Claude Code Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
psschwei
left a comment
There was a problem hiding this comment.
A couple of minor things that came up on review. Feel free to push back on any.
|
Testing summary Local (macOS, no GPU):
LSF/Cuda (GPU,
|
psschwei
left a comment
There was a problem hiding this comment.
LGTM
Put a hold in case @jakelorocco wants to take a look, but if not I'll remove the hold/merge tomorrow morning.
|
This is next up on my merge conflict list. |
|
@AngeloDanducci feel free to drop the hold label as you see fit |
jakelorocco
left a comment
There was a problem hiding this comment.
I think this seems reasonable. Raising an exception during output_to_bool slightly changes the contract that output_to_bool puts forward. I don't know if we should just flag for end users that output_to_bool / validate might raise arbitrary exceptions? Should validate have a some means of flagging these exceptions differently from True/False?
|
Added documentation around the new raise behavior, will open an issue for addressing this the "right" way in the future if it does not end up covered by changes to iocontract etc in the rest of the epic. |
…atch (Epic generative-computing#929 Phase 1) requirement_check_to_bool no longer silently returns False when the adapter output does not match the expected contract. Both missing-key paths now raise AdapterSchemaMismatchError so callers learn about schema drift immediately rather than silently treating every call as "requirement not met" (generative-computing#1008). requirement_check gains a model_options parameter and a declared output contract in its docstring. Signed-off-by: Nigel Jones <jonesn@uk.ibm.com> Assisted-by: Claude Code Closes generative-computing#1138
…mputing#1320 - ALoraRequirement docstring now accurately scopes the LLMaJ fallback to generation-time errors only; schema mismatches from output_to_bool propagate intentionally as AdapterSchemaMismatchError - requirement_check_to_bool: broaden score guard from `is None` to isinstance check, so non-numeric scores (e.g. JSON strings) raise AdapterSchemaMismatchError instead of bare TypeError - Add tests for null and string-typed score values - Remove issue number refs from test docstrings per project convention Assisted-by: Claude Code Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
- requirement.py: rename `likelihood` -> `req_check` (holdover from pre-schema-change field name) - core.py: requirement_check now raises AdapterSchemaMismatchError on malformed adapter output instead of bare KeyError/TypeError - core.py: add model_options parameter to check_certainty and find_context_attributions for consistency with requirement_check - core.py: import AdapterSchemaMismatchError; clean up Returns section in requirement_check docstring - test: integration test confirming AdapterSchemaMismatchError propagates through ALoraRequirement.validate() (documents deliberate hard-fail) Assisted-by: Claude Code Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
Assisted-by: Claude Code Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
…sics.md
Example comment showed {"requirement_likelihood": 1.0} (pre-generative-computing#1008 schema).
Updated to {"requirement_check": {"score": 1.0}} and corrected the
accompanying description from "likelihood score" to the actual output shape.
Assisted-by: Claude Code
Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
- Tighten ALoraRequirement docstring: the LLMaJ fallback is a pre-generation availability check, not an exception handler around output parsing. Reword to avoid implying schema errors are caught. - Add cross-reference comments between the two parallel validation blocks in requirement_check_to_bool and requirement_check; Phase 2 will consolidate both via IOContract. Signed-off-by: Nigel Jones <jonesn@uk.ibm.com> Assisted-by: Claude Code Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
…ment_check NaN, +/-Inf, and scores outside [0.0, 1.0] were passing the existing isinstance guard silently -- nan > 0.5 evaluates False rather than raising. Tighten both guards with math.isfinite + range check. Update Raises: docstrings to cover the broadened condition. Add unit tests for NaN, Inf, above-range, and below-range cases in both test_requirement.py and the new test_core_schema.py. Assisted-by: Claude Code Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
Assisted-by: Claude Code Signed-off-by: Nigel Jones <jonesn@uk.ibm.com>
Signed-off-by: AngeloDanducci <angelo.danducci.ii@ibm.com>
dd7a1bb to
618528a
Compare
e81333d
Context
This is one of three parallel Wave 3 issues in Epic #929 Phase 1:
rag.pymigrationrequirement_checkmigrationguardian.pymigrationPhase 1 is gated on #1136 (PR #1269, merged 2026-06-23), which rewrote
call_intrinsicto use the newresolve_adapter()path from the Phase 0 scaffolding. All three Wave 3 issues were unblocked by that merge.No file overlap with #1321 — that PR touches
_util.pyandrag.py; this one touchescore.py,requirement.py, and tests. They can merge in any order.Approach note: #1321 introduces
IOContractsubclasses wired throughcall_intrinsic. This PR uses inline validation directly in the helper functions. TheIOContract-based approach forrequirement-checkis Phase 2 work; inline validation here achieves the stated goal of #1138 (loud schema mismatch) without depending on Phase 2 being complete.Problem
PR #1008 changed the
requirement-checkadapter output schema from{"requirement_likelihood": 0.9}to{"requirement_check": {"score": 0.9}}, butrequirement_check_to_boolwas never updated to match. The result: every call silently returnedFalsewith a log warning — the kind of failure that looks like a working system until someone notices the model is never actually satisfying any requirements.This issue is also called out in the Epic as the worked example for how adapter output contracts should be enforced going forward: schema drift must raise immediately, not degrade quietly.
What changed
Core fix
requirement_check_to_boolnow raisesAdapterSchemaMismatchErrorinstead of returningFalsewhen the adapter output does not match{"requirement_check": {"score": <float>}}. Covers missing top-level key, missing nestedscore, and wrong-type score (null, string, etc.).requirement_check(the higher-level wrapper that calls the adapter directly) now applies the same validation — previously it raised a bareKeyErroron bad output.Consistency
check_certaintyandfind_context_attributionsnow accept amodel_optionsparameter, bringing all three intrinsic helpers incore.pyinto line.Documentation
ALoraRequirementdocstring previously said "falls back to LLMaJ on error" without qualification. That fallback only covers generation failures (adapter not found, etc.) — output-parsing failures were always a hard raise and now correctly documented as such.docs/advanced/intrinsics.mdexample comment updated from the old{"requirement_likelihood": 1.0}schema to the current{"requirement_check": {"score": 1.0}}.Breaking change
requirement_check_to_boolno longer returnsFalseon parse failure — it raisesAdapterSchemaMismatchError. Any caller treating silentFalseas "requirement not met" must now also handle this exception. The old behaviour was masking real errors, so the assumption was already wrong.Test plan
test/stdlib/requirements/test_requirement.py(was 19), covering the oldrequirement_likelihoodschema, missing nested key, null score, string score, and an integration test confirming the exception propagates throughALoraRequirement.validate()uv run ruff format . && uv run ruff check .— cleanuv run mypy mellea/stdlib/requirements/requirement.py mellea/stdlib/components/intrinsic/core.py test/stdlib/requirements/test_requirement.py --ignore-missing-imports— cleanCloses #1138