Skip to content

fix(search): frame the root subject, not the first graph node#503

Merged
ddeboer merged 1 commit into
mainfrom
fix/search-frame-root-subject
Jun 19, 2026
Merged

fix(search): frame the root subject, not the first graph node#503
ddeboer merged 1 commit into
mainfrom
fix/search-frame-root-subject

Conversation

@ddeboer

@ddeboer ddeboer commented Jun 19, 2026

Copy link
Copy Markdown
Member

Problem

frameByType groups each root subject with the triples of the one-hop nodes it references, then yields framed['@graph'][0]. When a root subject references another subject of the same root type, jsonld.frame({'@type': rootType}) returns several root nodes and [0] can be the referenced one – so it is emitted twice and the referencing subject is dropped entirely.

This is not hypothetical. In the NDE Dataset Register a dataset can list a terminologySource (or other reference) whose IRI is itself a separately registered dcat:Dataset. On the full production corpus this silently dropped about 1% of datasets – each dropped one replaced by a duplicate of the dataset it referenced – with no error, because both the projection and the Typesense import still succeed.

Fix

Thread the root subject IRI through groupByRoot and yield the framed node whose @id matches that root, instead of blindly taking [0].

Validation

  • New regression test: a root subject that references another root-typed subject – both are projected, and the referenced one is not duplicated.
  • Reproduced against the full production Dataset Register read: before, projectGraph yielded 2625 documents but only 2602 unique (23 dropped); after the fix, 2625 / 2625 / 0 duplicates, and the previously missing datasets (e.g. the Gouda Tijdmachine knowledge graph) are present.

Local vitest/tsc could not run in my worktree (its shared node_modules predates the @tpluscode/rdf-ns-builders dependency on main); CI runs the suite with the correct dependencies.

@ddeboer ddeboer enabled auto-merge (squash) June 19, 2026 13:24
@ddeboer ddeboer force-pushed the fix/search-frame-root-subject branch from 173b26f to 8f1491f Compare June 19, 2026 13:36
frameByType framed each subgraph by root type and took `framed['@graph'][0]`.
When a root subject one-hop references another subject of the same root type —
e.g. a terminology source that is also a separately registered dataset —
`jsonld.frame` returns several root nodes, so `[0]` could be the referenced
one: it was emitted twice and the referencing subject was dropped.

Frame each subgraph by the specific root subject `@id` instead, so exactly that
subject is returned. Keeps the original branch structure (no coverage change).
@ddeboer ddeboer force-pushed the fix/search-frame-root-subject branch from 8f1491f to 722b82c Compare June 19, 2026 13:40
@ddeboer ddeboer merged commit eb1394b into main Jun 19, 2026
2 checks passed
@ddeboer ddeboer deleted the fix/search-frame-root-subject branch June 19, 2026 13:42
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant