Skip to content

autotagging: consistent handling of albums and singletons#6682

Open
snejus wants to merge 4 commits into
refactor-change-representationfrom
consistent-handling-of-albums-singletons
Open

autotagging: consistent handling of albums and singletons#6682
snejus wants to merge 4 commits into
refactor-change-representationfrom
consistent-handling-of-albums-singletons

Conversation

@snejus
Copy link
Copy Markdown
Member

@snejus snejus commented May 29, 2026

This PR refactors beets.autotag so matching is organized around persistent candidate collections instead of one-shot helper functions.

Fixes: #6685
Supersedes: #6117

In one line

The autotag pipeline moves from "tag_album() / tag_item() return a Proposal" to "AlbumCandidates / TrackCandidates own search, deduplication, sorting, and recommendation state", and that logic is then split into a dedicated beets.autotag.candidates module.


Why this change exists

Before this PR, matching behavior was spread across:

  • free functions like tag_album() and tag_item()
  • helper logic in match.py
  • importer/session code that had to juggle both candidates and rec
  • a Proposal wrapper used to pass results around

That made album and singleton flows similar in purpose, but different in structure.

This PR gives both flows the same model:

  1. build a Source
  2. create a candidate collection
  3. resolve candidates in place
  4. read sorted matches from the collection
  5. read the recommendation from the same object

Architectural change

Before

Source
  -> `tag_album()` / `tag_item()`
      -> search by ID and/or text
      -> build matches
      -> sort matches
      -> compute recommendation
  -> return `Proposal(candidates, recommendation)`

After

Source
  -> `AlbumCandidates(source)` / `TrackCandidates(source)`
      -> `resolve(search_ids)`
          -> ID lookup
          -> library-ID lookup
          -> text search
          -> dedupe by candidate identifier
          -> build `AlbumMatch` / `TrackMatch`
      -> `matches`
      -> `recommendation`

This is the core refactor: matching is now modeled as a stateful collection object instead of a function that assembles and returns a result bundle.


Main architecture changes

1. Candidate lifecycle moved into Candidates objects

New collection types:

  • Candidates
  • AlbumCandidates
  • TrackCandidates

These objects now own:

  • candidate aggregation
  • duplicate suppression
  • ID search
  • text search
  • sorting via matches
  • recommendation calculation via recommendation
  • "ID match first, then search if needed" flow via resolve()

That centralizes the matching pipeline in one place instead of duplicating it across album and singleton helper functions.


2. Matching collections were split into a new module

The latest commit extracts:

  • Candidates
  • AlbumCandidates
  • TrackCandidates
  • Recommendation

from beets/autotag/match.py into beets/autotag/candidates.py.

That improves module boundaries:

`hooks.py`
  -> metadata info objects
     like `AlbumInfo`, `TrackInfo`

`match.py`
  -> concrete match objects
     like `AlbumMatch`, `TrackMatch`
  -> metadata application and item assignment

`candidates.py`
  -> candidate discovery
  -> deduplication
  -> search orchestration
  -> recommendation logic

High-level effect: clearer separation between "one possible match" and "the collection of possible matches".


3. Match validation moved into match classes

AlbumMatch and TrackMatch now own their own construction rules through try_create():

  • AlbumMatch.try_create()
    • rejects empty-track candidates
    • checks required fields
    • computes item-to-track assignment
    • computes distance
    • rejects ignored penalties
  • TrackMatch.try_create()
    • computes track distance
    • returns a ready TrackMatch

This removes helper-style validation code and keeps creation rules next to the match types they belong to.


API and control-flow changes

Removed old proposal-based flow

The old API centered around:

  • tag_album()
  • tag_item()
  • Proposal

That flow is replaced by candidate collections exposed through tasks.

New task-owned candidate state

Importer tasks now keep candidate state directly:

  • ImportTask.candidates -> AlbumCandidates
  • SingletonImportTask.candidates -> TrackCandidates

Call sites now use:

  • task.candidates.resolve(...)
  • task.candidates.search(...)
  • task.candidates.search_ids(...)
  • task.candidates.recommendation
  • task.candidates[0]

This simplifies session/UI logic because it no longer has to swap whole Proposal objects in and out.


High-level impact by area

beets.autotag

  • adds new beets.autotag.candidates module
  • re-exports candidate-related types from beets.autotag.__init__
  • removes the older proposal-oriented surface from the main flow
  • clarifies responsibilities between:
    • Info objects
    • Match objects
    • candidate collections

Importer and UI

  • tasks now own persistent candidate collections
  • recommendation is derived from task.candidates.recommendation
  • manual search and manual ID lookup mutate the existing candidate collection instead of returning a new Proposal
  • album and singleton flows now look much more alike

Plugins

Plugins consuming importer state now read recommendation from the task's candidates:

  • beetsplug.mbsubmit now checks task.candidates.recommendation
  • beetsplug.bench now benchmarks through AlbumCandidates(...).resolve(...)

This aligns plugin integrations with the new matching model.

Tests

  • candidate behavior now has its own dedicated test module: test/autotag/test_candidates.py
  • test_match.py becomes narrower and more match-focused
  • test_mbpseudo.py now explicitly marks one case as xfail, documenting an existing design mismatch where the plugin adjusts AlbumMatch dynamically

End-to-end flow after this PR

Importer task
  -> build `Source`
  -> create task-owned `AlbumCandidates` / `TrackCandidates`
  -> `resolve(search_ids)`
      -> explicit ID lookup, if provided
      -> otherwise try current library/source ID
      -> if strong non-timid ID recommendation, stop early
      -> otherwise perform text search
  -> UI reads `task.candidates.recommendation`
  -> selected match comes from `task.candidates[0]`

This is easier to reason about than the old "return proposal, maybe replace proposal, maybe replace recommendation" control flow.


Practical reviewer mental model

Think of the refactor as three separate layers:

  1. Info objects describe metadata from external sources
  2. Match objects represent one validated tagging outcome
  3. Candidates objects manage the search and ranking of many possible matches

So the importer/UI mostly talks to Candidates, while Candidates creates Match objects, and Match objects wrap Info.


User-visible behavior

This is mostly a structural refactor, not a feature change.

Notable external effects are small:

  • display code normalizes match type casing with .capitalize()
  • one mbpseudo scenario is now explicitly marked as expected-failing rather than implicitly tolerated

Reviewer takeaway

The purpose of this PR is to make album and singleton matching follow the same architecture, with a single object responsible for candidate search state and recommendation logic.

That should make future changes easier in one shared place, especially around:

  • deduplication
  • recommendation thresholds
  • search order
  • new candidate sources
  • keeping album and singleton behavior consistent

Copilot AI review requested due to automatic review settings May 29, 2026 18:54
@snejus snejus requested a review from a team as a code owner May 29, 2026 18:54
@github-actions
Copy link
Copy Markdown

Thank you for the PR! The changelog has not been updated, so here is a friendly reminder to check if you need to add an entry.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

grug see big refactor in beets.autotag: album + singleton matching now use stateful candidate collection objects (AlbumCandidates / TrackCandidates) instead of one-shot tag_album() / tag_item() returning Proposal. new beets.autotag.candidates module hold search + dedupe + sort + recommendation, while match.py focus on match objects and validation.

Changes:

  • add Candidates collections with .resolve(), .search(), .search_ids(), .matches, and .recommendation
  • move recommendation + candidate-search orchestration into new beets/autotag/candidates.py, and move match validation into AlbumMatch.try_create() / TrackMatch.try_create()
  • update importer tasks, terminal UI, and plugins to read recommendation from task.candidates.recommendation and to mutate same candidate collection for manual search/ID lookup; update tests accordingly

Reviewed changes

Copilot reviewed 14 out of 14 changed files in this pull request and generated 2 comments.

Show a summary per file
File Description
beets/autotag/candidates.py New module for candidate lifecycle: search, dedupe, sorting, recommendation, and resolve flow.
beets/autotag/match.py Removes proposal/tag helpers; makes Match generic and moves validation into try_create.
beets/autotag/hooks.py Adds InfoT typing helper and improves Info.__repr__ for logging/debug.
beets/autotag/__init__.py Re-exports new candidate types; stops exporting old proposal/tag helpers.
beets/importer/tasks.py Tasks now own cached candidates collections; lookup uses .resolve(); singleton task uses TrackCandidates.
beets/ui/commands/import_/session.py UI now uses task.candidates.recommendation; manual search/ID mutate candidate collection (no more Proposal).
beets/ui/commands/import_/display.py Adjusts match typing and normalizes displayed match type casing.
beetsplug/mbsubmit.py Uses task.candidates.recommendation instead of removed task.rec.
beetsplug/bench.py Bench uses Source + AlbumCandidates.resolve() instead of tag_album.
test/autotag/test_candidates.py New tests covering multi-data-source candidate aggregation via new collections.
test/autotag/test_match.py Removes tests for old proposal/tag API; keeps assign_items coverage.
test/test_importer.py Updates importer tests to use set_choice(Action.APPLY) instead of constructing dummy AlbumMatch.
test/plugins/test_art.py Updates fetchart importer test setup to use set_choice(Action.APPLY).
test/plugins/test_mbpseudo.py Marks one scenario as xfail (documents known design mismatch with dynamic match adjustment).

Comment thread beets/ui/commands/import_/session.py
Comment thread beets/importer/tasks.py
@codecov
Copy link
Copy Markdown

codecov Bot commented May 29, 2026

⚠️ Unsupported file format

Upload processing failed due to unsupported file format. Please review the parser error message:
Error parsing JUnit XML in /home/runner/work/beets/beets/.reports/pytest.xml at 1:82539

Caused by:
RuntimeError: Error converting computed name to ValidatedString

Caused by:
    string is too long</code></pre>

For more help, visit our troubleshooting guide.

@snejus snejus force-pushed the introduce-source branch from 31c9bf2 to 2f5b10e Compare May 29, 2026 19:31
@snejus snejus force-pushed the introduce-source branch 3 times, most recently from 95e0a81 to 34ae2e4 Compare May 30, 2026 12:23
@snejus snejus force-pushed the consistent-handling-of-albums-singletons branch from ee1aea8 to 7d412a1 Compare May 30, 2026 12:23
@snejus snejus force-pushed the introduce-source branch 4 times, most recently from 3ca6d85 to da82f8f Compare May 30, 2026 13:26
@snejus snejus force-pushed the consistent-handling-of-albums-singletons branch from 7d412a1 to ecbb5b1 Compare May 30, 2026 15:52
@snejus snejus changed the base branch from introduce-source to refactor-change-representation May 30, 2026 15:53
@snejus snejus force-pushed the consistent-handling-of-albums-singletons branch 3 times, most recently from adf1b2b to 8f25874 Compare May 31, 2026 04:10
@snejus snejus force-pushed the refactor-change-representation branch from d26e8f2 to 1952f55 Compare June 2, 2026 00:21
@snejus snejus force-pushed the consistent-handling-of-albums-singletons branch from 8f25874 to f2ed137 Compare June 2, 2026 00:21
@snejus snejus force-pushed the refactor-change-representation branch from 1952f55 to 97e8a4f Compare June 2, 2026 00:29
@snejus snejus force-pushed the consistent-handling-of-albums-singletons branch from f2ed137 to ba32b7b Compare June 2, 2026 00:29
@snejus snejus force-pushed the refactor-change-representation branch from 97e8a4f to 90e087b Compare June 2, 2026 07:46
@snejus snejus force-pushed the consistent-handling-of-albums-singletons branch 2 times, most recently from 9f1f381 to d8a6b49 Compare June 2, 2026 08:02
@snejus snejus force-pushed the refactor-change-representation branch from 90e087b to 3827a35 Compare June 2, 2026 08:02
@snejus snejus force-pushed the consistent-handling-of-albums-singletons branch from d8a6b49 to 349a55b Compare June 2, 2026 08:16
@snejus snejus force-pushed the refactor-change-representation branch 2 times, most recently from d466ca8 to 8668ec6 Compare June 2, 2026 14:48
@snejus snejus force-pushed the consistent-handling-of-albums-singletons branch 2 times, most recently from e6faa37 to 1f90f35 Compare June 2, 2026 15:29
@snejus snejus force-pushed the refactor-change-representation branch 2 times, most recently from 5e3860a to 609a54c Compare June 3, 2026 14:42
@snejus snejus force-pushed the consistent-handling-of-albums-singletons branch from 1f90f35 to 83a1c44 Compare June 3, 2026 14:42
snejus added 4 commits June 3, 2026 16:16
- Replace `tag_album`/`tag_item` proposal flows with `AlbumCandidates`
  and `TrackCandidates` that own candidate aggregation, ID/text search,
  sorting, and recommendation calculation.
- Update importer/session/plugin call sites to use
  `task.candidates.recommendation` and in-place candidate resolution,
  reducing duplicated control flow and centralizing match behavior.
- Adjust related tests and exports for the new API, normalize displayed
  match type casing, and mark the current `mbpseudo` dynamic adjustment
  behavior as expected-failing pending redesign.
@snejus snejus force-pushed the consistent-handling-of-albums-singletons branch from 83a1c44 to bd4b43f Compare June 3, 2026 15:16
@snejus snejus force-pushed the refactor-change-representation branch from 609a54c to e30f69a Compare June 3, 2026 15:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants