Skip to content

Overload read and blocks functions and methods#484

Open
hunterhogan wants to merge 1 commit into
bastibe:masterfrom
hunterhogan:overload
Open

Overload read and blocks functions and methods#484
hunterhogan wants to merge 1 commit into
bastibe:masterfrom
hunterhogan:overload

Conversation

@hunterhogan

Copy link
Copy Markdown
Contributor
  1. NDArray type for compact typing of indeterminate axes.
  2. _2d_float64 TypeAlias, and three more, for compact typing of 2-axis ndarray.
  3. T_ndarray TypeVar for out parameter because the ndarray assigned to out is the return type.
  4. kwargs_read TypedDict for parameters after out: see notes below.
  5. Use keyword names in calls to methods to help the static type checkers.
  6. In Generator type, remove the optional None values.
  7. In read and blocks methods, add identifer, sound, to help the static type checkers.
  8. In each overload definition, the placement of the line breaks might make it easier to see the patterns.
  9. out overload is first because it's different, but maybe it should be last.
  10. Then overload definitions sorted by the order of the parameters, dtype and always_2d, alphabetical in dtype.

Unpack

If **kwargs: Unpack... were in the primary definition, the parameters in the typed dict would be keyword-only, of course. In the overload definitions, if a user wants to ensure they have the benefit of a static type checker, then they will need to treat some of the parameters in most of the overload definitions are keyword-only, including those in Unpack.

I didn't add Unpack to the overload definition that contains the default values for dtype, always_2d, and out. Therefore, for the default case, for the static type checkers, all parameters are positional-or-keyword. After explaining my "reason", the idea sounds pretty stupid.

@bastibe

bastibe commented Jun 13, 2026

Copy link
Copy Markdown
Owner

I'm still thinking about this issue, and am still very torn about it.

Could you demonstrate how this changes your IDE behavior, or point me to a video where some other project shows similar behavior?

@hunterhogan

Copy link
Copy Markdown
Contributor Author

Could you demonstrate how this changes your IDE behavior

Yes, I happen to be refactoring my universal readAudioFile function right now. The following demonstrates there is a difference, but the type annotations in this function (and the ecosystem it supports) are currently kicking my butt. Therefore, I don't have the confidence to claim that my PR is perfect. (omg, my response is so annoying that I am annoying myself. 🤦)

In case these are helpful, the screenshots show files in these exact commits:

Current overloads

image

Giga overloads

image

@hunterhogan

Copy link
Copy Markdown
Contributor Author

Wait, wait, a clearer demonstration!

The above screenshots are absurdly complex.

But in this example, I set a static value for the dtype parameter, SoundFile.read(dtype='float32', always_2d=True), and because of that, the exact shape AND exact dtype can propagate through the different functions.

image

Because the type checker's inferred return is truncated, ^^^ I put the full inference in the comment.

@bastibe

bastibe commented Jun 17, 2026

Copy link
Copy Markdown
Owner

Thank you very much for providing these examples! This helps me understand what this is trying to achieve.

In your first example, the return type of read is too vague, such that the subsequent call to resampleWaveform can not be typecheck. I get that that's annoying. But please help me understand: In the line prior to read, you explicitly annotate the samplerate as int. Would your typing issue be similarly solved if you manually annotated the waveform as a 2D-NDArray? Why did you annotate the samplerate, but not the waveform?

Just to be clear, this is not a criticism of your examples. I merely want to understand how this stuff is used in practice.

@hunterhogan

Copy link
Copy Markdown
Contributor Author

It's cool to ask questions. I have discovered that I have no idea how other people use my code--well, would hypothetically use my code if it were 1% as popular as soundfile. 🤣

Why did you annotate the samplerate, but not the waveform?

Contemporary Python language servers (lsp) will optionally display the implicit type that their static type checker has computed. In this screenshot, those are the light grey annotations.

image

Partially due to my lack of professional training and partially due to my critical-thinking habits, I almost always add an explicit annotation. (Exception: during development, if I want to see what the type checker is "thinking", I remove explicit annotations.) In the screenshots above, I removed the annotation so you could see what the type checker was computing. My normal annotation:

image

Note that after I added an explicit annotation to the identifier in one location, the lsp stopped displaying all implicit annotations.

In your first example, the return type of read is too vague, such that the subsequent call to resampleWaveform can not be typecheck.

That's not completely accurate. The type definitions in soundfile are not vague. Objectively, they are not vague, and compared to the Python ecosystem, the current annotations are in the top 1%.

The "problems" arise because soundfile is typed very well, resampy stubs are typed very well, and that tells the type checker that some soundfile return types are incompatible with some resampy input types.

Programmatically, soundfile has a pretty limited range of dtypes. Now that the type annotations match the internal logic, type checkers can see when there are potential mismatches--which is the point of the type checker, right?

soundfile is unique, however, because we can tie an input parameter to an output dtype. With the rise of Python typing, apps tend to use a class as the parameter (instead of a string) or use TypeVar to infer the output dtype. But soundfile predates both of those capabilities by at least 8 British Prime Ministers, right?

The great news, however, is that it is completely possible for soundfile to precisely type the return value using the current parameters. In my limited experience, that is unique for apps that have been around for more than five years.

The current typing for soundfile really is excellent: top 1% is not hyperbole.

I rely on hyper-precise annotations, however, so I type everything to death. The folks at typeshed can tell you some war stories about me obsessing over silly things like ast.Constant.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants