Skip to content

Pushdown some expressions to Dict layout reader#8341

Open
myrrc wants to merge 1 commit into
developfrom
myrrc/pushdown-dict
Open

Pushdown some expressions to Dict layout reader#8341
myrrc wants to merge 1 commit into
developfrom
myrrc/pushdown-dict

Conversation

@myrrc

@myrrc myrrc commented Jun 10, 2026

Copy link
Copy Markdown
Contributor

When we access values of Dict layout reader, it canonicalizes them and stores
them in a SharedArray. This means we always pay the cost of canonicalization
which in turn means we can't do #8310 .

In order to solve this issue, we need to apply some expressions to the values
array before canonicalizing it. However, we can't push down arbitrary
expressions as it may be beneficial to apply them over canonicalized array.

One example of such expressions is LIKE over a Dict array with few codes used.
Applying LIKE to whole values array is not beneficial.

This PR adds a hardcoded internal is_negative_cost estimation for expressions
that we want to push before canonicalization. A hint for these are expressions
which don't depend on individual input size. As an example, for every string,
len(string) doesn't read the string itself but reads the metadata and thus is
O(1) on individual input.

We don't push down fallible (like cast) or null sensitive (like IS NULL)
expressions as well because we want to propagate the errors at call site
rather than upfront.

Signed-off-by: Mikhail Kot <mikhail@spiraldb.com>
@myrrc myrrc requested a review from a team June 10, 2026 17:08
@myrrc myrrc added the changelog/feature A new feature label Jun 10, 2026
@myrrc myrrc requested a review from joseph-isaacs June 10, 2026 17:09
@codspeed-hq

codspeed-hq Bot commented Jun 10, 2026

Copy link
Copy Markdown

Merging this PR will degrade performance by 11.93%

⚠️ Unknown Walltime execution environment detected

Using the Walltime instrument on standard Hosted Runners will lead to inconsistent data.

For the most accurate results, we recommend using CodSpeed Macro Runners: bare-metal machines fine-tuned for performance measurement consistency.

❌ 1 regressed benchmark
✅ 1531 untouched benchmarks

Warning

Please fix the performance issues or acknowledge them on CodSpeed.

Performance Changes

Mode Benchmark BASE HEAD Efficiency
Simulation bitwise_not_vortex_buffer_mut[128] 215.3 ns 244.4 ns -11.93%

Tip

Investigate this regression by commenting @codspeedbot fix this regression on this PR, or directly use the CodSpeed MCP with your agent.


Comparing myrrc/pushdown-dict (e64f2a2) with develop (3d7bbfb)

Open in CodSpeed

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

changelog/feature A new feature

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant