Upgrade reth 1.9.3 → 1.11.3#927
Conversation
|
Claude finished @ZanCorDX's task in 4m 14s —— View job Review: PR #927 — Upgrade reth 1.9.3 → 1.11.3
The migration is well-structured and the Main concern — provider opens moved into hot loops (perf)The shift from a single
Opening a history provider is meaningfully heavier than an Other items
Minor / notes
No blocking correctness or security issues found in the reviewed paths. The perf concern in the hot loops is the one I'd most want validated before merge. |
| let order_id = task.order.id(); | ||
| let start_time = Instant::now(); | ||
| let state_provider = match provider | ||
| .history_by_block_hash(current_sim_context.block_ctx.attributes.parent) |
There was a problem hiding this comment.
Perf regression in the sim hot loop. Previously the state provider was opened once per sim context (outside the inner while let loop) and cheaply Arc-cloned per order. Now history_by_block_hash is called for every order task. Opening a history provider walks reth's state/overlay setup and is materially more expensive than the old Arc::clone, so this adds per-order latency and DB pressure on the simulation path.
Since CachedDB now owns a non-cloneable StateProviderBox, the provider can't simply be hoisted — but it's worth confirming the per-order open cost is acceptable (benchmark mev_boost/sim throughput), or reworking so the provider is opened once per block and the per-order CachedDB borrows it.
There was a problem hiding this comment.
It's ~0.4 µs but I can fix it here
| fn initialize_block_state( | ||
| &mut self, | ||
| state_provider: Arc<dyn StateProvider>, | ||
| ) -> BlockState<CachedDB> { | ||
| let cached = CachedDB::new(state_provider, self.ctx.shared_cached_reads.clone()); | ||
| BlockState::new(cached) | ||
| source: StateProviderSource, | ||
| ) -> ProviderResult<BlockState<CachedDB>> { | ||
| let cached = CachedDB::new( | ||
| source.state_provider()?, | ||
| self.ctx.shared_cached_reads.clone(), | ||
| ); |
There was a problem hiding this comment.
Perf regression in the conflict-resolution hot loop. initialize_block_state now opens a fresh provider via source.state_provider()? on every sequence attempt: resolve_conflict_task loops over sequence_to_try (multiple per task) × many tasks per thread. Before this change the provider was opened once per worker thread in conflict_resolving_pool::start and Arc-cloned per task. The existing // @todo actually reuse it for the duration of the block (line 130) is now much more costly than before. Worth measuring the parallel-builder throughput impact, or reusing one provider per thread/block.
There was a problem hiding this comment.
parallel builder is not working anyway
| .get_or_init(|| { | ||
| let config = match tokio::runtime::Handle::try_current() { | ||
| Ok(handle) => RuntimeConfig::with_existing_handle(handle), | ||
| Err(_) => RuntimeConfig::default(), | ||
| }; | ||
| // Invariant: building the process-wide task runtime is a startup operation. If it | ||
| // fails the builder cannot compute state roots or open the provider factory, so there | ||
| // is no meaningful way to continue. | ||
| RuntimeBuilder::new(config) | ||
| .build() |
There was a problem hiding this comment.
The process-wide OnceLock captures the tokio Handle from whatever context makes the first call, and that decision is permanent. Two consequences worth checking:
- If the first caller runs inside a short-lived runtime (e.g. a
#[tokio::test]), the cachedRuntimeholds a handle to a tokio runtime that is dropped at the end of that test. Any later use of the global runtime (another test, ortest_reth_dbwhich is a plain#[test]) then operates on a dead handle — a plausible source of order-dependent test flakiness, since test execution order within a binary isn't deterministic. - First-call-wins also means the standalone-vs-attached choice can be "wrong" for later callers (attached when the first caller happened to be outside a runtime, or vice versa).
In the production single-runtime path this is fine; the risk is concentrated in the test suite. Consider not attaching to the ambient handle (always build standalone), or documenting that the first call must come from the long-lived runtime.
| fn unpack_nibbles<'a>(nibbles: &Nibbles, buf: &'a mut [u8; 64]) -> &'a [u8] { | ||
| let len = nibbles.len(); | ||
| for (i, nibble) in nibbles.iter().enumerate() { | ||
| buf[i] = nibble; | ||
| } | ||
| &buf[..len] | ||
| } |
There was a problem hiding this comment.
The fixed [0u8; 64] buffer indexes buf[i] without a bound check, so a Nibbles longer than 64 panics with an index-out-of-bounds. 64 is the correct max for account/storage trie keys (32-byte keccak = 64 nibbles), so the invariant holds today — but a debug_assert!(nibbles.len() <= 64) would make the invariant explicit and fail loudly in tests rather than relying on the implicit panic. Same applies to the identical helper in v_experimental/trie/mod.rs.
| + 'static, | ||
| { | ||
| let overlay = OverlayStateProviderFactory::new(provider); | ||
| let overlay = OverlayStateProviderFactory::new(provider, ChangesetCache::new()); |
There was a problem hiding this comment.
A fresh ChangesetCache::new() is allocated on every root-hash computation. If reth intends this cache to be reused across blocks/computations (to amortize changeset reads), a per-call cache provides no benefit. If a per-computation cache is the intended semantics here, ignore — but worth confirming against how reth's node wires OverlayStateProviderFactory.
|
Claude finished @ZanCorDX's task in 1m 15s —— View job Re-review: PR #927 — Upgrade reth 1.9.3 → 1.11.3
New since last review —
|
|
Claude finished @ZanCorDX's task in 1m 20s —— View job Re-review: PR #927 — Upgrade reth 1.9.3 → 1.11.3
New since last review —
|
|
Claude finished @ZanCorDX's task in 56s —— View job Re-review: PR #927 — Upgrade reth 1.9.3 → 1.11.3
New since last review —
|
| rw: bool, | ||
| root_hash_config: Option<RootHashContext>, | ||
| ) -> eyre::Result<ProviderFactoryReopener<NodeTypesWithDBAdapter<EthereumNode, Arc<DatabaseEnv>>>> { | ||
| tracing::info!(?reth_datadir, "AAAAAAAAAAAA!!!!!!!!!"); |
There was a problem hiding this comment.
Leftover debug log — should be removed before merge.
| tracing::info!(?reth_datadir, "AAAAAAAAAAAA!!!!!!!!!"); |
| let a = reth_datadir.unwrap_or("NADA".into()); | ||
| tracing::info!(?reth_rocksdb_path, ?a, "Using RockDB path"); |
There was a problem hiding this comment.
Leftover debug log with a "NADA" placeholder. This also consumes reth_datadir by value (unwrap_or), forcing the three preceding matches to use .clone(). Removing these two lines lets the final match consume reth_datadir directly and drops the added clones.
| let a = reth_datadir.unwrap_or("NADA".into()); | |
| tracing::info!(?reth_rocksdb_path, ?a, "Using RockDB path"); |
|
Claude finished @ZanCorDX's task in 40s —— View job Re-review: PR #927 — Upgrade reth 1.9.3 → 1.11.3
New since last review —
|
| let nibble = p[0]; | ||
| let vec = p.as_mut_vec_unchecked(); | ||
| vec.remove(0); | ||
| let nibble = p.get_unchecked(0); |
There was a problem hiding this comment.
as safe as before..... p[0];
|
|
||
| fn pad_path(mut path: Nibbles) -> B256 { | ||
| path.as_mut_vec_unchecked().resize(64, 0); | ||
| while path.len() < 64 { |
| let mut account_subtree = Vec::with_capacity(reth_proof.account_subtree.len()); | ||
| for (k, v) in reth_proof.account_subtree.into_inner() { | ||
| account_subtree.push((convert_reth_nybbles_to_nibbles(k), v)); | ||
| account_subtree.push((k, v)); |
There was a problem hiding this comment.
So it's different nibles now?
There was a problem hiding this comment.
Upgraded Nibbles so we don't have 2 versions as before.
| impl NodeCursor { | ||
| pub fn new(key: Nibbles, head: u64) -> Self { | ||
| let current_path = Nibbles::with_capacity(key.len()); | ||
| let current_path = Nibbles::new(); |
There was a problem hiding this comment.
new Nibbles.... already have the 256 allocated , no "capacity"
| .other_child_nibble(n) | ||
| .expect("other child must exist"); | ||
| if branch.get_diff_child(other_child_nibble).is_none() { | ||
| let mut other_child_path = c.current_path.clone(); |
There was a problem hiding this comment.
Are we okay that logic differs?
There was a problem hiding this comment.
New nibble is Copy
| .other_child_nibble(n) | ||
| .expect("other child must exist"); | ||
| if branch.get_diff_child(other_child_nibble).is_none() { | ||
| let mut other_child_path = c.current_path.clone(); |
There was a problem hiding this comment.
Are we okay that logic differs?
There was a problem hiding this comment.
Why you say that? Both update the last nibble
| .as_mut_vec_unchecked() | ||
| .insert(0, *child_nibble); | ||
| let leaf_key = leaf_below.key_mut(); | ||
| *leaf_key = |
There was a problem hiding this comment.
This looks like slowdown too?
There was a problem hiding this comment.
New Nibbles new code.
Previous was insert(0) so it was also O(N)
| DiffTrieNodeKind::Extension(ext) => { | ||
| ext.key_mut().as_mut_vec_unchecked().insert(0, child_nibble); | ||
| let ext_key = ext.key_mut(); | ||
| *ext_key = Nibbles::from_nibbles_unchecked([child_nibble]).join(ext_key); |
|
|
||
| fn pad_path(mut path: Nibbles) -> B256 { | ||
| path.as_mut_vec_unchecked().resize(64, 0); | ||
| while path.len() < 64 { |
| #[inline] | ||
| fn nibbles_to_buf<'a>(nibbles: &Nibbles, buf: &'a mut [u8; 64]) -> &'a [u8] { | ||
| for (i, nibble) in nibbles.iter().enumerate() { | ||
| buf[i] = nibble; |
There was a problem hiding this comment.
New nibbles are packed 2 per byte (real nibbles!) so there is no direct slice access.
| let orphan_ptr = orphan_ptr.unwrap(); | ||
| if orphan_ptr.is_remote() { | ||
| let mut orphan_path = Nibbles::with_capacity(path_walked); | ||
| let mut orphan_path = Nibbles::new(); |
| // Random input may repeat the same key; collapse to the last (value, new_value) per key so | ||
| // the helper models last-write-wins. Otherwise a key appearing as insert-then-delete would | ||
| // double-delete (an error) and make the reference trie inconsistent with the updated trie. | ||
| let deduped: Vec<(Vec<u8>, Vec<u8>, Option<Vec<u8>>)> = { |
There was a problem hiding this comment.
In the original PR I took this code from (some russian) the tests were working with 5K random tests but started to fail at 50K.
| let best_block = state.update_with_new_block(block.clone()); | ||
| let best_block = state | ||
| .update_with_new_block(block.try_clone().unwrap()) | ||
| .unwrap(); |
There was a problem hiding this comment.
these are tests, there is a non test unwrap but is valid.
|
Claude finished @ZanCorDX's task in 42s —— View job Re-review: PR #927 — Upgrade reth 1.9.3 → 1.11.3
New since last review —
|
📝 Summary
Bumps reth
27a8c0f→d6324d6(1.11.3) and the coupled major bumps: revm 31→34,alloy-evm 0.23→0.27, alloy 1.x→1.6, alloy-trie 0.8→0.9, revm-inspectors→0.34.
Adapts rbuilder to the new reth/revm APIs and removes the
unsafethat reth'strait-object
Syncchange would otherwise force.Dependencies
reth-*→ revd6324d6; revm 34, revm-inspectors 0.34.2, alloy-evm 0.27.2,alloy 1.6.3, alloy-primitives 1.5.6, alloy-trie 0.9.4.
time = 0.3.47(0.3.48 has an E0119 coherence bug); MSRV 1.85 → 1.88.reth-tasksin rbuilder-utils repointed to the same rev (was stale).Key adaptations
Provider Send/Sync . reth 1.11 dropped
SyncfromStateProviderBox(
Box<dyn StateProvider + Send>). The unit shared acrossbuilding threads is now
StateProviderSource(factory + parent hash,Send+Sync+Clone);each consumer opens its own provider on demand.
CachedDBholds an already-open provider;the block-building helper clones itself via an explicit, fallible
try_clonethat reopensthe provider. This is the reth way.
revm 34.
Database::Erroris nowEvmDatabaseError<ProviderError>;CfgEnv::with_spec→with_spec_and_mainnet_gas_params(sets Osaka's per-tx gas cap,EIP-7825).
order_commitmaps the new error shape.Root hash.
OverlayStateProviderFactory::new(factory, ChangesetCache),ParallelStateRoot::new(.., Runtime); provider bounds gainChangeSetReader + StorageChangeSetReader + DBProvider + StorageSettingsCache.Transparently supports reth storage v1 and v2 (reth reads the mode from the DB;
rbuilder only forwards the bound).
Provider construction.
ProviderFactory::newnow takes a RocksDBProvider + Runtimeand returns
Result; it reads storage settings on open. Adds a process-wide task runtime.eth-sparse-mpt. Unified onto nybbles 0.4 (matching alloy-trie 0.9 / reth): the crate's
Nibblesis now the same packed type reth's trie nodes use, so the conversion shims betweenthe two representations are gone — encode/decode pass
Nibblesstraight into alloy-trie nodetypes, removing per-node conversions in the root-hash hot path. Single nybbles version
across the workspace.
🤖 Generated with Claude Code with me kicking his ass to get it right....
💡 Motivation and Context
I like prime numbers so I switched 9 for 11.
✅ I have completed the following steps:
make lintmake test