Stake with Besu After The Merge With Improved Bonsai Tries State Storage

We are excited to be a part of the future of Ethereum on Proof of Stake (PoS) and are actively working on new features to make participating in PoS easier, more efficient, and more accessible. The Hyperledger Besu team recently brought the state storage paradigm, Bonsai Tries, out of early access and is thrilled to share Bonsai Tries Stage Storage is now available for general use, as well as support for the Merge.
The results during early access have been successful and we’re excited to continue to get Bonsai Tries Stage Storage into the hands of developers. We believe Bonsai Tries Stage Storage will be a great reason to use Besu as part of staking and validator stacks, to keep storage requirements low, state growth in check, and near-head performance fast.
This blog is the first part of a series we will be publishing about performance and feature upgrades that make Besu a top-tier choice for small and institutional stakers alike.
How We’ve Improved Bonsai Tries
Previously, the Besu Ethereum world state storage strategy has been described as a ‘Forest of Tries’ or forest mode. The underlying data store closely mirrors the Merkle trie data structure, with each node in the trie saved in a key-value store by hash. With each new block, the world state is updated with new leafs, nodes, and a new state root. Old leafs and nodes remain in the underlying data store. As the state grows and changes, this strategy leads to two challenges:
- Traversing the trie structures becomes slower and slower
- The footprint on-disk grows substantially
Bonsai Tries has been improved as a response to address these challenges by encompassing a set of three philosophies:
- Leaf Focused
- Well Manicured Branches
- Focused on One Point in Time
This improved approach resulted in reduced storage requirements and fast read access for accounts. For storage, numbers speak for themselves. We’ve observed the following:
- Full archive – ~1.14TB
- Freshly synced node (Fast or SnapSync) – ~617GB
- One month old node (Fast or SnapSync) – ~625GB
For stakers, this means less state bloat and less upkeep of your staking infrastructure over time.The Hyperledger Besu team is proud to share how slow the state database grows as changes are made to the blockchain, due to the trie-log layer keeping only the differences in state and the implicit pruning that comes with this approach. For more background on the Bonsai Tries philosophy, see our first article here.
Trade-off of Using Bonsai Tries Versus Other Data Storage Options
By making use of the trie-log to keep track of the state tree, we can reduce the state size. However, it will increase the time it takes to traverse backwards into the chain as the diff layer is applied to recreate the state.
For account and historical reads there is a mechanism that will check if the given account or block is in the flat Bonsai Trie database:
- If it’s not, check the trie and then put it in the flat database. The next read will be in O(1) but not the first. This longer traversal will be O(n).
- If it is, retrieving the item will be directly O(1)
This has two consequences.
First, some cases may have the same performance as the old forest mode, but performance will improve in a smart way. After a Snap or FastSync, if specific data is needed, it is fetched and put into the flat database for the next time it’s accessed. Data that is unused will not be in the flat database and will reduce the size of the state database even more.
Secondly, read speed will be fast for accounts and blocks within the database. Writing to the database will require a full O(n) traversal. Bonsai Tries will remain fast for operations near the head of the chain. Forest mode will still be convenient for an archive node which must access data from the genesis block to the head or with a need to access data far in the past. Bonsai Tries is better for everything else. In validator setups operating near-head, Bonsai Tries is perfect for keeping storage requirements low and performance fast in block production.
If you are familiar with Geth’s snapshot implementation the comparison is:
- Go-Ethereum snapshots are taken at a fixed point-in-time, with a diff log to construct current values
- Besu’s bonsai snapshots are for the current point-in-time, with a diff log to construct historical values
This means Bonsai Tries will be fastest at providing the ‘head’ state and will get progressively slower at reproducing historical states, the further back from the head you look. Conversely, Geth snapshots will more readily produce historical states and read performance gets more expensive as the distance from the snapshot to head increases. Besu is also implicitly pruned by this approach. Go-Ethereum snapshots are also supplemental to the existing trie strategy, and result in additional storage costs rather than storage benefits.
By default, Bonsai Tries is specified to traverse around 512 blocks historically. This option can be changed here and will depend on the strength of your machine. The good news is there is no unsafe value here. See the –bonsai-maximum-back-layers-to-load option.
Get Started with Besu And Bonsai Tries For Testing Staking and More
We hope you enjoy using Bonsai Tries and get value from using this improved feature in your validator setup. The low storage requirements and quick account and block lookups will be valuable in Proof of Stake at the execution layer and for newly possible low-cost setups like ARM and Raspberry-Pi. Plus, with SnapSync now supported in Hyperledger Besu, you can sync the world-state and get started faster. Learn more about Besu here and its documentation here.
You can provide us feedback on new features like Bonsai Tries in our Discord channel and any new features you’d like to build for you.
We have a lot more news coming out about how Besu can improve the validator and staking UX, so stay tuned for more updates! The Ethereum community thanks you for supporting client diversity by using Hyperledger Besu in your staking infrastructure.