Comment on page
Mempool Data Program
This program makes Blocknative's Mempool Data Archive available for the benefit of the community.
Blocknative actively maintains the most comprehensive historical dataset of mempool transaction events within the Ethereum ecosystem. This collection encompasses, as of August 29, 2023, >15 TB of archive data representing >5 billion transaction detection events since November 1st, 2019.
This uninterrupted dataset covers major scenarios the network has encountered over the years, including massive surges in traffic, huge gas spikes, bidding wars, the launch of MEV-boost, the price of ETH collapsing, EIP-1559, Black Thursday, and major hacks.
This data covers 27 data fields, such as gas details, input data, time pending in the mempool, failure reasons, and regional timestamps for each instance seen by our global network of nodes.
The Mempool Archive set is available for research and may be freely used for non-commercial purposes. Here are the steps to receive access to the dataset :
Blocknative logs all mempool transactions from nodes in multiple geographical regions for the Ethereum mainnet blockchain. The Archive contains historic events for all transactions:
- entering the mempool
- denied entry into the mempool (rejection with reason)
- exiting the mempool (eviction with reason)
- replacing existing mempool transaction (speedup or cancel)
- finalized on chain (confirmed or failed)
The number of times a transaction appears in the Archive corresponds to the number of status changes it undergoes. The
detecttimefield indicates the time when the status change was first observed.
Below you can find the complete schema for the data:
The Mempool Data Program can be used to research:
- Historic gas trends
- Transaction inclusion
- Private transactions
- Bug fixes
- Block sequences of interest
- Trading strategies
- Third-party strategies
- Malicious activity
- Probes for potential explotation
What attribution must I provide when using the Blocknative Mempool Data?
The archive is publicly available according to open data standards and licenses datasets under the Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International.
2.1 Attribution — End Users must give appropriate credit, provide a link to the license, and indicate if changes were made. End Users may do so in any reasonable manner, but not in any way that suggests the licensor endorses End Users or their use. 2.2 NonCommercial — End Users may not use the material for commercial purposes. 2.3 ShareAlike — If End Users remix, transform, or build upon the material, End Users must distribute their contributions under the same license as the original.
Please use the following as a guideline for attribution:
What format is the data?
The data is stored in hourly slices with file format
*.csv.gzThe data is tab delimited.
How many nodes are gathering mempool data?
We run highly redundant node infrastructure in each region to ensure strong uptime.
How can I identify on-chain transactions?
On-chain transaction have a
WHERE status = 'confirmed'
How can I identify private transactions?
A private transaction does not have a
timependingis determined from the difference between a transaction's
WHERE timepending = 0
AND status = 'confirmed'
What is the difference between
A dropped transaction might have been valid but deemed less important or lower-priority. A rejected transaction is one that is fundamentally flawed or invalid according to the Ethereum protocol rules.
A drop reason could be that there isn't enough ETH in the EOA to cover gas fees. A rejection reason could be incorrect transaction signatures. Dropped transactions existed in the mempool, but are dropped to make room for incoming transactions. Rejected transactions never make it to the mempool.