ByAUJay
Best Practices for Future-Proofing Rollup Proof Throughput
Summary: A 2025 playbook for decision‑makers to keep rollup proofs fast, cheap, and scalable amid Ethereum’s Pectra upgrade, rising blob capacity, new BLS12‑381 precompiles, decentralized prover networks, and rapidly improving proving systems.
Why this matters now
2025 materially changed the throughput and cost envelope for rollup proofs:
- Ethereum’s Pectra mainnet upgrade on May 7, 2025 doubled average blob capacity (EIP‑7691), raised the max per block to 9 blobs, and repriced calldata (EIP‑7623), shifting DA economics decisively toward blobs. (blog.ethereum.org)
- Pectra also shipped EIP‑2537 BLS12‑381 precompiles, lowering on‑chain verification costs for modern SNARKs/BLS signatures and giving teams a higher‑security curve than BN254 with competitive gas. (blog.ethereum.org)
- Optimistic rollups turned the corner on production proofs: Arbitrum’s BoLD went live on mainnet for permissionless validation, and OP Mainnet’s fault proofs reached Stage‑1 decentralization—raising real throughput ceilings without sacrificing safety. (theblock.co)
- ZK rollups and zkVMs improved order‑of‑magnitude on prover speed and decentralization options (e.g., Boojum’s consumer‑GPU path, Plonky3’s CPU throughput records, and StarkWare’s Stwo recursion benchmarks). (zksync.mirror.xyz)
Below is a concrete, numbers‑first guide 7Block Labs uses with clients to future‑proof proof throughput across ZK and optimistic stacks.
The 2025 baseline: what changed and the raw numbers
- Blobs are 4096 field elements of 32 bytes each (~128 KiB). After EIP‑7691, Ethereum targets 6 blobs per block (max 9). With ~12s blocks, that’s ~64 KiB/s average DA bandwidth (6×128 KiB / 12s). (eips.ethereum.org)
- Calldata got a floor price (10/40 gas per byte) to reduce worst‑case EL payload sizes and encourage DA in blobs instead of calldata (EIP‑7623). Your DA strategy should be “blob‑first” except for small control paths. (eips.ethereum.org)
- BLS12‑381 precompiles (EIP‑2537) introduced cheap curve ops, MSM, and pairings. Pairing checks cost 32,600·k + 37,700 gas vs BN254’s 34,000·k + 45,000 (EIP‑1108), while raising security from ~80 to ~120+ bits. Translation: you can target BLS12‑381 without paying a gas penalty. (eips.ethereum.org)
- On the optimistic side, Arbitrum’s BoLD dispute protocol is live on One/Nova and OP Mainnet operates with governance‑approved, permissionless fault proofs. If you are building on these stacks, real fault‑/fraud‑proof throughput is now practical, not theoretical. (theblock.co)
Define “proof throughput” precisely (so you can scale it)
Track these four layers separately:
- Proving layer
- Proofs per second (PPS) at each circuit (execution trace, state root, aggregator/recursive).
- p50/p90/p99 prove latency for each job class.
- Bottleneck kernels (FFT/NTT, MSM, Merkle hashing) utilization and wall time.
- Aggregation/recursion layer
- Aggregation depth per window (e.g., N leaf proofs per aggregator, M aggregators per recursive wrap).
- Time‑to‑final aggregated proof (TFA).
- Data availability (DA) layer
- Blob fill rate (% blob bytes used), blobs per batch, blob fee paid vs target.
- Calldata fallback count and bytes (should trend to near‑zero post‑EIP‑7623).
- L1 verification layer
- Gas per verify (base + pairing cost), bytes posted (blob + on‑chain metadata), failure/retry rates.
Tie SLOs to these: e.g., “99% of batches proven ≤ 6 minutes; 99% of aggregated proofs verified on L1 ≤ 1 slot after availability; average blob fill ≥ 85%.”
Pattern 1: DA‑aware batching after EIP‑7691
What to do now:
- Batch to blobs, not calldata. Use a sizing heuristic that targets 1–3 blobs per batch to hit >85% fill while keeping proof size/verifier costs stable. If a batch would spill to a partly empty extra blob, consider holding for the next slot or increasing aggregation depth to amortize L1 costs. (eips.ethereum.org)
- Compute your DA headroom with the new target: average 6 blobs/block × ~128 KiB ≈ 768 KiB every 12s. If your chain’s raw posted data exceeds this sustained rate, consider multi‑DA (e.g., EigenDA or Celestia) with settlement on Ethereum. (eips.ethereum.org)
- If you must use calldata (e.g., tiny control proofs, keeper heartbeats), budget with EIP‑7623’s floor pricing and keep payloads under a few KB. (eips.ethereum.org)
Signals from alt‑DA:
- Celestia’s sustained blob usage growth and spikes around large mints show capacity elasticity for off‑Ethereum DA; useful for L3/app‑specific rollups. (theblock.co)
- EigenDA is in production with a “free tier” and has shown synthetic and peak throughput ranges; it introduced dual‑quorum security and has V2 live (see L2BEAT milestones). Evaluate it for burst capacity while settling proofs on Ethereum. (cointelegraph.com)
Practical example:
- If your rollup posts 1.8 MB/minute of compressed data, on Ethereum blobs you can cover ~3.84 MB/minute at target (768 KiB/12s). You’re safe staying blob‑only; switch to alt‑DA when you approach >75% sustained blob utilization during peak hours.
Pattern 2: Move verification to BLS12‑381 now
Why:
- EIP‑2537 makes BLS12‑381 practical and cheaper than many expect. Pairing check gas is 32,600·k + 37,700; BN254 pairing is 34,000·k + 45,000. For a 3‑pairing verifier (Groth16‑style), that’s 135,500 gas on BLS12‑381 vs 147,000 on BN254 for the pairing component alone, with higher security margins. (eips.ethereum.org)
How:
- If your prover emits BN254 Groth16/Plonk, add a recursive wrapper that compresses to a BLS12‑381 proof for L1 verification. Or, if you use Plonky3/KZG‑style systems, compile directly to BLS12‑381 verifying keys going forward. (polygon.technology)
- Keep old verifiers as emergency fallback until you’re confident in the migration; deploy a canary path verifying both for a subset of batches for a week.
Bonus:
- Combine with EIP‑4844’s KZG point‑evaluation precompile and BLOBHASH opcode when you need to reference blob commitments on‑chain (e.g., proof carrying data commitments); this is now a first‑class on‑chain flow. (eips.ethereum.org)
Pattern 3: Recursion and aggregation that actually scale
What the new benchmarks tell us:
- Polygon’s Plonky3 has shown multi‑million Poseidon‑hash/sec proving on laptops and very high server‑side throughput, making deep recursion practical for many apps. (polygon.technology)
- StarkWare’s Stwo (Circle‑STARK based) hits 500k–600k hashes/sec on commodity CPUs, with production‑grade recursion on the horizon—great for parallel proving and short aggregation windows. (starkware.co)
Practices we recommend:
- Target 2–3 levels of recursion: leaf proofs (execution/trace), mid‑level aggregators per time window, and a final wrap proof per L1 submission window. Tune windows to your latency SLO.
- Keep wrap verifiers minimal and BLS12‑381‑friendly. Push bulky checks (e.g., large MSMs) to the recursion levels, not the on‑chain verifier.
- For Nova/HyperNova‑style IVC, ensure your curve cycles and commitment scheme choices support your final L1 target (BLS12‑381 or BN254) and trusted setup posture. (github.com)
Pattern 4: Multi‑proof, multi‑prover architectures
Why:
- Heterogeneous proofs reduce correlated failure risk and let you mix faster/cheaper provers with conservative ones. Taiko already runs a multi‑proof architecture (Succinct + RISC Zero, with SGX today, expanding to more ZK). (prnewswire.com)
How to deploy:
- Specify a policy like “N of M proofs must validate; at least one must be ZK.” Start by requiring ZK for a small percentage of blocks and ramp (as Taiko did). (prnewswire.com)
- Use decentralized proving marketplaces to elastically scale: Succinct’s Prover Network (live on mainnet) exposes an auction‑driven market across 1,700+ programs; RISC Zero Bonsai offers managed parallel proving with 99.9% uptime. (theblock.co)
Procurement reality:
- If your p95 proving queues exceed internal capacity for >30 minutes during peak, burst to a prover network rather than burning capex on idle‑most‑of‑day GPUs.
Pattern 5: Hardware you can actually buy and run
What’s deployable today:
- zkSync’s Boojum can run on consumer GPUs; their docs list a 6 GB VRAM minimum for low‑TPS and a CPU‑only option with 128 GB RAM for test scenarios. Production guidance has ranged up to 16 GB GPU VRAM for practical throughput—plan your fleet to the high end. (docs.zksync.io)
- GPU acceleration libraries like Ingonyama’s ICICLE and Boojum‑CUDA deliver FFT/MSM speedups; use them to lift MSM utilization and reduce tails. (ingonyama.com)
- ASIC/FPGA acceleration is maturing (MSM/NTT on FPGAs; ZK‑specific ASICs showing 10–100× speedups on specific kernels), and some are beginning to plug into decentralized prover networks. Consider these only when you have stable, repeatable workloads. (eprint.iacr.org)
Capacity planning tip:
- Start with 1× A‑class consumer GPU per 15–30 TPS (ZK‑EVM workloads), then measure. If p95 job wait time > 2 minutes, add another GPU per 10 TPS until p95 < 60 seconds.
Optimistic rollups: raise the ceiling without breaking safety
- Arbitrum BoLD is live (One/Nova), enabling permissionless validation and a bounded dispute time; this hardens the path to true Stage‑2. If you operate Orbit chains, adopt BoLD per Offchain Labs’ recommended path (keep validation permissioned first, then relax). (theblock.co)
- OP Stack’s Cannon fault proofs are live, with a roadmap to 64‑bit and multi‑threading to lift block gas limits safely. Plan upgrades around Cannon improvements if you need higher block limits without sacrificing provability. (blog.oplabs.co)
Practical governance note:
- L2BEAT now strictly enforces ≥7‑day challenge periods for Stage‑1 optimistic rollups (including “grace” periods). Ensure your withdrawal/challenge configs comply or risk a downgrade. (forum.l2beat.com)
L1 gas math you can budget today
- BN254 pairing (EIP‑1108): 34,000·k + 45,000 gas.
- BLS12‑381 pairing (EIP‑2537): 32,600·k + 37,700 gas.
- KZG point‑evaluation precompile: 50,000 gas per call (EIP‑4844).
For a 3‑pairing verifier plus bookkeeping, expect the pairing portion to be ~135.5k gas on BLS12‑381 vs ~147k on BN254, before adds/MSM and contract overhead. Use this delta to justify a BLS12‑381 migration for security headroom without a gas tax. (eips.ethereum.org)
Two concrete playbooks
A) ZK rollup (zkEVM‑style) targeting sub‑5‑minute finality
- Batching: target 1–2 blobs per batch, 30–60s batch window; if blob fill < 70% in 3 consecutive windows, increase window by 15s. (eips.ethereum.org)
- Proving: GPU pool sized at 2× your peak batch concurrency; enable recursive aggregation every 2–4 batches; keep the wrap proof small and BLS12‑381‑based. (eips.ethereum.org)
- Verification: deploy BLS12‑381 verifier; retain BN254 fallback for 1–2 weeks with dual‑submission for 5% of batches. (eips.ethereum.org)
- Overflow: integrate a decentralized prover marketplace as burst capacity with a max price cap; monitor p95 queue latency and PPS per circuit. (theblock.co)
B) OP Stack chain seeking higher throughput
- Upgrade path: adopt governance‑approved fault proofs (Cannon) to Stage‑1; then evaluate BoLD‑like bounded disputes if/when stack support emerges. (blog.oplabs.co)
- Gas limit: lift MAX_GAS_LIMIT only after adopting the Cannon improvements (64‑bit + multithreading) to keep proofs feasible. (gov.optimism.io)
- DA: move data to blobs post‑EIP‑7623; reserve calldata for minimal control paths. (eips.ethereum.org)
What’s next (6–18 months): prepare now
- Blob capacity will keep increasing (peerDAS research is active). Design your batcher to adapt to changing target/max and to price blobs vs alternative DA in real time. (blog.ethereum.org)
- Expect more networks to add multi‑proof requirements (e.g., “ZK‑or‑fallback”), and more apps to migrate verification to BLS12‑381. Keep proving and verification modular so you can swap systems without application rewrites. (prnewswire.com)
- Prover decentralization will grow: marketplaces (Succinct) and managed services (Bonsai) will become normal parts of the stack. Get procurement and security comfortable with these now. (theblock.co)
- ZK proving keeps getting faster (Plonky3, Stwo). Revisit recursion depth quarterly; you may be able to shorten windows and still cut costs. (polygon.technology)
Implementation checklist (that we hold teams accountable to)
- Data availability
- Use blobs for DA; set blob fill target ≥ 85%; alert if < 70% for 5 windows. (eips.ethereum.org)
- Keep calldata usage under 1% of DA bytes/month post‑EIP‑7623. (eips.ethereum.org)
- Provers
- Size GPU pool to keep p95 queue wait < 60s; enforce per‑circuit PPS dashboards.
- Enable recursion; cap final wrap verifier to BLS12‑381 pairings only. (eips.ethereum.org)
- Configure burst to a decentralized prover network with a spend cap and proof integrity checks. (theblock.co)
- Verification
- Migrate verifiers to EIP‑2537; compute gas deltas against BN254 and track realized per‑verify gas on mainnet. (eips.ethereum.org)
- Optimistic safety (if applicable)
- Adopt BoLD/Cannon per vendor guidance; ensure ≥7‑day effective challenge periods; publish emergency runbook for disputes. (docs.arbitrum.io)
Appendix: quick reference facts for 2025 planning
- EIP‑7691 (Pectra): target 6 blobs/block, max 9. (eips.ethereum.org)
- Blob size: 4096 field elements × 32 bytes = 131,072 bytes (~128 KiB). (eips.ethereum.org)
- EIP‑7623 calldata floor: 10/40 gas per byte for data‑heavy txs. (eips.ethereum.org)
- EIP‑2537 BLS12‑381 precompiles (addresses 0x0b–0x11), pairing cost 32,600·k + 37,700. (eips.ethereum.org)
- EIP‑1108 BN254 pairing: 34,000·k + 45,000. (eips.ethereum.org)
- KZG point‑evaluation precompile (EIP‑4844) at 0x0a, 50,000 gas per call. (eips.ethereum.org)
- Arbitrum BoLD live on mainnet; permissionless validation path. (theblock.co)
- OP Mainnet fault proofs live; Stage‑1 decentralization. (theblock.co)
- Boojum can run on consumer GPUs; docs list 6–16 GB VRAM paths depending on throughput. (docs.zksync.io)
- Prover marketplace examples: Succinct Prover Network (mainnet), RISC Zero Bonsai (managed). (theblock.co)
- Prover performance examples: Plonky3, Stwo benchmarks show large recursion headroom for 2025. (polygon.technology)
If you want a 7Block Labs architecture review or a proof‑throughput tune‑up, we can benchmark your circuits, right‑size your GPU fleet, and implement a blob‑first batcher plus BLS12‑381 migration with dual‑submit canaries in under four weeks.
Like what you're reading? Let's build together.
Get a free 30‑minute consultation with our engineering team.

