Summary: The next 12–24 months will be defined by how well rollups scale proof throughput—not only how fast you can generate proofs, but how cheaply you can post data, verify results, and recover from fee spikes. This guide distills concrete, field-tested practices for architecting provers, choosing aggregation strategies, and designing sustainable fee policies across Ethereum L1 (blobs), AVS verification layers, and alternative DA.

Best practices for future-proofing rollup proof throughput: Designing for provers, aggregation, and fees

Decision-makers at startups and enterprises increasingly ask the same question: how do we ensure our rollup won’t bottleneck on proving or fee spikes as usage grows? In 2025, there are finally durable answers—but only if you design for them up‑front:

Treat proving as an elastic, multi-tenant service with clear SLOs.
Use recursion and aggregation aggressively to compress on‑chain verification.
Exploit Ethereum’s blobspace economics post‑Pectra and plan for PeerDAS.
Keep optionality: be able to settle directly to L1, or route through AVSs, or switch DA backends without downtime.

Below we lay out opinionated, pragmatic guidance with numbers you can budget against.

1) Design your proving layer like a product, not a script

Your prover is an always-on service with performance targets. Treat it like one.

Define SLOs: maximum prove latency per batch, minimum proofs-per-minute, and maximum queue depth per circuit class. Expose these as Prometheus metrics (e.g., job_wait_seconds, prove_latency_p95, queue_len_by_circuit).
Separate “hot” and “cold” circuits:
- Hot: block-level/state transition circuits and any recursive wrapper. These must meet block time SLOs.
- Cold: heavy analytics, fraud/finality challenges, or occasional coprocessor jobs. These can flex to cheaper hardware and lower priority.

Hardware mix and acceleration

Default to GPUs for modern stacks; they are the fastest path to throughput gains in 2025, with maturing libraries:
- Plonky3 and Stwo show >500k–2M Poseidon hashes/sec on laptops; server GPUs scale much further. This materially reduces wall-clock proving time for hash-heavy circuits. (polygon.technology)
- Ingonyama’s ICICLE provides GPU-accelerated MSM/NTT with active support and production users; it’s a practical on‑ramp to GPU acceleration without rewriting your stack. (ingonyama.com)
Plan capacity in “proofs per dollar”: Succinct’s SP1 benchmarks indicate 10x cheaper cloud costs than alternative zkVMs for typical light-client/EVM workloads when using their GPU prover; in practice this yields ≈0.1¢ proving cost per tx for average Ethereum blocks. Treat this as directional for budget planning, and benchmark your exact circuits. (succinct.xyz)
Pipeline the prover:
- Stage 1: witness generation (CPU-heavy), Stage 2: FFT/MSM/commit (GPU), Stage 3: recursion/wrapping (often STARK→SNARK). Keep these separable so you can scale them independently and avoid blocking on a single bottleneck.
- Coalesce small jobs. Many zkVMs have fixed overhead; SP1 notes that sub‑2M PGU programs are dominated by fixed costs—batch them or prove them inside a recursive accumulator. (docs.succinct.xyz)

Operational controls

Admission control: cap per-tenant PGU/cycles and concurrency to avoid head-of-line blocking. RISC Zero Bonsai exposes quotas (concurrent proofs, cycles per proof); mirror those controls if you run in-house. (dev.risczero.com)
Pre-emptive scheduling: reserve a slice of GPU time for the recursive wrapper to keep end-to-end latency stable during spikes.
Canary recursion: every N blocks, run a redundant recursive proof via a second stack (e.g., Groth16- vs Plonk-KZG wrapper) for early detection of soundness bugs in your primary verification path. Recent academic work found latent bugs in multiple zkVMs—treat diversity as a safety feature. (arxiv.org)

2) Choose aggregation deliberately: recursion trees, SNARK-packers, or external verifiers

There are three major approaches to compress verification costs:

A. Recursive accumulation (STARK/FRI → SNARK/KZG)

Most performant path today for high throughput: generate many leaf proofs, verify them inside a recursion circuit, then wrap into a succinct Groth16/Plonk proof for L1. This swaps millions of gas for ≈200–900k gas once per batch, depending on wrapper and public inputs. (blog.zkcloud.com)
Practical tip: keep public inputs tiny. The verifier cost scales with pairings and MSM on BN254/BLS12-381; EIP‑1108 reduced bn128 pairing to 45,000 + 34,000·k gas (k = pairings). Typical Groth16 verifiers run ≈200k–300k gas. Design circuits to minimize public IO. (eips.ethereum.org)

B. Proof aggregation schemes (e.g., SnarkPack, aPlonk)

If you produce many Groth16/Plonk proofs of heterogeneous statements, aggregators like SnarkPack verify thousands in logarithmic time, amortizing gas. Protocol Labs reports aggregating 8192 Groth16 proofs with verification in ≈33–163 ms off‑chain. Use when recursive engineering is costlier than adding an aggregator to your pipeline. (eprint.iacr.org)
On-chain footprint can be held to a single “super-proof” verify (hundreds of thousands of gas) plus a small per-proof inclusion check; concrete implementations report ≈380k base and ≈16k per-proof inclusion call. Budget with headroom and test against your curve and verifier. (docs.electron.dev)

C. Off-chain verification via AVSs (Aligned Layer) with on-chain attestations

For apps tolerant of an AVS trust model, Aligned’s Proof Verification Layer verifies proofs off-chain on a restaked operator set and reports results on Ethereum with aggregated BLS signatures. Reported savings: 90–99% vs direct L1 verification; current gas around tens of thousands per proof, with supported stacks including Risc0, SP1, and Groth16/Plonk. This is compelling for frequent/expensive verifications (e.g., STARKs). (docs.succinct.xyz)
Trade-off: settlement route adds another dependency and security model; many projects use a hybrid approach—use AVS during congestion, fall back to direct L1 verification for final checkpoints.

Decision rule of thumb:

If you settle once per N L2 blocks and your wrapper stays <500k gas, prefer recursion (A).
If you need to verify many independent app proofs, consider aggregation (B).
If your proofs are inherently expensive to verify (large STARKs) or you verify frequently, evaluate an AVS (C) with robust fallbacks.

3) Engineer your L1 verification for the chain you’re on: BN254 today, BLS12‑381 now viable

Ethereum’s EIP‑1108 slashed BN254 (alt_bn128) precompile costs; pairing cost is 45k + 34k·k gas, making BN254-friendly Groth16 the cheapest to verify on L1. Use it for your outermost wrapper unless you have a strong reason otherwise. (eips.ethereum.org)
As of May 7, 2025, Pectra added BLS12‑381 precompiles (EIP‑2537). This unlocks native, efficient BLS curves for on‑chain verification, widening your curve choices for SNARKs and signature schemes. If your stack already lives on BLS12‑381 (consensus tooling, bridging, light clients), re‑evaluate end-to-end costs; the precompile removes a key reason to contort everything into BN254. (ethereum.org)

Implementation detail: keep your verification contract upgradable behind a timelock (or immutable with a configurable verifier target), so you can swap BN254 <→ BLS12‑381 verifiers without migrating the entire rollup contract suite.

4) Blobspace is your friend—if you learn its quirks

Ethereum’s data landscape changed materially post‑Dencun (EIP‑4844) and Pectra.

Blobs are 128 KiB, priced in a separate 1559-style market (“blob gas”) and retained ~18 days on consensus clients; only the KZG commitment persists on L1. Archive providers (Blocknative, Blockscout) offer longer retention. Design your DA retrieval and analytics with this 18‑day window in mind. (eips.ethereum.org)
Capacity and costs:
- Dencun shipped with target/max 3/6 blobs per block (≈384/768 KiB per block). (digitalfinancenews.com)
- Pectra (May 7, 2025) raised blobspace to target/max 6/9 per block. This doubled target capacity and has kept blob fees near floor for long stretches. Plan your batcher to take advantage of the larger target. (ethereum.org)
Blob fee dynamics: the blob base fee increases when usage exceeds the target and decreases otherwise, similar to EIP‑1559. Expect low, steady pricing unless a large L2 rush coexists; still guard against spikes. (blocknative.com)

Batcher configuration best practices (OP Stack, Arbitrum)

Prefer blobs by default with an “auto fallback” to calldata if blobs become temporarily unavailable or overpriced; both stacks support blob posting post‑Dencun. (docs.optimism.io)
Right-size your submission cadence:
- Use MAX_CHANNEL_DURATION to target frequent, full blobs (e.g., 30–60 minutes for medium-throughput chains). Posting half‑empty blobs wastes money; too infrequent postings stall “safe” heads and UX. (docs.optimism.io)
Multi‑blob transactions:
- After Pectra you can pack up to 9 blobs in a block, but avoid “mega-blob” single txs unless you have builder arrangements; high blob counts per tx can increase replacement risk (blob txs require fee doubling for replacement). Start with 1–3 blobs per tx and scale cautiously. (docs.optimism.io)
Compression: brotli‑10 and careful packing to 131,072‑byte boundaries reduce spillover to an extra blob. Track utilization and simulate “add one more tx?” before posting. (specs.optimism.io)

Budget note: Since Dencun, average L1 gas fell drastically as L2 usage moved to blobs; L2 user fees dropped materially. Even though ETH price and usage fluctuate, the structural shift matters—blobspace keeps data costs predictable at scale. (tradingview.com)

5) DA optionality: EigenDA, Celestia, Avail

Keep the door open to alternative DA backends. Even if you’re on Ethereum blobs today, future price/demand or vertical integration may warrant a move.

EigenDA (AVS on EigenLayer): mainnet live since 2024 and upgraded throughput in 2025; marketed 100 MB/s capacity with production users (e.g., Fuel, Aevo). Evaluate if you want restaked security, high throughput, and tight Ethereum adjacency. (coindesk.com)
Celestia: widely adopted DA for cost reduction; Manta Pacific and multiple SDKs integrated. Good fit when minimizing DA costs outweighs L1-native settlement. (theblock.co)
Avail: mainnet (2024–2025), chain-agnostic DA with KZG+sampling and growing validator set. Worth a POC if you need portability across ecosystems. (coindesk.com)

Design tip: abstract your DA publishing path behind an interface and emit the same L1 commitment schema (or an adapter) so your L1 contracts and watchers don’t change when you swap DA.

6) Verification frequency and fee model that won’t paint you into a corner

When to verify on L1

Verify a succinct proof every M seconds or every N blocks—whichever comes first. Use traffic-aware thresholds to accumulate enough transactions to amortize cost, but bound latency to keep bridges/exchanges happy (e.g., 2–5 minutes in steady state).
If blocks are sparse, fall back to time-based sealing so blobs stay full but settlement isn’t delayed excessively.

L2 fee policy hygiene (zkSync as a reference)

Split fees into L2 compute and L1 costs (pubdata + verification). Expose a “gas_per_pubdata_limit” to cap how much L1 data cost can be charged per tx, preventing sticker shock when blob fees spike. This mechanism is proven in production. (docs.zksync.io)
Charge batch overhead proportionally to resource usage (bootloader slots, memory, pubdata bytes). This both aligns incentives and stabilizes economics across varying demand. (docs.zksync.io)

Concrete cost anchors for L1 verification

Groth16 on BN254: ≈200k–300k gas typical; use EIP‑1108 formula to estimate precisely for your k pairings and EC ops. (eips.ethereum.org)
Plonk-KZG: commonly 600k–1M gas depending on implementation and public inputs; optimize away extraneous IO. (blog.zkcloud.com)
STARK direct verification: ≈5–6M gas per proof is a useful planning number; many teams SNARK‑wrap to reduce on‑chain cost. (blog.lambdaclass.com)
AVS verification (Aligned): order‑of‑magnitude 90–99% cheaper per proof; current gas ≈tens of thousands per proof with BLS aggregation on L1. Model both L1 gas and off‑chain verification fees. (docs.succinct.xyz)

7) Reference architectures you can ship

A. High-throughput zkEVM with recursion and blob-first DA

Prover: GPU-heavy pipeline; leaf circuits per block segment; Plonky3/STARK at leaves; Groth16 wrapper on BN254 for L1.
Aggregation: 2‑level recursion tree per batch (~2–4s aggregation budget on mid-tier GPUs).
DA: Ethereum blobs (target 6; max 9 post‑Pectra). Batcher posts 1–3 blobs per tx every 2–5 minutes when utilization >90%; otherwise time‑seal at 5 minutes.
L1 verify: Groth16 verifier with ≤4 pairings and minimal public inputs for ~200k–250k gas. (eips.ethereum.org)

Why it works: predictable blob costs, minimal verifier gas, fast settlement.

B. STARK rollup with SNARK‑wrapping + AVS fallback

Prover: STARK leaf proofs; periodic SNARK wrap (Plonk‑KZG) for L1.
Aggregation: SnarkPack if you have many heterogeneous app proofs; otherwise recursion.
Verification: AVS (Aligned) for frequent intra‑day verifications; direct L1 verify for final checkpoints every X hours. (docs.succinct.xyz)

Why it works: keeps per‑proof costs low during peak demand while preserving hard L1 finality regularly.

C. zkVM coprocessor rollup with external proof markets

Prover: SP1/Risc0 with GPU provers on a decentralized proving market (Succinct Prover Network, RISC Zero Boundless). Set quotas and pricing for bursts. (succinct.xyz)
Aggregation: periodic recursive accumulator sealed to L1. Optional: offer “priority proving” for high-value txs with SLA pricing.
DA: Blobs by default; DA abstraction supports migration to EigenDA/Avail if economics change. (coindesk.com)

Why it works: elastic capacity via open markets, predictable on‑chain footprint via recursion.

8) Don’t forget the blob lifecycle and data ops

Retrieval window: blobs expire ~18 days after inclusion (4096 epochs). Ensure your provers, full nodes, and analytics backfills pull data within window. Subscribe to an archival service (Blocknative Blob Archive API, Blockscout) for long-term access. (consensys.io)
Observability: index blob utilization (bytes/131,072), fee paid per blob gas, and share of batches that spill into an extra blob. Use these KPIs to tune compression and cadence. (specs.optimism.io)
Prepare for PeerDAS: Pectra set the stage with higher blob targets (6/9). PeerDAS will raise ceilings again (client-side sampling). Design your batcher to treat target/max as config, not constants, and rehearse parameter changes in staging. (ethereum.org)

9) Worked examples: plug numbers into your roadmap

Example 1: You settle every 3 minutes with a Groth16 wrapper at 230k gas. With a 45M gas limit and a typical base+tip of 3 gwei, your per‑settlement cost is ≈0.00069 ETH. If you batch 10k L2 txs, that’s ≈0.000000069 ETH per tx (excluding blob DA). Now add blob DA: at floor blob fees and 1–2 blobs per batch, DA often rounds to fractions of a cent. Keep a 5× surge buffer for spikes. (eips.ethereum.org)
Example 2: A STARK proof verified directly on L1 costs ~5–6M gas. At 3 gwei, that’s ≈0.015–0.018 ETH—too high for frequent verifies. SNARK‑wrap to 700k gas or route to an AVS that reduces on‑chain to ~40k gas, then checkpoint to L1 hourly. (community.starknet.io)
Example 3: Blob packing: your average batch compresses to 180 KiB. If you naively post each batch, you’ll waste ~52 KiB per blob. Instead, coalesce two batches (360 KiB) into 3 blobs (384 KiB target) with brotli‑10; post every 2–3 minutes. Track utilization and tweak MAX_CHANNEL_DURATION to keep ≥90% fill. (docs.optimism.io)

10) Security and upgrade playbook

Guardrails during upgrades:
- Run “shadow proofs” (parallel proofs not used for settlement) when rolling out new provers or recursion code, the way teams have deployed Boojum in staged mode. Promote only after matching live results over several days. (theblock.co)
Circuit/VM diversity:
- Keep a second verification path (e.g., different library or curve) for key circuits. Industry data shows even audited zkVMs have had soundness/completeness bugs discovered; diversity limits blast radius. (arxiv.org)
Verifier upgradability:
- If using EIP‑2537 BLS12‑381 precompiles, migrate verifiers with a time‑locked upgrade and community signaling; test gas and correctness on Sepolia/Holesky forks before mainnet. (ethereum.org)

11) A crisp checklist for CTOs and heads of product

Provers
- GPU-backed pipeline with clear SLOs and per‑tenant quotas.
- Recursion wrapper with minimized public inputs; budget <500k gas.
- Canary recursion on alternate stack in production.
Aggregation/Verification
- Choose between recursion, SnarkPack‑style aggregation, or AVS; document fallbacks.
- Verifier contracts abstracted for BN254/BLS12‑381 swap (post‑Pectra).
DA and batcher
- Blob‑first posting with calldata fallback; cadence tuned to ≥90% blob fill.
- Multi‑blob txs capped at 1–3 initially; replacement policy and fee doublings handled.
- Blob archival subscription; retrieval within 18 days guaranteed.
Fees
- Publish fee breakdown (L2 compute vs L1 pubdata/verify).
- gas_per_pubdata_limit configured; batch overhead allocation in place.

Ship this and you’ll have a rollup that scales proofs and fees with demand, not against it.