ByAUJay
What’s the Most Gas-Efficient Way to Batch Thousands of Groth16 Proofs on Ethereum for a High-Throughput Rollup?
Short answer: stop verifying individual Groth16 proofs on L1. Use proof aggregation/recursion so a single on-chain verification attests to the validity of the whole batch, and architect your contracts around “one constant-cost check per batch + compact membership commitments,” not per-proof checks. With Ethereum’s 2025 Pectra upgrade adding BLS12‑381 precompiles, the most future-proof path is Groth16-on‑BLS12‑381 or a Groth16→KZG/Halo2 wrapper, verified once on L1 for a few hundred thousand gas. (blog.ethereum.org)
TL;DR for decision‑makers
- A single Groth16 verification costs ~207,700 gas fixed + ~7,160 gas per public input on BN254; thousands of per‑tx verifications will never fit into L1 gas budgets. Aggregate or recurse instead. (medium.com)
- Best-in-class today: aggregate off-chain (SnarkPack for Groth16, or Halo2/KZG-style UPA) and verify one aggregated proof on L1 for roughly 300k–600k gas, depending on scheme and batch size. Pectra’s BLS12‑381 precompiles (EIP‑2537) make BLS‑curve verification cheap and secure on mainnet as of May 7, 2025. (research.protocol.ai)
The constraint: why naive multi-verify fails
Verifying one Groth16 on BN254 (alt_bn128) is dominated by the pairing precompile cost introduced in EIP‑197 and repriced by EIP‑1108. In practice you pay a fixed ~207,700 gas plus ~7,160 gas per public signal; even if you cram multiple proofs into one transaction, total gas scales essentially linearly with the number of proofs because each proof adds its own multi-scalar multiplications and increases the final multi‑pairing size. At 200k–250k gas each, 1,000 proofs would exceed 200M gas—far beyond L1’s ~45M gas per block. (medium.com)
Key constants you can design against (BN254 precompiles):
- ECADD (0x06): 150 gas
- ECMUL (0x07): 6,000 gas
- Pairing (0x08): 45,000 + 34,000·k gas (k = number of pairings)
These are post‑EIP‑1108 figures and underpin the ~207.7k fixed + ~7.16k per public signal formula observed in on-chain benchmarks. (eips.ethereum.org)
The 2025–2026 reality also includes EIP‑7623 (calldata repricing), which penalizes data‑heavy txs and nudges you to reduce calldata bytes. Shipping thousands of raw proofs or large public input vectors is now doubly expensive. (eips.ethereum.org)
What changed in 2025: BLS12‑381 precompiles on mainnet
Ethereum’s Pectra upgrade (mainnet activation at epoch 364032 on May 7, 2025) shipped EIP‑2537: seven BLS12‑381 precompiles (G1/G2 add, MSM, field-to-curve maps, and the multi‑pairing check). This finally makes on‑chain verification over BLS12‑381 fast, predictable, and first‑class. Meta‑EIP 7600 lists 2537 as included. Bottom line: you’re no longer forced into BN254 only because of precompiles; you can instantiate verifiers/aggregators over BLS12‑381 with ~128‑bit security. (blog.ethereum.org)
Indicative pairing costs:
- BN254 pairing: 45,000 + 34,000·k gas (EIP‑1108)
- BLS12‑381 pairing: precompiled at 0x0f with fixed base + per‑pair cost specified in 2537; net effect is competitive and higher security than BN254. Use MSM precompiles (0x0c/0x0e) to keep verifier math cheap. (eips.ethereum.org)
The playbook: four ways to batch thousands of Groth16 proofs
1) “Just batch-verify many Groth16s in Solidity” (don’t)
- You can combine per‑proof pairing equations into a single multi‑pairing call, saving one 45k base, but total k still grows ~4·n (typical Groth16 verifiers use 3–4 pairs), so you remain linear in n. This won’t scale to thousands under L1 gas. Use only as a stopgap for small n. (eips.ethereum.org)
2) Groth16→Groth16 aggregation with SnarkPack (logarithmic verification)
- SnarkPack aggregates n Groth16 proofs into one aggregated object with logarithmic verification time and proof size, requiring no extra trusted setup beyond two PoT (powers‑of‑tau) transcripts. Filecoin‑scale engineering shows 8,192 proofs aggregated in about 8–9 seconds on a 32‑core CPU, with verification in tens of milliseconds off‑chain. On chain, your gas is mostly the small number of pairings + a handful of MSMs. You can instantiate on BN254 or, post‑Pectra, on BLS12‑381. (research.protocol.ai)
Practical budgeting tip:
- If your SnarkPack verifier uses, say, O(log n) ≈ 13 multi‑pairings for n = 8,192, the BN254 pairing component alone would be ≈ 45,000 + 34,000·13 = 487,000 gas, plus calldata and some EC ops—a single-digit percentage of one block. BLS12‑381 costs are similar magnitude with stronger security. Treat this as an engineering estimate; measure your exact pair count during integration. (eips.ethereum.org)
Where SnarkPack shines:
- You keep the Groth16 stack you already have, avoid per‑proof storage writes, and only commit to a batch root/public‑input accumulator on L1.
3) Universal Proof Aggregation (Halo2/KZG) and SNARK wrappers
-
A practical alternative is to wrap many Groth16 proofs into a single Halo2‑KZG proof and verify that once on L1. Teams shipping this today report about ~350k gas baseline per aggregated proof verification, with ≈7k gas per included proof only if you also record per‑proof status in contract storage/events (which a rollup usually doesn’t need). If you don’t persist per‑proof state, you’re near the ~350k constant. (blog.nebra.one)
-
This approach leverages both EIP‑2537 (BLS12‑381 arithmetic/pairings) and EIP‑4844’s KZG precompile (0x0a) where applicable, keeping L1 verification in the few‑hundred‑thousand gas range even for big batches. (eips.ethereum.org)
When to prefer this:
- Heterogeneous proof systems headed to one settlement check; you want transparent setups (Halo2) and a well‑maintained “universal aggregator” operated by your team or a vendor.
4) Off-chain verification with BLS attestation, optionally followed by recursion
- Services like Aligned’s Proof Verification Layer verify thousands of proofs off‑chain across a decentralized operator set and post a single BLS aggregate signature back to Ethereum. One batch with a single proof (any system) costs around ~350k gas; at batch size 20, ~40k gas per proof, with verification latency in milliseconds. Their separate Proof Aggregation Service then recursively compresses verified proofs into a single on‑chain proof (~300k gas) if/when you need “hard L1 finality.” This “two‑lane” design (fast AVS + recursive L1 proof) is proving popular for throughput‑sensitive apps. (blog.alignedlayer.com)
Which is “most gas‑efficient” for a rollup?
For a rollup that must settle on L1:
- If you control the proving stack and can tolerate a few minutes of aggregation latency: adopt SnarkPack aggregation (Groth16→Groth16) or a Halo2/KZG wrapper and verify one aggregated proof per batch on L1. Per batch verification stays ~3e5–6e5 gas, independent of the number of constituent proofs. In exchange, you pay off‑chain aggregation time and hardware. (research.protocol.ai)
- If you need sub‑second L1‑visible attestations: consider an AVS like Aligned for immediate BLS‑backed results (~100k–300k gas per batch), and optionally post a recursive proof later for finality. This yields the lowest end‑to‑end latency at the cost of an extra trust‑minimized layer until the recursive proof lands. (blog.alignedlayer.com)
Either way, the winning pattern is constant‑ish on‑chain work per batch—not per proof.
Concrete budgeting: from 10,000 Groth16s to one L1 check
Assume 10,000 small Groth16 proofs (few public inputs):
- Baseline (no aggregation): 10,000 × ~220k gas ≈ 2.2B gas. Impossible on L1. (medium.com)
- SnarkPack style (engineering estimate): O(log n) pairings + MSMs. Suppose k ≈ 14 on BN254: pairing gas ≈ 45,000 + 34,000·14 = 521,000. Add calldata and EC ops; expect total still well under 1M gas for the verify step. That’s a 2,000× reduction in on‑chain cost for the same correctness claim. Measure pair counts in your actual verifier to refine this. (eips.ethereum.org)
- Halo2/KZG wrapper (as reported): ~350k gas per aggregated verification; if your batcher emits no per‑proof storage, there’s almost no per‑proof additive gas. Two or three such proofs can settle 10,000 inputs if you used a tree of recursion off‑chain. (docs.nebra.one)
On L2s, the calculus is similar, and public numbers show ≈775k L2 execution gas per aggregated verify for batch size 32, i.e., ~46k gas per proof total including query—again confirming the amortization effect. (docs.nebra.one)
Engineering blueprint (what we implement for clients)
- Pick your curve and aggregation route
- Greenfield (new circuits): prefer Groth16 over BLS12‑381 or a Halo2/KZG outer wrapper. BLS12‑381 gives ~128‑bit security and now has native precompiles at 0x0b–0x11; MSMs and pairings are cheap enough to be first‑class in Solidity. (eips.ethereum.org)
- Existing BN254 circuits: either (a) keep BN254 and aggregate with SnarkPack on BN254; or (b) wrap BN254 Groth16 proofs into a BLS12‑381 Halo2/KZG aggregator, then verify once on L1 via the BLS12 precompiles. Both avoid per‑proof verification. (research.protocol.ai)
- Minimize public inputs and calldata
- Groth16 gas grows ~7,160 gas per public input on BN254; EIP‑7623 increases costs for calldata‑heavy txs. Hash or Merkle‑commit per‑proof public inputs off‑chain and expose only a batch root. This keeps your on‑chain payload small and EIP‑7623‑safe. (hackmd.io)
- Commit to “which proofs are included” without per‑proof writes
- Include a Merkle root (or vector commitment) of the per‑proof public inputs/hints inside the aggregated statement and verify exactly one proof on L1. If you need selective membership checks in other contracts, verify an inclusion proof against the batch root instead of touching storage for every proof. This drops the ~7k storage/bookkeeping overhead reported by shared aggregators. (docs.nebra.one)
- Keep the L1 verifier modular and upgradable
- Put the verifier behind a timelocked proxy so you can swap BN254↔BLS12‑381 or SnarkPack↔Halo2 as the ecosystem evolves. Pectra’s precompiles mean your future upgrades can be cleaner and cheaper. (eips.ethereum.org)
- Provision aggregation hardware for your SLO
- SnarkPack: 8,192 proofs in ~8–9 seconds on a 32‑core CPU is a useful anchor; for 100k+ proofs per batch, use a tree (local aggregations → higher‑level aggregate) to parallelize. Halo2/KZG recursion latencies are generally minutes for large batches; use a two‑tier strategy (fast BLS attestation, slower recursive proof) if you need immediate UX. (research.protocol.ai)
Solidity-level notes that save gas
- Use one multi‑pairing call per verification. Whether BN254 (0x08) or BLS12‑381 (0x0f), marshal all required pairs into a single call. This avoids paying the precompile’s base cost more than once. (eips.ethereum.org)
- On BLS12‑381, prefer MSM precompiles (0x0c/0x0e) over hand‑rolled ECMUL loops. They’re priced with batched discounts and avoid scaffolding overhead. (eips.ethereum.org)
- If you must verify signatures (e.g., operator attestations), “fast aggregate verify” costs roughly one k=2 pairing check plus mapping for the common‑message case—on the order of ~100k gas for the pairing on BLS12‑381, excluding calldata. Aggregate pubkeys off‑chain. (eips.ethereum.org)
- Avoid per‑proof SSTORE. If you need auditability, emit a single event with the batch root and reuse it across contracts as the reference. That mirrors the ~350k‑only pattern in Halo2/KZG aggregators where per‑proof storage is optional. (docs.nebra.one)
“Which option should we pick?” A quick decision guide
-
You already run Groth16 and want minimal changes, strong L1 finality, and the lowest L1 gas:
Choose SnarkPack aggregation over BN254 or BLS12‑381 (post‑Pectra), verify once per batch on L1. Expect sub‑million‑gas verifies even for multi‑thousand‑proof batches. (research.protocol.ai) -
You want a drop‑in service, transparent setup, and predictable budgets:
Use a Halo2/KZG universal aggregator; design your batch interface to avoid per‑proof storage so L1 verification stays ≈350k gas. (docs.nebra.one) -
You need near‑instant attestations plus optional hard finality later:
Use an AVS like Aligned for off‑chain verification and BLS aggregation (~350k gas per batch, ~40k/proof at n=20), then post a recursive proof (~300k gas) on a cadence that fits your risk tolerance. (blog.alignedlayer.com)
Example: turning 8,192 Groth16 proofs into one L1 check
- Off‑chain: aggregate with SnarkPack (≈8–9s on a 32‑core CPU). Include a Merkle root of all per‑proof public inputs in the aggregated statement. (research.protocol.ai)
- On‑chain: verify the aggregated proof via one multi‑pairing call (BN254 0x08 or BLS12‑381 0x0f). Engineering estimate for the pairing component at O(log n) ≈ 13 pairs on BN254 is ≈ 487k gas; with calldata and a few EC ops you’re typically well below 1M gas. That’s an ~99.95% reduction vs. 8,192 standalone verifies. Validate your exact pair count and calldata size during integration. (eips.ethereum.org)
If you prefer a wrapper: generate a single Halo2/KZG proof attesting “all 8,192 Groth16s verified,” then verify that once on L1 for ≈350k gas; no per‑proof storage if you’re only settling the rollup state root. (docs.nebra.one)
2026‑ready best practices (brief but in‑depth)
- Design for BLS12‑381 first, unless you’re locked into BN254: Pectra gives you fast BLS MSMs + pairings, higher security, and simpler interop with blob/KZG tooling. Keep a path to swap verifiers if you still deploy BN254 today. (eips.ethereum.org)
- Keep public inputs tiny: hash-to-field off‑chain; commit to roots on‑chain. Every extra public input adds ~7,160 gas (BN254 Groth16), and EIP‑7623 made calldata heavier for data‑dominant txs. (hackmd.io)
- Separate “attestation latency” from “L1 finality”: if UX needs sub‑second confirmation, use BLS‑aggregated attestations first; post a recursive proof on a schedule (e.g., every N blocks) to cement finality at constant cost. (blog.alignedlayer.com)
- Avoid per‑proof accounting on L1: store a root; if downstream contracts need per‑proof checks, let them verify Merkle inclusion against that root. This eliminates ~7k gas/proof book‑keeping that universal aggregators pay when they track proofs individually. (docs.nebra.one)
- Measure, don’t guess: before mainnet, measure your verifier’s exact multi‑pairing count and calldata footprint; then use EIP‑197/EIP‑2537 pricing to compute precise upper bounds you can defend in a board deck. (eips.ethereum.org)
Bottom line
- The most gas‑efficient way to batch thousands of Groth16 proofs for a high‑throughput rollup is to avoid per‑proof verification entirely and settle one aggregated or recursive proof on L1.
- If you’re greenfield, target BLS12‑381 (post‑Pectra) and adopt SnarkPack aggregation or a Halo2/KZG wrapper for a single ~3e5–6e5‑gas verify per batch. If you’re latency‑sensitive, combine fast BLS attestation with periodic recursive proofs. This is the design pattern that closes both your gas and throughput budgets in 2026. (eips.ethereum.org)
References and further reading
- Groth16 verification cost breakdown and EIP‑1108 pricing: ~207,700 fixed + ~7,160 per public input; BN254 precompile gas. (medium.com)
- SnarkPack (Groth16 aggregation): 8,192 proofs aggregated in ≈8–9s; logarithmic verifier. (research.protocol.ai)
- Pectra mainnet activation (May 7, 2025) and inclusion of EIP‑2537 (BLS12‑381 precompiles). (blog.ethereum.org)
- NEBRA UPA (Halo2/KZG aggregation) gas: ~350k per aggregated verify (+ optional ~7k/proof storage). (docs.nebra.one)
- Aligned Verification Layer and Aggregation Service: per‑batch ~350k gas; ~40k/proof at batch size 20; recursive proofs ~300k gas. (blog.alignedlayer.com)
Description: To batch thousands of Groth16 proofs on Ethereum, don’t verify them individually. Aggregate or recurse off-chain (SnarkPack or Halo2/KZG), then verify a single proof on L1 for a few hundred thousand gas; Pectra’s BLS12‑381 precompiles make the BLS route both cheaper and more secure going forward. (research.protocol.ai)
Like what you're reading? Let's build together.
Get a free 30‑minute consultation with our engineering team.

