Benchmarks
AionDB includes local benchmark harnesses under benchmarks/. They are intended to make performance claims reproducible and tied to a specific commit, dataset, and machine.
The benchmark docs are deliberately conservative. A fast number without the command, dataset size, durability mode, hardware, and raw output should not be treated as a product claim.
Available harnesses
benchmarks/run.sh --help
Current benchmark families:
pgbenchfor OLTP microbenchmarks.surreal-suitefor SurrealDB 3 article-style CRUD, scan, graph, index, full-text, and vector tests against SurrealDB WS, AionDB pgwire, and PostgreSQL with pgvector/AGE.tpchfor analytical SQL workloads.tpcdsfor analytical SQL workloads.jobfor join-heavy workloads based on the Join Order Benchmark.
The harnesses are tools, not claims. A benchmark family being present means the repository has a path to run it; it does not mean every query shape is optimized or that AionDB should be expected to win.
Basic usage
Run AionDB and PostgreSQL side by side when you want a local reference point:
benchmarks/run.sh pgbench
Run only AionDB:
BENCH_ENGINES=aiondb benchmarks/run.sh pgbench
Run only PostgreSQL:
BENCH_ENGINES=pg benchmarks/run.sh pgbench
Run the SurrealDB 3 article-style comparison:
SURREAL_SUITE_ITERATIONS=1 \
SURREAL_SUITE_DURATION_SECONDS=20 \
SURREAL_SUITE_ROWS=2000 \
benchmarks/run.sh surreal-suite
This wrapper runs each test as warmup across all selected engines first, then one measured 20-second pass across all engines, and writes raw traces, metadata, per-iteration CSV, and summaries under benchmarks/.state/surreal-suite/<run-id>/.
Render the latest run as a docs page with visual bars and pgstack-relative ratios:
benchmarks/surreal-suite/render_docs.py \
benchmarks/.state/surreal-suite/<run-id> \
--out docs/content/documentation/evaluate/benchmark-results.md
The generated snapshot is visible in Benchmark Results.
Important variables
AIONDB_PORT=15432
PG_PORT=5432
PG_DB=bench_ref
TPCH_SCALE=1
TPCDS_SCALE=1
BENCH_AUTO_CLEAN=1
SURREAL_SUITE_ENGINES="surrealdb aiondb pgstack"
SURREAL_SUITE_ITERATIONS=1
SURREAL_SUITE_DURATION_SECONDS=20
SURREAL_SUITE_WARMUP_SECONDS=3
SURREAL_SUITE_ROWS=2000
The heavier benchmarks may require external tools or datasets. Read the output before treating a run as comparable.
Picking the right benchmark
| Goal | Benchmark shape |
|---|---|
| Connection and transaction smoke | small pgbench run |
| Write-path comparison | pgbench with disclosed WAL policy |
| SurrealDB 3 article-style comparison | surreal-suite with raw output retained |
| Analytical scans | TPC-H or TPC-DS subset |
| Join optimizer pressure | Join Order Benchmark |
| Hybrid graph/vector claim | custom schema with published SQL |
For hybrid claims, standard SQL benchmarks are not enough. Publish a small workload that includes relational filters, relationship tables, and vector ranking so readers can inspect the model.
SurrealDB Suite Comparison
The surreal-suite harness mirrors the public SurrealDB 3 benchmark families by name: create, read, update, scans, filters, ordering, grouping, subqueries, graph traversals, index build/remove, indexed scans, full-text, and HNSW vector search.
That choice is deliberate. Instead of inventing a benchmark mix that happens to fit AionDB especially well, the suite reuses the benchmark families SurrealDB itself publicly highlighted so the comparison is less biased toward workloads chosen by AionDB.
The protocol paths are explicit:
- SurrealDB uses WebSocket JSON-RPC.
- AionDB uses the PostgreSQL wire protocol.
- PostgreSQL stack uses PostgreSQL wire plus
pgvectorfor vectors and Apache AGE for Cypher graph tests.
If vector or age is not installed in the local PostgreSQL cluster, affected PostgreSQL-stack tests are marked UNSUPPORTED and the raw extension error is kept in the trace.
Useful variables:
SURREAL_SUITE_ENGINES="surrealdb aiondb pgstack"
SURREAL_SUITE_ROWS=2000
SURREAL_SUITE_WARMUP_SECONDS=3
SURREAL_SUITE_ITERATIONS=1
SURREAL_SUITE_DURATION_SECONDS=20
SURREAL_SUITE_TESTS=all
SURREAL_PATH=memory
The run directory contains:
metadata.jsonwith commits, protocol paths, row count, durations, engines, and test names.traces/*.logwith the raw error or query trace for every warmup and measured pass.raw_results.csvwith every warmup and measured iteration.summary.tsvandsummary.mdwith arithmetic means over the measured iterations.benchmark-results.mdcan be regenerated fromraw_results.csvfor a visual docs page.
Recommended workflow
- Build the release binary.
- Run a small smoke benchmark to validate the environment.
- Increase scale only after both engines complete the same workload.
- Keep the raw output with the commit hash.
- Change one variable at a time: clients, scale, durability, indexes, or query timeout.
Example:
cargo build --release -p aiondb-server --bin aiondb
PGBENCH_SCALE=1 \
PGBENCH_CLIENTS=1 \
PGBENCH_DURATION=10 \
benchmarks/run.sh pgbench
Correctness before timing
For every benchmark, define a correctness check:
- row counts after load;
- sample query output;
- checksum-style aggregate if useful;
- expected error behavior for unsupported statements;
- same dataset loaded into each compared engine.
Only compare latency after the result is known to be correct.
Interpreting results
Do not compare benchmark results unless both engines are using comparable durability, data volume, hardware, and query timeout settings. AionDB defaults in the harness are chosen to exercise the real server path, but alpha performance can change quickly.
The public benchmark rule for v0.1 is simple: publish commands, dataset size, commit hash, machine details, and raw output with any performance claim.
Comparison examples
A useful result summary looks like:
commit: <sha>
binary: target/release/aiondb
benchmark: pgbench
scale: 1
clients: 1
duration: 10s
durability: AIONDB_STORAGE_DURABLE_WAL_COMMIT_POLICY=always
machine: CPU / RAM / disk / OS
raw output: attached
An unusable result summary looks like:
AionDB is faster on my machine.
The second form cannot be reproduced, debugged, or believed.
See Benchmark Reproducibility and Performance Tuning before changing durability, resource limits, or index definitions.