Report

Help us improve this tool

Benchmark Builder

Compare multiple performance test suites with mean, variance, and ratio analysis. Generate markdown tables and bullet list reports for your benchmarks.

O M T

What is the Benchmark Builder?

The Benchmark Builder is a free online tool for comparing the performance of multiple test suites. It computes statistical measures like mean (average) and variance for each suite, ranks them from fastest to slowest, and shows how each suite compares to the best performer with delta and ratio metrics.

This tool is ideal for developers and performance engineers who need to compare execution times, throughput measurements, or any numeric performance data across multiple test scenarios. All calculations happen instantly in your browser as you enter data, with results displayed in a sortable table.

For other statistical analysis needs, try the Mean Median Mode Calculator, Standard Deviation Calculator, or Variance Calculator.

How to Use the Benchmark Builder

  1. Configure Suites: Each suite card represents one test scenario. Give it a descriptive name and enter comma-separated numeric values representing individual test runs.
  2. Add or Remove Suites: Use the Add button to create additional suites and the Delete button to remove ones you do not need.
  3. Set Unit: Specify the measurement unit (e.g., ms for milliseconds, ops/s for operations per second, MB/s for throughput).
  4. View Results: The results table shows each suite ranked by mean performance, along with sample count, mean value, and variance.
  5. Export: Copy results as a Markdown table for documentation or as a bullet list for quick sharing.

Understanding the Statistics

The Benchmark Builder calculates two key statistical measures for each test suite:

  • Mean (Average): The arithmetic mean of all values in the suite, calculated as the sum of all values divided by the count. Lower means indicate better performance for latency metrics, while higher means indicate better throughput.
  • Variance: A measure of how spread out the values are from the mean. Low variance indicates consistent performance across runs, while high variance suggests variability that may need investigation.

The comparison metrics help you understand relative performance. The delta shows the absolute difference from the best suite, while the ratio shows how many times slower (or faster) each suite is compared to the best performer.

Use Cases

  • Performance Testing: Compare execution times of different algorithms or implementations.
  • Load Testing: Analyze throughput metrics across varying concurrency levels.
  • A/B Testing: Compare response times between different versions of a service.
  • Database Query Analysis: Benchmark query execution times across different optimization strategies.
  • Network Performance: Compare latency and bandwidth measurements across locations or providers.

Frequently Asked Questions

What is a benchmark suite?

A benchmark suite is a collection of test runs with measured numeric results. Each suite represents one specific configuration or implementation being tested. For example, you might have Suite 1 for a baseline implementation and Suite 2 for an optimized version, each with multiple measurement values from repeated test runs.

How is the ranking determined?

Suites are ranked by their mean value from lowest to highest. Lower means are ranked first since lower values typically indicate better performance (e.g., lower execution time is better). If your metric works the opposite way (higher is better), simply interpret the ranking in reverse.

What does the variance tell me?

Variance measures how consistent your test results are. A low variance means all test runs produced similar values (consistent performance), while a high variance indicates significant fluctuations between runs. High variance in a performance benchmark may indicate external factors affecting results, such as garbage collection, I/O contention, or thermal throttling.

How many decimal places are shown?

All values are rounded to 3 decimal places for readability. The raw data you enter is preserved exactly as input, but the computed statistics use standard rounding to avoid overwhelming precision that is rarely meaningful in performance benchmarks.

Can I export the benchmark results?

Yes, you can copy results as a Markdown table (perfect for GitHub issues, documentation, or README files) or as a bullet list (useful for chat messages, emails, or quick notes). Both options copy the formatted output directly to your clipboard.