obnam-benchmark
—tool to run benchmarks2022-09-02 07:02
Obnam is a backup program. It aims to have good performance. To achieve that, we run benchmarks, and we automate that. To allow flexibility in what benchmark scenarios we have, we have a tool that reads benchmark specifications from a file and runs them. This document is the description of that tool, and its acceptance criteria.
The obnam-benchmark
tool runs a set of specified benchmarks for one version of Obnam. It works as follows:
A benchmark specification file is in YAML format. It has a list of individual benchmarks. Each benchmark has a name, and a description of how to prepare the test data for each backup generation. Example:
- benchmark: maildir
backups:
- changes:
- create:
files: 100000
file_size: 0
- rename:
files: 1000
- changes: []
- benchmark: video-footage
backups:
- changes:
- create:
files: 1000
filee_size: 1G
- changes: []
The example above specifies two benchmarks: “maildir”, and “video-footage”. The names are not interpreted by obnam-benchmark
, they are for human consumption only. The backups
field specifies a list of changes to the test data; the benchmark tool runs a backup for each item in the list. Each change can create, copy, delete, or rename files, compared to the previous backup. The created files will have the specified size, and the content is randomly generated, non-repetitive, binary junk. The files will be stored in a directory tree avoiding very large numbers of files per directory.
By specifying what kind of data is generated, and how it changes from backup to backup, and how many backups are made, the specification file can describe synthetic benchmarks for different use cases.
Requirement: The benchmark tool can generate random test data.
We verify this by having the benchmark generate some test data, using a subcommand for this purpose. If the data doesn’t compress well, we assume it’s sufficiently random. We can safely assume that the same code is used to generate test data for benchmarks, though of course the scenario can’t verify that. Instead, it can be verified via manual code inspection.
Requirement: The benchmark tool can parse a specification file.
We verify this by having the tool output a YAML specification file as JSON. Given the formats are different, this will check that the tool actually parses the file. We use an input file that contains aspects of benchmark specification that are not just YAML, with the assumption that the tool uses a YAML parser that works well, so that we only need to care about things that are specific to the benchmark tool.
Specifically, we verify that the tool parses file sizes such as “1K” correctly.
Requirement: The benchmark tool can benchmarks at all.
We verify this by running a trivial benchmark, which backs up an empty directory.
Requirement: The benchmark tool can benchmarks with more than one backup.
We verify this by running a benchmark with three backup generations.