Table of Contents

BenchmarkDotNet v0.13.2

Highlights

In BenchmarkDotNet v0.13.2, we have implemented support for:

  • .NET 7
  • NativeAOT
  • .NET Framework 4.8.1

We have also introduced new features and improvements including:

  • Possibility to hide selected columns,
  • Allocation Ratio column,
  • Logging progress and estimated finish time,
  • ARM64 support for BenchmarkDotNet.Diagnostics.Windows package,
  • Printing Hardware Intrinsics information,
  • Glob filters support for DisassemblyDiagnoser.

Of course, this release includes dozens of other improvements and bug fixes!

Our special thanks go to @mawosoft, @YegorStepanov and @radical who fixed a LOT of really nasty bugs.

Supported technologies

.NET 7 and .NET Framework 4.8.1

.NET 4.8.1 has been released earlier this month, while .NET 7 should land in autumn this year. Now you can use BenchmarkDotNet to compare both!

BenchmarkDotNet=v0.13.1.1845-nightly, OS=Windows 11 (10.0.22622.575)
Microsoft SQ1 3.0 GHz, 1 CPU, 8 logical and 8 physical cores
.NET SDK=7.0.100-preview.6.22352.1
  [Host]     : .NET 7.0.0 (7.0.22.32404), Arm64 RyuJIT AdvSIMD
  Job-QJVIDT : .NET 7.0.0 (7.0.22.32404), Arm64 RyuJIT AdvSIMD
  Job-FNNXOY : .NET Framework 4.8.1 (4.8.9032.0), Arm64 RyuJIT
Method Runtime Mean Allocated
BinaryTrees_2 .NET 7.0 193.6 ms 227.33 MB
BinaryTrees_2 .NET Framework 4.8.1 192.8 ms 228.01 MB

Credit for adding .NET 7 support in #1816 goes to @am11. @adamsitnik implemented .NET 4.8.1 support in #2044 and #2067. Big thanks to @MichalPetryka who was using preview versions of BenchmarkDotNet and reported a bug related to .NET 4.8.1 support: #2059 that got fixed before we released a new version.

NativeAOT

We are really excited to see the experimental CoreRT project grow and become officially supported by Microsoft (under new name: NativeAOT)! You can read more about it here. Implementing and improving the support was a combined effort of multiple contributors that spawned across multiple repositories:

As every AOT solution, NativeAOT has some limitations like limited reflection support or lack of dynamic assembly loading. Because of that, the host process (what you run from command line) is never an AOT process, but just a regular .NET process. This process (called Host process) uses reflection to read benchmarks metadata (find all [Benchmark] methods etc.), generates a new project that references the benchmarks and compiles it using ILCompiler. The boilerplate code is not using reflection, so the project is built with TrimmerDefaultAction=link (we have greatly reduced build time thanks to that). Such compilation produces a native executable, which is later started by the Host process. This process (called Benchmark or Child process) performs the actual benchmarking and reports the results back to the Host process. By default BenchmarkDotNet uses the latest version of Microsoft.DotNet.ILCompiler to build the NativeAOT benchmark according to this instructions. Moreover, BenchmarkDotNet by default uses current machines CPU features (change: #1994, discussion: #2061) and if you don't like this behavior, you can disable it.

This is why you need to:

  • install pre-requisites required by NativeAOT compiler
  • target .NET to be able to run NativeAOT benchmarks (example: <TargetFramework>net7.0</TargetFramework> in the .csproj file)
  • run the app as a .NET process (example: dotnet run -c Release -f net7.0).
  • specify the NativeAOT runtime in an explicit way, either by using console line arguments --runtimes nativeaot7.0 (the recommended approach), or by using[SimpleJob] attribute or by using the fluent Job config API Job.ShortRun.With(NativeAotRuntime.Net70):
dotnet run -c Release -f net7.0 --runtimes nativeaot7.0

For more examples please go to docs.

BenchmarkDotNet=v0.13.1.1845-nightly, OS=Windows 11 (10.0.22000.856/21H2)
AMD Ryzen Threadripper PRO 3945WX 12-Cores, 1 CPU, 24 logical and 12 physical cores
.NET SDK=7.0.100-rc.1.22423.16
  [Host]     : .NET 7.0.0 (7.0.22.42223), X64 RyuJIT AVX2
  Job-KDVXET : .NET 7.0.0 (7.0.22.42223), X64 RyuJIT AVX2
  Job-HFRAGK : .NET 7.0.0-rc.1.22424.9, X64 NativeAOT AVX2
Method Runtime Mean Ratio Allocated
BinaryTrees_2 .NET 7.0 95.06 ms 1.00 227.33 MB
BinaryTrees_2 NativeAOT 7.0 90.32 ms 0.96 227.33 MB

Some of .NET features are not supported by Native AOT, that is why you may want to filter them out using new [AotFilter] attribute:

[AotFilter("Currently not supported due to missing metadata.")]
public class Xml_FromStream<T>

New features and improvements

Hiding Columns

In #1621 @marcnet80 has reduced the number of columns displayed when multiple runtimes are being compared.

In #1890 @YegorStepanov has implemented a set of new APIs that allow for hiding columns. It's also exposed via -h and --hide command line arguments.

[MemoryDiagnoser] // adds Gen0, Gen1, Gen2 and Allocated Bytes columns
[HideColumns(Column.Gen0, Column.Gen1, Column.Gen2)] // dont display GenX columns
public class IntroHidingColumns
{
    [Benchmark]
    public byte[] AllocateArray() => new byte[100_000];
}

Sample results without [HideColumns]:

Method Mean Error StdDev Gen0 Gen1 Gen2 Allocated
AllocateArray 3.303 us 0.0465 us 0.0435 us 31.2462 31.2462 31.2462 97.69 KB

With:

Method Mean Error StdDev Allocated
AllocateArray 3.489 us 0.0662 us 0.0763 us 97.69 KB

Imagine how much time @YegorStepanov has saved to all the people who so far were removing the columns manually from the results before publishing them on GitHub!

Allocation Ratio Column

In #1859 @YegorStepanov has added Allocation Ratio Column. It's enabled by default when MemoryDiagnoser is used and one of the benchmarks is marked as [Benchmark(Baseline = true)] or when there are multuple jobs defined and one of them is marked as baseline.

[MemoryDiagnoser]
public class AllocationColumnSample
{
    [Benchmark(Baseline = true)]
    [Arguments("test")]
    public string Builder(string value)
    {
        StringBuilder sb = new (value);

        for (int i = 0; i < 10; i++)
            sb.Append(value);

        return sb.ToString();
    }

    [Benchmark]
    [Arguments("test")]
    public string Concatenation(string value)
    {
        string result = value;

        for (int i = 0; i < 10; i++)
            result += value;

        return result;
    }
}
Method value Mean Error StdDev Ratio Gen 0 Allocated Alloc Ratio
Builder test 127.9 ns 0.49 ns 0.43 ns 1.00 0.0544 456 B 1.00
Concatenation test 120.2 ns 0.94 ns 0.88 ns 0.94 0.0908 760 B 1.67

Progress and estimated finish time

In #1909 @adamsitnik has added logging of progress and estimated finish time.

// ** Remained 5211 (99.9%) benchmark(s) to run. Estimated finish 2022-08-25 22:26 (9h 7m from now) **

arm64 support for BenchmarkDotNet.Diagnostics.Windows package

Due to the update to TraceEvent 3.0 BenchmarkDotNet.Diagnostics.Windows package has now arm64 support. Which means that you can use EtwProfiler and other ETW-based diagnosers on Windows arm64.

It would not be possible without @brianrob who implemented arm64 support for TraceEvent in #1533!

Hardware Intrinsics information

In #2051 @adamsitnik has extended the hardware information printed in the Summary table with Hardware Intrinsics information.

Sine the space in Summary table is quite limited, we full information is printed only in the log:

Special thanks to @tannergooding who provided a lot of very valuable feedback and @MichalPetryka who contributed an improvement #2066 for older runtimes.

Other improvements

  • WASM toolchain has received a lot of improvements from various .NET Team members: #1769, #1936, #1938, #1982.
  • Dependencies and TFMs updates: #1805, #1978, #2012, #2019, #2035.
  • Ensure proper SummaryStyle handling implemented by @mawosoft in #1828.
  • Preserving EnablePreviewFeatures project setting which gives the possibility to benchmark preview .NET features. Implemented by @kkokosa in #1842.
  • CI: Using non-deprecated macOS pool on Azure Pipelines, implemented by @akoeplinger in #1847
  • CI: Updating Cake to 2.0.0, adopting frosting project style. Implemented by @AndreyAkinshin in #1865.
  • Detecting ReSharper's Dynamic Program Analysis. Implemented by @adamsitnik in #1874.
  • Preventing benchmark failure when some of the exporters fail. Implemented by @epeshk in #1902.
  • Don't use the diagnosers when benchmarking has failed. Implemented by @adamsitnik in #1903.
  • Ensuring the default order of benchmarks is the same as declared in source code. Implemented by @adamsitnik in #1907.
  • Making BuildTimeout configurable. Implemented by @adamsitnik in #1906.
  • Notify users about private methods with Setup/Cleanup attributes. Implemented by @epeshk in #1912.
  • Don't run Roslyn Analyzers for the generated code. Implemented by @adamsitnik in #1917.
  • Ensure WorkloadActionUnroll and similar are optimized if possible. Implemented by @AndyAyersMS in #1935.
  • Don't use blocking acknowledgments when there is no need to. Implemented by @adamsitnik in #1940.
  • Executor: Don't use Process.ExitCode, unless the process has exited. Implemented by @radical in #1947.
  • Revise heuristic for initial jitting. Implemented by @AndyAyersMS in #1949.
  • Allow logging build commands output. Implemented by @radical in #1950.
  • Change Mono AOT mode to Normal AOT with LLVM JIT fall back. Implemented by @fanyang-mono in #1990.

Glob filters support for DisassemblyDiagnoser

So far, the disassembler was always loading the type that was generated by BDN, searching for the benchmark method, disassembling it and when encountered direct method calls, disassembling the called methods as well (if their depth was lesser or equal to max configured depth).

This was working fine, but only for direct method calls. For indirect, the disassembly was incomplete.

In #2072 @adamsitnik has added the possibility to filter methods disassembled by the DisassemblyDiagnoser.

The users can now pass --disasmFilter $globPattern and it's going to be applied to full signatures of all methods available for disassembling. Examples:

  • --disasmFilter *System.Text* - disassemble all System.Text methods.
  • --disasmFilter * - disassemble all possible methods.

Moreover, ClrMD was updated to v2 (#2040) and few disassembler bugs have been fixed (#2075, #2078). We are expecting that the disassembler will be more reliable now.

Docs and Samples improvements

Big thanks to @SnakyBeaky, @Distinctlyminty, @asaf92, @adamsitnik and @eiriktsarpalis who have improved our docs, samples and error messages!

#1776, #1797, #1850, #1861, #1939, #1974, #1997, #2042, #2050, #2068.

Bug fixes

  • WASM: #1811, #1846, #1916, #1926, #1932.
  • Diagnoser-provided Analysers weren't automatically added to Config. Fixed by @mawosoft in #1790.
  • Exportes could been duplicated. Fixed by @workgroupengineering in #1796.
  • Small bug in SummaryStyle. Fixed by @mawosoft in #1801.
  • InvalidOperationException/NullReferenceException in SmartParaemter. Fixed by @mawosoft in #1810.
  • Failures caused by colons in benchmark name. Fixed by @ronbrogan in #1823.
  • Some benchmark arugments were not properly escaped and were causing process launcher to crush. Fixed by @adamsitnik in #1841
  • Invalid size specifiers for Memory and Disassembly diagnosers. Fixed by @YegorStepanov in #1854 and #1855.
  • Respect LogicalGroup order in DefaultOrderer. Fixed by @AndreyAkinshin in #1866.
  • Endless loop in user interaction with redirected input. Fixed by @tmds in #.
  • Broken power plan support. Fixed by @YegorStepanov in #1885.
  • BytesAllocatedPerOperation was not being output by the JSON and XML exporters. Fixed by #martincostello in #1919.
  • Incorrect default InvocationCount in the summary table. Fixed by @AndreyAkinshin in #1929.
  • Failed build output was printed in reverse order. Fixed by @radical in #1945.
  • Build failures due to NETSDK1150. Fixed by @OlegOLK in #1981.
  • MetricCoumn was not respecting provided units when formatting values. Fixed by @mawosoft in #2033.
  • Generating invalid code that was causing benchmark failures. Fixed by @mawosoft in #2041.
  • CI: non-master build branches were publishing artifacts to the CI feed. Fixed by @mawosoft in #2047.
  • Comments in the project files were causing build failures. Fixed by @mawosoft in #2056.

Milestone details

In the v0.13.2 scope, 50 issues were resolved and 124 pull requests were merged. This release includes 147 commits by 34 contributors.

Resolved issues (50)

  • #299 Add API to remove columns for baseline comparison (assignee: @AndreyAkinshin)
  • #384 Print Vector.Count as part of machine info (assignee: @adamsitnik)
  • #722 Add scaled column for Allocated Memory
  • #837 Problems with default UnrollFactor in V0.11.0 (assignee: @adamsitnik)
  • #1177 Public types missing from reference assemblies don't work with ParamsSource
  • #1506 BenchmarkDotNet does not force to High Performance Mode during running (assignee: @YegorStepanov)
  • #1603 Don't display Job and Toolchain column when running benchmarks for multiple runtimes
  • #1669 --buildTimeout does not seem to work (assignee: @adamsitnik)
  • #1680 Cannot override RD.xml for NativeAOT
  • #1711 Add support for IBM Z architecture (assignee: @adamsitnik)
  • #1727 Unhelpful rounding in MemoryDiagnoser
  • #1753 "call: command not found" in .sh build script (assignee: @AndreyAkinshin)
  • #1755 EventPipeProfiler: File names are very verbose
  • #1756 EventPipeProfile: speedscope.app cannot parse result file (assignee: @adamsitnik)
  • #1774 Ability to compare --corerun with --runtimes (assignee: @adamsitnik)
  • #1775 Please add an easy way to remove columns
  • #1789 Small bug in ImmutableConfigbuilder (assignee: @mawosoft)
  • #1794 typo in error message
  • #1800 Small bug in SummaryStyle (assignee: @mawosoft)
  • #1803 Benchmark exception stops entire suite run (assignee: @adamsitnik)
  • #1809 Exception when using ParamsSource with (null) objects
  • #1812 Invalid codegen for Enumerable.Empty returned from ParamsSource
  • #1819 How to change exporter output path?
  • #1836 [Bug] System.InvalidOperationException: There is an error in XML document (0, 0).
  • #1857 Github actions ubuntu-latest "Unable to load shared library 'advapi32.dll' or one of its dependencies" when profiling dotnet 5
  • #1864 Is there a way to join summaries as if the benchmarks were run separately? (assignee: @AndreyAkinshin)
  • #1871 Detect ReSharper's Dynamic Program Analysis (assignee: @adamsitnik)
  • #1872 BenchmarkDontNet should make allowance for projects where Preview Features are enabled
  • #1887 MonoAotLLVM runtime is not actually AOTing things (assignee: @naricc)
  • #1900 Failed to benchmark .NET 7 in release mode
  • #1908 BenchmarkRunner.RunSource and BenchmarkRunner.RunUrl doesn't work
  • #1929 Incorrect default InvocationCount in the summary table (assignee: @AndreyAkinshin)
  • #1934 Ensure WorkloadActionUnroll and similar are optimized if possible
  • #1937 PR builds should not be published to BDN nightly feed (assignee: @mawosoft)
  • #1943 GitHub Actions Windows CI leg failing due to lack of native tools (assignee: @adamsitnik)
  • #1948 questions to help with future PRs
  • #1957 Broken pipe (assignee: @adamsitnik)
  • #1977 More Data for JSON and XML Export
  • #1989 Way to summarize all the params results (assignee: @YegorStepanov)
  • #1992 API Docs on website empty (assignee: @AndreyAkinshin)
  • #2000 DisassemblyDiagnoser broken on Linux (assignee: @adamsitnik)
  • #2009 Cleanup the dependencies (assignee: @martincostello)
  • #2011 Release-only build error when using ParamsSource with a LINQ method: Cannot implicitly convert type 'object' (assignee: @mawosoft)
  • #2016 Support for .NET 7?
  • #2028 Trying to build BDN on a windows arm64 host (assignee: @adamsitnik)
  • #2039 Support .NET Framework 4.8.1 (assignee: @adamsitnik)
  • #2055 Comment after breaks generated project
  • #2059 BenchmarkDotNet misreports .Net Framework as 4.8.1 (assignee: @adamsitnik)
  • #2063 BenchmarkDotNet nightly fails to disassemble .Net 7.0 properly
  • #2074 DisassemblyDiagnoser producing invalid disassembly (assignee: @adamsitnik)

Merged pull requests (124)

Commits (147)

Contributors (34)

Thank you very much!

Additional details

Date: August 26, 2022

Milestone: v0.13.2 (List of commits)

NuGet Packages: