Avoiding Benchmarking Pitfalls with std::hint::black_box
When benchmarking short programs, you often encounter two big problems that mess up your final results: (1) hardware and operating systems are full of side-effects that are neither transparent nor directly manipulable and (2) compilers can optimize in unpredictable ways, requiring IR/Assembly inspection and knowledge of compiler intrinsics.
One such example happened while I was benchmarking a multithreaded queue. I chose my struct alignments in a way that would reduce cache coherency traffic, which should translate to a noticeable improvement in per-thread throughput on write-heavy workloads. However, I measured the exact opposite! This is an excerpt of the code, where we essentially just increment a variable:for i in 0..0xffff { //... *head = (*head + 1) & ((1 << C) - 1); }
LBB0_1: add w9, w9, #1 and w9, w9, #0xffff subs x10, x10, #1 b.ne LBB0_1 str w9, [x8]
for i in 0..0xffff { //... *head = (*head + 1) & ((1 << C) - 1); black_box(head); }
LBB0_1: ldr x11, [sp] ldr w12, [x11] add w12, w12, #1 and w12, w12, #0xffff str w12, [x11] str x9, [sp, #24] subs x8, x8, #1 b.ne LBB0_1
pub fn write_scalar(&mut self, range: AllocRange, val: Scalar) -> ... { let range = self.range.subrange(range); Ok(self .alloc .write_scalar(&self.tcx, range, val) .map_err(|e| e.to_interp_error(self.alloc_id))?) }
- Make sure to pass large objects via &mut T. If you pass them by value, you will end up with a memcpy even in optimized builds.
- black_box does not guarantee anything, and only works as an advisory function. It's not a llvm intrinsic. So manual inspection of IR or assembly is still necessary.
black_box is still experimental and awaits stabilization, part of which is possibly a name and documentation change.(stabilized as of 2022-12-15)- Creating a version of black_box that gives strict guarantees would require a top-to-bottom rework, including patching backends to support these intrinsics.
Update (December): With the release of Rust 1.66, black_box
has been officially stabilized. You can find more information on the
official announcement post.