Commit fb2fea16 authored by Benoit Lize's avatar Benoit Lize Committed by Commit Bot

[PartitionAlloc] Exponential backoff in SpinningFutex.

The latency of "pause" is unusually high on the Skylake Client
architecture. This is likely not affecting us though, as the spinning
loop is short. Add a comment noting this (with a link to the Intel
optimization manual), and follow the best practice highlighted in the
manual, namely exponential backoff with "pause".

Bug: 1125999
Change-Id: Ia233aa5ae4b9cd58b24897537861b60c7d4ca05a
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2465745Reviewed-by: default avatarEgor Pasko <pasko@chromium.org>
Commit-Queue: Benoit L <lizeb@chromium.org>
Cr-Commit-Position: refs/heads/master@{#816665}
parent c2d62d36
...@@ -5,6 +5,7 @@ ...@@ -5,6 +5,7 @@
#ifndef BASE_ALLOCATOR_PARTITION_ALLOCATOR_SPINNING_FUTEX_LINUX_H_ #ifndef BASE_ALLOCATOR_PARTITION_ALLOCATOR_SPINNING_FUTEX_LINUX_H_
#define BASE_ALLOCATOR_PARTITION_ALLOCATOR_SPINNING_FUTEX_LINUX_H_ #define BASE_ALLOCATOR_PARTITION_ALLOCATOR_SPINNING_FUTEX_LINUX_H_
#include <algorithm>
#include <atomic> #include <atomic>
#include "base/allocator/partition_allocator/yield_processor.h" #include "base/allocator/partition_allocator/yield_processor.h"
...@@ -65,13 +66,28 @@ class BASE_EXPORT SpinningFutex { ...@@ -65,13 +66,28 @@ class BASE_EXPORT SpinningFutex {
ALWAYS_INLINE void SpinningFutex::Acquire() { ALWAYS_INLINE void SpinningFutex::Acquire() {
int tries = 0; int tries = 0;
int backoff = 1;
// Busy-waiting is inlined, which is fine as long as we have few callers. This // Busy-waiting is inlined, which is fine as long as we have few callers. This
// is only used for the partition lock, so this is the case. // is only used for the partition lock, so this is the case.
do { do {
if (LIKELY(Try())) if (LIKELY(Try()))
return; return;
YIELD_PROCESSOR; // Note: Per the intel optimization manual
tries++; // (https://software.intel.com/content/dam/develop/public/us/en/documents/64-ia-32-architectures-optimization-manual.pdf),
// the "pause" instruction is more costly on Skylake Client than on previous
// (and subsequent?) architectures. The latency is found to be 141 cycles
// there. This is not a big issue here as we don't spin long enough for this
// to become a problem, as we spend a maximum of ~141k cycles ~= 47us at
// 3GHz in "pause".
//
// Also, loop several times here, following the guidelines in section 2.3.4
// of the manual, "Pause latency in Skylake Client Microarchitecture".
for (int yields = 0; yields < backoff; yields++) {
YIELD_PROCESSOR;
tries++;
}
constexpr int kMaxBackoff = 64;
backoff = std::min(kMaxBackoff, backoff << 1);
} while (tries < kSpinCount); } while (tries < kSpinCount);
LockSlow(); LockSlow();
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment