Commit 21699e36 authored by Dale Curtis's avatar Dale Curtis Committed by Commit Bot

Improve values for dav1d tile threads and frame threads.

Current AV1 content doesn't use many tiles, so we see much better
performance by increasing the number of frame threads on systems that
can handle the increase versus increasing tile threads.

This patch sets the number of tile threads based on current encoding
practices and the frame threads based on recommendations from dav1d:
https://bugzilla.mozilla.org/show_bug.cgi?id=1536783

It distributes tiles threads and frame threads based on the number of
available processors and other //media limits -- preferring to fulfill
tile thread recommendations first since the dav1d folk indicate they
are more efficient.

If a system has the cores for it, we'll now use the following:
<300p: 2 tile threads, 2 frame threads.
<700p: 3 tile threads, 2 frame threads.
<1000p: 5 tile threads, 4 frame threads
>1000p: 8 tile threads, 8 frame threads.

If we don't have the cores for it; i.e., on systems with <= 3 cores
we'll now use less threads. Previously even if a system had <= 3 cores,
we would still allocate >= 3 threads (up to 3 tile, and 2 frame). Now,
for such systems, we will set frame_threads=1 (which generates no more
threads, since the calling thread counts as a thread) and at most 1
more tile thread over the core count.

E.g. a 3 core system will use a maximum of 3 tile threads and 1 frame
thread. Single and dual core systems will use 2 tile threads and 1 frame
thread.

BUG=954659
TEST=https://www.youtube.com/watch?v=Fmdb-KmlzD8 plays smoothly; see
1m30s+ for most intense areas.

R=chcunningham

Change-Id: I82e5725a32a987a98567af9aeba495d248000105
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1576158
Commit-Queue: Chrome Cunningham <chcunningham@chromium.org>
Reviewed-by: default avatarChrome Cunningham <chcunningham@chromium.org>
Auto-Submit: Dale Curtis <dalecurtis@chromium.org>
Cr-Commit-Position: refs/heads/master@{#652893}
parent 93b4ec1b
......@@ -16,6 +16,7 @@
#include "base/threading/sequenced_task_runner_handle.h"
#include "media/base/bind_to_current_loop.h"
#include "media/base/decoder_buffer.h"
#include "media/base/limits.h"
#include "media/base/media_log.h"
#include "media/base/video_util.h"
......@@ -25,13 +26,36 @@ extern "C" {
namespace media {
// Returns the number of threads.
static int GetDecoderThreadCount(const VideoDecoderConfig& config) {
// For AV1 decode when using the default thread count, increase the number
// of decode threads to equal the maximum number of tiles possible for
// higher resolution streams.
return VideoDecoder::GetRecommendedThreadCount(config.coded_size().width() /
256);
static int GetDecoderTileThreadCount(const VideoDecoderConfig& config) {
// Values based on currently available content. Recommended by YouTube.
int tile_threads;
const int height = config.coded_size().height();
if (height >= 1000)
tile_threads = 8;
else if (height >= 700)
tile_threads = 5;
else if (height >= 300)
tile_threads = 3;
else
tile_threads = 2;
return tile_threads;
}
static int GetDecoderFrameThreadCount(const VideoDecoderConfig& config) {
// Values based on currently available content.
int frame_threads;
const int height = config.coded_size().height();
if (height >= 1000)
frame_threads = 8;
else if (height >= 700)
frame_threads = 4;
else
frame_threads = 2;
return frame_threads;
}
static VideoPixelFormat Dav1dImgFmtToVideoPixelFormat(
......@@ -149,11 +173,31 @@ void Dav1dVideoDecoder::Initialize(const VideoDecoderConfig& config,
Dav1dSettings s;
dav1d_default_settings(&s);
s.n_tile_threads = GetDecoderThreadCount(config);
// Use only 1 frame thread in low delay mode, otherwise we'll require at least
// two buffers before the first frame can be output.
s.n_frame_threads = low_delay ? 1 : 2;
// Compute the ideal thread count values. We'll then clamp these based on the
// maximum number of recommended threads (using number of processors, etc).
s.n_tile_threads = GetDecoderTileThreadCount(config);
s.n_frame_threads = GetDecoderFrameThreadCount(config);
const int max_threads = VideoDecoder::GetRecommendedThreadCount(
s.n_tile_threads + s.n_frame_threads);
// First clamp tile threads to the allowed maximum. We prefer tile threads
// over frame threads since dav1d folk indicate they are more efficient. In an
// ideal world this would be auto-detected by dav1d from the content.
//
// https://bugzilla.mozilla.org/show_bug.cgi?id=1536783#c0
s.n_tile_threads = std::min(max_threads, s.n_tile_threads);
// Now clamp frame threads based on the number of remaining threads after tile
// threads have been allocated. A thread count of 1 generates no additional
// threads since the calling thread (this thread) is counted as a thread.
//
// We only want 1 frame thread in low delay mode, since otherwise we'll
// require at least two buffers before the first frame can be output.
if (low_delay)
s.n_frame_threads = 1;
else if (s.n_frame_threads > max_threads - s.n_tile_threads)
s.n_frame_threads = std::max(1, max_threads - s.n_tile_threads);
// Route dav1d internal logs through Chrome's DLOG system.
s.logger = {nullptr, &LogDav1dMessage};
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment