Commit 0120e99b authored by Miguel Casas's avatar Miguel Casas Committed by Commit Bot

RELAND:media/gpu/vaapi: reduce encoding quality on low power ChromeOSes

Original CL (crrev.com/c/2344526) broke the newly added unit tests
on some devices, concretely where there's a normal-encoding and a low
power encoding variants: VaapiWrapper will try to use the low power
version (maybe: IsLowPowerEncSupported()), which confused the test.

The solution is to just try VaapiWrapper's preferred entrypoint (and
not all of them), see crrev.com/c/2544368/1..5 has the delta. (Verified
on Atlas which was one of the broken ones)

Original CL description -----------------------------------------------

This CL queries the driver for the encoding quality range and, if
available and the device is considered "low power" (celeron, pentium
and Core -Y devices), sets it to its maximum value, for highest speed
and lowest power consumption, albeit at lower encoding quality. The
quality difference should not be noticeable for video conference
scenarios.

I followed the code in both of Intel's VA backends.
For the legacy i965:
- VP8 supports values 1-2 [1]. (ENCODER_LOW_QUALITY is 2 so at least
 this CL can supersede the patch in [2] which is what it was set off to
 do).
- AVC1 and VP9 support 1-7 [3,4], ENCODER_QUALITY_RANGE_(AVC,VP9) == 7
 see [5], but this only on Gen9 [6a, 6b] (VP9 encoding is only
 available in Gen9+ FTR).

For the modern intel-media-driver iHD, values are configured with
names, 1 being TARGETUSAGE_BEST_QUALITY ([7] and its following lines).
By default, the TARGETUSAGE_RT_SPEED == 4 is used, and 7 would be
TARGETUSAGE_BEST_SPEED.

Improvements depend on the input content. I measured kohaku with
EncodeAccelPerf.h264_1080p_i420 and when ToT is ~204fps, 4.75W, with
this CL I see ~220 fps, 4.42W.

[1] https://github.com/intel/intel-vaapi-driver/blob/d87db2111a33b157d1913415f15d201cc5182850/src/i965_drv_video.c#L1151
[2] https://source.chromium.org/chromiumos/chromiumos/codesearch/+/master:src/third_party/chromiumos-overlay/x11-libs/libva-intel-driver/files/vp8-encoder-Workaround-to-force-perforamce-mode-enco.patch;l=36
[3] https://github.com/intel/intel-vaapi-driver/blob/d87db2111a33b157d1913415f15d201cc5182850/src/i965_drv_video.c#L1142
[4] https://github.com/intel/intel-vaapi-driver/blob/d87db2111a33b157d1913415f15d201cc5182850/src/i965_drv_video.c#L1149
[5] https://github.com/intel/intel-vaapi-driver/blob/021bcb79d1bd873bbd9fbca55f40320344bab866/src/i965_drv_video.h#L76
[6a] https://github.com/intel/intel-vaapi-driver/blob/021bcb79d1bd873bbd9fbca55f40320344bab866/src/i965_encoder.c#L1599
[6b] https://github.com/intel/intel-vaapi-driver/blob/021bcb79d1bd873bbd9fbca55f40320344bab866/src/i965_encoder.c#L1653
[7] https://github.com/intel/media-driver/blob/105f0341a21324ae3bc458cfa6c25c7df0517f3a/media_driver/agnostic/common/codec/shared/codec_def_common.h#L333

Bug: b:141147405
Change-Id: I231f043edacc79cadd87b9f992505977e5802cda
Commit-Queue: Miguel Casas <mcasas@chromium.org>
Reviewed-by: default avatarAndres Calderon Jaramillo <andrescj@chromium.org>
Reviewed-by: default avatarSreerenj Balachandran <sreerenj.balachandran@intel.com>
Reviewed-by: default avatarHirokazu Honda <hiroh@chromium.org>
Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/2544368
Cr-Commit-Position: refs/heads/master@{#828535}
parent dfddc5cb
......@@ -19,6 +19,7 @@
#include "base/optional.h"
#include "base/process/launch.h"
#include "base/stl_util.h"
#include "base/strings/pattern.h"
#include "base/strings/string_split.h"
#include "base/test/launcher/unit_test_launcher.h"
#include "base/test/test_suite.h"
......@@ -245,6 +246,87 @@ TEST_F(VaapiTest, DefaultEntrypointIsSupported) {
}
}
}
// Verifies that VaapiWrapper::CreateContext() will queue up a buffer to set the
// encoder to its lowest quality setting if a given VAProfile and VAEntrypoint
// claims to support configuring it.
TEST_F(VaapiTest, LowQualityEncodingSetting) {
// This test only applies to low powered Intel processors.
constexpr int kPentiumAndLaterFamily = 0x06;
const base::CPU cpuid;
const bool is_core_y_processor =
base::MatchPattern(cpuid.cpu_brand(), "Intel(R) Core(TM) *Y CPU*");
const bool is_low_power_intel =
cpuid.family() == kPentiumAndLaterFamily &&
(base::Contains(cpuid.cpu_brand(), "Pentium") ||
base::Contains(cpuid.cpu_brand(), "Celeron") || is_core_y_processor);
if (!is_low_power_intel)
GTEST_SKIP() << "Not an Intel low power processor";
std::map<VAProfile, std::vector<VAEntrypoint>> configurations =
VaapiWrapper::GetSupportedConfigurationsForCodecModeForTesting(
VaapiWrapper::kEncode);
for (const auto& codec_mode :
{VaapiWrapper::kEncode,
VaapiWrapper::kEncodeConstantQuantizationParameter}) {
std::map<VAProfile, std::vector<VAEntrypoint>> configurations =
VaapiWrapper::GetSupportedConfigurationsForCodecModeForTesting(
codec_mode);
for (const auto& profile_and_entrypoints : configurations) {
const VAProfile va_profile = profile_and_entrypoints.first;
scoped_refptr<VaapiWrapper> wrapper = VaapiWrapper::Create(
VaapiWrapper::kEncode, va_profile, base::DoNothing());
// Depending on the GPU Gen, flags and policies, we may or may not utilize
// all entrypoints (e.g. we might always want VAEntrypointEncSliceLP if
// supported and enabled). Query VaapiWrapper's mandated entry point.
const VAEntrypoint entrypoint =
VaapiWrapper::GetDefaultVaEntryPoint(codec_mode, va_profile);
ASSERT_TRUE(base::Contains(profile_and_entrypoints.second, entrypoint));
VAConfigAttrib attrib{};
attrib.type = VAConfigAttribEncQualityRange;
{
base::AutoLock auto_lock(*wrapper->va_lock_);
VAStatus va_res = vaGetConfigAttributes(
wrapper->va_display_, va_profile, entrypoint, &attrib, 1);
ASSERT_EQ(va_res, VA_STATUS_SUCCESS);
}
const auto quality_level = attrib.value;
if (quality_level == VA_ATTRIB_NOT_SUPPORTED || quality_level <= 1u)
continue;
DLOG(INFO) << vaProfileStr(va_profile)
<< " supports encoding quality setting, with max value "
<< quality_level;
// If we get here it means the |va_profile| and |entrypoint| support
// the quality setting. We cannot inspect what the driver does with this
// number (it could ignore it), so instead just make sure there's a
// |pending_va_buffers_| that, when mapped, looks correct. That buffer
// should be created by CreateContext().
ASSERT_TRUE(wrapper->CreateContext(gfx::Size(64, 64)));
ASSERT_EQ(wrapper->pending_va_buffers_.size(), 1u);
{
base::AutoLock auto_lock(*wrapper->va_lock_);
ScopedVABufferMapping mapping(wrapper->va_lock_, wrapper->va_display_,
wrapper->pending_va_buffers_.front());
ASSERT_TRUE(mapping.IsValid());
auto* const va_buffer =
reinterpret_cast<VAEncMiscParameterBuffer*>(mapping.data());
EXPECT_EQ(va_buffer->type, VAEncMiscParameterTypeQualityLevel);
auto* const enc_quality =
reinterpret_cast<VAEncMiscParameterBufferQualityLevel*>(
va_buffer->data);
EXPECT_EQ(enc_quality->quality_level, quality_level)
<< vaProfileStr(va_profile) << " " << vaEntrypointStr(entrypoint);
}
}
}
}
} // namespace media
int main(int argc, char** argv) {
......
......@@ -32,6 +32,7 @@
#include "base/numerics/safe_conversions.h"
#include "base/posix/eintr_wrapper.h"
#include "base/stl_util.h"
#include "base/strings/pattern.h"
#include "base/strings/string_util.h"
#include "base/system/sys_info.h"
#include "base/trace_event/trace_event.h"
......@@ -241,7 +242,7 @@ bool IsGen9Gpu() {
constexpr int kSkyLakeModelId = 0x5E;
constexpr int kSkyLake_LModelId = 0x4E;
constexpr int kApolloLakeModelId = 0x5c;
static base::NoDestructor<base::CPU> cpuid;
static const base::NoDestructor<base::CPU> cpuid;
static const bool is_gen9_gpu = cpuid->family() == kPentiumAndLaterFamily &&
(cpuid->model() == kSkyLakeModelId ||
cpuid->model() == kSkyLake_LModelId ||
......@@ -259,7 +260,7 @@ bool IsGen95Gpu() {
constexpr int kGeminiLakeModelId = 0x7A;
constexpr int kCometLakeModelId = 0xA5;
constexpr int kCometLake_LModelId = 0xA6;
static base::NoDestructor<base::CPU> cpuid;
static const base::NoDestructor<base::CPU> cpuid;
static const bool is_gen95_gpu = cpuid->family() == kPentiumAndLaterFamily &&
(cpuid->model() == kKabyLakeModelId ||
cpuid->model() == kKabyLake_LModelId ||
......@@ -269,6 +270,22 @@ bool IsGen95Gpu() {
return is_gen95_gpu;
}
// Returns true if the SoC is considered a low power one, i.e. it's an Intel
// Pentium, Celeron, or a Core Y-series. See go/intel-socs-101 or
// https://www.intel.com/content/www/us/en/processors/processor-numbers.html.
bool IsLowPowerIntelProcessor() {
constexpr int kPentiumAndLaterFamily = 0x06;
static const base::NoDestructor<base::CPU> cpuid;
static const bool is_core_y_processor =
base::MatchPattern(cpuid->cpu_brand(), "Intel(R) Core(TM) *Y CPU*");
static const bool is_low_power_intel =
cpuid->family() == kPentiumAndLaterFamily &&
(base::Contains(cpuid->cpu_brand(), "Pentium") ||
base::Contains(cpuid->cpu_brand(), "Celeron") || is_core_y_processor);
return is_low_power_intel;
}
bool IsModeEncoding(VaapiWrapper::CodecMode mode) {
return mode == VaapiWrapper::CodecMode::kEncode ||
mode == VaapiWrapper::CodecMode::kEncodeConstantQuantizationParameter;
......@@ -740,25 +757,25 @@ bool GetRequiredAttribs(const base::Lock* va_lock,
if (!IsModeEncoding(mode))
return true;
if (profile != VAProfileJPEGBaseline) {
if (mode == VaapiWrapper::kEncode)
required_attribs->push_back({VAConfigAttribRateControl, VA_RC_CBR});
if (mode == VaapiWrapper::kEncodeConstantQuantizationParameter)
required_attribs->push_back({VAConfigAttribRateControl, VA_RC_CQP});
}
if (profile == VAProfileJPEGBaseline)
return true;
if (mode == VaapiWrapper::kEncode)
required_attribs->push_back({VAConfigAttribRateControl, VA_RC_CBR});
if (mode == VaapiWrapper::kEncodeConstantQuantizationParameter)
required_attribs->push_back({VAConfigAttribRateControl, VA_RC_CQP});
constexpr VAProfile kSupportedH264VaProfilesForEncoding[] = {
VAProfileH264ConstrainedBaseline, VAProfileH264Main, VAProfileH264High};
// VAConfigAttribEncPackedHeaders is H.264 specific.
if (base::Contains(kSupportedH264VaProfilesForEncoding, profile)) {
// Encode with Packed header if a driver supports.
VAConfigAttrib attrib;
// Encode with Packed header if the driver supports.
VAConfigAttrib attrib{};
attrib.type = VAConfigAttribEncPackedHeaders;
const VAStatus va_res =
vaGetConfigAttributes(va_display, profile, entrypoint, &attrib, 1);
if (va_res != VA_STATUS_SUCCESS) {
LOG(ERROR) << "vaGetConfigAttributes failed for "
<< vaProfileStr(profile);
LOG(ERROR) << "vaGetConfigAttributes failed: " << vaProfileStr(profile);
return false;
}
......@@ -941,6 +958,7 @@ void VASupportedProfiles::FillSupportedProfileInfos(base::Lock* va_lock,
<< vaEntrypointStr(entrypoint);
continue;
}
supported_profile_infos.push_back(profile_info);
}
}
......@@ -1659,6 +1677,10 @@ bool VaapiWrapper::CreateContext(const gfx::Size& size) {
flag, empty_va_surfaces_ids_pointer, empty_va_surfaces_ids_size,
&va_context_id_);
VA_LOG_ON_ERROR(va_res, VaapiFunctions::kVACreateContext);
if (IsModeEncoding(mode_) && IsLowPowerIntelProcessor())
MaybeSetLowQualityEncoding_Locked();
return va_res == VA_STATUS_SUCCESS;
}
......@@ -2142,9 +2164,8 @@ bool VaapiWrapper::GetVAEncMaxNumOfRefFrames(VideoCodecProfile profile,
attrib.type = VAConfigAttribEncMaxRefFrames;
base::AutoLock auto_lock(*va_lock_);
VAStatus va_res =
vaGetConfigAttributes(va_display_, va_profile,
va_entrypoint_, &attrib, 1);
VAStatus va_res = vaGetConfigAttributes(va_display_, va_profile,
va_entrypoint_, &attrib, 1);
VA_SUCCESS_OR_RETURN(va_res, VaapiFunctions::kVAGetConfigAttributes, false);
*max_ref_frames = attrib.value;
......@@ -2296,7 +2317,9 @@ VaapiWrapper::VaapiWrapper(CodecMode mode)
va_lock_(VADisplayState::Get()->va_lock()),
va_display_(NULL),
va_config_id_(VA_INVALID_ID),
va_context_id_(VA_INVALID_ID) {}
va_context_id_(VA_INVALID_ID),
va_profile_(VAProfileNone),
va_entrypoint_(kVAEntrypointInvalid) {}
VaapiWrapper::~VaapiWrapper() {
// Destroy ScopedVABuffer before VaapiWrappers are destroyed to ensure
......@@ -2328,7 +2351,7 @@ bool VaapiWrapper::Initialize(CodecMode mode, VAProfile va_profile) {
vaCreateConfig(va_display_, va_profile, entrypoint,
required_attribs.empty() ? nullptr : &required_attribs[0],
required_attribs.size(), &va_config_id_);
va_profile_ = va_profile;
va_entrypoint_ = entrypoint;
VA_SUCCESS_OR_RETURN(va_res, VaapiFunctions::kVACreateConfig, false);
......@@ -2558,4 +2581,40 @@ bool VaapiWrapper::MapAndCopy_Locked(VABufferID va_buffer_id,
return memcpy(mapping.data(), va_buffer.data, va_buffer.size);
}
void VaapiWrapper::MaybeSetLowQualityEncoding_Locked() {
DCHECK(IsModeEncoding(mode_));
va_lock_->AssertAcquired();
// Query if encoding quality (VAConfigAttribEncQualityRange) is supported, and
// if so, use the associated value for lowest quality and power consumption.
VAConfigAttrib attrib{};
attrib.type = VAConfigAttribEncQualityRange;
const VAStatus va_res = vaGetConfigAttributes(va_display_, va_profile_,
va_entrypoint_, &attrib, 1);
if (va_res != VA_STATUS_SUCCESS) {
LOG(ERROR) << "vaGetConfigAttributes failed: " << vaProfileStr(va_profile_);
return;
}
// From libva's va.h: 'A value less than or equal to 1 means that the
// encoder only has a single "quality setting,"'.
if (attrib.value == VA_ATTRIB_NOT_SUPPORTED || attrib.value <= 1u)
return;
const size_t temp_size = sizeof(VAEncMiscParameterBuffer) +
sizeof(VAEncMiscParameterBufferQualityLevel);
std::vector<char> temp(temp_size);
auto* const va_buffer =
reinterpret_cast<VAEncMiscParameterBuffer*>(temp.data());
va_buffer->type = VAEncMiscParameterTypeQualityLevel;
auto* const enc_quality =
reinterpret_cast<VAEncMiscParameterBufferQualityLevel*>(va_buffer->data);
enc_quality->quality_level = attrib.value;
const bool success =
SubmitBuffer_Locked({VAEncMiscParameterBufferType, temp_size, va_buffer});
LOG_IF(ERROR, !success) << "Error setting encoding quality to "
<< enc_quality->quality_level;
}
} // namespace media
......@@ -253,10 +253,10 @@ class MEDIA_GPU_EXPORT VaapiWrapper
// Releases the |va_surfaces| and destroys |va_context_id_|.
void DestroyContextAndSurfaces(std::vector<VASurfaceID> va_surfaces);
// Creates a VA Context of |size| and sets |va_context_id_|. In the case of a
// VPP VaapiWrapper, |size| is ignored and 0x0 is used to create the context.
// The client is responsible for releasing it via DestroyContext() or
// DestroyContextAndSurfaces(), or it will be released on dtor.
// Creates a VAContextID of |size| (unless it's a Vpp context in which case
// |size| is ignored and 0x0 is used instead). The client is responsible for
// releasing said context via DestroyContext() or DestroyContextAndSurfaces(),
// or it will be released on dtor.
virtual bool CreateContext(const gfx::Size& size) WARN_UNUSED_RESULT;
// Destroys the context identified by |va_context_id_|.
......@@ -434,6 +434,7 @@ class MEDIA_GPU_EXPORT VaapiWrapper
private:
friend class base::RefCountedThreadSafe<VaapiWrapper>;
FRIEND_TEST_ALL_PREFIXES(VaapiTest, LowQualityEncodingSetting);
FRIEND_TEST_ALL_PREFIXES(VaapiUtilsTest, ScopedVAImage);
FRIEND_TEST_ALL_PREFIXES(VaapiUtilsTest, BadScopedVAImage);
FRIEND_TEST_ALL_PREFIXES(VaapiUtilsTest, BadScopedVABufferMapping);
......@@ -470,6 +471,11 @@ class MEDIA_GPU_EXPORT VaapiWrapper
const VABufferDescriptor& va_buffer)
EXCLUSIVE_LOCKS_REQUIRED(va_lock_) WARN_UNUSED_RESULT;
// Queries whether |va_profile_| and |va_entrypoint_| support encoding quality
// setting and, if available, configures it to its maximum value, for lower
// consumption and maximum speed.
void MaybeSetLowQualityEncoding_Locked() EXCLUSIVE_LOCKS_REQUIRED(va_lock_);
const CodecMode mode_;
// Pointer to VADisplayState's member |va_lock_|. Guaranteed to be valid for
......@@ -484,7 +490,8 @@ class MEDIA_GPU_EXPORT VaapiWrapper
// DestroyContext() or DestroyContextAndSurfaces().
VAContextID va_context_id_;
//Entrypoint configured for the corresponding context
// Profile and entrypoint configured for the corresponding |va_context_id_|.
VAProfile va_profile_;
VAEntrypoint va_entrypoint_;
// Data queued up for HW codec, to be committed on next execution.
......
Markdown is supported
0%
or
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment