vaapi: Improve VAAPI encoding scheduling
VAAPI usage in Chrome has a global lock that is acquired for each vaapi call. The assumption was that vaapi calls would not block on the CPU, but they do. In particular, vaEndPicture seems to take a significant amount of wall time during encoding. When multiple encodings are happening, we might end up interleaving parts of the encodings of different streams and serialize them. This causes wall time for one encoding to sometimes take into account wall time of other encodings too. This CL tries to mitigate that issue in a few different ways: - In VaapiVideoEncodeAccelerator::ReturnBitstreamBuffer the client is notified that the encoding is ready before destroying the surface, since destroying the surface reacquires the lock that might be taken by another vaapi call in another thread. - ExecuteAndDestroyPendingBuffers acquires the lock only once, so that another vaapi client can block in between execute and destroy. - DownloadFromVABuffer doesn't unlock and relock during the copy to shmem, since the copy takes in the order of .1ms for a VGA stream and gives the chances to another VAAPI client to acquire the lock once the encoding is done and before we notify the client. Bug: 1005592 Test: on meet.google.com, record average WebRTC encoding times, observe it drops by ~10% Change-Id: I807e281ba501b5d05d53902c8257010725730966 Reviewed-on: https://chromium-review.googlesource.com/c/chromium/src/+/1819518 Commit-Queue: Daniele Castagna <dcastagna@chromium.org> Reviewed-by:Hirokazu Honda <hiroh@chromium.org> Reviewed-by:
Miguel Casas <mcasas@chromium.org> Cr-Commit-Position: refs/heads/master@{#701010}
Showing
Please register or sign in to comment