Vaapi: Remove ConditionVariables and use PostTask

This CL removes the use of the ConditionVariables inside VaVDA and uses instead PostTask()ing. This removes blocking the worker thread, which allows for moving other tasks there, e.g. the Surface Blit (as done in the follow up WIP CL crrev.com/c/947341) For all that goodness, this CL: - Refactors the waiting logic that on ToT is spread between GetCurrInputBuffer_Locked() and WaitForSurfaces_Locked() and simplifies it in DecodeTask() itself. - Moves the |curr_input_buffer_| filling logic in GetCurrInputBuffer_Locked() into DecodeTask() proper. - Reduces the scoped of the |lock_| in DecodeTask() (note that |curr_input_buffer_| and |decode_| are only used on the decoder thread -- added a .h comment). Minor stuff: This CL also moves together the declaration of the related member vars |input_buffers_|, |curr_input_buffer_| and |available_va_surfaces_| TEST=simplechrome+crosvideo (including seeks) on soraka, and v_d_a_unittest with h264, vp8, vp9 (with dcheck_always_on=true). Bug: 717265 Cq-Include-Trybots: master.tryserver.chromium.android:android_optional_gpu_tests_rel;master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel Change-Id: Ib167115dd2a4ebaeaea3ba3b5a205779d659f479 Reviewed-on: https://chromium-review.googlesource.com/949283 Commit-Queue: Miguel Casas <mcasas@chromium.org> Reviewed-by: Daniele Castagna <dcastagna@chromium.org> Cr-Commit-Position: refs/heads/master@{#542955}

Vaapi: Remove ConditionVariables and use PostTask
This CL removes the use of the ConditionVariables inside VaVDA and uses instead PostTask()ing. This removes blocking the worker thread, which allows for moving other tasks there, e.g. the Surface Blit (as done in the follow up WIP CL crrev.com/c/947341) For all that goodness, this CL: - Refactors the waiting logic that on ToT is spread between GetCurrInputBuffer_Locked() and WaitForSurfaces_Locked() and simplifies it in DecodeTask() itself. - Moves the |curr_input_buffer_| filling logic in GetCurrInputBuffer_Locked() into DecodeTask() proper. - Reduces the scoped of the |lock_| in DecodeTask() (note that |curr_input_buffer_| and |decode_| are only used on the decoder thread -- added a .h comment). Minor stuff: This CL also moves together the declaration of the related member vars |input_buffers_|, |curr_input_buffer_| and |available_va_surfaces_| TEST=simplechrome+crosvideo (including seeks) on soraka, and v_d_a_unittest with h264, vp8, vp9 (with dcheck_always_on=true). Bug: 717265 Cq-Include-Trybots: master.tryserver.chromium.android:android_optional_gpu_tests_rel;master.tryserver.chromium.linux:linux_optional_gpu_tests_rel;master.tryserver.chromium.mac:mac_optional_gpu_tests_rel;master.tryserver.chromium.win:win_optional_gpu_tests_rel Change-Id: Ib167115dd2a4ebaeaea3ba3b5a205779d659f479 Reviewed-on: https://chromium-review.googlesource.com/949283 Commit-Queue: Miguel Casas <mcasas@chromium.org> Reviewed-by: Daniele Castagna <dcastagna@chromium.org> Cr-Commit-Position: refs/heads/master@{#542955}
6683995b · Miguel Casas · Commit Bot · 9fdc3676 · 6683995b · 6683995b
Commit 6683995b authored Mar 14, 2018 by Miguel Casas Committed by Commit Bot Mar 14, 2018
Showing with 63 additions and 158 deletions

media/gpu/vaapi/vaapi_video_decode_accelerator.cc media/gpu/vaapi/vaapi_video_decode_accelerator.cc +55 -129

media/gpu/vaapi/vaapi_video_decode_accelerator.h media/gpu/vaapi/vaapi_video_decode_accelerator.h +8 -29

No files found.
--- a/media/gpu/vaapi/vaapi_video_decode_accelerator.cc
+++ b/media/gpu/vaapi/vaapi_video_decode_accelerator.cc
@@ -132,9 +132,7 @@ VaapiVideoDecodeAccelerator::VaapiVideoDecodeAccelerator(
    const MakeGLContextCurrentCallback& make_context_current_cb,
    const BindGLImageCallback& bind_image_cb)
    : state_(kUninitialized),
-      input_ready_(&lock_),
      vaapi_picture_factory_(new VaapiPictureFactory()),
-      surfaces_available_(&lock_),
      task_runner_(base::ThreadTaskRunnerHandle::Get()),
      decoder_thread_("VaapiDecoderThread"),
      num_frames_at_client_(0),
@@ -287,20 +285,17 @@ void VaapiVideoDecodeAccelerator::QueueInputBuffer(
  }

  TRACE_COUNTER1("media,gpu", "Input buffers", input_buffers_.size());
-  input_ready_.Signal();

  switch (state_) {
    case kIdle:
      state_ = kDecoding;
+      FALLTHROUGH;
+    case kDecoding:
      decoder_thread_task_runner_->PostTask(
          FROM_HERE, base::Bind(&VaapiVideoDecodeAccelerator::DecodeTask,
                                base::Unretained(this)));
      break;

-    case kDecoding:
-      // Decoder already running.
-      break;
-
    case kResetting:
      // When resetting, allow accumulating bitstream buffers, so that
      // the client can queue after-seek-buffers while we are finishing with
@@ -315,127 +310,66 @@ void VaapiVideoDecodeAccelerator::QueueInputBuffer(
  }
 }

-bool VaapiVideoDecodeAccelerator::GetCurrInputBuffer_Locked() {
-  DCHECK(decoder_thread_task_runner_->BelongsToCurrentThread());
-  lock_.AssertAcquired();
-
-  if (curr_input_buffer_.get())
-    return true;
-
-  // Will only wait if it is expected that in current state new buffers will
-  // be queued from the client via Decode(). The state can change during wait.
-  while (input_buffers_.empty() && (state_ == kDecoding || state_ == kIdle))
-    input_ready_.Wait();
-
-  // We could have got woken up in a different state or never got to sleep
-  // due to current state.
-  if (state_ != kDecoding && state_ != kIdle)
-    return false;
-
-  DCHECK(!input_buffers_.empty());
-  curr_input_buffer_ = std::move(input_buffers_.front());
-  input_buffers_.pop();
-  TRACE_COUNTER1("media,gpu", "Input buffers", input_buffers_.size());
-
-  if (curr_input_buffer_->IsFlushRequest()) {
-    VLOGF(4) << "New flush buffer";
-    return true;
-  }
-
-  VLOGF(4) << "New |curr_input_buffer_|, id: " << curr_input_buffer_->id()
-           << " size: " << curr_input_buffer_->shm()->size() << "B";
-  decoder_->SetStream(
-      static_cast<uint8_t*>(curr_input_buffer_->shm()->memory()),
-      curr_input_buffer_->shm()->size());
-
-  return true;
-}
-
-void VaapiVideoDecodeAccelerator::ReturnCurrInputBuffer_Locked() {
-  DCHECK(decoder_thread_task_runner_->BelongsToCurrentThread());
-  lock_.AssertAcquired();
-  DCHECK(curr_input_buffer_.get());
-  curr_input_buffer_.reset();
-}
-
-// TODO(posciak): refactor the whole class to remove sleeping in wait for
-// surfaces, and reschedule DecodeTask instead.
-bool VaapiVideoDecodeAccelerator::WaitForSurfaces_Locked() {
-  DCHECK(decoder_thread_task_runner_->BelongsToCurrentThread());
-  lock_.AssertAcquired();
-
-  while (available_va_surfaces_.empty() &&
-         (state_ == kDecoding || state_ == kIdle)) {
-    surfaces_available_.Wait();
-  }
-
-  return state_ == kDecoding || state_ == kIdle;
-}
-
 void VaapiVideoDecodeAccelerator::DecodeTask() {
  DCHECK(decoder_thread_task_runner_->BelongsToCurrentThread());
-  base::AutoLock auto_lock(lock_);

-  if (state_ != kDecoding)
-    return;
-  VLOGF(4) << "Decode task";
+  {
+    base::AutoLock auto_lock(lock_);
+    if (state_ != kDecoding)
+      return;

-  // Try to decode what stream data is (still) in the decoder until we run out
-  // of it.
-  while (GetCurrInputBuffer_Locked()) {
-    DCHECK(curr_input_buffer_.get());
+    // Extract |curr_input_buffer_| out of |input_buffers_| if we have none, and
+    // set it in |decoder_| if it isn't a Flush Request.
+    if (!curr_input_buffer_ && !input_buffers_.empty()) {
+      curr_input_buffer_ = std::move(input_buffers_.front());
+      input_buffers_.pop();

-    if (curr_input_buffer_->IsFlushRequest()) {
-      FlushTask();
-      break;
-    }
+      if (curr_input_buffer_->IsFlushRequest()) {
+        VLOGF(4) << "New flush buffer";
+        FlushTask();
+        return;
+      }

-    AcceleratedVideoDecoder::DecodeResult res;
-    {
-      // We are OK releasing the lock here, as decoder never calls our methods
-      // directly and we will reacquire the lock before looking at state again.
-      // This is the main decode function of the decoder and while keeping
-      // the lock for its duration would be fine, it would defeat the purpose
-      // of having a separate decoder thread.
-      base::AutoUnlock auto_unlock(lock_);
-      TRACE_EVENT0("media,gpu", "VAVDA::Decode");
-      res = decoder_->Decode();
+      SharedMemoryRegion* const shm = curr_input_buffer_->shm();
+      VLOGF(4) << "New |curr_input_buffer|, id " << curr_input_buffer_->id()
+               << " size: " << shm->size() << "B";
+      decoder_->SetStream(static_cast<uint8_t*>(shm->memory()), shm->size());
    }
+  }

-    switch (res) {
-      case AcceleratedVideoDecoder::kAllocateNewSurfaces:
-        VLOGF(2) << "Decoder requesting a new set of surfaces";
-        task_runner_->PostTask(
-            FROM_HERE,
-            base::Bind(&VaapiVideoDecodeAccelerator::InitiateSurfaceSetChange,
-                       weak_this_, decoder_->GetRequiredNumOfPictures(),
-                       decoder_->GetPicSize()));
-        // We'll get rescheduled once ProvidePictureBuffers() finishes.
-        return;
-
-      case AcceleratedVideoDecoder::kRanOutOfStreamData:
-        ReturnCurrInputBuffer_Locked();
-        break;
+  AcceleratedVideoDecoder::DecodeResult res;
+  {
+    TRACE_EVENT0("media,gpu", "VAVDA::Decode");
+    res = decoder_->Decode();
+  }
+  switch (res) {
+    case AcceleratedVideoDecoder::kAllocateNewSurfaces:
+      VLOGF(2) << "Decoder requesting a new set of surfaces";
+      task_runner_->PostTask(
+          FROM_HERE,
+          base::BindOnce(&VaapiVideoDecodeAccelerator::InitiateSurfaceSetChange,
+                         weak_this_, decoder_->GetRequiredNumOfPictures(),
+                         decoder_->GetPicSize()));
+      // We'll get rescheduled once ProvidePictureBuffers() finishes.
+      return;

-      case AcceleratedVideoDecoder::kRanOutOfSurfaces:
-        // No more output buffers in the decoder, try getting more or go to
-        // sleep waiting for them.
-        if (!WaitForSurfaces_Locked())
-          return;
+    case AcceleratedVideoDecoder::kRanOutOfStreamData:
+      curr_input_buffer_.reset();
+      break;

-        break;
+    case AcceleratedVideoDecoder::kRanOutOfSurfaces:
+      // We'll be rescheduled once we get a VA Surface via RecycleVASurfaceID().
+      break;

-      case AcceleratedVideoDecoder::kNeedContextUpdate:
-        // This should not happen as we return false from
-        // IsFrameContextRequired().
-        NOTREACHED() << "Context updates not supported";
-        return;
+    case AcceleratedVideoDecoder::kNeedContextUpdate:
+      // This shouldn't happen as we return false from IsFrameContextRequired().
+      NOTREACHED() << "Context updates not supported";
+      return;

-      case AcceleratedVideoDecoder::kDecodeError:
-        RETURN_AND_NOTIFY_ON_FAILURE(false, "Error decoding stream",
-                                     PLATFORM_FAILURE, );
-        return;
-    }
+    case AcceleratedVideoDecoder::kDecodeError:
+      RETURN_AND_NOTIFY_ON_FAILURE(false, "Error decoding stream",
+                                   PLATFORM_FAILURE, );
+      return;
  }
 }

@@ -539,7 +473,9 @@ void VaapiVideoDecodeAccelerator::RecycleVASurfaceID(
  base::AutoLock auto_lock(lock_);

  available_va_surfaces_.push_back(va_surface_id);
-  surfaces_available_.Signal();
+  decoder_thread_task_runner_->PostTask(
+      FROM_HERE, base::BindOnce(&VaapiVideoDecodeAccelerator::DecodeTask,
+                                base::Unretained(this)));
 }

 void VaapiVideoDecodeAccelerator::AssignPictureBuffers(
@@ -564,6 +500,7 @@ void VaapiVideoDecodeAccelerator::AssignPictureBuffers(
                                     buffers.size(), &va_surface_ids),
      "Failed creating VA Surfaces", PLATFORM_FAILURE, );
  DCHECK_EQ(va_surface_ids.size(), buffers.size());
+  available_va_surfaces_.assign(va_surface_ids.begin(), va_surface_ids.end());

  for (size_t i = 0; i < buffers.size(); ++i) {
    uint32_t client_id = !buffers[i].client_texture_ids().empty()
@@ -593,9 +530,6 @@ void VaapiVideoDecodeAccelerator::AssignPictureBuffers(
        pictures_.insert(std::make_pair(buffers[i].id(), std::move(picture)))
            .second;
    DCHECK(inserted);
-
-    available_va_surfaces_.push_back(va_surface_ids[i]);
-    surfaces_available_.Signal();
  }

  // Resume DecodeTask if it is still in decoding state.
@@ -755,7 +689,7 @@ void VaapiVideoDecodeAccelerator::ResetTask() {

  // Return current input buffer, if present.
  if (curr_input_buffer_)
-    ReturnCurrInputBuffer_Locked();
+    curr_input_buffer_.reset();

  // And let client know that we are done with reset.
  task_runner_->PostTask(
@@ -780,9 +714,6 @@ void VaapiVideoDecodeAccelerator::Reset() {
  decoder_thread_task_runner_->PostTask(
      FROM_HERE, base::Bind(&VaapiVideoDecodeAccelerator::ResetTask,
                            base::Unretained(this)));
-
-  input_ready_.Signal();
-  surfaces_available_.Signal();
 }

 void VaapiVideoDecodeAccelerator::FinishReset() {
@@ -844,11 +775,6 @@ void VaapiVideoDecodeAccelerator::Cleanup() {
  // TODO(mcasas): consider deleting |decoder_| on
  // |decoder_thread_task_runner_|, https://crbug.com/789160.

-  // Signal all potential waiters on the decoder_thread_, let them early-exit,
-  // as we've just moved to the kDestroying state, and wait for all tasks
-  // to finish.
-  input_ready_.Signal();
-  surfaces_available_.Signal();
  {
    base::AutoUnlock auto_unlock(lock_);
    decoder_thread_.Stop();

--- a/media/gpu/vaapi/vaapi_video_decode_accelerator.h
+++ b/media/gpu/vaapi/vaapi_video_decode_accelerator.h
@@ -23,7 +23,6 @@
 #include "base/memory/linked_ptr.h"
 #include "base/memory/weak_ptr.h"
 #include "base/single_thread_task_runner.h"
-#include "base/synchronization/condition_variable.h"
 #include "base/synchronization/lock.h"
 #include "base/threading/thread.h"
 #include "media/base/bitstream_buffer.h"
@@ -114,21 +113,6 @@ class MEDIA_GPU_EXPORT VaapiVideoDecodeAccelerator
  // Queue a input buffer for decode.
  void QueueInputBuffer(const BitstreamBuffer& bitstream_buffer);

-  // Gets a new |current_input_buffer_| from |input_buffers_| and sets it up in
-  // |decoder_|. This method will sleep if no |input_buffers_| are available.
-  // Returns true if a new buffer has been set up, false if an early exit has
-  // been requested (due to initiated reset/flush/destroy).
-  bool GetCurrInputBuffer_Locked();
-
-  // Signals the client that |curr_input_buffer_| has been read and can be
-  // returned. Will also release the mapping.
-  void ReturnCurrInputBuffer_Locked();
-
-  // Waits for more surfaces to become available. Returns true once they do or
-  // false if an early exit has been requested (due to an initiated
-  // reset/flush/destroy).
-  bool WaitForSurfaces_Locked();
-
  // Continue decoding given input buffers and sleep waiting for input/output
  // as needed. Will exit if a new set of surfaces or reset/flush/destroy
  // is requested.
@@ -202,19 +186,21 @@ class MEDIA_GPU_EXPORT VaapiVideoDecodeAccelerator
    kDestroying,
  };

-  // Protects input buffer and surface queues and state_.
+  // Protects input buffer and surface queues and |state_|.
  base::Lock lock_;
  State state_;
  Config::OutputMode output_mode_;

-  // Queue of available InputBuffers (picture_buffer_ids).
+  // Queue of input InputBuffers (PictureBuffer ids) to decode.
  base::queue<std::unique_ptr<InputBuffer>> input_buffers_;
-  // Signalled when input buffers are queued onto |input_buffers_| queue.
-  base::ConditionVariable input_ready_;
-
-  // Current input buffer at decoder.
+  // Current InputBuffer at |decoder_|.
+  // Only accessed on |decoder_thread_task_runner_| (needs no |lock_|)
  std::unique_ptr<InputBuffer> curr_input_buffer_;

+  // VA Surfaces no longer in use that can be passed back to the decoder for
+  // reuse, once it requests them.
+  std::list<VASurfaceID> available_va_surfaces_;
+
  // Queue for incoming output buffers (texture ids).
  using OutputBuffers = base::queue<int32_t>;
  OutputBuffers output_buffers_;
@@ -235,13 +221,6 @@ class MEDIA_GPU_EXPORT VaapiVideoDecodeAccelerator
  // Return a VaapiPicture associated with given client-provided id.
  VaapiPicture* PictureById(int32_t picture_buffer_id);

-  // VA Surfaces no longer in use that can be passed back to the decoder for
-  // reuse, once it requests them.
-  std::list<VASurfaceID> available_va_surfaces_;
-  // Signalled when output surfaces are queued onto the available_va_surfaces_
-  // queue.
-  base::ConditionVariable surfaces_available_;
-
  // Pending output requests from the decoder. When it indicates that we should
  // output a surface and we have an available Picture (i.e. texture) ready
  // to use, we'll execute the callback passing the Picture. The callback