Bug 1713276 Comment 40 Edit History

Note: The actual edited comment in the bug view page will always show the original commenter’s name and original timestamp.

What I feel is multi-thread decoding is not a factor, but the mechanism of synchronizing the shmem might be related. Following result are for multi-threads, and single thread (change [this](https://searchfox.org/mozilla-central/rev/55a826a9ef74e92988e56cd9615d4fc6a470695e/dom/media/platforms/ffmpeg/FFmpegVideoDecoder.cpp#404) to 1) From below result, it shows that even if using single thread, decode to shmem is still slower.

* Linux, Multi-threads, Decode + Copy
[Child 579563: Main Thread]: D/MediaAveragePerf 'RequestDecode' stage for 'V:1440<h<=2160' took `19340.897155` us in the average.

* Linux, Single thread, Decode + Copy
[Child 794861: Main Thread]: D/MediaAveragePerf 'RequestDecode' stage for 'V:1440<h<=2160' took `44225.718321` us in the average.

* Linux, Multi-threads, Decode to shmem 
[Child 579979: Main Thread]: D/MediaAveragePerf 'RequestDecode' stage for 'V:1440<h<=2160' took `20269.318408` us in the average.

* Linux, Single thread, Decode to shmem 
[Child 795783: Main Thread]: D/MediaAveragePerf 'RequestDecode' stage for 'V:1440<h<=2160' took `53721.353087` us in the average.

---

By observing the stack in comment 37 (decode to shmem), it took a lot of time on `ZwFreeVirtualMemory` which is called from `av_buffer_unref`. In `vp9_decode_update_thread_context`, I did see ffvpx would start checking which frames are no longer needed and would unref those AVFrames. And then it seems triaggering resetting data on shmem, which would cost a lot of time. 

However, if those data got reset during decoding, how could we see the image on the compositor?! Because the images are still complete, the shmem buffer should suppose unchanged after we receive decoded video frames. If those data didn't get reset, why `av_buffer_unref` would trigger those cleaning methods and took a lot of time?
What I feel is multi-thread decoding is not a factor, but the mechanism of synchronizing the shmem might be related. Following result are for multi-threads, and single thread (change [this](https://searchfox.org/mozilla-central/rev/55a826a9ef74e92988e56cd9615d4fc6a470695e/dom/media/platforms/ffmpeg/FFmpegVideoDecoder.cpp#404) to 1) From below result, it shows that even if using single thread, decode to shmem is still slower.

* Linux, Multi-threads, Decode + Copy
[Child 579563: Main Thread]: D/MediaAveragePerf 'RequestDecode' stage for 'V:1440<h<=2160' took `19340.897155` us in the average.

* Linux, Single thread, Decode + Copy
[Child 794861: Main Thread]: D/MediaAveragePerf 'RequestDecode' stage for 'V:1440<h<=2160' took `44225.718321` us in the average.

* Linux, Multi-threads, Decode to shmem 
[Child 579979: Main Thread]: D/MediaAveragePerf 'RequestDecode' stage for 'V:1440<h<=2160' took `20269.318408` us in the average.

* Linux, Single thread, Decode to shmem 
[Child 795783: Main Thread]: D/MediaAveragePerf 'RequestDecode' stage for 'V:1440<h<=2160' took `53721.353087` us in the average.

---

By observing the stack in comment 37 (decode to shmem), it took a lot of time on `ZwFreeVirtualMemory` (on Windows) which is called from `av_buffer_unref`. In `vp9_decode_update_thread_context`, I did see ffvpx would start checking which frames are no longer needed and would unref those AVFrames. And then it seems triaggering resetting data on shmem, which would cost a lot of time. Also, on Linux, the time spend most are on `__pthread_cond_wait` which also looks like synchronizing shmeme data between two processes?

However, if those data got reset during decoding, how could we see the image on the compositor?! Because the images are still complete, the shmem buffer should suppose unchanged after we receive decoded video frames. If those data didn't get reset, why `av_buffer_unref` would trigger those cleaning methods and took a lot of time?

Back to Bug 1713276 Comment 40