Poor performance of TrackBuffersManager::CodedFrameProcessing() due to mInputBuffer.RemoveElement(0,N) on Twitch
Categories
(Core :: Audio/Video: Playback, defect, P2)
Tracking
()
Tracking | Status | |
---|---|---|
firefox69 | --- | fixed |
People
(Reporter: cpearce, Assigned: cpearce)
References
Details
Attachments
(2 files)
Profiling https://www.twitch.tv/videos/427206039 on the HD8 tablet (with my patches from Bug 1554075 applied), I see a lot of time spent in TrackBuffersManager::CodedFrameProcessing() calling nsTArray::RemoveElementAt():
I believe this is caused by the call to mInputBuffer->RemoveElementsAt(0, length);.
To test this, I added logging to log the length being removed from the front of the array, vs the array's length. Sure enough, I see logging like so (this log was collected on desktop Firefox on Linux):
[Child 14571: MediaPlayback #2]: V/cpearce TrackBuffersManager[0x7fc1c6fd4000] CodedFrameProcessing() removeElements(0, 493) of 20896
[Child 14571: MediaPlayback #2]: V/cpearce TrackBuffersManager[0x7fc1c3ff6000] ProcessTasks AppendBuffer=829613
[Child 14571: MediaPlayback #2]: V/cpearce TrackBuffersManager[0x7fc1c3ff6000] ProcessTasks AppendBuffer=829613 (done) len(mInputBuffer)=829613
[Child 14571: MediaPlayback #2]: V/cpearce TrackBuffersManager[0x7fc1c3ff6000] SegmentParserLoop mProcessedInput=734 len(mInputBuffer)=829613
[Child 14571: MediaPlayback #2]: V/cpearce TrackBuffersManager[0x7fc1c3ff6000] SegmentParserLoop mProcessedInput=734 len(mInputBuffer)=829613
[Child 14571: MediaPlayback #2]: V/cpearce TrackBuffersManager[0x7fc1c3ff6000] CodedFrameProcessing() removeElements(0, 6900) of 829613
[Child 14571: MediaPlayback #2]: V/cpearce TrackBuffersManager[0x7fc1c6fd4000] SegmentParserLoop mProcessedInput=620 len(mInputBuffer)=20403
[Child 14571: MediaPlayback #2]: V/cpearce TrackBuffersManager[0x7fc1c6fd4000] SegmentParserLoop mProcessedInput=620 len(mInputBuffer)=20403
[Child 14571: MediaPlayback #2]: V/cpearce TrackBuffersManager[0x7fc1c6fd4000] CodedFrameProcessing() removeElements(0, 469) of 20403
[Child 14571: MediaPlayback #2]: V/cpearce TrackBuffersManager[0x7fc1c3ff6000] SegmentParserLoop mProcessedInput=734 len(mInputBuffer)=822713
[Child 14571: MediaPlayback #2]: V/cpearce TrackBuffersManager[0x7fc1c3ff6000] SegmentParserLoop mProcessedInput=734 len(mInputBuffer)=822713
[Child 14571: MediaPlayback #2]: V/cpearce TrackBuffersManager[0x7fc1c3ff6000] CodedFrameProcessing() removeElements(0, 27985) of 822713
[Child 14571: MediaPlayback #2]: V/cpearce TrackBuffersManager[0x7fc1c6fd4000] SegmentParserLoop mProcessedInput=620 len(mInputBuffer)=19934
[Child 14571: MediaPlayback #2]: V/cpearce TrackBuffersManager[0x7fc1c6fd4000] SegmentParserLoop mProcessedInput=620 len(mInputBuffer)=19934
[Child 14571: MediaPlayback #2]: V/cpearce TrackBuffersManager[0x7fc1c6fd4000] CodedFrameProcessing() removeElements(0, 434) of 19934
[Child 14571: MediaPlayback #2]: V/cpearce TrackBuffersManager[0x7fc1c3ff6000] SegmentParserLoop mProcessedInput=734 len(mInputBuffer)=794728
[Child 14571: MediaPlayback #2]: V/cpearce TrackBuffersManager[0x7fc1c3ff6000] SegmentParserLoop mProcessedInput=734 len(mInputBuffer)=794728
[Child 14571: MediaPlayback #2]: V/cpearce TrackBuffersManager[0x7fc1c6fd4000] SegmentParserLoop mProcessedInput=620 len(mInputBuffer)=19500
[Child 14571: MediaPlayback #2]: V/cpearce TrackBuffersManager[0x7fc1c6fd4000] SegmentParserLoop mProcessedInput=620 len(mInputBuffer)=19500
[Child 14571: MediaPlayback #2]: V/cpearce TrackBuffersManager[0x7fc1c6fd4000] CodedFrameProcessing() removeElements(0, 489) of 19500
[Child 14571: MediaPlayback #2]: V/cpearce TrackBuffersManager[0x7fc1c3ff6000] SegmentParserLoop mProcessedInput=734 len(mInputBuffer)=794728
[Child 14571: MediaPlayback #2]: V/cpearce TrackBuffersManager[0x7fc1c3ff6000] CodedFrameProcessing() removeElements(0, 7683) of 794728
[Child 14571: MediaPlayback #2]: V/cpearce TrackBuffersManager[0x7fc1c6fd4000] SegmentParserLoop mProcessedInput=620 len(mInputBuffer)=19011
[Child 14571: MediaPlayback #2]: V/cpearce TrackBuffersManager[0x7fc1c6fd4000] SegmentParserLoop mProcessedInput=620 len(mInputBuffer)=19011
[Child 14571: MediaPlayback #2]: V/cpearce TrackBuffersManager[0x7fc1c6fd4000] CodedFrameProcessing() removeElements(0, 496) of 19011
[Child 14571: MediaPlayback #2]: V/cpearce TrackBuffersManager[0x7fc1c3ff6000] SegmentParserLoop mProcessedInput=734 len(mInputBuffer)=787045
[Child 14571: MediaPlayback #2]: V/cpearce TrackBuffersManager[0x7fc1c3ff6000] SegmentParserLoop mProcessedInput=734 len(mInputBuffer)=787045
[Child 14571: MediaPlayback #2]: V/cpearce TrackBuffersManager[0x7fc1c3ff6000] CodedFrameProcessing() removeElements(0, 6766) of 787045
[Child 14571: MediaPlayback #3]: V/cpearce TrackBuffersManager[0x7fc1c6fd4000] SegmentParserLoop mProcessedInput=620 len(mInputBuffer)=18515
[Child 14571: MediaPlayback #3]: V/cpearce TrackBuffersManager[0x7fc1c6fd4000] SegmentParserLoop mProcessedInput=620 len(mInputBuffer)=18515
[Child 14571: MediaPlayback #3]: V/cpearce TrackBuffersManager[0x7fc1c6fd4000] CodedFrameProcessing() removeElements(0, 500) of 18515
[Child 14571: MediaPlayback #3]: V/cpearce TrackBuffersManager[0x7fc1c3ff6000] SegmentParserLoop mProcessedInput=734 len(mInputBuffer)=780279
[Child 14571: MediaPlayback #3]: V/cpearce TrackBuffersManager[0x7fc1c3ff6000] SegmentParserLoop mProcessedInput=734 len(mInputBuffer)=780279
[Child 14571: MediaPlayback #3]: V/cpearce TrackBuffersManager[0x7fc1c3ff6000] CodedFrameProcessing() removeElements(0, 22180) of 780279
[Child 14571: MediaPlayback #3]: V/cpearce TrackBuffersManager[0x7fc1c6fd4000] SegmentParserLoop mProcessedInput=620 len(mInputBuffer)=18015
[Child 14571: MediaPlayback #3]: V/cpearce TrackBuffersManager[0x7fc1c6fd4000] SegmentParserLoop mProcessedInput=620 len(mInputBuffer)=18015
[Child 14571: MediaPlayback #3]: V/cpearce TrackBuffersManager[0x7fc1c6fd4000] CodedFrameProcessing() removeElements(0, 517) of 18015
[Child 14571: MediaPlayback #3]: V/cpearce TrackBuffersManager[0x7fc1c3ff6000] SegmentParserLoop mProcessedInput=734 len(mInputBuffer)=758099
[Child 14571: MediaPlayback #3]: V/cpearce TrackBuffersManager[0x7fc1c3ff6000] SegmentParserLoop mProcessedInput=734 len(mInputBuffer)=758099
[Child 14571: MediaPlayback #3]: V/cpearce TrackBuffersManager[0x7fc1c3ff6000] SegmentParserLoop mProcessedInput=734 len(mInputBuffer)=758099
[Child 14571: MediaPlayback #3]: V/cpearce TrackBuffersManager[0x7fc1c3ff6000] CodedFrameProcessing() removeElements(0, 6471) of 758099
[Child 14571: MediaPlayback #3]: V/cpearce TrackBuffersManager[0x7fc1c6fd4000] SegmentParserLoop mProcessedInput=620 len(mInputBuffer)=17498
[Child 14571: MediaPlayback #3]: V/cpearce TrackBuffersManager[0x7fc1c6fd4000] SegmentParserLoop mProcessedInput=620 len(mInputBuffer)=17498
[Child 14571: MediaPlayback #3]: V/cpearce TrackBuffersManager[0x7fc1c6fd4000] CodedFrameProcessing() removeElements(0, 488) of 17498
[Child 14571: MediaPlayback #3]: V/cpearce TrackBuffersManager[0x7fc1c3ff6000] SegmentParserLoop mProcessedInput=734 len(mInputBuffer)=751628
[Child 14571: MediaPlayback #3]: V/cpearce TrackBuffersManager[0x7fc1c3ff6000] SegmentParserLoop mProcessedInput=734 len(mInputBuffer)=751628
[Child 14571: MediaPlayback #3]: V/cpearce TrackBuffersManager[0x7fc1c3ff6000] CodedFrameProcessing() removeElements(0, 6029) of 751628
[Child 14571: MediaPlayback #3]: V/cpearce TrackBuffersManager[0x7fc1c6fd4000] SegmentParserLoop mProcessedInput=620 len(mInputBuffer)=17010
[Child 14571: MediaPlayback #3]: V/cpearce TrackBuffersManager[0x7fc1c6fd4000] SegmentParserLoop mProcessedInput=620 len(mInputBuffer)=17010
[Child 14571: MediaPlayback #3]: V/cpearce TrackBuffersManager[0x7fc1c6fd4000] CodedFrameProcessing() removeElements(0, 494) of 17010
[Child 14571: MediaPlayback #3]: V/cpearce TrackBuffersManager[0x7fc1c3ff6000] SegmentParserLoop mProcessedInput=734 len(mInputBuffer)=745599
[Child 14571: MediaPlayback #3]: V/cpearce TrackBuffersManager[0x7fc1c3ff6000] SegmentParserLoop mProcessedInput=734 len(mInputBuffer)=745599
[Child 14571: MediaPlayback #3]: V/cpearce TrackBuffersManager[0x7fc1c3ff6000] CodedFrameProcessing() removeElements(0, 25106) of 745599
This log shows that we see a big append, followed by a large number of relatively small RemoveElements(0,N) calls to snip off the bytes consumed. Each of these calls to RemoveElements(0,N) requires memmoving the remaining content in the nsTArray<uint8_t> down by N bytes. So in the extreme cases in the above log, in order to slice off ~500 bytes from the front, we end up having to memmove almost 800,000 bytes, and we'll keep having to memmove the entire remaining data every time we remove ~500 bytes from the front until mInputBuffer is empty.
We can speed this up. A couple of ideas:
- We could change TrackBuffersManager::mInput to a ring buffer. It's not obvious that all the demuxers would handle that gracefully, as if the ring buffer's content spanned the end of the ring buffer, we'd need to pass demuxers two slices; one slice from the start of content to the end of the ring buffer, and a second slice from the beginning of the ring buffer to the end of the content. It looks like the WebMParser could handle that to me, but it's not clear the MP4 parser would.
- Instead of calling RemoveElements(0,N) on TrackBuffersManager::mInput, we could maintain a slice into this array, and increment its start by N elements. We'd need to pass this slice to anything that read TrackBuffersManager::mInput before. This is only a good idea if we can be sure that we always process everything in TrackBuffersManager::mInput before appending more data onto TrackBuffersManager::mInput. Otherwise the capacity of TrackBuffersManager::mInput's underlying storage could grow unbounded.
Assignee | ||
Comment 1•5 years ago
|
||
As seen in this profile of a Twitch replay: https://perfht.ml/2K9Ydb3 we can
often end up spending time in TrackBuffersManager::CodedFrameProcessing()
shaving off bytes from the front off TrackBuffersManager::mInputBuffer. This
requires all the remaining bytes to be memmove'd down to the start of this
array. Sometimes we have close to 1MB in that buffer, and when we're just
trying to consume a few hundred bytes, that becomes high overhead.
So intead of using this "slice off, shuffle down" approach change
TrackBuffersManager::mInputBuffer to be a new type MediaSpan, which maintains a
RefPtr to a MediaByteBuffer and a span defining the subregion of the buffer we
care about. This means the RemoveElementsAt(0,N) operation becomes basically
free, and we can eliminate a few other copies we were doing as well.
Assignee | ||
Comment 2•5 years ago
|
||
This allows us to avoid a (probably small) copy when we stash the pending input.
Depends on D34661
Updated•5 years ago
|
Pushed by cpearce@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/2e64d8db2b4b Add MediaSpan and use it for TrackBuffersManager::mInputBuffer. r=jya https://hg.mozilla.org/integration/autoland/rev/098ce3586133 Convert TrackBuffersManager::mPendingInput into a MediaSpan. r=jya
Comment 4•5 years ago
|
||
Backed out 2 changesets (Bug 1558364) for build bustages at MediaSpan.h
Backout: https://hg.mozilla.org/integration/autoland/rev/a2930bb79701c54c30bc67ff27d35ee928ee4335
Push that started the failures: https://treeherder.mozilla.org/#/jobs?repo=autoland&resultStatus=pending%2Crunning%2Csuccess%2Ctestfailed%2Cbusted%2Cexception&selectedJob=251802908&revision=098ce3586133fe57aa1d53fe4bdf541b26243563
Failure log: https://treeherder.mozilla.org/logviewer.html#/jobs?job_id=251802908&repo=autoland&lineNumber=19520
Pushed by cpearce@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/f6df4eb84932 Add MediaSpan and use it for TrackBuffersManager::mInputBuffer. r=jya https://hg.mozilla.org/integration/autoland/rev/f53080eb8868 Convert TrackBuffersManager::mPendingInput into a MediaSpan. r=jya
Assignee | ||
Updated•5 years ago
|
Comment 6•5 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/f6df4eb84932
https://hg.mozilla.org/mozilla-central/rev/f53080eb8868
Description
•