Closed
Bug 1448863
Opened 6 years ago
Closed 6 years ago
Stop sync dispatching in mozilla::WebrtcGmpVideoDecoder::Decode
Categories
(Core :: WebRTC: Audio/Video, defect, P2)
Tracking
()
RESOLVED
FIXED
mozilla61
People
(Reporter: bwc, Assigned: bwc)
References
Details
Attachments
(1 file)
59 bytes,
text/x-review-board-request
|
jesup
:
review+
RyanVM
:
approval-mozilla-beta+
|
Details |
This is causing frequent deadlocks over in bug 1375540. Also, fixing bug 1407653 makes the deadlocks even more frequent, because frames are actually being decoded at the rate that they are supposed to.
Updated•6 years ago
|
Rank: 15
Priority: -- → P2
Comment hidden (mozreview-request) |
Comment hidden (mozreview-request) |
Comment 3•6 years ago
|
||
Comment on attachment 8962371 [details] Bug 1448863: Stop sync dispatching in Decode. Sorry, GMP is not my turf at all. I asked jesup on IRC, but no response yet. If not him, I'm not sure who can review this.
Attachment #8962371 -
Flags: review?(apehrson)
Comment hidden (mozreview-request) |
Assignee | ||
Comment 5•6 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=26978554a80d
Comment 6•6 years ago
|
||
mozreview-review |
Comment on attachment 8962371 [details] Bug 1448863: Stop sync dispatching in Decode. https://reviewboard.mozilla.org/r/231228/#review238078 r+ assuming there's no frame-copy required, and that we're certain this doesn't introduce shutdown issues. ::: media/webrtc/signaling/src/media-conduit/WebrtcGmpVideoCodec.cpp:806 (Diff revision 3) > - // Would be really nice to avoid this sync dispatch, but it would require a > - // copy of the frame, since it doesn't appear to actually have a refcount. > - // Passing 'this' is safe since this is synchronous. > - mozilla::SyncRunnable::DispatchToThread(mGMPThread, > - WrapRunnableRet(&ret, this, > + if (!aInputImage._length) { > + return WEBRTC_VIDEO_CODEC_ERROR; > + } > + > + mGMPThread->Dispatch(WrapRunnableNM(&WebrtcGmpVideoDecoder::Decode_g, How was the previous item blocking making this async resolved? Doesn't need a comment in the code, but a comment in the bug about why that's no longer a blocker (or what overhead we're adding; which might need a comment) should be done. If we are adding copying of frames, I'd like to see analysis of the impact of this. ::: media/webrtc/signaling/src/media-conduit/WebrtcGmpVideoCodec.cpp:810 (Diff revision 3) > MOZ_ASSERT(!NS_IsMainThread()); > - // Would be really nice to avoid this sync dispatch, but it would require a > - // copy of the frame, since it doesn't appear to actually have a refcount. > - // Passing 'this' is safe since this is synchronous. > - mozilla::SyncRunnable::DispatchToThread(mGMPThread, > - WrapRunnableRet(&ret, this, > + if (!aInputImage._length) { > + return WEBRTC_VIDEO_CODEC_ERROR; > + } > + > + mGMPThread->Dispatch(WrapRunnableNM(&WebrtcGmpVideoDecoder::Decode_g, Question: does this introduce any shutdown-timing issues? ::: media/webrtc/webrtc.mozbuild:25 (Diff revision 3) > + if CONFIG['CC_TYPE'] in ('msvc', 'clang-cl'): > + DEFINES['__PRETTY_FUNCTION__'] = '__FUNCSIG__' > + > if CONFIG['CC_TYPE'] in ('clang', 'gcc'): Perhaps this should be nominated for the master build config?
Attachment #8962371 -
Flags: review?(rjesup) → review+
Assignee | ||
Comment 7•6 years ago
|
||
mozreview-review |
Comment on attachment 8962371 [details] Bug 1448863: Stop sync dispatching in Decode. https://reviewboard.mozilla.org/r/231228/#review238862 ::: media/webrtc/signaling/src/media-conduit/WebrtcGmpVideoCodec.cpp:806 (Diff revision 3) > - // Would be really nice to avoid this sync dispatch, but it would require a > - // copy of the frame, since it doesn't appear to actually have a refcount. > - // Passing 'this' is safe since this is synchronous. > - mozilla::SyncRunnable::DispatchToThread(mGMPThread, > - WrapRunnableRet(&ret, this, > + if (!aInputImage._length) { > + return WEBRTC_VIDEO_CODEC_ERROR; > + } > + > + mGMPThread->Dispatch(WrapRunnableNM(&WebrtcGmpVideoDecoder::Decode_g, Deadlock bugs take precedence over optimizations, in my view, especially when they are causing a lot of pain on treeherder. Optimizing out the buffer copy would best be left to a separate bug in my opinion.
Assignee | ||
Comment 8•6 years ago
|
||
I have profiled this change, and the CPU cycles spent in the new version of Decode while running the h264 test on webrtc-landing are negligible (less than 0.01% of the samples taken across the content process). This isn't the delta, this is the total. Whatever the cost of these copies is, it is not significant.
Flags: needinfo?(rjesup)
Assignee | ||
Comment 9•6 years ago
|
||
mozreview-review-reply |
Comment on attachment 8962371 [details] Bug 1448863: Stop sync dispatching in Decode. https://reviewboard.mozilla.org/r/231228/#review238078 > Question: does this introduce any shutdown-timing issues? I don't see any new issues on try, and I've done lots of retriggers while shaking out other problems. > Perhaps this should be nominated for the master build config? Possibly. I have opened bug 1451497 for this.
Comment 10•6 years ago
|
||
(In reply to Byron Campen [:bwc] from comment #9) > Comment on attachment 8962371 [details] > Bug 1448863: Stop sync dispatching in Decode. > > https://reviewboard.mozilla.org/r/231228/#review238078 > > > Question: does this introduce any shutdown-timing issues? > > I don't see any new issues on try, and I've done lots of retriggers while > shaking out other problems. Good, though I don't believe that the tests will necessarily trigger such problems; I was looking for analysis of the lifetime of these runnables vs the GMP instance and the calling code, and any checks that they've been resolved and that if there are any races. I realize they hold a ref, of course. > > Perhaps this should be nominated for the master build config? > > Possibly. I have opened bug 1451497 for this. thanks > I have profiled this change, and the CPU cycles spent in the new version of Decode while running the h264 test on webrtc-landing are negligible (less than 0.01% of the samples taken across the content process). This isn't the delta, this is the total. Whatever the cost of these copies is, it is not significant. Good, though that uses 640x480, but even 10x that shouldn't be large. I presume it's actually copying, not copying a surface reference? My actual concern was on low-end machines. (IIRC Android uses platform H264, not OpenH264, but worth double-checking. Thanks
Flags: needinfo?(rjesup)
Assignee | ||
Comment 11•6 years ago
|
||
(In reply to Randell Jesup [:jesup] from comment #10) > (In reply to Byron Campen [:bwc] from comment #9) > > Comment on attachment 8962371 [details] > > Bug 1448863: Stop sync dispatching in Decode. > > > > https://reviewboard.mozilla.org/r/231228/#review238078 > > > > > Question: does this introduce any shutdown-timing issues? > > > > I don't see any new issues on try, and I've done lots of retriggers while > > shaking out other problems. > > Good, though I don't believe that the tests will necessarily trigger such > problems; I was looking for analysis of the lifetime of these runnables vs > the GMP instance and the calling code, and any checks that they've been > resolved and that if there are any races. I realize they hold a ref, of > course. From what I can tell, the way this handles shutdown is by unsetting mGMP on the GMP thread, and we check for that in Decode_g. > > I have profiled this change, and the CPU cycles spent in the new version of Decode while running the h264 test on webrtc-landing are negligible (less than 0.01% of the samples taken across the content process). This isn't the delta, this is the total. Whatever the cost of these copies is, it is not significant. > > Good, though that uses 640x480, but even 10x that shouldn't be large. I > presume it's actually copying, not copying a surface reference? My actual > concern was on low-end machines. (IIRC Android uses platform H264, not > OpenH264, but worth double-checking. > > Thanks I'll look into it.
Assignee | ||
Comment 12•6 years ago
|
||
https://treeherder.mozilla.org/#/jobs?repo=try&revision=ab9bb493baeb
Comment hidden (mozreview-request) |
Assignee | ||
Comment 14•6 years ago
|
||
mozreview-review-reply |
Comment on attachment 8962371 [details] Bug 1448863: Stop sync dispatching in Decode. https://reviewboard.mozilla.org/r/231228/#review238078 > How was the previous item blocking making this async resolved? Doesn't need a comment in the code, but a comment in the bug about why that's no longer a blocker (or what overhead we're adding; which might need a comment) should be done. If we are adding copying of frames, I'd like to see analysis of the impact of this. So I've profiled the latest patch (where I've confirmed that we are actually copying the frame), and we're still talking only about 0.01% of samples in Decode. Makes sense, since these are (small) encoded frames with a pretty low frequency (in the big picture).
Comment 15•6 years ago
|
||
Pushed by bcampen@mozilla.com: https://hg.mozilla.org/integration/autoland/rev/549f0b8075d5 Stop sync dispatching in Decode. r=jesup
Comment 16•6 years ago
|
||
bugherder |
https://hg.mozilla.org/mozilla-central/rev/549f0b8075d5
Status: NEW → RESOLVED
Closed: 6 years ago
status-firefox61:
--- → fixed
Resolution: --- → FIXED
Target Milestone: --- → mozilla61
Comment 17•6 years ago
|
||
Is this low enough risk that we could consider backporting it to Beta? Would be nice to resolve bug 1375540 there too before we're stuck with it on ESR for the next year :)
status-firefox60:
--- → affected
Flags: needinfo?(docfaraday)
Assignee | ||
Comment 18•6 years ago
|
||
I think this would be a good thing to uplift.
Flags: needinfo?(docfaraday)
Assignee | ||
Comment 19•6 years ago
|
||
Comment on attachment 8962371 [details] Bug 1448863: Stop sync dispatching in Decode. Approval Request Comment [Feature/Bug causing the regression]: Bug 1040345. [User impact if declined]: Intermittent deadlocks in webrtc h264 calls. [Is this code covered by automated tests?]: Yes. [Has the fix been verified in Nightly?]: Yes. [Needs manual test from QE? If yes, steps to reproduce]: No. [List of other uplifts needed for the feature/fix]: None, I think. [Is the change risky?]: Not particularly. [Why is the change risky/not risky?]: We're just avoiding a sync dispatch, and we have all of the refcount stuff sorted. [String changes made/needed]: None.
Attachment #8962371 -
Flags: approval-mozilla-beta?
Comment 20•6 years ago
|
||
Comment on attachment 8962371 [details] Bug 1448863: Stop sync dispatching in Decode. Low-risk fix for a bunch of oranges and deadlocks during h264 webrtc calls. Approved for 60.0b13.
Attachment #8962371 -
Flags: approval-mozilla-beta? → approval-mozilla-beta+
Comment 21•6 years ago
|
||
bugherder uplift |
https://hg.mozilla.org/releases/mozilla-beta/rev/f71f11c943ea
Comment 22•6 years ago
|
||
This patch causes an issue whereby firefox will not be able to recover the video stream in a timely manner when packet loss is encountered (i.e. many minutes can pass without a recovery and the video will remain frozen in the meantime). I confirmed this by performing a repro testcase both before and after backing out this changeset -- PL handling works as expected when it's backed out, but works poorly with the changeset in place. jesup has the details on how to repro the issue, and has already forwarded those details via email over to bwc and others. jesup suspected that perhaps it has something to do with the loss of return value from Decode_g() -- prior to the patch, it would return int32_t and after the patch is void. Indeed, it's not clear to me how the caller of Decode_g can know if a decode issue has occurred and handle it (e.g. request an IDR)... is that supposed to happen via some other means?
You need to log in
before you can comment on or make changes to this bug.
Description
•